You are on page 1of 95

# Vector and Complex Calculus

## for the Physical Sciences

Notes for Math 321 @ UW Madison
c
Fabian
Waleffe
Department of Mathematics &
Department of Engineering Physics
Spring 2013

Contents
1 Vectors
1
Get your bearings . . . . . . . . . . . . . . . . . . . . . . . .
1.1
Magnitude and direction . . . . . . . . . . . . . . . . .
1.2
Polar and Cartesian representations of 2D vectors . .
1.3
Spherical, Cylindrical and Cartesian representations in
2
Addition and scaling of vectors . . . . . . . . . . . . . . . . .
3
Abstract Vector Spaces . . . . . . . . . . . . . . . . . . . . .
4
Bases and Components . . . . . . . . . . . . . . . . . . . . . .
5
Points and Coordinates . . . . . . . . . . . . . . . . . . . . .
5.1
Position vector . . . . . . . . . . . . . . . . . . . . . .
5.2
Lines and Planes . . . . . . . . . . . . . . . . . . . . .
6
Dot (a.k.a. Scalar or Inner) Product . . . . . . . . . . . . . .
7
Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . .
7.1
Definition of dot product for Rn (Optional) . . . . . .
7.2
Norm of a vector (Optional) . . . . . . . . . . . . . . .
7.3
Definition of norm for Rn (Optional) . . . . . . . . . .
8
Cross (a.k.a. Vector or Area) Product . . . . . . . . . . . . .
8.1
Orientation of Bases . . . . . . . . . . . . . . . . . . .
8.2
Double cross product (Triple vector product) . . . .
9
Index notation . . . . . . . . . . . . . . . . . . . . . . . . . .
9.1
Levi-Civita (a.k.a. alternating or permutation) symbol
9.2
Sigma notation, free and dummy indices . . . . . . . .
9.3
Einstein summation convention . . . . . . . . . . . . .
10 Mixed (or Box) product and Determinant . . . . . . . . . .
11 Points, Lines, Planes, etc. . . . . . . . . . . . . . . . . . . . .
12 Orthogonal Transformations and Matrices . . . . . . . . . . .
12.1 Change of Cartesian Basis . . . . . . . . . . . . . . . .
12.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . .
12.3 Orthogonal Matrices and Euler Angles . . . . . . . . .
12.4 Determinant of a matrix . . . . . . . . . . . . . . . . .
12.5 Three views of Ax = b . . . . . . . . . . . . . . . . . .
12.6 Eigenvalues and Eigenvectors (Math 320 not 321) . . .

. .
. .
. .
3D
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

5
5
5
5
6
7
8
9
10
10
11
12
14
15
16
16
17
18
19
21
21
21
22
24
27
28
28
30
31
33
33
35

2 Vector Calculus
1
Vector function of a scalar variable . . . . . . .
2
Motion of a particle . . . . . . . . . . . . . . .
3
Motion of a system of particles (optional) . . .
4
Motion of a rigid body (optional) . . . . . . . .
5
Curves, Surfaces, Volumes and their integrals .
5.1
Curves . . . . . . . . . . . . . . . . . . .
5.2
Integrals along curves, or line integrals
5.3
Surfaces . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

37
37
38
39
40
42
42
43
45

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

5.4
Surface integrals . . . . . . . . . . . . . . . . . . . . . .
5.5
Volumes and volume integrals . . . . . . . . . . . . . . .
5.6
Mappings, curvilinear coordinates . . . . . . . . . . . .
5.7
Change of variables . . . . . . . . . . . . . . . . . . . .
Grad, div, curl . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1
Geometric definition of the Gradient . . . . . . . . . . .
6.2
Directional derivative, gradient and the operator . .
6.3
Div and Curl . . . . . . . . . . . . . . . . . . . . . . . .
6.4
Vector identities . . . . . . . . . . . . . . . . . . . . . .
6.5
Grad, Div, Curl in cylindrical and spherical coordinates
Fundamental theorems of vector calculus . . . . . . . . . . . . .
7.1
Integration in R2 and R3 . . . . . . . . . . . . . . . . .
7.2
Fundamental theorem of Calculus . . . . . . . . . . . .
7.3
Fundamental theorem in R2 . . . . . . . . . . . . . . . .
7.4
Green and Stokes theorems . . . . . . . . . . . . . . . .
7.5
Divergence form of Greens theorem . . . . . . . . . . .
7.6
Gauss theorem . . . . . . . . . . . . . . . . . . . . . . .
7.7
Other useful forms of the fundamental theorem in 3D .

3 Complex Calculus
1
Basics of Series and Complex Numbers . . . . . .
1.1
Algebra of Complex numbers . . . . . . .
1.2
Limits and Derivatives . . . . . . . . . . .
1.3
Geometric sums and series . . . . . . . . .
1.4
Ratio test . . . . . . . . . . . . . . . . . .
1.5
Power series . . . . . . . . . . . . . . . . .
1.6
Complex transcendentals . . . . . . . . .
1.7
Polar representation . . . . . . . . . . . .
1.8
Logs . . . . . . . . . . . . . . . . . . . . .
2
Functions of a complex variable . . . . . . . . . .
2.1
Visualization of complex functions . . . .
2.2
Cauchy-Riemann equations . . . . . . . .
2.3
Geometry of Cauchy-Riemann, Conformal
3
Integration of Complex Functions . . . . . . . . .
3.1
Cauchys theorem . . . . . . . . . . . . .
3.2
Cauchys formula . . . . . . . . . . . . . .
4
Applications of complex integration . . . . . . . .

. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Mapping
. . . . . .
. . . . . .
. . . . . .
. . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

47
47
49
51
55
55
55
56
57
58
60
60
60
61
62
64
64
65

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

69
69
69
70
71
71
72
73
75
76
78
78
79
81
84
85
88
90

Chapter 1

Vectors
What is a vector? In calculus, you may have defined vectors as lists of numbers such as (a1 , a2 ) or (a1 , a2 , a3 ),
while in physics, and in geometry before that, you encountered vectors as quantities with both magnitude
and direction such as displacements, velocities and forces.

1.1

We write
a=aa

(1)

## for a vector a of magnitude |a| = a in the direction a

. Magnitude of vector a is denoted |a|, and |a| 0
is a positive real number with appropriate physical units (meters, Newtons, . . . ). Direction of vector a is
denoted a
with |
a| = 1 and, since their magnitude is unity, direction vectors are often called unit vectors,
but note that unit vectors have no physical units!
Example 1.1.
a = 2L heading 30 clockwise from North
specifies a horizontal displacement a of magnitude |a| = 2L and direction a
= 30 clockwise from North. L
is a reference length, for instance L = 1 meter or 1 mile, and North is a reference direction, magnetic north
or true North on the earth.

D
B
a
a

## Two points A and B specify a displacement a = AB. The same

displacement a starting from point C leads to point D, with

## CD = a = AB. Conversely, we can specify points (locations)

by specifying displacements from a reference point, thus point B
is located displacement a from point A, and D is a from C.

Vectors are denoted with a boldface: a, b, u, v, . . . , usually lower-case but sometimes upper-case as in a
magnetic field B, electric field E or force F . Some authors write ~a for a and we use that notation to denote

the displacement from point A to point B as AB. Writing by hand, we use the typographical notation for
boldface which is a for a, a
for a
, etc.
e

1.2

## For vectors in a plane (2D), a direction a

can be specified by an angle from a reference direction as in
example 1.1. In navigation, that angle is usually defined clockwise from North and called an azimuthal
angle. In mathematical physics, we specify the direction using an angle counterclockwise from a reference
5

6
direction x
. We can specify a 2D vector a by giving the pair (a, ) for vector a of magnitude |a| = a and
direction a
= angle counterclockwise from x
.

ay y

ax x

We can also specify the vector as a sum of displacements in two reference perpendicular directions, ax in
direction x
and ay in direction y
perpendicular to x
. Now the pair (ax , ay ) specifies a. The pairs (a, ) and
(ax , ay ) are equivalent representations of a,
a (a, ) (ax , ay )
but in general they are not equal
(a, ) 6= (ax , ay ).
To express the equality of the representations, we need to include the direction vectors and write
a = aa
= ax x
+ ay y

(2)

## then by Pythagoras and basic trigonometry

)
ax = a cos
ay = a sin

a2 = a2x + a2y ,

(3)

a
= cos x
+ sin y
.

(4)

The representation a = a a
, where a
=a
() is a function of , is a polar representation of the 2D vector
a and a = ax x
+ ay y
is a cartesian representation with x
and y
two arbitrary perpendicular directions.
The vector a is specified by the pair (a, ) or (ax , ay ), or many other pairs of numbers depending on
how we specify the direction, for instance 20 feet toward the water tower then 5 meters toward the oak tree
specifies a displacement as (20ft, 5m) but the reference directions may not be perpendicular. Clearly, the
same physical 2D vector can be represented by an infinity of pairs of numbers depending on how we choose
our reference magnitudes and directions.

1.3

## Spherical, Cylindrical and Cartesian representations in 3D

How do you specify a direction in 3D space? Astronomers use an azimuthal angle measured clockwise
from North in the horizontal plane and an inclination angle, say, measured from the upward direction (or
Zenith), in the azimuthal (or meridional ) plane. In ballistics and computer graphics, they may prefer to use
an elevation angle measured up from the horizontal instead of the inclination angle down from the vertical.
In mathematical physics, we use the angle between the z and a
directions and an angle from the x

## A little trigonometry yields

a
= sin + cos z,

= cos x
+ sin y
.

(5)

c
F.
Waleffe, Math 321, 2013/1/21

The spherical representation (a, , ) specifies vector a through its magnitude a, its polar angle and
its azimuthal angle . The cylindrical representation (aH , , az ) specifies a by its horizontal magnitude
aH , azimuthal angle and vertical component az and the cartesian representation consists of the familiar
(ax , ay , az ). A 3D vector a can thus be represented by 3 real numbers, for instance (a, , ) or (aH , , az ) or
(ax , ay , az ). Each of these triplets are equivalent representations of the vector a
a (a, , ) (aH , , az ) (ax , ay , az )

(6)

## but they are not equal to each other, in general,

(a, , ) 6= (aH , , az ) 6= (ax , ay , az ).

(7)

## To express equality we need the direction vectors to write

a=aa
= aH + az z = ax x
+ ay y
+ az z.

(8)

aH

ay y

az z

aH

ax x

From this vector equation, and with a little help from Pythagoras and basic trigonometry, we deduce
a2 = a2H + a2z ,

## a2H = a2x + a2y ,

(9)

az
aH
+
z = sin + cos z,
a
a
ay
ax
x
+
y
= cos x
+ sin y
.
=
aH
aH

(10)

a
=

(11)

These representations indicate the two fundamental algebraic properties of vectors: they can be added and
scaled.

a
b

c
b

a+

a+b

+c
b+c

## Vector addition is commutative: a + b = b + a, and associative: a + b + c = (a + b) + c = a + (b + c). Note

that a, b and c are not in the same plane, in general.

(a)
a

## To every vector a we can associate an opposite vector denoted (a) that

is the displacement exactly opposite to a. Vector subtraction b a is then
defined as the addition of b and (a). In particular a+(a) = 0 corresponds
to no net displacement. This is an important difference between points and
displacements, there is no special point in our space, but there is one special
displacement: the zero vector 0 such that a + (a) = 0 = (a) + a and
a + 0 = a, for any a.

8
The other key operation that characterizes vectors is scaling, that is, multiplication by a real number R.

b
(a + b)
b
||a

a + b

a+b

||a

a
a

Geometrically, v = a is a new vector parallel to a but of length |v| = |||a|. The direction of v is
the same as a if > 0 and opposite to a if < 0. Obviously (1)a = (a), multiplying a by (1)
yields the previously defined opposite of a. Other geometrically obvious properties are distributivity with
respect to addition of real factors: ( + )a = a + a, and with respect to multiplication of real factors:
()a = (a). Slightly less trivial is distributivity with respect to vector addition: (a + b) = a + b,
which geometrically corresponds to similarity of triangles, as illustrated above.

## Abstract Vector Spaces

Vector addition and multiplication by a number (scaling) are the two key operations that define a Vector
Space, provided those operations satisfy the following 8 properties a, b in the vector space and , in
R. The symbol means for all or for any.
a + b = b + a,

(12)

a + (b + c) = (a + b) + c,

(13)

a + 0 = a,

(14)

a + (a) = 0.

(15)

## Required scalar multiplication properties:

( + )a = a + a,

(16)

()a = (a),

(17)

(a + b) = a + b,

(18)

1 a = a.

(19)

When these properties are used to define the vector space they are referred to as axioms, i.e. the defining
properties.
The vector space Rn : Consider the set of ordered n-tuplets of real numbers x (x1 , x2 , . . . , xn ). These
could correspond to student grades on a particular exam, for instance. What kind of operations would we
want to do on these lists of student grades? Well probably want to add several grades for each student and
well probably want to rescale the grades. So the natural operations on these n-tuplets are addition defined1
x + y , (x1 + y1 , x2 + y2 , . . . , xn + yn ) = y + x.

(20)

## and multiplication by a real number R defined as

x , (x1 , x2 , . . . , xn ).

(21)

The set of n-tuplets of real numbers equipped with addition and multiplication by a real number as just
defined is an important vector space called Rn . The vector spaces R2 and R3 will be particularly important
to us as theyll soon corresponds to the components of our physical vectors. But we also use Rn for very
large n when studying systems of equations, for instance.
1 The

## symbol , means equal by definition.

c
F.
Waleffe, Math 321, 2013/1/21

Exercises:
1. Show that addition and scalar multiplication of n-tuplets satisfy the 8 required properties listed above.
2. Define addition and scalar multiplication of n-tuplets of complex numbers and show that all 8 properties
are satisfied. That vector space is called Cn .
3. The set of real functions f (x) is also a vector space. Define addition in the obvious way: f (x) + g(x)
h(x) another real function, and scalar multiplication: f (x) = F (x) yet another real function. Show
that all 8 properties are again satisfied.
4. Suppose you define addition of n-tuplets x = (x1 , x2 , . . . , xn ) as usual but define scalar multiplication
according to x = (x1 , x2 , , xn ), that is, only the first component is multiplied by . Which
property is violated? What if you defined x = (x1 , 0, , 0), which property would be violated?
5. From the 8 properties, show that (0)a = 0 and (1)a = (a), a, i.e. show that multiplication by the
scalar 0 yields the neutral element for addition, and multiplication by 1 yields the additive inverse.

## Bases and Components

Addition and scaling of vectors allow us to define the concepts of linear combination, basis, components and
dimension. These concepts apply to any vector space.
A linear combination of vectors a and b is an expression of the form a + b. This linear combination
yields another vector v. The set of all such vectors, obtained by taking any , R, is itself a vector space
(or more correctly a vector subspace if a and b are two vectors in 3D space for instance). We say that a
and b form a basis for that (sub)space. We also say that this is the (sub)space spanned by a and b. For
a given vector v, the unique real numbers and such that v = a + b are called the components of v
with respect to the basis a, b.

v = a + b
b
a

Linear Independence: The vectors a1 , a2 , , ak for some integer k are linearly independent (L.I.) if
the only way to have
1 a1 + 2 a2 + + k ak = 0
is for all the s to be zero:
1 = 2 = = k = 0.
Dimension: The dimension of a vector space is the largest number of linearly independent vectors, n
say, in that space. A basis for that space consists of n linearly independent vectors. A vector v has n
components (some of them possibly zero) with respect to any basis in that space.
Examples
Two non-parallel vectors a and b in a plane (for instance, horizontal plane) are L.I. and these vectors
form a basis for vectors (for instance, displacements) in the plane. Any given vector v in the plane can be
written as v = a + b, for a unique pair (, ). v is the diagonal of the parallelogram a, b. Three or
more vectors in a plane are linearly dependent.
Three non-coplanar vectors a, b and c in 3D space are L.I. and those vectors form a basis for 3D
space. However 4 or more vectors in 3D are linearly dependent. Any given vector v can be expanded as
v = a + b + c, for a unique triplet of real numbers (, , ). Make sketches to illustrate.

10
The 8 properties of addition and scalar multiplication imply that if two vectors u and v are expanded
with respect to the same basis a1 , a2 , a3 so
u = u1 a1 + u2 a2 + u3 a3 ,
v = v1 a1 + v2 a2 + v3 a3 ,
then
u + v = (u1 + v1 )a1 + (u2 + v2 )a2 + (u3 + v3 )a3 ,
v = (v1 )a1 + (v2 )a2 + (v3 )a3 ,
so addition and scalar multiplication are performed component by component and the triplets of real components (v1 , v2 , v3 ) are elements of the vector space R3 . A basis a1 , a2 , a3 in 3D space provides a one-to-one
correspondence (mapping) between displacements v in 3D (Euclidean) space (call it E3 ) and triplets of real
numbers in R3
v E3 (v1 , v2 , v3 ) R3 .
Exercises
1. Given vectors a, b in E3 , show that the set of all v = a + b, , R is a vector space.
2. Show that the set of all vectors v = a + b, R and fixed a, b is not a vector space.
3. If you defined addition of ordered pairs x = (x1 , x2 ) as usual but scalar multiplication by x =
(x1 , x2 ), would it be possible to represent any vector x as a linear combination of two basis vectors
a and b?
4. Given three points P1 , P2 , P3 in E3 , let M be the midpoint of segment P1 P2 , what are the components

of P2 P3 and P3 M in the basis P1 P2 , P1 P3 ?
5. Find a basis for Rn (consider the natural basis: e1 = (1, 0, , 0), e2 = (0, 1, 0, , 0), etc.)
6. Find a basis for Cn . What is the dimension of that space?
7. What is the dimension of the vector space of real continuous functions f (x) in 0 < x < 1?
8. What could be a basis for the vector space of nice functions f (x) in (0, 1)? (i.e. 0 < x < 1) (whats
a nice function? smooth functions are infinitely differentiable, thats nice!)

5
5.1

## Points and Coordinates

Position vector

In elementary calculus and linear algebra it is easy to confuse points and vectors. In R3 for instance, a point
P is defined as a triplet (x1 , x2 , x3 ) but a vector a is also defined by a real number triplet (a1 , a2 , a3 ). In
physical space, points and displacements are two clearly different things.

P
r

The confusion arises from the fundamental way to locate points by specifying displacements from a reference point called the origin and denoted O. An arbitrary

## point P is then specified by providing the displacement vector r = OP . That

vector is called the position vector of P and denoted r for radial vector from the
origin O.

A Cartesian system of coordinates consists of a reference point O and three mutually orthogonal directions

x
, y
, z that provide a basis for displacements in 3D Euclidean space E3 . The position vector r = OP of
point P can then be specified in spherical, cylindrical or cartesian form as in sect. 1.3 now with the special
r instead of the arbitrary a,
r = r r = + z z = x x
+yy
+ z z,
(22)

c
F.
Waleffe, Math 321, 2013/1/21

11

(
p
= x2 + y 2

y = sin
(

r=

z2

x = cos

= r sin
z = r cos

(23)

(24)

## and we can eliminate to obtain

r=

p
x2 + y 2 + z 2

x = r sin cos
y = r sin sin

z = r cos

(25)

Note that x
, y
, z form a basis for 3D Euclidean vector space, but and z do not, and r does not either.
The cartesian basis x
, y
, z consists of three fixed and mutually orthogonal directions independent of P ,
but and r depend on P , each point has its own and r. We will construct and use full cylindrical and
)
spherical orthogonal bases, (,
,
z) and (
r , ,
later in vector calculus. These cylindrical and spherical
basis vectors vary with P , or more precisely with and .
Once a Cartesian system of coordinates, O, x
, y
, z, has been chosen, the cartesian coordinates (x, y, z)
of P are the cartesian components of r = x x
+yy
+ z z. The cylindrical coordinates of P are (, , z) and
its spherical coordinates are (r, , ), but cylindrical and spherical coordinates are not vector components,
they are the cylindrical and spherical representations of r.
Coordinates of points can be specified in many other ways that do not correspond to a displacement
vector. In navigation for instance, a point P in a horizontal plane can be located by specifying the angles,
1 and 2 say, between a reference direction (North) and the lines of sight from P to two reference points P1
and P2 (light houses or radio beacons). In the global positioning system (GPS), a point P is located from
the distances to (at least) 3 satellites. Thus, in general, coordinates of points are not components of vectors.

5.2

## Lines and Planes

The line passing through point A that is parallel to the direction of a consists of all points P such that
P
A

rA

AP = a,

R
(26)

## This vector equation expresses that the vector AP is parallel to a. In terms

of an origin O we have OP = OA + AP and we can write the vector (parametric) equation of that same line as
r = rA + a,

(27)

## where r = OP and rA = OA are the position vectors of P and A with

respect to O, respectively. The real number is the parameter of the line,
it is the coordinate of P in the system A, a.

Likewise the equation of a plane passing through A and parallel to the vectors a and b consists of all
points P such that

AP = a + b, , R
(28)
or in terms of O
r = rA + a + b.

(29)

This is the parametric vector equation of that plane with parameters , , that are the coordinates of P in
the system of reference A, a, b.

12
Exercises

1. Pick two vectors a, b and some arbitrary point A in the plane of your sheet of paper. If AB = a + b,
sketch the region where B can be if: (i) and are both between 0 and 1, (ii) || || 1.
2. Given three points P1 , P2 , P3 in R3 , what are the coordinates of the midpoints of the triangle with

respect to P1 and the basis P1 P2 , P1 P3 ?
3. Show that the line segment connecting the midpoints of two sides of a triangle is parallel to and equal
to half of the third side using methods of plane geometry and using vectors.
4. Show that the medians of a triangle intersect at the same point which is 2/3 of the way down from
the vertices along each median (a median is a line that connects a vertex to the middle of the opposite
side). Do this using both plane geometry and vector methods.

5. Given three point A, B, C, not co-linear, find a point X such that XA + XB + XC = 0. Show
that the line through A and X cuts BC at its mid-point. Deduce similar results for the other sides
of the triangle ABC and therefore that X is the point of intersection of the medians. Sketch. [Hint:

XB = XA + AB, XC = ]

6. Given four points A, B, C, D not co-planar, find a point X such that XA + XB + XC + XD = 0.
Show that the line through A and X intersects the triangle BCD at its center of area. Deduce similar
results for the other faces and therefore that the medians of the tetrahedron ABCD, defined as the
lines joining each vertex to the center of area of the opposite triangle, all intersect at the same point
X which is 3/4 of the way down from the vertices along the medians. Visualize. [Hint: solve previous
problem first, of course.]
Partial solutions to problems 3 and 4:
B

E
C

## Let D and E be the midpoints of segments CB and CA. Geometry: consider

the triangles ABC and EDC. They are similar (why?), so ED = AB/2 and
those two segments are parallel. Next consider triangles BAG and EDG.
Those are similar too (why?), so AG = 2GD and BG = 2GE, done! Vector

## A b + BE for some , . Writing this equality in terms of a and b yields

= = 2/3, done! (Why does this imply that the line through C and G
cuts AB at its midpoint?).

No need to introduce cartesian coordinates, that would be horrible and full of unnecessary algebra. The

vector solution refers to a vector basis: a = CA and b = CB which is perfect (or intrinsic) for the problem,
although it is not orthogonal!

## The geometric definition of the dot product of vectors in 3D Euclidean space is

a b , |a| |b| cos ,

(30)

where 0 is the angle between the vectors a and b when their tails coincide. The dot product is
a real number such that a b = 0 iff a and b are orthogonal (perpendicular). The 0 vector is considered
orthogonal to any vector. The dot product of any vector with itself is the square of its length a a = |a|2 .
The dot product is directly related to the perpendicular projections of b onto a and a onto b. The latter are,
respectively,

c
F.
Waleffe, Math 321, 2013/1/21

13

a
b

a
b

|b| cos =

ab
ab
=a
b 6= a
b=
= |a| cos ,
|a|
|b|

(31)

where a
= a/|a| and
b = b/|b| are the unit vectors in the a and b directions, respectively. Recall that a unit
vector is a vector of length one |
a| = 1 a 6= 0.
In physics, the work W done by a force F on a particle undergoing the displacement ` is equal to
distance |`| times the component of F in the direction of `, but that is equal to the total force |F | times
the component of the displacement ` in the direction of F . Both of these statements are contained in the
symmetric definition W = F `, see exercise 1 below.
Parallel and Perpendicular Components
We often want to decompose a vector b into vector components, bk and b , parallel and perpendicular to a
vector a, respectively, such that b = bk + b with

bk = (b a
) a
=

ba
a
aa

b = b bk = b (b a
)
a=b

bk

(32)

ba
a
aa

## Properties of the dot product

The dot product has the following properties, most of which are immediate:
1.

a b = b a,

2.

## (a) b = a (b) = (a b),

3.

a (b + c) = a b + a c,

4.

a a 0,

5.

(a b)2 (a a) (b b)

a a = 0 a = 0,

ba

b+c

ca

(Cauchy-Schwarz)

## To verify the distributive property, a (b + c) = a b + a c, geometrically, note that the magnitude of a

drops out so all we need to check is a
b+a
c = a
(b + c) or in other words, that the perpendicular
projections of b and c onto a line parallel to a add up to the projection of b + c onto that line. This is
obvious from the figure. In general, a, b and c are not in the same plane. To visualize the 3D case interpret
a
b as the (signed) distance between two planes perpendicular to a that pass through the tail and head of
b and likewise for a
c and a
(b + c). The result follows directly since b, c and b + c form a triangle. So
the picture is the same but the dotted lines represent planes seen from the sides and b, c, b + c are not in
the plane of the paper, in general.

Exercises
1. A skier slides down an inclined plane with a total vertical drop of h, show that the work done by
gravity is independent of the slope. Use F and `s and sketch the geometry of this result.

14
2. Sketch the solutions of a x = , where a and are known.
3. Sketch c = a + b then calculate c c = (a + b) (a + b) and deduce the law of cosines.
4. If n is a unit vector, show that a a (a n)n is orthogonal to n, a. Sketch.
5. If c = a + b show that c = a + b (defined in previous exercise). Interpret geometrically.
6. B is a magnetic field and v is the velocity of a particle. We want to decompose v = v + v k where v
is perpendicular to the magnetic field and v k is parallel to it. Derive vector expressions for v and v k .
7. Show that the three normals (or heights, or altitudes) dropped from the vertices of a triangle perpendicular to their opposite sides intersect at the same point H say. [Hint: this is similar to problem 4 in

section 4 but now AD and BE are defined by AD CB = 0 and BE CA = 0 and the goal is to show

that CH AB = 0. Do you need to find H to prove the result?].
8. A and B are two points on a sphere of radius R specified by their longitude and latitude. Find the
shortest distance between A and B, traveling on the sphere. [If O is the center of the sphere consider

OA OB to determine their angle].
9. Consider v(t) = a + tb where t R and a, b are arbitrary constant. What is the minimum |v| and for
what t? Solve two ways: geometrically and using calculus.
10. Show that the point of intersection of the medians G (center of gravity), the point of intersection of the
heights H (orthocenter ) and the point of intersection of the perpendicular bisectors O (circumcenter )
of an arbitrary triangle, all lie on the same line (known as the Euler line).

Orthonormal bases

Given an arbitrary vector v and three non co-planar vectors a, b and c in E3 , you can find the three scalars
, and such that
v = a + b + c
by parallel projections (sect. 4). The scalars , and are the components of v in the basis a, b, c. Finding
those components is simpler if the basis is orthogonal, that is if the basis vectors a, b and c are mutually
orthogonal, in which case a b = b c = c a = 0. For an orthogonal basis, a projection parallel to a
say, is a projection perpendicular to b and c, but a perpendicular projection is a dot product. In fact, we
can forget geometry and crank out the vector algebra: take the dot product of both sides of the equation
v = a + b + c with each of the 3 basis vectors to obtain
=

av
,
aa

bv
,
bb

cv
.
cc

An orthonormal basis is even better. Thats a basis for which the vectors are mutually orthogonal and
of unit norm. Such a basis is often denoted2 e1 , e2 , e3 . Its compact definition is
ei ej = ij

(33)

## where i, j = 1, 2, 3 and ij is the Kronecker symbol, ij = 1 if i = j and 0 if i 6= j.

2 Forget about the notation i, j, k for cartesian unit vectors. This is 19th century notation, it is unfortunately still very
common in elementary courses but that old notation will get in the way if you stick to it. We will NEVER use i, j, k, instead
we use (
x, y
, z
) or (e1 , e2 , e3 ) or (ex , ey , ez ) to denote a set of three orthonormal vectors in 3D euclidean space. We will soon
use indices i, j and k (next line already!). Those indices are positive integers that can take all the values from 1 to n, the
dimension of the space. We spend most of our time in 3D space, so most of the time the possible values for these indices i, j
and k are 1, 2 and 3. We will use those indices a lot!. They should not be confused with those old orthonormal vectors i, j, k
from elementary calculus.

c
F.
Waleffe, Math 321, 2013/1/21

15

The components of a vector v with respect to the orthonormal basis e1 , e2 , e3 in E3 are the real numbers
v1 , v2 , v3 such that

v = v e + v e + v e
vi ei
1 1
2 2
3 3
(34)
i=1

vi = ei v, i = 1, 2, 3.
If two vectors a and b are expanded in terms of e1 , e2 , e3 , that is
a = a1 e1 + a2 e2 + a3 e3 ,

b = b1 e1 + b2 e2 + b3 e3 ,

use the distributivity properties of the dot product and the orthonormality of the basis to show that
a b = a1 b1 + a2 b2 + a3 b3 .

(35)

## B Show that this formula is valid only for orthonormal bases.

One remarkable property of this formula is that its value is independent of the orthonormal basis. The
dot product is a geometric property of the vectors a and b, independent of the basis. This is obvious from
the geometric definition (30) but not from its expression in terms of components (35). If e1 , e2 , e3 and e1 0 ,
e2 0 , e3 0 are two distinct orthogonal bases then
a = a1 e1 + a2 e2 + a3 e3 = a01 e1 0 + a02 e2 0 + a03 e3 0
but, in general, the components in the two bases are distinct: a1 6= a01 , a2 6= a02 , a3 6= a03 , and likewise for
another vector b, yet
a b = a1 b1 + a2 b2 + a3 b3 = a01 b01 + a02 b02 + a03 b03 .
(36)
The simple algebraic form of the dot product is invariant under a change of orthonormal basis.

Exercises
1. Given the orthonormal (cartesian) basis (e1 , e2 , e3 ), consider a = a1 e1 + a2 e2 , b = b1 e1 + b2 e2 ,
v = v1 e1 + v2 e2 . What are the components of v (i) in terms of a and b? (ii) in terms of a and b
where a b = 0?
2. If (e1 , e2 , e3 ) and (e01 , e02 , e03 ) are two distinct orthogonal bases and a and b are arbitrary vectors, prove
(36) but construct an example that a1 b1 + 2a2 b2 + 3a3 b3 6= a01 b01 + 2a02 b02 + 3a03 b03 in general.
P3
P
3. If w = i=1 wi ei , calculate ej w using
notation and (33).
P3
P3
P3
4. Why is not true that ei i=1 wi ei = i=1 wi (ei ei ) = i=1 wi ii = w1 + w2 + w3 ?
P3
P3
P
notation and (33).
5. If v = i=1 vi ei and w = i=1 wi ei , calculate v w using
P3
P3
6. If v = i=1 vi ai and w = i=1 wi ai , where the basis ai , i = 1, 2, 3, is not orthonormal, calculate
v w.
P3
P3 P3
P3
7. Calculate (i) j=1 ij aj , (ii) i=1 j=1 ij aj bi , (iii) j=1 jj .

7.1

## Definition of dot product for Rn (Optional)

The geometric definition of the dot product (30) is great for oriented line segments as it emphasizes the
geometric aspects, but the algebraic formula (35) is very useful for calculations. Its also the way to define
the dot product for other vector spaces where the concept of angle between vectors may not be obvious e.g.
what is the angle between the vectors (1,2,3,4) and (4,3,2,1) in R4 ?! The dot product (a.k.a. scalar product
or inner product) of the vectors x and y in Rn is defined as suggested by (35):
x y , x1 y1 + x2 y2 + + xn yn .

(37)

16
Verify that this definition satisfies the first 4 properties of the dot product. To show the Cauchy-Schwarz
property, you need a bit of Calculus and a classical trick: consider v = x + y, then by prop 4 of the dot
product: (x + y) (x + y) 0, and by props 1,2,3, F () (x + y) (x + y) = 2 y y + 2x y + x x.
For given, but arbitrary, x and y, this is a quadratic polynomial in . That polynomial F () has a single
minimum at = (x y)/(y y). Find that minimum value of F () and deduce the Cauchy-Schwarz
inequality. Once we know that the definition (37) satisfies Cauchy-Schwarz, (x y)2 (x x) (y y), we
can define the length of a vector by |x| = (x x)1/2 (this is called the Euclidean length since it corresponds
to length in Euclidean geometry by Pythagorass theorem) and the angle between two vectors in Rn by
cos = (x y)/(|x| |y|). A vector space for which a dot (or inner) product is defined is called a Hilbert space
(or an inner product space).
The bottom line is that for more complex vector spaces, the dot (or scalar or inner) product is a key mathematical construct that allows us to generalize the concept of angle between vectors and, most importantly,
to define orthogonal vectors.
Exercises
1. So what is the angle between (1, 2, 3, 4) and (4, 3, 2, 1)?
2. Can you define a dot product for the vector space of real functions f (x)?
3. Find a vector orthogonal to (1, 2, 3, 4). Find all the vectors orthogonal to (1, 2, 3, 4).
4. Decompose (4,2,1,7) into the sum of two vectors one of which is parallel and the other perpendicular
to (1, 2, 3, 4).
5. Show that cos nx with n integer, is a set of orthogonal functions on (0, ). Find formulas for the
components of a function f (x) in terms of that orthogonal basis. In particular, find the components
of sin x in terms of the cosine basis in that (0, ) interval.

7.2

## Norm of a vector (Optional)

The norm of a vector, denoted kak, is a positive real number that defines its size or length (but not in
the sense of the number of its components). For displacement vectors in Euclidean spaces, the norm is the

length of the displacement, kak = |a| i.e. the distance between point A and B if AB = a. The following
properties are geometrically straightforward for length of displacement vectors:
1.

## kak 0 and kak = 0 a = 0,

2.

kak = || kak,

3.

ka + bk kak + kbk.

(triangle inequality)

Draw the triangle formed by a, b and a + b to see why the latter is called the triangle inequality. For more
general vector spaces, these properties become the defining properties (axioms) that a norm must satisfy. A
vector space for which a norm is defined is called a Banach space.

7.3

## Definition of norm for Rn (Optional)

For other types of vector space, there are many possible definitions for the norm of a vector as long as those
definitions satisfy the 3 norm properties. In Rn , the p-norm of vector x is defined by the positive number

1/p
kxkp , |x1 |p + |x2 |p + + |xn |p
,

(38)

where p 1 is a real number. Commonly used norms are the 2-norm kxk2 which is the square root of the
sum of the squares, the 1-norm kxk1 (sum of absolute values) and the infinity norm, kxk , defined as the
limit as p of the above expression.

c
F.
Waleffe, Math 321, 2013/1/21

17

Note that the 2-norm kxk2 = (x x)1/2 and for that reason is also called the Euclidean norm. In fact, if
a dot product is defined, then a norm can always be defined as the square root of the dot product. In other
words, every Hilbert space is a Banach space, but the converse is not true.
1. Show that the infinity norm kxk = maxi |xi |.
2. Show that the p-norm satisfies the three norm properties for p = 1, 2, .
3. Define a norm for Cn .
4. Define the 2-norm for real functions f (x) in 0 < x < 1.

## Cross (a.k.a. Vector or Area) Product

The cross product is a very useful operation for physical applications (mechanics, electromagnetism), but
it is particular to 3D space. The cross product of two vectors a, b is the vector denoted a b that is (1)
orthogonal to both a and b, (2) has magnitude equal to the area of the parallelogram with sides a and b,
(3) has direction determined by the right-hand rule (or the cork-screw rule), i.e.
a b , An

(39)

where A = |a| |b| sin is the area of the parallelogram spanned by a and b, and n
is a unit vector perpendicular to both a and b in the direction such that a, b, n
is right handed. is the angle between a and
b as defined for the dot product (is sin 0?). The following figure illustrates the cross-product with a
perspective view (left) and a top view (right), with a b out of the paper in the top view.

ab

ab

ba
a
An important geometric property of the cross product is
a b = a b = a b

(40)

where a = a (a
b)
b is the vector component of a perpendicular to b and likewise b = b (b a
)
a is
the vector component of b perpendicular to a (so the meaning of is relative). The cross-product has the
following properties:
1.

a b = b a,

2.

## (a) b = a (b) = (a b),

3.

c (a + b) = (c a) + (c b)

(anti-commutativity)

a a = 0,

So we manipulate the cross product as wed expect except for the anti-commutativity, which is a big difference
from our other elementary products! The first 2 properties are geometrically obvious from the definition.
To show the third property (distributivity) let c = |c|
c and get rid of |c| by prop 2. All three cross products
give vectors perpendicular to c and furthermore from (40) we have c a = c a , c b = c b and
c (a + b) = c (a + b) , where means perpendicular to c, a = a (a c)
c, etc. So the cross-products
eliminate the components parallel to c and all the action is in the plane perpendicular to c.

18

c b
c (a + b)

c a

a
c

8.1

## To visualize the distributivity property it suffices to look at that plane

from the top, with c pointing out of the sheet of paper. Then a cross
product by c is equivalent to a rotation of the perpendicular components
by /2 counterclockwise. Since a, b and a + b form a triangle, their
perpendicular projections a , b and (a + b) form a triangle and
therefore ca, cb and c(a+b) also form a triangle, demonstrating
distributivity.

(a + b)

Orientation of Bases

If we pick an arbitrary unit vector e1 , then a unit vector e2 orthogonal to e1 , then there are two possible
unit vectors e3 orthogonal to both e1 and e2 . One choice gives a right-handed basis (i.e. e1 in right thumb
direction, e2 in right index direction and e3 in right major direction). The other choice gives a left-handed
basis. These two types of bases are mirror images of each other as illustrated in the following figure, where
e1 0 = e1 point straight out of the paper (or screen).

e2 0

e2

e3 0
e1

e3

e1

Left-handed

Right-handed

mirror

This figure reveals an interesting subtlety of the cross product. For this particular choice of left and right
handed bases (other arrangements are possible of course), e1 0 = e1 and e2 0 = e2 but e3 0 = e3 so e1 e2 = e3
and e1 0 e2 0 = e3 = e3 0 . This indicates that the mirror image of the cross-product is not the cross-product
of the mirror images. On the opposite, the mirror image of the cross-product e3 0 is minus the cross-product of
the images e1 0 e2 0 . We showed this for a special case, but this is general, the cross-product is not invariant
under reflection, it changes sign. Physical laws should not depend on the choice of basis, so this implies that
they should not be expressed in terms of an odd number of cross products. When we write that the velocity
of a particle is v = r, v and r are good vectors (reflecting as they should under mirror symmetry) but
is not quite a true vector, it is a pseudo-vector. It changes sign under reflection. That is because rotation
vectors are themselves defined according to the right-hand rule, so an expression such as r actually
contains two applications of the right hand rule. Likewise in the Lorentz force F = qv B, F and v are
good vectors, but since the definition involves a cross-product, it must be that B is a pseudo-vector. Indeed
B is itself a cross-product so the definition of F actually contains two cross-products.
The orientation (right-handed or left-handed) did not matter to us before but, now that weve defined
the cross-product with the right-hand rule, well typically choose right-handed bases. We dont have to,
geometrically speaking, but we need to from an algebraic point of view otherwise wed need two sets of
algebraic formula, one for right-handed bases and one for left-handed bases. In terms of our right-handed
cross product definition, we can define a right-handed basis by
e1 e2 = e3 ,

(41)

## then deduce geometrically

e2 e3 = e1 ,
e2 e1 = e3 ,

e3 e1 = e2 ,

e1 e3 = e2 ,

e3 e2 = e1 .

(42)
(43)

c
F.
Waleffe, Math 321, 2013/1/21

19

Note that (42) are cyclic rotations of the basis vectors in (41), i.e. (e1 , e2 , e3 ) (e2 , e3 , e1 ) (e3 , e1 , e2 ).
The orderings of the basis vectors in (43) do not correspond to cyclic rotations. For 3 elements, a cyclic
rotation corresponds to an even number of permutations. For instance to go from (e1 , e2 , e3 ) to (e2 , e3 , e1 )
we can first permute (switch) e1 e2 to obtain (e2 , e1 , e3 ) then permute e1 and e3 . The concept of even
and odd number of permutations is more general. But for three elements it is useful to think in terms of
cyclic and acyclic permutations.
If we expand a and b in terms of the right-handed e1 , e2 , e3 , then apply the 3 properties of the crossproduct i.e. in compact summation form
a=

3
X
i=1

ai ei ,

b=

3
X
j=1

bj ej ,

ab=

3 X
3
X

ai bj (ei ej ),

i=1 j=1

we obtain
a b = (a2 b3 a3 b2 )e1 + (a3 b1 a1 b3 )e2 + (a1 b2 a2 b1 )e3 .

(44)

Verify this result explicitly. What would the formula be if the basis was left-handed?
That expansion of the cross product with respect to a right-handed orthonormal basis (44) is often remembered using the formal determinant (i.e. this is not a true determinant, its just convenient mnemonics)

e1 e2 e3

a1 a2 a3 .

b1 b2 b3

8.2

## Double cross product (Triple vector product)

I Exercise: Visualize the vector (a b) a. Sketch it. What are its geometric properties? What is its
magnitude?
Double cross products3 occur frequently in applications (e.g. angular momentum of a rotating body)
directly or indirectly (recall above discussion about mirror reflection and cross-products in physics). They
have simple expressions
(a b) c = (a c) b (b c) a,
(45)
a (b c) = (a c) b (a b)c.
One follows from the other after some manipulations and renaming of vectors, but we can remember both
at once as: middle vector times dot product of the other two minus other vector within parentheses times dot
product of the other two. 4
Lets visualize the geometry of the double product: (a b) is orthogonal to both a and b, and v =
(a b) c is orthogonal to (a b) so v is in the a, b plane and v = a + b for some scalars and .
Now v is also orthogonal to c, so v c = (a c) + (b c) = 0, therefore = (b c) and = (a c) for
some scalar and v = (b c)a (a c)b. This almost gives the formula (45) but we still need to show
that = 1.
Write c = ck +c , where ck is parallel, and c perpendicular, to ab,
so v = (a b) c and a c = a c , b c = b c , since parallel to
v
b
a b means perpendicular to a and b. So a, b, v and c are all in the
plane perpendicular to a b. Thats easier to visualize. To determine

a
in v = (bc )a(ac )b, take the cross product with b to obtain
v b = (b c ) (a b). Now from the figure and the definition of v
we have v b = (a b)|c | |b| sin , and the figure also shows that
c
bc = |b| |c | cos( 2 ), therefore = 1 since sin = cos(/2)
ab
(with positive clockwise). [This derivation can be improved]
3 The

double vector product is often called triple vector product, there are 3 vectors but only 2 vector products!
is more useful than the confusing BAC-CAB rule for remembering the 2nd. Try applying the BAC-CAB mnemonic
to (b c) a for confusing fun!
4 This

20
Exercises
1. Show that |a b|2 + (a b)2 = |a|2 |b|2 , a, b.
2. A particle of charge q moving at velocity v in a magnetic field B experiences the Lorentz force F =
qv B. Show that there is no force in the direction of the magnetic field and that the Lorentz force
does no work on the particle. Does it have any effect on the particle?
3. Sketch three vectors such that a + b + c = 0, show that a b = b c = c a in two ways (1) from the
geometric definition of the cross product and (2) from the algebraic properties of the cross product.
Deduce the law of sines relating the sines of the angles of a triangle and the lengths of its sides.
4. Consider any three points P1 , P2 , P3 in 3D Euclidean space. Show that 12 (r 1 r 2 + r 2 r 3 + r 3 r 1 ) is
a vector whose magnitude is the area of the triangle and is perpendicular to the triangle in a direction
determined by the ordering P1 , P2 , P3 and the right hand rule.
5. Consider an arbitrary non-self intersecting quadrilateral in the (x, y) plane with vertices P1 , P2 , P3 , P4 .
Show that 12 (r 1 r 2 + r 2 r 3 + r 3 r 4 + r 4 r 1 ) is a vector whose magnitude is the area of the
z direction depending on whether P1 , P2 , P3 , P4 are oriented counterclockwise, or clockwise, respectively. What is the compact formula for the area of the quadrilateral in
terms of the (x, y) coordinates of each point? What happens if the points are in a plane but not the
(x, y) plane?
6. Four arbitrary points in 3D Euclidean space E3 are the vertices of an arbitrary tetrahedron. Consider
the area vectors perpendicular pointing outward for each of the faces of the tetrahedron, with magnitudes equal to the area of the corresponding triangular face. Make sketch. Show that the sum of these
area vectors is zero.
7. Vector c is the (right hand) rotation of vector b about e3 by angle . Find the components of c in
terms of the components of b.
8. Vector c is the (right hand) rotation of vector b about a by angle . Find c in terms of a, b, and
basic vector operations. Find c if a (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ), in cartesian components. [Hint:
consider the intrinsic basis a
, b and a
b ]
9. Show by (1) cross product geometry and (2) cross product algebra that all the vectors x such that
a x = b have the form
ba
x = a +
, R
kak2
10. Show the Jacobi identity:

a (b c) + b (c a) + c (a b) = 0.

11. If n
is any unit vector, show algebraically and geometrically that any vector a can be decomposed as
a = (a n
)
n+n
(a n
) ak + a .

(46)

## The first component is parallel to n, the second is perpendicular to n.

12. A left-handed basis e1 0 , e2 0 , e3 0 , is defined by ei 0 ej 0 = ij and e1 0 e2 0 = e3 0 . Show that
(ei 0 ej 0 ) ek 0 has the opposite sign to the corresponding expression for a right-handed basis, i, j, k
(the definition of the cross-product remaining its right-hand rule self). Thus deduce that the formula
for the components of the cross-product in the left handed basis would all change sign.
13. Prove (45) using the intrinsic right-handed orthonormal basis e1 = a/|a|, e3 = (a b)/|a b| and
e2 = e3 e1 . Then a = a1 e1 , b = b1 e1 + b2 e2 , c = c1 e1 + c2 e2 + c3 e3 . Visualize and explain why this
is a general result and therefore a proof of the double cross product identity.
14. Prove (45) using the right-handed orthonormal basis e1 = c /|c |, e3 = (ab)/|ab| and e2 = e3 e1 .
In that basis a = a1 e1 + a2 e2 and b = b1 e1 + b2 e2 but what is c? Show by direct calculation that
a b = (a1 b2 a2 b1 )e3 and v = |c |(a1 b2 a2 b1 )e2 = |c |a1 b |c |b1 a. Why is this (a c)b (b c)a
and thus a proof of the identity?

c
F.
Waleffe, Math 321, 2013/1/21

9
9.1

21

Index notation
Levi-Civita (a.k.a. alternating or permutation) symbol

We have used the Kronecker symbol (33) to express all the dot products ei ej = ij in a very compact form.
There is a similar symbol, ijk , the Levi-Civita or alternating symbol, defined as

## if (i, j, k) is an even permutation of (1, 2, 3),

1
1
if (i, j, k) is an odd permutation of (1, 2, 3),
ijk =
(47)

0
otherwise,
or, explicitly: 123 = 231 = 312 = 1 and 213 = 132 = 321 = 1, all other ijk = 0. Recall that for 3
elements an even permutation is the same as a cyclic permutation, therefore ijk = jki = kij , i, j, k (why?).
The ijk symbol provides a compact expression for the components of the cross-product of right-handed basis
vectors:
(ei ej ) ek = ijk .
(48)
but since this is the k-component of (ei ej ) we can also write
(ei ej ) =

3
X

ijk ek .

(49)

k=1

Note that there is only one non-zero term in the latter sum (but then, why cant we drop the sum?). Verify
this result for yourself.

9.2

## Sigma notation, free and dummy indices

The expansion of vectors a and b in terms of basis e1 ,e2 ,e3 , a = a1 e1 +a2 e2 +a3 e3 and b = b1 e1 +b2 e2 +b3 e3 ,
can be written compactly using the sigma () notation
a=

3
X

ai ei ,

b=

i=1

3
X

bi ei .

(50)

i=1

We have introduced the Kronecker symbol ij and the Levi-Civita symbol ijk in order to write and
perform our basic vector operations such as dot and cross products in compact forms, when the basis is
orthonormal and right-handed, for instance using (33) and (49)
ab=

3 X
3
X

ai bj ei ej =

i=1 j=1

ab=

3 X
3
X

ai bj ij =

3
X

ai bi

(51)

ai bj ijk ek

(52)

i=1 j=1

3 X
3
X

ai bj ei ej =

i=1 j=1

3 X
3 X
3
X

i=1

## i=1 j=1 k=1

Note that i and j are dummy or summation indices in the sums (50) and (51), they do not have a specific
value, they have all the possible values in their range. It is their place in the particular expression and their
range that matters, not their name
a=

3
X
i=1

ai ei =

3
X
j=1

aj ej =

3
X

ak ek = 6=

k=1

3
X

Indices come in two kinds, the dummies and the free. Heres an example

3
X
ei (a b)c =
aj bj ci ,
j=1

ak ei

(53)

k=1

(54)

22
here j is a dummy summation index, but i is free, we can pick for it any value 1, 2, 3. Freedom comes with
constraints. If we use i on the left-hand side of the equation, then we have no choice, we must use i for ci
on the right hand side. By convention we try to use i, j, k, l, m, n, to denote indices, which are positive
integers. Greek letters are sometimes used for indices.
Mathematical operations impose some naming constraints however. Although, we can use the same index
name, i, in the expansions of a and b, when they appear separately as in (50), we cannot use the same index
name if we multiply them as in (51) and (52). Bad things will happen if you do, for instance
!
!
3
3
3
X
X
X
ab=
ai ei
bi ei =
ai bi ei ei = 0 (WRONG!)
(55)
i=1

9.3

i=1

i=1

## Einstein summation convention

While he was developing the theory of general relativity, Einstein noticed that many of the sums that occur
in calculations involve terms where the summation index appears twice. For example, i appears twice in the
single sums in (50), i and j appear twice in the double sum in (51) and i, j and k each appear twice in the
triple sum in (52). To facilitate such manipulations he dropped the signs and adopted the summation
convention that a repeated index implicitly denotes a sum over all values of that index. In a
letter to a friend he wrote I have made a great discovery in mathematics; I have suppressed the summation
sign every time that the summation must be made over an index which occurs twice. Thus with Einsteins
summation convention we write
a = ai ei ,

b = bj ej ,

a b = ai bi ,

a b = ijk ai bj ek ,

(56)

and any repeated index implies a sum over all values of that index. This is a very useful and widely used
notation but you have to use it with care and there are cases where it cannot be used. Indices can never be
repeated more than twice, if they are, thats probably a mistake as in (55), if not then you are out of luck
and need to use s or invent your own notation. A few common operations in the summation convention:
We love to see a ij involved in a sum since this collapses that sum. This is called the substitution rule, if
ij appears in a sum, we can forget about it and eliminate the summation index, for example
ai ij = aj ,

kl kl = kk = ll = 3,

ij ijk = iik = 0

(57)

note the second result kk = 3 because k is repeated, so there is a sum over all values of k and kk =
11 + 22 + 33 . The last result is because ijk vanishes whenever two indices are the same. That last
expression ij ijk involves a double sum over i and over j. The ij collapses one of those sums. It doesnt
matter which index we choose to eliminate since both are dummy indices. Lets compute the l component
of a b from (56). We pick l because i, j and k are already taken. The l component is
el (a b) = ijk ai bj ek el = ijk ai bj kl = ijl ai bj = lmn am bn

(58)

what happened on that last step? first, ijk = kij because (i, j, k) to (k, i, j) is a cyclic permutation which
corresponds to an even number of permutation in space of odd dimension (dimension 3, here) and the value
of ijk does not change under even permutations. Then i and j are dummies and we renamed them m and
n respectively being careful to keep the order. The final result is worth memorizing: if c = a b, the l
component of c is cl = lmn am bn , or switching indices to i, j, k
c = a b ci = ijk aj bk c = ei ijk aj bk .

(59)

In the spirit of no pain-no gain, lets write the double cross product identity (a b) c in this indicial
notation. Let v = a b then the i component of the double cross product v c is ijk vj ck . Now we need
the j component of v = a b. Since i and k are taken we use l, m as new dummy indices, and we have
vj = jlm al bm . So the i component of the double cross product (a b) c is
ijk jlm al bm ck .

(60)

c
F.
Waleffe, Math 321, 2013/1/21

23

Note that j, k, l and m are repeated, so this expression is a quadruple sum! According to our double cross
product identity it should be equal to the i component of (a c)b (b c)a for any a, b, c. We want the i
component of the latter expression since i is a free index in (60), that i component is
(aj cj )bi (bj cj )ai

(61)

(wait! isnt j repeated 4 times? no, its not. Its repeated twice in separate terms so this is a difference
of two sums over j). Since (60) and (61) are equal to each other for any a, b, c, this should be telling us
something about ijk , but to extract that out we need to rewrite (61) in the form al bm ck . How? by making
use of our ability to rename dummy variables and adding variables using ij . Lets look at the first term in
(61), (aj cj )bi , heres how to write it in the form al ck bm as in (60):
(aj cj )bi = (ak ck )bi = (lk al ck )(im bm ) = lk im al ck bm .

(62)

Do similar manipulations to the second term in (61) to obtain (bj cj )ai = il km al ck bm and
ijk jlm al bm ck = (lk im il km )al ck bm .

(63)

Since this equality holds for any al , ck , bm , we must have ijk jlm = (lk im il km ). Thats true but its not
written in a nice way so lets clean it up to a form thats easier to reconstruct. First note that ijk = jki since
ijk is invariant under a cyclic permutation of its indices. So our identity becomes jki jlm = (lk im il km ).
Weve done that flipping so the summation index j is in first place in both  factors. Now we prefer the
lexicographic order (i, j, k) to (j, k, i) so lets rename all the indices being careful to rename the correct
indices on both sides. This yields
ijk ilm = jl km jm kl

(64)

This takes some digesting. Go through it carefully again. And again, as many times as it takes. The identity
(64) is actually pretty easy to remember and verify. First, ijk ilm is a sum over i but there is never more
than one non-zero term (why?). Second, the only possible values for that expression are +1, 0 and 1
(why?). The only way to get 1 is to have (j, k) = (l, m) with j = l 6= k = m (why?), but in that case the
right hand side of (64) is also 1 (why?). The only way to get 1 is to have (j, k) = (m, l) with j = m 6= k = l
(why?), but in that case the right hand side is 1 also (why?). Finally, to get 0 we need j = k or l = m and
the right-hand side again vanishes in either case. For instance, if j = k then we can switch j and k in one
of the terms and jl km jm kl = jl km km jl = 0.
Formula (64) has a generalization that does not include summation over one index
ijk lmn =il jm kn + im jn kl + in jl km
im jl kn in jm kl il jn km

(65)

note that the first line correspond to (i, j, k) and (l, m, n) matching up to cyclic rotations, while the second
line corresponds to (i, j, k) matching with an odd (acyclic) rotation of (l, m, n).

Exercises
1. Explain why ijk = jki = jik for any integer i, j, k.
2. Using (48) and Einsteins notation show that (a b) ek = ijk ai bj and (a b) = ijk ai bj ek =
ijk aj bk ei .
3. Show that ijk ljk = 2il by direct deduction and by application of (64).
4. Deduce (64) from (65).

24

10

## Mixed (or Box) product and Determinant

A mixed product5 of three vectors has the form (a b) c, it involves both a cross and a dot product. The
result is a scalar. We have already encountered mixed products (e.g. eqn. (48)) but their geometric and
algebraic properties are so important that they merit their own subsection.
(a b) c = (b c) a = (c a) b =
a (b c) = b (c a) = c (a b) =
volume of the parallelepiped spanned by a, b, c

(66)

ab

c
b
a
Take a and b as the base of the parallelepiped then a b is perpendicular to the base and has magnitude
equal to the base area. The height is z c where z is the unit vector perpendicular to the base, i.e. parallel to
a b. So the volume is (a b) c. Signwise, (a b) c > 0 if a, b and c, in that order, form a right-handed
basis (not orthogonal in general), and (a b) c < 0 if the triplet is left-handed. Taking b and c, or c and
a, as the base, you get the same volume and sign. The dot product commutes, so (b c) a = a (b c),
yielding the identity
(a b) c = a (b c).
(67)
We can switch the dot and the cross without changing the result. We have shown (66) geometrically. The
properties of the dot and cross products yield many other results such as (a b) c = (b a) c, etc. We
can collect all these results as follows.
A mixed product is one form of a scalar function of three vectors called the determinant
(a b) c , det(a, b, c),

(68)

whose value is the signed volume of the parallelepiped with sides a, b, c. The determinant has three
fundamental properties
1. it changes sign if any two vectors are permuted, e.g.
det(a, b, c) = det(b, a, c) = det(b, c, a),

(69)

## 2. it is linear in any of its vectors e.g. , d,

det(a + d, b, c) = det(a, b, c) + det(d, b, c),

(70)

## 3. if the triplet e1 , e2 , e3 is right-handed and orthonormal then

det(e1 , e2 , e3 ) = 1.

(71)

5 The mixed product is called the triple scalar product by some authors, but there are only 2 products and they apply to
three vectors.

c
F.
Waleffe, Math 321, 2013/1/21

25

Deduce these from the properties of the dot and cross products as well as geometrically. Property (70) is
a combination of the distributivity properties of the dot and cross products with respect to vector addition
and multiplication by a scalar. For example,

det(a + d, b, c) = (a + d) (b c) = a (b c) + d (b c)
= det(a, b, c) + det(d, b, c).
From these three properties, you deduce easily that the determinant is zero if any two vectors are identical
(from prop 1), or if any vector is zero (from prop 2 with = 1 and d = 0), and that the determinant does
not change if we add a multiple of one vector to another, for example
det(a, b, a) = 0,
det(a, 0, c) = 0,
det(a + b, b, c) = det(a, b, c).

(72)

Geometrically, this last one corresponds to a shearing of the parallelepiped, with no change in volume or
orientation. One key application of determinants is
det(a, b, c) 6= 0 a, b, c form a basis.

(73)

If det(a, b, c) = 0 then either one of the vectors is zero or they are co-planar and a, b, c cannot provide a basis
for vectors in E3 . This is how the determinant is introduced in elementary linear algebra, it determines
whether a system of linear equations has a solution or not. But the determinant is so much more, it
determines the volume of the parallelepiped and its orientation!
The 3 fundamental properties fully specify the determinant as explored in exercises 5, 6 below. If the
vectors are expanded in terms of a right-handed orthonormal basis, i.e. a = ai ei , b = bj ej , c = ck ek
(summation convention), then we obtain the following formula for the determinant in terms of the vector
components
det(a, b, c) = (a b) c = ai bj ck (ei ej ) ek = ijk ai bj ck .
(74)
Expanding that expression
ijk ai bj ck = a1 b2 c3 + a2 b3 c1 + a3 b1 c2 a2 b1 c3 a3 b2 c1 a1 b3 c2 ,

(75)

## we recognize the familiar algebraic determinants

a1

det(a, b, c) = ijk ai bj ck b1
c1

a2
b2
c2

a3
b3
c3

a1

= a2

a3

b1
b2
b3

c1
c2
c3

(76)

Note that it does not matter whether we put the vector components along rows or columns. This is a
non-trivial and important property of determinants. (see section on matrices).
This familiar determinant has the same three fundamental properties (69), (70), (71) of course
1. it changes sign if any two columns (or

a1

a2

a3

b1 a1 c1
b1 c1

b2 c2 = b2 a2 c2
b3 a3 c3
b3 c3

## 2. it is linear in any of its columns (or rows) e.g.

a1 + d1 b1 c1

a2 + d2 b2 c2 =

a3 + d3 b3 c3

, (d1 , d2 , d3 ),

a1 b1 c1 d1
a2 b2 c2 + d2
a3 b3 c3 d3

(77)

b1
b2
b3

c1
c2
c3

(78)

26
3. finally, the determinant of the natural basis

is
1
0
0

0
1
0

0
0
1

= 1,

(79)

You deduce easily from these three properties that the det vanishes if any column (or row) is zero or if any two
columns (or rows) is a multiple of another, and that the determinant does not change if we add to one column
(row) a linear combination of the other columns (rows). These properties allow us to calculate determinants
by successive shearings and column-swapping. There is another explicit formula for determinants, in addition
to the ijk ai bj ck formula, it is the Laplace (or Cofactor) expansion in terms of 2-by-2 determinants, e.g.

a1 b1 c1

a2 b2 c2 = a1 b2 c2 a2 b1 c1 + a3 b1 c1 ,
(80)
b2 c1
b3 c3
b3 c3

a3 b3 c3
where the 2-by-2 determinants are

a1

a2

b1
= a1 b2 a2 b1 .
b2

(81)

This formula is nothing but a (b c) expressed with respect to a right handed basis. To verify that,
compute the components of (b c) first, then dot with the components of a. This cofactor expansion
formula can be applied to any column or row, however there are 1 factors that appear. We wont go into
the straightforward details, but all that follows directly from the column swapping property (77). Thats
essentially the identities a (b c) = b (c a) = .

Exercises

a b
1. Show that the 2-by-2 determinant 1 1 = a1 b2 a2 b1 , is the signed area of the parallelogram
a2 b2
with sides a = a1 e1 + a2 e2 , b = b1 e1 + b2 e2 . It is positive if a, b, a, b is a counterclockwise cycle,
negative if the cycle is clockwise. Sketch (of course).
2. The determinant det(a, b, c) of three oriented line segments a, b, c is a geometric quantity. Show that
det(a, b, c) = |a| |b| |c| sin cos . Specify and . Sketch.
3. Show that |a| |b| |c| det(a, b, c) |a| |b| |c|. When do the equalities apply? Sketch.
4. Use properties (69) and (70) to show that
det(a + d, b + e, c) =
det(a, b, c) + det(a, e, c) + det(d, b, c) + det(d, e, c).
5. Use properties (69) and (71) to show that det(ei , ej , ek ) = ijk .
6. Use property (70) and exercise 5 above to show that if a = ai ei , b = bi ei , c = ci ei (summation
convention) then det(a, b, c) = ijk ai bj ck .
7. Prove the identity (a b) (c d) = (a c)(b d) (a d)(b c) using both vector identities and indicial
notation.
8. Express (a b) (a b) in terms of dot products of a and b.
9. Show that (a a)(b b) (a b)2 is the square of the area of the parallelogram spanned by a and b.
10. If A is the area the parallelogram with sides a and b, show that

aa ab
2

.
A =
ab bb

c
F.
Waleffe, Math 321, 2013/1/21

27

11. If det(a, b, c) 6= 0, then any vector v can be expanded as v = a + b + c. Find explicit expressions
for the components , , in terms of v and the basis vectors a, b, c in the general case when the
latter are not orthogonal. [Hint: project on cross products of the basis vectors, then collect the mixed
products into determinants and deduce Cramers rule].
12. Given three vectors a1 , a2 , a3 such that D = a1 (a2 a3 ) 6= 0, define
a01 = (a2 a3 )/D,

(82)

## This is the reciprocal basis of the basis a1 , a2 , a3 .

(i) Show that ai a0j = ij , i, j = 1, 2, 3.
P3
P3
(ii) Show that if v = i=1 vi ai and v = i=1 vi0 a0i , then vi = v a0i and vi0 = v ai . So the components
in one basis are obtained by projecting onto the other basis.
13. If a and b are linearly independent and c is any arbitrary vector, find , and such that c =
a + b + (a b). Express , and in terms of dot products only. [Hint: find and first, then
use ck = c c .]
14. Express (a b) c in terms of dot products of a, b and c only. [Hint: solve problem 13 first.]
15. Provide an algorithm to compute the volume of the parallelepiped (a, b, c) by taking only dot products. [Hint: rectify the parallelepiped (a, b, c) (a, b , c ) (a, b , c ) where b and c are
perpendicular to a, and c is perpendicular to both a and b . Explain geometrically why these
transformations do not change the volume. Explain why these transformations do not change the
determinant by using the properties of determinants.]
16. (*) If V is the volume of the parallelepiped with sides a,

aa ab

2
V = b a b b
ca cb

b, c show that

a c
b c .
cc

Do this in several ways: (i) from problem 13, (ii) using indicial notation and the formula (65).

11

## Points, Lines, Planes, etc.

We discussed points, lines and planes in section 5 and reviewed the concepts of position vector OP = r =
r
r = + z z = x
x + yy
+ z z and parametric equations of lines and planes. Here, we briefly review implicit
equations of lines and planes and some applications of dot and cross products to points, lines and planes.
The center of
r c of a system of P
N particles of mass mi located at position r i , i = 1, . . . , N is defined
Pmass
N
N
by M r c = i=1 mi r i where M = i=1 mi is the total mass. This is a coordinate-free expression for
the center of mass. In particular, if all the masses are equal then for N = 2, r c = (r 1 + r 2 )/2, for
N = 3, r c = (r 1 + r 2 + r 3 )/3.
The vector equation of a line parallel to a passing through a point r 0 is
r = r 0 + a,

R (r r 0 ) a = 0

(83)

These are the (explicit) parametric equation r = r 0 + a, with parameter , and the implicit equation
(r r 0 ) a = 0 of a line.
The equation of a plane through r 0 parallel to a and b (with ab 6= 0), or (equivalently) perpendicular
to n a b is
r = r 0 + a + b, , R (r r 0 ) n = 0
(84)
These are the explicit, parametric equation of the plane with parameters and , and the implicit
equation of the plane.

28
The equation of a sphere of center r c and radius R is
|r r c | = R r = r c + R n
,

n s.t. |
n| = 1.

(85)

Exercises
1. Show that the center of gravity of three points of equal mass is at the point of intersection of the
medians of the triangle formed by the three points.
2. What are the equations of lines and planes in Cartesian coordinates?
3. Find vector equations for the line passing through the two points r 1 , r 2 and the plane through the
three points r 1 , r 2 , r 3 .
4. What is the distance between the point r 1 and the plane through r 0 perpendicular to a?
5. What is the distance between the point r 1 and the plane through r 0 parallel to a and b?
6. What is the distance between the line parallel to a that passes through point A and the line parallel
to b that passes through point B?
7. A particle was at point P1 at time t1 and is moving at the constant velocity v 1 . Another particle was
at P2 at t2 and is moving at the constant velocity v 2 . How close did the particles get to each other
and at what time? What conditions are needed for a collision?
8. Point C is obtained by rotating point B about the axis passing through point A, with direction a, by

angle (right hand rotation by about a). Find an explicit vector expression for OC in terms of OB,

OA, a and . Make clean sketches. Express your vector result in cartesian form.

12

12.1

## Change of Cartesian Basis

Consider two orthonormal bases (e1 , e2 , e3 ) and (e01 , e02 , e03 ) in 3D euclidean space. A vector v can be
expanded in terms of each bases as v = v1 e1 + v2 e2 + v3 e3 vi ei and v = v10 e1 0 + v20 e2 0 + v30 e3 0 vi0 e0i .
What are the relationships between the two sets of components (v1 , v2 , v3 ) and (v10 , v20 , v30 )?
In 2D, you can find the relations between the components (v1 , v2 )
and (v10 , v20 ) directly using geometry,

v2
v20

v
v10

+v2 sin ,

v20 = v1 sin

+v2 cos .

(86)

Can you?

v1
However, it is easier and more systematic to use the basis vectors
and vector operations. The starting point is the vector identity

e2

e02

v10 = v1 cos

(87)

e01

e1

## v10 = e01 v = v1 e01 e1 + v2 e01 e2

= v1 cos + v2 sin .

(88)

c
F.
Waleffe, Math 321, 2013/1/21

29

In 3D, a direct geometric derivation is quite challenging but using the direction vectors and vector algebra
is just as straightforward. The components (v1 , v2 , v3 ) 6= (v10 , v20 , v30 ) for distinct bases, but the geometric
vector v is independent of the choice of basis, thus
v = vi0 ei 0 = vj ej ,
using Einsteins summation convention, and the relationship between the two sets of coordinates are then
vi0 = ei 0 v = (ei 0 ej ) vj , Qij vj ,

(89)

## vi = ei v = (ei ej 0 ) vj0 = Qji vj0 ,

(90)

Qij , ei 0 ej = cos ij

(91)

and, likewise,
where we defined
where ij is the angle between the unit vectors e0i and ej and these coefficients Qij = cos ij are the direction
cosines. These 9 coefficients Qij are the elements of a 3-by-3 matrix Q, that is, a 3-by-3 table with the first
index i corresponding to the row index and the second index j to the column index

## Q11 Q12 Q13

e1 e1 e1 0 e2 e1 0 e3
 
Q Qij = Q21 Q22 Q23 = e2 0 e1 e2 0 e2 e2 0 e3
(92)
Q31 Q32 Q33
e3 0 e1 e3 0 e2 e3 0 e3
This is the Direction Cosine Matrix.
There are 9 direction cosines (in 3D space) but orthonormality of both bases imply many constraints, so
these coefficients are not independent. These constraints follow from eqns. (89), (90) which must hold for
any (v1 , v2 , v3 ) and (v10 , v20 , v30 ). Substituting (90) into (89) (watching out for dummy indices!) yields
vi0 = Qik Qjk vj0 ,

v 0 Qik Qjk = ij .

(93)

This means that the rows of matrix Q (92) are orthogonal to each other and of unit magnitude. Likewise,
substituting (89) into (90) gives
vi = Qki Qkj vj ,

v Qki Qkj = ij ,

(94)

and the columns of matrix Q (92) are also orthogonal to each other and of unit magnitude.
These two sets of orthogonal relationships can also be derived more geometrically as follows. The coefficient Qij = ei 0 ej is both the j component of ei 0 in the {e1 , e2 , e3 } basis, and the i component of ej in the
{e1 0 , e2 0 , e3 0 } basis. Therefore we can write
ei 0 = (ei 0 ek ) ek = Qik ek Qi1 e1 + Qi2 e2 + Qi3 e3 ,
in other words, e0i (Qi1 , Qi2 , Qi3 ) in the basis (e1 , e2 , e3 ), and this is the i-th row of matrix Q (92). Now
the dot product of 2 vectors a = ak ek and b = bl el is a b = ak bk since the basis (e1 , e2 , e3 ) is orthonormal,
so the dot product of e0i = Qik ek and e0j = Qjl el is
ei 0 ej 0 = Qik Qjk = ij ,

(95)

since (e01 , e02 , e03 ) is also orthonormal and therefore ei 0 ej 0 = ij . So the rows of Q are orthonormal because
they are the components of (e01 , e02 , e03 ) in the basis (e1 , e2 , e3 ).
Likewise,
ej = (ej ek 0 ) ek 0 = Qkj ek 0 Q1j e1 0 + Q2j e2 0 + Q3j e3 0 .
and ej (Q1j , Q2j , Q3j ) in the basis (e01 , e02 , e03 ), this corresponds to the j-th column of matrix Q (92).
Then
ei ej = Qki Qkj = ij ,
(96)
and the columns of Q are orthonormal because they are the components of (e1 , e2 , e3 ) in the basis (e01 , e02 , e03 )
.

30
Exercises
1. Find Q if (e01 , e02 , e03 ) is the right hand rotation of (e1 , e2 , e3 ) about e3 by angle . Verify (95) and
(96) for your Q. If v = vi ei , find the components of v in the basis (e01 , e02 , e03 ) in terms of (v1 , v2 , v3 )
and .
2. Find Q if (e01 , e02 , e03 ) is the right hand rotation of (e1 , e2 , e3 ) about e2 by angle . Verify (95) and (96)
3. Find Q if e01 = e1 , e02 = e3 , e03 = e2 . Verify (95) and (96) for your Q.
4. What is the value of the determinant of Q? Go back to section 10 to find out and explain.
5. If a and b are two arbitrary vectors and (e1 , e2 , e3 ) and (e01 , e02 , e03 ) are two distinct arbitrary bases,
we have already shown that ai bi = a0i b0i (summation convention) in section (36). Verify this invariance
directly from the transformation rule (89), vi0 = Qij vj , showing of your mastery of index notation.

12.2

Matrices

That Q was a very special matrix, namely an orthogonal matrix. More generally a 3-by-3 real matrix A is
a table of 9 real numbers

## A11 A12 A13

A [Aij ] A21 A22 A23 .
(97)
A31 A32 A33
Matrices are denoted by a capital letter, e.g. A and Q and by square brackets [ ]. By convention, vectors in
R3 are defined as 3-by-1 matrices e.g.

x1
x = x1 ,
x3
although for typographical reasons we often write x = (x1 , x2 , x3 ) but not [x1 , x2 , x3 ] which would denote
a 1-by-3 matrix, or row vector. The term matrix is similar to vectors in that it implies precise rules for
manipulations of these objects (for vectors these are the two fundamental addition and scalar multiplication
operations with specific properties, see Sect. 2).
Matrix-vector multiply
Equation (89) shows how matrix-vector multiplication should be defined. The matrix vector product Ax (A
3-by-3, x 3-by-1) is a 3-by-1 vector b in R3 whose i-th component is the dot product of row i of matrix A
with the column x,
Ax = b bi = Aij xj
(98)
where Aij xj Ai1 x1 +Ai2 x2 +Ai3 x3 in the summation convention. The product of a matrix with a (column)
vector is performed row-by-column. This product is defined only if the number of columns of A is equal to
the number of rows of x. A 2-by-1 vector cannot be multiplied by a 3-by-3 matrix.
Identity Matrix
There is a unique matrix such that Ix = x, x. For x R3 , show that

1
I= 0
0

0
1
0

0
0 .
1

(99)

c
F.
Waleffe, Math 321, 2013/1/21

31

Matrix-Matrix multiply
Two successive linear transformation of coordinates, that is,
x0i = Aij xj ,

then

## x00i = Bij x0j

(summation over repeated indices) can be combined into one transformation from xj to x00i
x00i = Bik Akj xj , Cij xj
where
Cij , (BA)ij = Bik Akj .

(100)

This defines matrix multiplication. The product of two matrices BA is a matrix, C say, whose (i, j) element
Cij is the dot product of row i of B with column j of A. As for matrix-vector multiplication, the product
of two matrices is done row-by-column. This requires that the number of columns of the first matrix in the
product (B) equals the number of rows of the second matrix (A). Thus, the product of a 3-by-3 matrix
and a 2-by-2 matrix is not defined, for instance. We can only multiply M-by-N by an N-by-P, that is inner
dimensions must match. In general, BA 6= AB, matrix multiplication does not commute. You can visualize
this by considering two successive rotation of axes, one by angle about e3 , followed by one by about e02 .
This is not the same as rotating by about e2 , then by about e03 . You can also see it algebraically
(BA)ij = Bik Akj 6= Aik Bkj = (AB)ij .
Matrix transpose
The transformation (90) involves the sum Aji x0j that is similar to the matrix vector multiply except that
the multiplication is column-by-column! To write this as a matrix-vector multiply, we define the transpose
matrix AT whose row i correspond to column i of A. If the (i, j) element of A is Aij then the (i, j) element
of AT is Aji
(AT )ij = (A)ji .
Then
xi = Aji x0j

x = AT x 0 .

(101)

## A symmetric matrix A is such that A = AT , but an anti-symmetric matrix A is such that A = AT .

B Show that the transpose of a product is equal to the product of the transposes in reverse order (AB)T =
B T AT .

12.3

## [Your favorite lecturer will twist and turn more in class]

Arbitrary matrices are typically denoted A, while orthogonal matrices are typically denoted Q in the
literature. In matrix notation, the orthogonality conditions (95), (96) read
QT Q = QQT = I.

(102)

A matrix that satisfy these relationships is called an orthogonal matrix (it should have been called orthonormal). A proper orthogonal matrix has determinant equal to 1 and corresponds to a pure rotation. An
improper orthogonal matrix has determinant -1. It corresponds to a combination of rotations and an odd
number of reflections. The product of orthogonal matrices is an orthogonal matrix.
This is useful as any 3-by-3 proper orthogonal matrix can be decomposed into the product of three
elementary rotations. There are several ways to define these elementary rotations but a common one that
corresponds to spherical coordinates is to (1) rotate by about e3 , (2) rotate by about e2 0 , (3) rotate by
about e3 00 . The 3 angles , , are called Euler angles. Our choice of angles here is typically labelled

32
ZY Z, since the successive bases rotation are around the z-axis, then y, then z. Hence a general 3-by-3
orthogonal matrix A can always be written as

cos
sin 0
cos 0 sin
cos sin 0
sin cos 0 .
1
0
Q = sin cos 0 0
(103)
0
0
1
sin 0 cos
0
0
1
To define an arbitrary orthogonal matrix, we can then simply pick any three arbitrary (Euler) angles
, , and construct an orthonormal matrix using (103). Another important procedure to do this is the
Gram-Schmidt procedure: pick any three a1 , a2 , a3 and orthonormalize them, i.e.
(1) First, define q 1 = a1 /ka1 k and a02 = a2 (a2 q 1 )q 1 , a03 = a3 (a3 q 1 )q 1 ,
(2) next, define q 2 = a02 /ka02 k and a003 = a03 (a03 q 2 )q 2 ,
(3) finally, define q 3 = a003 /ka003 k.
The vectors q 1 , q 2 , q 3 form an orthonormal basis. This procedure generalizes not only to any dimension
but also to other vector spaces, e.g. to construct orthogonal polynomials.
Exercises
1. Give explicit examples of 2-by-2 and 3-by-3 symmetric and antisymmetric matrices.
2. If xT = [x1 , x2 , x3 ], calculate xT x and xxT .
3. Show that xT x and xxT are symmetric (explicitly and by matrix manipulations).
4. If A is a square matrix of appropriate size, what is xT Ax?
5. Show that the product of two orthogonal matrices is an orthogonal matrix. Interpret geometrically.
6. What is the general form of a 3-by-3 orthogonal and symmetric matrix?
7. What is the orthogonal matrix corresponding to a reflection about the x z plane? What is its
determinant?
8. What is the most general form of a 2-by-2 orthogonal matrix?
9. Suppose that you would like to rotate an object (i.e. a set of points) about a given axis by an angle
. Can you explain how to do this? [Hint: (1) Translation: express coordinates of any point r with
respect to any point r 0 on the rotation axis: r r 0 . (2) Perform two elementary rotations to align
the vertical axis with the rotation axis, i.e. find the Euler angles and . Express the coordinates
of r r 0 in that new set of coordinates. (3) Rotate the vector by , this is equivalent to multiplying
by the transpose of the matrix corresponding to rotation of axes by . Then you need to re-express
the coordinates in terms of the original axes! thats a few multiplication by transpose of matrices you
10. What is the rotation matrix corresponding to rotation by about e2 ?
11. What is the matrix corresponding to (right-handed) rotation by angle about the direction e1 +e2 +e3 ?
12. Find the components of a vector a rotated by angle about the direction e1 + 2e2 + 2e3 .
13. Pick three non-trivial but arbitrary vectors in R3 (e.g. using Matlabs randn(3,3)), then construct an
orthonormal basis q 1 , q 2 , q 3 using the Gram-Schmidt procedure. Verify that the matrix Q = [q 1 , q 2 , q 3 ]
is orthogonal. Note in particular that the rows are orthogonal eventhough you orthogonalized the
columns only.
14. Pick two arbitrary vectors a1 , a2 in R3 and orthogonalize them to construct q 1 , q 2 . Consider the
3-by-2 matrix Q = [q 1 , q 2 ] and compute QQT and QT Q. Can you explain the results?

c
F.
Waleffe, Math 321, 2013/1/21

12.4

33

Determinant of a matrix

See earlier discussion of determinants (section on mixed product). The determinant of a matrix has the
explicit formula det(A) = ijk Ai1 Aj2 Ak3 , the only non-zero terms are for (i, j, k) equal to a permutation of
(1, 2, 3). We can deduce several fundamental properties of determinants from that formula. We can reorder
Ai1 Aj2 Ak3 into A1l A2m A3n using an even number of permutations if (i, j, k) is an even perm of (1,2,3) and
an odd number for odd permutations. So
det(A) = ijk Ai1 Aj2 Ak3 = lmn A1l A2m A3n = det(AT ).

(104)

## Another useful result is that

ijk Ail Ajm Akn = ijk lmn Ai1 Aj2 Ak3

(105)

## Then it is easy to prove that det(AB) = det(A) det(B):

det(AB) = ijk Ail Bl1 Ajm Bm2 Akn Bn3 = ijk lmn Ai1 Aj2 Ak3 Bl1 Bm2 Bn3 = det(A) det(B)

(106)

One nice thing is that these results and manipulations generalize straightforwardly to any dimension.

12.5

Three views of Ax = b

Column View
I View b as a linear combination of the columns of A.
Write A as a row of columns, A = [a1 , a2 , a3 ], where aT1 = [a11 , a21 , a31 ] etc., then
b = Ax = x1 a1 + x2 a2 + x3 a3
and b is a linear combination of the columns a1 , a2 , a3 . If x is unknown, the linear system of equations
Ax = b will have a solution for any b if and only if the columns form a basis, i.e. iff det(a1 , a2 , a3 )
det(A) 6= 0. If the determinant is zero, then the 3 columns are in the same plane and the system will have
a solution only if b is also in that plane.
As seen in earlier exercises, we can find the components (x1 , x2 , x3 ) by thinking geometrically and projecting on the reciprocal basis e.g.
x1 =
Likewise
x2 =

det(b, a2 , a3 )
b (a2 a3 )

.
a1 (a2 a3 )
det(a1 , a2 , a3 )

det(a1 , b, a3 )
,
det(a1 , a2 , a3 )

x3 =

(107)

det(a1 , a2 , b)
.
det(a1 , a2 , a3 )

This is a nifty formula. Component xi equals the determinant where vector i is replaced by b divided
by the determinant of the basis vectors. You can deduce this directly from the algebraic properties of
determinants, for example,
det(b, a2 , a3 ) = det(x1 a1 + x2 a2 + x3 a3 , a2 , a3 ) = x1 det(a1 , a2 , a3 ).
This is Cramers rule and it generalizes to any dimension, however computing determinants in higher
dimensions can be very costly and the next approach is computationally much more efficient.
Row View:
I View x as the intersection of planes perpendicular to the rows of A.
View A as a column of rows, A = [n1 , n2 , n3 ]T , where nT1 = [a11 , a12 , a13 ] is the first row of A, etc., then

T
n1
n1 x = b1
n2 x = b2
b = Ax = nT2 x

n3 x = b3
nT3

34
and x is seen as the position vector of the intersection of three planes. Recall that n x = C is the equation
of a plane perpendicular to n and passing through a point x0 such that n x0 = C, for instance the point
x0 = Cn/knk.
To find x such that Ax = b, for given b and A, we can combine the equations in order to eliminate
unknowns, i.e.

n1 x = b1
n1 x = b1

n2 x = b2
(n2 2 n1 ) x = b2 2 b1

n3 x = b3
(n3 3 n1 ) x = b3 3 b1
where we pick 2 and 3 such that the new normal vectors n02 = n2 2 n1 and n03 = n3 3 n1 have a
zero 1st component i.e. n02 = (0, a022 , a023 ), n03 = (0, a032 , a033 ). At the next step, one defines a n003 = n03 3 n02
picking 3 so that the 1st and 2nd components of n003 are zero, i.e. n003 = (0, 0, a0033 ). And the resulting system
of equations is then easy to solve by backward substitution. This is Gaussian Elimination which in general
requires swapping of equations to avoid dividing by small numbers. We could also pick the s and s to
orthogonalize the ns, just as in the Gram-Schmidt procedure. That is better in terms of roundoff error and
does not require equation swapping but is computationally twice as expensive as Gaussian elimination.
Linear Transformation of vectors into vectors
I View b as a linear transformation of x.
Here A is a black box that transforms the vector input x into the vector output b. This is the most
general view of Ax = b. The transformation is linear, this means that
A(x + y) = (Ax) + (Ay),

, R, x, y Rn

(108)

This can be checked directly from the explicit definition of matrix-vector multiply:
X

Aik (xk + yk ) =

Aik xk +

Aik yk .

This linearity property is a key property because if A is really a black box (e.g. the matrix is not actually
known, its just a machine that takes a vector and spits out another vector) we can figure out the effect of
A onto any vector x once we know Ae1 , Ae2 , . . . , Aen .
This transformation view of matrices leads to the following extra rules of matrix manipulations.
X
X
X
Ax + Bx = (A + B)x
Aik xk +
Bik xk =
(Aik + Bik )xk , xk
(109)
k

## so matrices are added components by components and A + B = B + A, (A + B) + C = A + (B + C). The

zero matrix is the matrix whose entries are all zero.
Matrix-scalar multiply
A(x) = (A)x

X
k

Aik (xk ) =

X
(Aik )xk , , xk

(110)

so multiplication by a scalar is also done component by component and (A) = ()A = (A).
In other words, matrices can be seen as elements of a vector space! This point of view is also useful
in some instances (in fact, computer languages like C and Fortran typically store matrices as long vectors.
Fortran stores it column by column, and C row by row). The set of orthogonal matrices does NOT form a
vector space because the sum of two orthogonal matrices is not, in general, an orthogonal matrix. The set
of orthogonal matrices is a group, the orthogonal group O(3) (for 3-by-3 matrices). The special orthogonal
group SO(3) is the set of all 3-by-3 proper orthogonal matrices, i.e. orthogonal matrices with determinant
=+1 that correspond to pure rotation, not reflections. The motion of a rigid body about its center of inertia
is a motion in SO(3), not R3 . SO(3) is the configuration space of a rigid body.

c
F.
Waleffe, Math 321, 2013/1/21

35

Exercises
B Pick a random 3-by-3 matrix A and a vector b, ideally in matlab using its A=randn(3,3), b=randn(3,1).
Solve Ax = b using Cramers rule and Gaussian Elimination. Ideally again in matlab, unless punching
numbers into your calculator really turns you on. Matlab knows all about matrices and vectors. To compute
det(a1 , a2 , a3 ) = det(A) and det(b, a2 , a3 ) in matlab, simply use det(A), det(b,A(:,2),A(:,3)). Type
help matfun, or help elmat, and or demos for a peek at all the goodies in matlab.

12.6

## Problem: Given a matrix A, find x 6= 0 and such that

Ax = x.

(111)

These special vectors are eigenvectors for A. They are simply shrunk or elongated by the transformation A.
The scalar is the eigenvalue. The eigenvalue problem can be rewritten
(A I)x = 0
where I is the identity matrix of the same size as A. This will have a non-zero solution iff
det(A I) = 0.

(112)

This is the characteristic equation for . If A is n-by-n, it is a polynomial of degree n in called the
characteristic polynomial.

36

Chapter 2

Vector Calculus
1

## Vector function of a scalar variable

The position vector of a moving particle is a vector function r(t) of the scalar time t, for instance, a particle
moving a constant velocity v 0 has position vector r = r 0 + tv 0 = r(t) where r 0 is the position at t = 0. The
derivative of a vector function r(t) is defined as usual as the limit of a ratio
dr
r(t + h) r(t)
= lim
.
h0
dt
h

(1)

The derivative of the position vector is of course the instantaneous velocity vector v(t) = dr/dt, and for the
simple motion r = r 0 + tv 0 , dr/dt = v 0 . In general, a position vector function r(t) describes a curve in
three-dimensional space, the particle trajectory, and v(t) is tangent to that curve as we will discuss further in
section 5.1 below. The derivative of the velocity vector is the acceleration vector a(t) = dv/dt. We sometime
use Newtons dot notation for time derivatives: dr/dt r,
d2 r/dt2 = r, etc.
Rules for derivatives of vector functions are similar to those of simple functions. The derivative of a sum
of vector functions is the sum of the derivatives,
d
da db
(a + b) =
+
.
dt
dt
dt

(2)

## We can prove as in calc 1 the various product rules:

d
d
da
(a) =
a+ ,
dt
dt
dt

d
da
db
(a b) =
b+a
,
dt
dt
dt

(3)

da
db
d
(a b) =
b+a
,
dt
dt
dt

(4)

d
da
db
dc
[(a b) c] = (
b) c + (a
) c + (a b) ,
dt
dt
dt
dt

(5)

therefore

d
da
db
dc
det(a, b, c) = det( , b, c) + det(a, , c) + det(a, b, ).
(6)
dt
dt
dt
dt
All of these are as expected but the formula for the derivative of a determinant is worth noting because it
generalizes to any dimension.1
1 For

a
b1
d 1
a2 b2

dt a
b3
3

c1
c2
c3

a 1

= a 2

a 3

b1
b2
b3

c1
c2
c3

a1

+ a2

a
3

b 1
b 2
b 3

c1
c2
c3

and of course we could also take the derivatives along rows instead of columns.

37

a1

+ a2

a3

b1
b2
b3

c1
c2
c3

38

Exercises
1. Show that if u(t) is any vector with constant length, then
u

du
= 0,
dt

t.

(7)

The derivative of a vector of constant magnitude is orthogonal to the vector. [Hint: u u = u20 ]
2. If r(t) is not of constant magnitude, what is the geometric meaning of points where r dr/dt = 0?
Make sketches to illustrate such r(t) and points.
3. Show that d|a|/dt = a
da/dt for any vector function a(t). Make a sketch to illustrate.
4. If v(t) = dr/dt show that d(r v)/dt = r dv/dt. In mechanics, r mv , L is the angular momentum
of the particle of mass m and velocity v with respect to the origin.
We now illustrate all these concepts and results by considering the basic problems of classical mechanics:
motion of a particle and motion of a rigid body.

Motion of a particle

## In classical mechanics, the motion of a particle of mass m is governed by Newtons law

F = ma,

(8)

where F is the resultant of the forces acting on the particle and a(t) = dv/dt = d2 r/dt2 is its acceleration,
with r(t) its position vector. Newtons law is a vector equation.
I Deduce the angular momentum law dL/dt = T where T = r F is the torque (see exercise 4 above).
Free motion
If F = 0 then a = dv/dt = 0 so the velocity of the particle is constant, v(t) = v 0 say, and its position is given
by the vector differential equation dr/dt = v 0 whose solution is r(t) = r 0 + tv 0 where r 0 is a constant of
integration which corresponds to the position of the particle at time t = 0. The particle moves in a straight
line through r 0 parallel to v 0 .
Constant acceleration
dv
d2 r
=
= a(t) = a0
dt2
dt
where a0 is a time-independent vector. Integrating we find
v(t) = a0 t + v 0 ,

r(t) = a0

t2
+ v0 t + r0
2

(9)

(10)

where v 0 and r 0 are vector constants of integration. They are easily interpreted as the velocity and position
at t = 0. The trajectory is a parabola passing through r 0 parallel to v 0 at t = 0. The parabolic motion is
in the plane through r 0 that is parallel to v 0 and a0 .
Uniform rotation
If a particle rotates with angular velocity about an axis parallel to n
that passes through the point r A
(|
n| = 1 and define > 0 for right-handed rotation about n
, < 0 for left-handed rotation) then its velocity
is
dr
v(t) =
= (r r A )
(11)
dt
where = n is the rotation vector.
B Show that |r(t) r a | remains constant. Calculate the particle acceleration if and r A are constants
and interpret geometrically. Find the force required to sustain such a motion according to Newtons law.

c
F.
Waleffe, Math 321, 2013/1/21

39

## Motion under a central force

A force F = F (r) r where r = |r| that always points toward the origin (if F (r) > 0, away if F (r) < 0 )
and depends only on the distance to the origin is called a central force. The gravitational force for planetary
motion and the Coulomb force in electromagnetism are of that kind. Newtons law for a particle submitted
to such a force is
dv
m
= F (r) r
(12)
dt
where v = dr/dt and r(t) = r
r is the position vector of the particle, hence both r and r are functions of
time t, in general. Motion due to such a force has two conserved quantities, angular momentum and energy.
1. Conservation of angular momentum F (r)
Take the cross product of (12) with r to obtain
r

d
dv
=0
(r v) = 0 r v = r 0 v 0 , L0 z
dt
dt

(13)

where L0 > 0 and z are constants (see exercise 4 in the previous section). So the motion remains in
the plane that passes through the origin O and is orthogonal to z (why?). Now r vdt = r dr =
2 dA(t) z = L0 zdt where dA(t) = 12 |r(t) dr(t)| = L0 /2 dt is the infinitesimal triangular area swept
by r in time dt. This yields
Keplers law: The radius vector sweeps equal areas in equal times.
2. Conservation of energy: kinetic + potential
Take the dot product of (12) with v to obtain
m


dv
d  vv
v + F (r) r v = 0
m
+ V (r) = 0,
dt
dt
2

(14)

where dV (r)/dr F (r) as by the chain rule dV (r)/dt = (dV /dr)(dr/dt) = (dV /dr) r v (see exercise
3 in the previous section) This implies that


|v|2
m
+ V (r) = E0
(15)
2
where E0 is a constant. The first term m|v|2 /2 is the kinetic energy and the second V (r) is the potential
energy which is defined up to an arbitrary constant. The constant E0 is the total conserved energy.
Note that V (r) and E0 can be negative but m|v|2 /2 0, so the physically admissible r domain is that
were V (r) is less than E0 .

## Motion of a system of particles (optional)

Consider N particles of mass mi at positions r i , i = 1, . . . , N . The net force acting on particle number i is
F i and Newtons law for each particle reads mi ri = F i . Summing over all is yields
N
X
i=1

mi ri =

N
X

F i.

i=1

Great cancellations occur on both sides. On the left side, let r i = r c + si , where r c is the center of mass
and si is the position vector of particle i with respect to the center of mass, then
X
X
X
X
mi r i =
mi (r c + si ) = M r c +
mi si
mi si = 0,
i

P
P
as, by definition of the
r c , where M = i mi is thePtotal mass.PIf the masses mi
P center of mass
P i mi r i = M P
are constants then i mi si = 0 i mi s i = 0 i mi si = 0. In that case, i mi ri = i mi (
r c + si ) =

40
P

mi rc = M rc . On the right-hand side, by action-reaction, all internal forces cancel out and the resultant
P
P (e)
is therefore the sum of all external forces only i F i = i F i = F (e) .
Therefore,
M rc = F (e)
(16)
i

where M is the total mass and F (e) is the resultant of all external forces acting on all the particles. The
motion of the center of mass of a system of particles is that of a single particle of mass M with position
vector r c under the action of the sum of all external forces. This is a fundamental theorem of mechanics.
There are also nice cancellations occurring for the motion about the center of mass. This involves
considering angular momentum and torques about the center of mass. Taking the cross-product of Newtons
law, mi ri = F i , with si for each particle and summing over all particles gives
X
X
si mi ri =
si F i .
i

On the left hand side, r i r c + si and the definition of center of mass implies
X

si mi ri =

si mi (
r c + si ) =

si mi si =

d
dt

mi si = 0. Therefore
!

si mi s i

This last expression is the rate of change of the total angular momentum about the center of mass
Lc

N
X

(si mi s i ) .

i=1

On the right hand side, one can argue that the (internal) force exerted by particle j on particle i is in the
direction of the relative position of j with respect to i, f ij ij (r i r j ). By action-reaction the force from
i onto j is f ji = f ij = ij (r i r j ), and the net contribution to the torque from the internal forces will
cancel out: r i f ij + r j f ji = 0. This is true with respect to any point and in particular, with respect to
the center of mass si f ij + sj f ji = 0. Hence, for the motion about the center of mass we have
dLc
= T c(e)
dt

(17)

P
where T (e) = i si F i is the net torque about the center of mass due to external forces only. This is
another fundamental theorem, that the rate of change of the total angular momentum about the center of
mass is equal to the total torque due to the external forces only.
B If f ij = (r i r j ) and f ji = (r j r i ), show algebraically and geometrically that si f ij +sj f ji = 0,
where s is the position vector from the center of mass.

## Motion of a rigid body (optional)

The two vector differential equations for motion of the center of mass and evolution of the angular momentum
about the center of mass are sufficient to fully determine the motion of a rigid body.
A rigid body is such that all lengths and angles are preserved within the rigid body. If A, B and C are

any three points of the rigid body, then AB AC = constant.
Kinematics of a rigid body
Consider a right-handed orthonormal basis, e1 (t), e2 (t), e3 (t) tied to the body. These vectors are functions
of time t because they are frozen into the body so they rotate with it. However the basis remains orthonormal
as all lengths and angles are preserved. Hence ei (t) ej (t) = ij i, j = 1, 2, 3, and t and differentiating
with respect to time
dei
dej
ej + ei
= 0.
(18)
dt
dt

c
F.
Waleffe, Math 321, 2013/1/21

41

In particular, as seen in an earlier exercise, the derivative of a unit vector is orthogonal to the vector:
el del /dt = 0, l = 1, 2, 3. So we can write
del
l el ,
dt

l = 1, 2, 3

(19)

as this guarantees that el del /dt = 0 for any l . Substituting this expression into (18) yields
( i ei ) ej + ei ( j ej ) = 0,
and rewriting the mixed products
(ei ej ) i = (ei ej ) j .

(20)

Now let
l

kl ek = 1l e1 + 2l e2 + 3l e3 ,

## so kl is the k component of vector l . Substituting in (20) gives

X
X
ijk ki =
ijk kj
k

(21)

where as before ijk (ei ej ) ek . The sums over k have at most one non-zero term. This yields the three
equations
(i, j, k) = (1, 2, 3) 31 = 32
(i, j, k) = (2, 3, 1) 12 = 13
(22)
(i, j, k) = (3, 1, 2) 23 = 21 .
The second equation, for instance, says that the first component of 2 is equal to the first component of
3 . Now ll is arbitrary according to (19) (why?), so we can choose to define 11 , the first component of
1 , for instance, equal to the first components of the other two vectors that are equal to each other, i.e.
11 = 12 = 13 . Likewise, pick 22 = 23 = 21 and 33 = 31 = 32 . This choice implies that
1 = 2 = 3

(23)

## The vector (t) is the Poisson vector of the rigid body.

The Poisson vector (t) gives the rate of change of any vector tied to the body. Indeed, if A and B are

any two points of the body then the vector c AB can be expanded with respect to the body basis e1 (t),
e2 (t), e3 (t)
c(t) = c1 e1 (t) + c2 e2 (t) + c3 e3 (t),
but the components ci c(t) ei (t) are constants because all lengths and angles, and therefore all dot
products, are time-invariant. Thus
3

X
dc X dei
=
ci
=
ci ( ei ) = c.
dt
dt
i=1
i=1
This is true for any vector tied to the body (material vectors), implying that the Poisson vector is unique
for the body.
Dynamics of rigid body
The center of mass of a rigid body moves according to the sum of the external forces as for a system of
particles. A continuous rigid body can be considered as a continuous distribution of infinitesimal masses
dm
Z
N
X
mi si
s dm
i=1

42
where the three-dimensional integral is over all points s in the domain V of the body (dm is the measure of
the infinitesimal volume element dV , or in other words dm = dV , where (s) is the mass density at point
s).
For the motion about the center of mass, the position vectors si are frozen into the body hence s i = si
for any point of the body. The total angular momentum for a rigid system of particles then reads
L=

mi si s i =

mi si ( si ) =


mi |si |2 si (si ) .

(24)

## and for a continuous rigid body

Z
L=


|s|2 s (s ) dm.

(25)

The Poisson vector is unique for the body, so it does not depend on s and we should be able Rto take it out
of the sum, or integral. Thats easy for the ksk2 term, but how can we get out of the s (s ) dm
term?! We need to introduce the concepts of tensor product and tensors to do this, but we can give a hint
by switching to index notation L Li , s si , |s| = s, i , with i = 1, 2, 3 and writing
Z
Li =


s2 i si (sj j ) dm =

Z



s2 ij si sj dm j , Jij j

(26)

where J is the tensor of inertia of the rigid body, independent of the rotation vector .

## Curves, Surfaces, Volumes and their integrals

5.1

Curves
v0

r0

r(t)

P
Recall the parametric equation of a line: r(t) = r 0 + tv 0 , where r(t) =

## OP is the position vector of a point P on the line with respect to some

origin O, r 0 is the position vector of a reference point on the line and
v 0 is a vector parallel to the line. Note that this can be interpreted as
the linear motion of a particle with constant velocity v 0 that was at
the point r 0 at time t = 0 and r(t) is the position at time t.

O
More generally, a vector function r(t) of a real variable t defines a curve C.
The vector function r(t) is the parametric representation of that curve and
t is the parameter. It is useful to think of t as time and r(t) as the position
of a particle at time t. The collection of all the positions for a range of t is
the particle trajectory. The vector r = r(t + h) r(t) is a secant vector
connecting two points on the curve, if we divide r by h and take the limit
as h 0 we obtain the vector dr/dt which is tangent to the curve at r(t).
If t is time, then dr/dt = v is the velocity.

C
dr/dt
r(t)

r(t + h)
O

c
F.
Waleffe, Math 321, 2013/1/21

43

The parameter can have any name and does not need to correspond
to time. For instance r(t) = x
a cos (t) + y
a sin (t), with (t) =
cos t and x
, y
orthonormal, is the position of a particle moving
around a circle of radius a, but the particle oscillates back and
forth around the circle. That same circle is more simply described
by
r() = x
a cos + y
a sin ,

dr/d
r()

## where is a real parameter that can be interpreted as the angle

between the position vector and the x
basis vector.
Remark: Note the common abuse of notation where r(t) and r() are two different vector functions.
Both represent points on the same curve but not the same points, for instance r(t = /2) = a
x while
r( = /2) = a
y . In principle, we should distinguish between the two vector functions writing r = f (t) and
r = g() with f (t) = g((t)), but in applications this quickly becomes cumbersome and a nice advantage of
this abuse of notation is that it fits nicely with the chain rule
dr d
dr
=
.
dt
d dt
Remark: A convenient parametrization, conceptually, is to use distance along the curve as the parameter,
often denoted s and called the arclength. Once a curve is specified and an s = 0 reference point has been
chosen then the distance s from that reference point along the curve uniquely defines a point on the curve,
e.g. for the highway 90 curve, s could be picked as distance from Chicago, positive westward. For the circle
of radius a, this could be the parametrization r(s) = x
a cos(s/a) + y
a sin(s/a), where s is now the distance
along the circle from the x-axis and s/a is the angle in radians.

5.2

## Integrals along curves, or line integrals

dr

C
Line element: Given a curve C, the line element denoted dr is
an infinitesimal secant vector. This is a useful shortcut for the
procedure of approximating the curve by a succession of secant
vectors r n = r n r n1 where r n1 and r n are two consecutive
points on the curve, with n = 1, 2, . . . , N integer, then taking the
limit max |r n | 0 (so N ). In that limit, the direction of
the secant vector r n becomes identical with that of the tangent
vector at that point. If an explicit parametric representation r(t)
is known for the curve then

r n

r n1

dr =

rn

dr(t)
dt
dt

(27)

R
The typical line integral along a curve C has the form C F dr where F (r) is a vector field, i.e. a vector
function of position. If F (r) is a force, this integral represent the net work done by the force on a particle as
the latter moves along the curve. We can make sense of this integral as the limit of a sum, namely breaking
up the curve into a chain of N secant vectors r n as above then
Z
F dr =
C

lim

max |r n |0

N
X

F n r n

(28)

n=1

where F n is an estimate of the average value of F along the segments r n1 r n . A simple choice is
F n = F (r n ) but better choices are the trapezoidal rule F n = 21 (F (r n ) + F (r n1 )), or the midpoint rule

44

1
RF n = F 2 (r n + r n1 ) . These different choices for F n give finite sums that converge to the same limit,
F dr, but the trapezoidal and midpoint rules will converge faster for nice functions, and give more
C
accurate finite sum approximations.
If an explicit representation r(t) is known then we can reduce the line integral to a regular Calc I integral:
Z

tb

F dr =
C

ta



dr(t)
dt,
F (r(t))
dt

(29)

where r(ta ) is the starting point of curve C and r(tb ) is its end point. These may be the same point even if
ta 6= tb (e.g. integral once around a circle from = 0 to = 2).
Likewise, we can use the limit-of-a-sum definition to make sense of many other types of line integrals
such as
Z
Z
Z
Z
f (r) |dr|,
F |dr|,
f (r) dr,
F dr.
C

The first one gives a scalar result and the latter three give vector results.
I One important example is
Z
N
X
|dr| = lim
|r n |
|r n |0

(30)

n=1

which is the length of the curve C from its starting point r a = r 0 to its end point r b = r N . If a parametrization
r = r(t) is known then

Z
Z tb
dr(t)

(31)
|dr| =
dt |dt|
C
ta
where r a = r(ta ) and r b = r(tb ). Thats almost a Calc I integral, except for that |dt|, what does that mean?!
Again you can understand that from the limit-of-a-sum definition with t0 = ta , tN = tb and tn = tn tn1 .
If ta < tb then tn > 0 and dt > 0, so |dt| = dt and were blissfully happy. But if tb < ta then tn < 0 and
dt < 0, so |dt| = dt and
Z tb
Z ta
( )|dt| =
( )dt,
if ta > tb .
(32)
ta

I A special example of a

R
C

tb

F dr integral is

Z
r dr =
C

lim

|r n |0

N
X
n=1

tb

r n r n =
ta



dr(t)
dt
r(t)
dt

## This integral yields a vector 2A

z whose magnitude is twice the area
A swept by the position vector r(t) when the curve C lies in a plane
perpendicular to z and O is in that plane (recall Keplers law that the
radius vector sweeps equal areas in equal times ). This follows directly
from the fact that r n r n = 2(A)
z is the area of the parallelogram
with sides r n and r n which is twice the area A of the triangle r n1 ,
r n , r n . If C and O are not coplanar then the vectors r n r n are
not necessarily in the same direction and their vector sum is not the
area swept by r. In that more general case, the surface Ris conical and
to calculate its area S we would need to calculate S = 12 C |r dr|.

(33)

r n

r n1

A
rn
z

Exercises:
1. What is the curve described by r(t) = a cos t x
+ a sin t y
+ bt z, where a, b and are constant real
numbers and x
, y
, z represent the orthonormal unit vectors for cartesian coordinates? What is the
tangent to the curve at r(t)?
2. Consider the vector function r() = r c + a cos e1 + b sin e2 , where r c , e1 , e2 , a and b are constants,
with ei ej = ij . What kind of curve is this? Next, assume that r c , e1 and e2 are in the same plane.

c
F.
Waleffe, Math 321, 2013/1/21

45

## Consider cartesian coordinates (x, y) in that plane such that r = x

x + yy
. Assume that the angle
between e1 and x
is . Derive the equation of the curve in terms of the cartesian coordinates (x, y)
(i) in parametric form, (ii) in implicit form f (x, y) = 0. Simplify your equations as much as possible.
Find a geometric interpretation for the parameter .
3. Generalize the previous exercise to the case where r c is not in the same plane as e1 and e2 . Consider
general cartesian coordinates (x, y, z) such that r = x
x + yy
+ z z. Assume that all the angles between
e1 and e2 and the basis vectors {
x, y
, z} are known. How many independent angles is that? Specify
those angles. Derive the parametric equations of the curve for the cartesian coordinates (x, y, z) in
terms of the parameter .
4. Derive integrals for the length and area of the planar curve in the previous exercise. Clean up your
integrals and compute them if possible.
R
R
5. Calculate C dr and C r dr along the curve of the preceding exercise from r(0) to r(3/2).
R
R
6. Calculate C F dr and C F dr with F = (
z r)/|
z r|2 when C is the circle of radius R in the
x, y plane centered at the origin. How do the integrals depend on R?

5.3

Surfaces

[You favorite lecturer will draw many pretty pictures, but you should learn to draw them all by yourself ]
Recall the parametric equation of a plane: r(u, v) = r 0 + u a + v b where r 0 is a point on the plane, a, b
are two vectors parallel to the plane (but not to each other) and u, v are real parameters. These parameters
specify coordinates for the surface. More generally, the parametric equation of a surface prescribes the
position vector r as a function of two real parameters, u and v say, r(u, v). Again, the names of the
parameters do not matter, r(s, t) is also common. The following two examples are fundamental.
I If the surface can be parametrized by the cartesian coordinates x and y, i.e. the surface is described
by z = h(x, y), then the position vector of a point on the surface is
r(x, y) = x x
+yy
+ h(x, y) z,

(34)

where x
, y
, z are the unit vectors in the respective coordinate directions.
I The surface of a sphere of radius R centered at r c can be parametrized by
r(u, v) = r c + e1 R cos u cos v + e2 R sin u cos v + e3 R sin v,

(35)

where ei ej = ij . Verify that |r r c | = R for any real u and v. The parameters u and v can be
interpreted as angles. With e3 as the polar axis, v is the latitude and u is the longitude. The parametric
representation r(, ) = r c + e1 R cos sin + e2 R sin sin + e3 R cos , is a different but equally valid
parametric representation of the same sphere. Here is the polar angle, or co-latitude. It is the angle
between the position vector and the polar axis e3 . The angle is the azimuthal, or longitudinal, angle.
Coordinate curves and Tangent vectors If one of the parameters is held fixed, v = v0 say, we obtain
a curve r(u, v0 ). There is one such curve for every value of v. For the sphere parametrized as in (35),
r(u, v0 ) is the v0 -parallel, the circle at latitude v0 . Likewise r(u0 , v) describes another curve. This would be
a longitude circle, or meridian, for the sphere. The set of all such curves generates the surface. These two
families of curves are parametric curves or coordinate curves for the surface. The vectors r/u and r/v
are tangent to their respective parametric curves and hence to the surface. These two vectors taken at the
same point r(u, v) define the tangent plane at that point. The coordinates are said to be orthogonal if the
tangent vectors r/u and r/v at each point r(u, v) are orthogonal to each other.
I Consider the sphere x2 +y 2 +z 2 = R2 . Parametrize the northern hemisphere z 0 using both (34) and
(35). Make 3D perspective sketches of the coordinate curves for both parametrizations. What happens at
the pole and and the equator? Calculate r/u and r/v for both parametrizations. Are the coordinates
orthogonal?

46
Normal to the surface at a point At any point r(u, v) on a surface, there is an infinity of tangent
directions but there is only one normal direction. The normal to the surface at a point r(u, v) is given by
N=

r
r

.
u v

(36)

Note that the ordering (u, v) specifies an orientation for the surface, i.e. an up and down side, and that
N is not a unit vector, in general.
Surface element: The surface element dS at a point r on a surface S is a vector perpendicular to
the surface at that point with a magnitude dS that is the area of an infinitesimal patch of surface at that
point. Just like in the case of the line element along a curve, this is a useful shortcut for the limit-of-a-sum
interpretation of integrals. The surface element is often written dS = n
dS where n
is the unit normal to
the surface at that point. If a parametric representation r(u, v) for the surface is known then


r
r

dudv,
(37)
dS =
u v
r
du and
since the right hand side represents the area of the parallelogram formed by the line elements u
r
6= N but n
= N /|N | since n
is a unit vector. Although we often need to refer to
v dv. Note that n
the unit normal n
, it is usually not needed to compute it explicitly since in practice it is the area element
dS = N dudv which is required.

Exercises:
1. Compute tangent vectors and the normal to the surface z = h(x, y). Show that r/x and r/y
are not orthogonal to each other in general. Determine the class of functions h(x, y) for which (x, y)
are orthogonal coordinates on the surface z = h(x, y) and interpret geometrically. Derive an explicit
formula for the area element dS = |dS| in terms of h(x, y).
2. Deduce from the implicit equation |r r c | = R for a sphere of radius R and center r c that (r r c )
r/u = (r r c ) r/v = 0 for any u and v, where r(u, v) is any parametrization of a point on
the sphere. Compute r/u and r/v for the parametrization (35) and verify that these vectors are
perpendicular to the radial vector r r c and to each other (when evaluated at the same point on the
sphere, whatever that point is). Orthogonality of r/u and r/v implies that those u and v are
orthogonal coordinates for the sphere. Compute the surface element dS for the sphere in terms of both
the longitude-latitude parametrization (35) and the longitude-colatitude parametrization , . Do the
surface elements point toward or away from the center of the sphere?
3. Explain why the surface described by r(u, v) = x
a cos u cos v + y
b sin u cos v + z c sin v where a, b and
c are real constants is the surface of an ellipsoid. Are u and v orthogonal coordinates for that surface?
Consider cartesian coordinates (x, y, z) such that r = x
x + yy
+ z z. Derive the implicit equation
f (x, y, z) = 0 satisfied by all such r(u, v)s.
4. Consider the vector function r(u, v) = r c +e1 a cos u cos v+e2 b sin u cos v+e3 c sin v, where r c , e1 , e2 , e3 ,
a, b and c are constants with ei ej = ij . What does the set of all such r(u, v)s represent? Consider
cartesian coordinates (x, y, z) such that r = x
x + yy
+ z z. Assume that all the angles between e1 ,
e2 , e3 and the basis vectors {
x, y
, z} are known. How many independent angles is that? Specify
which angles. Can you assume {e1 , e2 , e3 } is right handed? Explain. Derive the implicit equation
f (x, y, z) = 0 satisfied by all such r(u, v)s. Express your answer in terms of the minimum independent
angles that you specified earlier.
5. Explain why the surface of a torus (i.e. donut or tire) can be parametrized as x = (R + a cos ) cos ,
y = (R + a cos ) sin , z = a sin . Interpret the geometric meaning of the parameters R, a, and
. What are the ranges of and needed to cover the entire torus? Do these parameters provide
orthogonal coordinates for the torus? Calculate the surface element dS.

c
F.
Waleffe, Math 321, 2013/1/21

5.4

47

Surface integrals

R
The typical surface integral is of the form S v dS. This represents
the flux of v through the surface S. If
R
v(r) is the velocity of a fluid, water or air, at point r, then S v dS is the time-rate at which volume of fluid
is flowing through that surface per unit time. Indeed, v dS = (v n
) dS where n
is the unit normal to the
surface and v n
is the component of fluid velocity that is perpendicular to the surface. If that component is
zero, the fluid moves tangentially to the surface, not through the surface. Speed area = volume per unit
time, so v dS is the volume of fluid passing through the surface element dS per unit
R time at that point at
that time. The total volume passing through the entire surface S per unit time is S v dS. Such integrals
are often called flux integrals.
We can make sense of such integrals as the limit of a sum, for instance by picking a series of points on
the surface, then forming a triangular meshing of the surface and evaluating the integral as the limit of the
sum of v evaluated at the center of such triangles dotted with the unit normal to the triangle and times the
area of the triangle (making sure of course that for each triangle we pick its normal n
to correspond to the
same side of the surface as the neighboring elements).
If a parametric representation r(u, v) is known for the surface then we can also write

Z  
Z
r
r

dudv,
(38)
v
v dS =
u v
A
S
where A is the domain in the u, v parameter plane that corresponds to S.
As for line integrals, we can make sense of many other types of surface integrals such as
Z
p dS,
S

which would represent the netR pressure force on S if p = p(r) is the pressure at pointR r. Other surface
integrals could have the form S v dS, etc. In particular the total area of surface S is S |dS|.
Exercises:
1. Compute the percentage of surface area that lies north of the arctic circle on the earth (assume it is a
2. Provide an explicit integral for the total surface area of the torus of outer radius R and inner radius a.
R
3. Calculate S r dS where S is (i) the square 0 x, y a at z = b, (ii) the surface of the sphere of
radius R centered at (0, 0, 0), (iii) the surface of the sphere of radius R centered at x = x0 , y = z = 0.
R
4. Calculate S (r/r3 ) dS where S is the surface of the sphere of radius R centered at the origin. How
does the result depend on R?
5. The pressure outside the sphere of radius R centered at r c is p = p0 + Ar a
where a
is an arbitrary but
fixed unit vector and p0 and A are constants. The pressure inside the sphere is the constant p1 > p0 .
Calculate the net force on the sphere. Calculate the net torque on the sphere about its center r c and

5.5

## Volumes and volume integrals

We have seen that r(t) is the parametric equation of a curve, r(u, v) represents a surface, now we discuss
r(u, v, w) which is the parametric representation of a volume. Curves r(t) are one dimensional objects so
they have only one parameter t or each point on the known curve is determined by a single coordinate.
Surfaces are two-dimensional and require two parameters u and v, which are coordinates for points on that
surface. Volumes are three-dimensional objects that require three parameters u, v, w say. Each point is
specified by three coordinates.
I A sphere of radius R centered at r c has the implicit equation |r r c | R or (r r c ) (r r c ) R2
to avoid square roots. In cartesian coordinates this translates into the implicit equation
(x xc )2 + (y yc )2 + (z zc )2 R2 .

(39)

48
An explicit parametrization for that sphere is
r(u, v, w) = r c + e1 w cos u cos v + e2 w sin u cos v + e3 w sin v,

(40)

## r(r, , ) = r c + e1 r sin cos + e2 r sin sin + e3 r cos ,

(41)

or
where {e1 , e2 , e3 } are any three orthonormal vectors such that ei ej = ij . For the parametrization (40)
we recognize u and v as the longitude and latitude parameters used earlier for spherical surfaces (35) and
note that w has units of length. From orthonormality of {e1 , e2 , e3 } we deduce that (r r c ) (r r c ) = w2 ,
so w can be interpreted as the distance from the center of the sphere. To parametrize the entire sphere we
need to consider all the u, v and ws such that
0 u < 2,

v ,
2
2

0 w R.

(42)

The parametrization (41) is mathematically equivalent to this u, v, w parametrization but it is in the standard
form of spherical coordinates, with r representing the distance to the origin, the azimuthal (or longitude)
angle and the polar angle. To describe the full sphere we need
0 r R,

0 < 2,

0 .

(43)

Coordinate curves
For a curve r(t), all we needed to worry about was the tangent dr/dt and the line element dr = (dr/dt)dt.
For surfaces, r(u, v) we have two sets of coordinates curves with tangents r/u and r/v, a normal
N = (r/u) (r/v) and a surface element dS = N dudv. Now for volumes r(u, v, w), we have three
sets of coordinates curves with tangents r/u, r/v and r/w. A u-coordinate curve for instance,
corresponds to r(u, v, w) with v and w fixed. There is a double infinity of such one dimensional curves, one
for each v, w pair. For the parametrization (40), the u-coordinate curves correspond to parallels, i.e. circles
of fixed radius w at fixed latitude v. The v-coordinate curves are meridians, i.e. circles of fixed radius w
through the poles. The w-coordinate curves are radial lines out of the origin.
Coordinate surfaces
For volumes r(u, v, w), we also have three sets of coordinate surfaces corresponding to one parameter
fixed and the other two free. A w-isosurface for instance corresponds to r(u, v, w) for a fixed w. There is a
single infinity of such two dimensional (u, v) surfaces. For the parametrization (40) such surfaces correspond
to spherical surfaces of radius w centered at r c . Likewise, if we fix u but let v and w free, we get another
surface, and v fixed with u and w free is another coordinate surface.
Line Elements
Thus given a volume parametrization r(u, v, w) we can define four types of line elements, one for each
of the coordinate directions (r/u)du, (r/v)dv, (r/w)dw and a general line element corresponding to
the infinitesimal displacement from coordinates (u, v, w) to the coordinates (u + du, v + dv, w + dw). That
general line element dr is given by (chain rule):
dr =

r
r
r
du +
dv +
dw.
u
v
w

(44)

Surface Elements
Likewise, there are three basic types of surface elements, one for each coordinate surface. The surface
element on a w-isosurface, for example, is given by


r
r
dS w =

dudv,
(45)
u v
while the surface elements on a u-isosurface and a v-isosurface are respectively




r
r
r
r
dS u =

dvdw,
dS v =

dudw.
v
w
w u

(46)

c
F.
Waleffe, Math 321, 2013/1/21

49

Note that surface orientations are built into the order of the coordinates.
Volume Element
Last but not least, a parametrization r(u, v, w) defines a volume element given by the mixed (i.e. triple
scalar) product




r
r
r
r r r
dV =

du dv dw det
,
,
du dv dw.
(47)
u v
w
u v w
The definition of volume integrals as limit-of-a-sum should be obvious by now. If an explicit parametrization r(u, v, w) for the volume is known, we can use the volume element (47) and write the volume integral
in r space as an iterated triple integral over u, v, w. Be careful that there is an orientation implicitly built
into the ordering of the parameters, as should be obvious from the definition of the mixed product and
determinants. The volume element dV is usually meant to be positive so the sign of the mixed product
and the bounds of integrations for the parameters u, v and w must be chosen to respect that. (Recall the
definition of |dt| in the line integrals section).
Exercises
1. Calculate the line, surface and volume elements for the coordinates (40). You need to calculate 4 line
elements and 3 surfaces elements. One line element for each coordinate curve and the general line
element. Verify that these coordinates are orthogonal.
2. Formulate integral expressions in terms of the coordinates (40) and (41) for the surface and volume of
a sphere of radius R. Calculate those integrals.
3. A curve r(t) is given in terms of the (u, v, w) coordinates (40), i.e. r(t) = r(u, v, w) with (u(t), v(t), w(t))
for t = ta to t = tb . Find an explicit expression in terms of (u(t), v(t), w(t)) as a t-integral for the
length of that curve.
4. Find suitable coordinates for a torus. Are your coordinates orthogonal? Compute the volume of that
torus.

5.6

## Mappings, curvilinear coordinates

Parametrizations of curves, surfaces and volumes is essentially equivalent to the concepts of mappings and
curvilinear coordinates.
Mappings
The parametrization (35) for the surface of a sphere of radius R provides a mapping of that surface to the
0 u < 2, /2 v /2 rectangle in the (u, v) plane. In a mapping r(u, v) a small rectangle of sides
du, dv at a point (u, v) in the (u, v) plane is mapped to a small parallelogram of sides (r/u)du, (r/v)dv
at point r(u, v) in the Euclidean space.
The parametrization (40) for the sphere of radius R provides a mapping from the sphere of radius R in
Euclidean space to the box 0 u < 2, /2 v /2, 0 w R in the (u, v, w) space. In a mapping
r(u, v, w), the infinitesimal box of sides du, dv, dw located at point (u, v, w) in the (u, v, w) space is mapped
to a parallelepiped of sides (r/u)du, (r/v)dv, (r/w)dw at the point r(u, v, w) in the Euclidean space.
Curvilinear coordinates, orthogonal coordinates
The parametrizations r(u, v) and r(u, v, w) define coordinates for a surface or a volume, respectively. If the
coordinate curves are not straight lines one talks of curvilinear coordinates. These mappings define good
coordinates if the coordinate curves intersect transversally i.e. if the coordinate curves are not tangent to
each other. If the coordinate curves intersect transversally at a point then the coordinates tangent vectors at
that point provide linearly independent directions in the space of r. Tangent intersections at a point r would
imply that the tangent vectors are not linearly independent at that point and that (r/u) (r/v) = 0

50
at r(u, v) in the surface case or that det(r/u, r/v, r/w) = 0 at that point r(u, v, w) in the volume
case.
The coordinates (u, v, w) are orthogonal if the coordinate curves in r-space intersect at right angles. This
is the best kind of transversal intersection and these are the most desirable type of coordinates, however
non-orthogonal coordinates are sometimes more convenient for some problems. Two fundamental examples
of orthogonal curvilinear coordinates are
I Cylindrical (or polar) coordinates
x = cos ,

y = sin ,

z = z.

(48)

I Spherical coordinates
x = r sin cos ,

y = r sin sin ,

z = r cos .

(49)

Changing notation from (x, y, z) to (x1 , x2 , x3 ) and from (u, v, w) to (q1 , q2 , q3 ) a general change of
coordinates from cartesian (x, y, z) (x1 , x2 , x3 ) to curvilinear (q1 , q2 , q3 ) coordinates is expressed succinctly
by
xi = xi (q1 , q2 , q3 ), i = 1, 2, 3.
(50)
The position vector r can be expressed in terms of the qj s through the cartesian expression:
r(q1 , q2 , q3 ) = x
x(q1 , q2 , q3 ) + y
y(q1 , q2 , q3 ) + z z(q1 , q2 , q3 ) =

3
X

ei xi (q1 , q2 , q3 ),

(51)

i=1

where ei ej = ij . The qi coordinate curve is the curve r(q1 , q2 , q3 ) where qi is free but the other two variables
are fixed. The qi isosurface is the surface r(q1 , q2 , q3 ) where qi is fixed and the other two parameters are free.
The coordinate tangent vectors r/qi are key to the coordinates. They provide a natural vector basis
for those coordinates. The coordinates are orthogonal if these tangent vectors are orthogonal to each other
at each point. In that case it is useful to define the unit vector qi in the qi coordinate direction by

r
r
,

= hi qi ,
(52)
hi =
qi
qi
where hi is the the magnitude of the tangent vector in the qi direction, r/qi , and qi qj = ij for orthogonal
coordinates. These hi s are called the scale factors. Note that this decomposition of r/qi into a scale factor
hi and a direction qi clashes with the summation convention. The distance traveled in x-space when changing
qi by dqi , keeping the other qs fixed, is |dr| = hi dqi (no summation). The distance ds travelled in x-space
when the orthogonal curvilinear coordinates change from (q1 , q2 , q3 ) to (q1 + dq1 , q2 + dq2 , q3 + dq3 ) is
ds2 = dr dr = h21 dq12 + h22 dq22 + h23 dq32 .

(53)

Although the cartesian unit vectors ei are independent of the coordinates, the curvilinear unit vectors qi
in general are functions of the coordinates, even if the latter are orthogonal. Hence qi /qj is in general
non-zero. For orthogonal coordinates, those derivatives qi /qj can be expressed in terms of the scale factors
and the unit vectors.
For instance, for spherical coordinates (q1 , q2 , q3 ) (r, , ), the unit vector in the q1 r direction is the
vector
r(r, , )
=x
sin cos + y
sin sin + z cos ,
(54)
r
so the scale coefficient h1 hr = 1 and the unit vector
q1 r er = x
sin cos + y
sin sin + z cos .

(55)

## The position vector r can be expressed as

r = r r = x x
+yy
+ z z

3
X
i=1

xi e i .

(56)

c
F.
Waleffe, Math 321, 2013/1/21

51

## So its expression in terms of spherical coordinates and their unit vectors, r = r

r , is simpler than in cartesian
coordinates, r = x x
+y y
+z z, but there is a catch! The radial unit vector r = r(, ) varies in the azimuthal
and polar angle directions, while the cartesian unit vectors x
, y
, z are independent of the coordinates!
For orthogonal coordinates, the scale factors hi s determine everything. In particular, the surface and
volume elements can be expressed in terms of the hi s. For instance, the surface element for a q3 -isosurface
is
dS 3 = q3 h1 h2 dq1 dq2 ,
(57)
and the volume element
dV = h1 h2 h3 dq1 dq2 dq3 ,

(58)

assuming that q1 , q2 , q3 is right-handed. These follow directly from (45) and (47) and (52) when the coordinates are orthogonal.
Exercises
1. Find the scale factors hi and the unit vectors qi for cylindrical and spherical coordinates. Express the
3 surface elements and the volume element in terms of those scale factors and unit vectors. Sketch the
unit vector qi in the (x, y, z) space (use several views rather than trying to make an ugly 3D sketch!).
Express the position vector r in terms of the unit vectors qi . Calculate the derivatives qi /qj for all
i, j and express these derivatives in terms of the scale factors hk and the unit vectors qk , k = 1, 2, 3.
2. A curve in the (x, y) plane is given in terms of polar coordinates as = (). Deduce -integral
expressions for the length of the curve and for the area swept by the radial vector.
3. Consider elliptical coordinates (u, v, w) defined by x = cosh u cos v, y = sinh u sin v , z = w for
some > 0, where x, y and z are standard cartesian coordinates in 3D Euclidean space. What do the
coordinate curves correspond to in the (x, y, z) space? Are these orthogonal coordinates? What is the
volume element in terms of elliptical coordinates?
4. For general curvilinear coordinates, not necessarily orthogonal, is the qi -isosurface perpendicular to
r/qi ? is it orthogonal to (r/qj )(r/qk ) where i, j, k are all distinct? What about for orthogonal
coordinates?

5.7

Change of variables

Parametrizations of surfaces and volumes and curvilinear coordinates are geometric examples of a change of
variables. These change of variables and the associated formula and geometric concepts can occur in other
non-geometric contexts. The fundamental relationship is the formula for a volume element (47). In the
context of a general change of variables from (x1 , x2 , x3 ) to (q1 , q2 , q3 ) that formula (47) reads
dx1 dx2 dx3 = dVx = J dq1 dq2 dq3 = J dVq

(59)

where

x1 /q1 x1 /q2 x1 /q3



xi
(60)
J = x2 /q1 x2 /q2 x2 /q3 = det
qj
x3 /q1 x3 /q2 x3 /q3
is the Jacobian determinant and dVx is a volume element in the x-space while dVq is the corresponding
volume element in q-space. The Jacobian is the determinant of the Jacobian matrix

## x1 /q1 x1 /q2 x1 /q3

xi
.
(61)
J = x2 /q1 x2 /q2 x2 /q3 Jij =
qj
x3 /q1 x3 /q2 x3 /q3
The vectors (dq1 , 0, 0), (0, dq2 , 0) and (0, 0, dq3 ) at point (q1 , q2 , q3 ) in q-space are mapped to the vectors
(r/q1 )dq1 , (r/q2 )dq2 , (r/q3 )dq3 . In component form this is
(1)

dx
x1 /q1 x1 /q2 x1 /q3
dq1
1(1)
(62)
dx2 = x2 /q1 x2 /q2 x2 /q3 0 ,
(1)
x3 /q1 x3 /q2 x3 /q3
0
dx
3

52
(1)

(1)

(1)

where (dx1 , dx2 , dx3 ) are the x-components of the vector (r/q1 )dq1 . Similar relations hold for the
other basis vectors. Note that the rectangular box in q-space is in general mapped to a non-rectangular
parallelepiped in x-space so the notation dx1 dx2 dx3 for the volume element in (59) is a (common) abuse of
notation.
The formulas (59), (60) tells us how to change variables in multiple integrals. This formula generalizes
to higher dimension, and also to lower dimension. In the 2 variable case, we have a 2-by-2 determinant
that can also be understood as a special case of the surface element formula (37) for a mapping r(q1 , q2 )
from a 2D space (q1 , q2 ) to another 2D-space (x1 , x2 ). In that case r(q1 , q2 ) = e1 x1 (q1 , q2 ) + e2 x2 (q1 , q2 ) so
(r/q1 ) (r/q2 )dq1 dq2 = e3 dAx and
dx1 dx2 = dAx = J dq1 dq2 = J dAq

(63)

x1 /q1
J =
x2 /q1

x1 /q2
.
x2 /q2

(64)

## Change of variables example

Consider a Carnot cycle for a perfect gas. The equation of state is P V = nRT = N kT , where P is
pressure, V is the volume of gas and T is the temperature in Kelvins. The volume V contains n moles of gas
corresponding to N molecules, R is the gas constant and k is Boltzmanns constant with nR = N k. A Carnot
cycle is an idealized thermodynamic cycle in which a gas goes through (1) a heated isothermal expansion at
temperature T1 , (2) an adiabatic expansion at constant entropy S2 , (3) an isothermal compression releasing
heat at temperature T3 < T1 and (4) an adiabatic compression at constant entropy S4 < S2 . For a perfect
monoatomic gas, constant entropy means constant P V where = CP /CV = 5/3 with CP and CV the heat
capacity at constant pressure P or constant volume V , respectively. Thus let S = P V (this S is not the
physical entropy but it is constant whenever entropy is constant, we can call S a pseudo-entropy).

T1
T1
S4
P

S4

S2

S2
T3

T3

S
R Vb

Now the work done by the gas when its volume changes from Va to Vb is Va P dV (since work = Force
displacement, P = force/area and V =area displacement). Thus the (yellow) area inside the cycle in the
(P, V ) plane is the net work performed by the gas during one cycle. Although we can calculate that area
by working in the (P, V ) plane, it is easier to calculate it by using a change of variables from P , V to T ,
S. The area inside the cycle in the (P, V ) plane is not the same as the area inside the cycle in the (S, T )
plane. There is a distortion. An element of area in the (P, V ) plane aligned with the T and S coordinates
(i.e. with the dashed curves in the (P, V ) plane) is





P
V
P
V

e +
e
dS
e +
e
dT
(65)
dA =
S P
S V
T P
T V
where eP and eV are the unit vectors in the P and V directions in the (P, V ) plane, respectively. This is
r
entirely similar to the area element of surface r(u, v) being equal to dS = u
du r
v dv (but dont confuse

c
F.
Waleffe, Math 321, 2013/1/21

53

the surface element dS with the pseudo-entropy differential dS used in the present example!). Calculating
out the cross product, we obtain



P V

P,V
V P
(66)
dA =

## dSdT = JS,T dSdT

S T
S T
where the sign will be chosen to get a positive area (this depends on the bounds of integrations) and

P P

S T
P,V
P,V

(67)
JS,T = det(J S,T ) =

V V

S T
is the Jacobian determinant (here the vertical bars are the common notation for determinants). It is the
determinant of the Jacobian matrix J P,V
S,T that corresponds to the mapping from (S, T ) to (P, V ). The cycle
area AP,V in the (P, V ) plane is thus
Z

T1

S2

AP,V =
T3

S4

P,V

JS,T dSdT.

(68)

P,V
Note that the vertical bars in this formula are for absolute value of JS,T
and the bounds have been selected so
that dS > 0 and dT > 0 (in the limit-of-a-sum sense). This expression for the (P, V ) area
expressed in terms
P,V
of (S, T ) coordinates is simpler than if we used (P, V ) coordinates, except for that JS,T since we do not have

explicit expression for P (S, T ) and V (S, T ). What we have in fact are the inverse functions T = P V /(N k)
and S = P V . To find the partial derivatives that we need we could (1) find the inverse functions by solving
for P and V in terms of T and S then compute the partial derivatives and the Jacobian, or (2) use implicit
differentiation e.g. N k T /T = N k = V P/T + P V /T and S/T = 0 = V P/T + P V 1 V /T
etc. and solve for the partial derivatives we need. But there is a simpler way that makes use of an important
property of Jacobians.
Geometric meaning of the Jacobian determinant and its inverse
P,V

The Jacobian JS,T represents the stretching factor of area elements when moving from the (S, T ) plane to the
(P, V ) plane. If dAS,T is an area element centered at point (S, T ) in the (S, T ) plane then that area element
gets mapped to an area element dAP,V centered at the corresponding point in the (P, V ) plane. Thats what
equation (66) represents. In that equation we have in mind the mapping of a rectangular element of area
P,V
dSdT in the (S, T ) plane to a parallelogram element in the (P, V ) plane. The stretching factor is |JS,T | (as
we saw earlier, the meaning of the sign is related to orientation, but here we are worrying only about areas,
so we take absolute values). That relationship is valid for area elements of any shape, not just rectangles
to parallelogram since the differential relationships implies an implicit limit-of-a-sum and in that limit, the
pointwise area stretching is independent of the shape of the area elements. A disk element in the (S, T )
plane would be mapped to an ellipse element in the (P, V ) plane but the pointwise area stretching would be
the same as for a rectangular element (this is not true for finite size areas). So equation (66) can be written
in the more general form
P,V
dAP,V = |JS,T | dAS,T
(69)
which is locally valid for area elements of any shape. The key point is that if we consider the inverse map,
S,T
back from (P, V ) to (S, T ) then there is an are stretching give by the Jacobian JP,V = (S/P )(T /V )
(S/V )(T /P ) such that
S,T
dAS,T = |JP,V | dAP,V
(70)
but since we are coming back to the original dAS,T element we must have
P,V

S,T

JS,T JP,V = 1,

(71)

54
so the Jacobian determinant are inverses of one another. This inverse relationship actually holds for the
Jacobian matrices also


P,V
S,T
1 0
J S,T J P,V =
(72)
.
0 1
The latter can be derived from the chain rule since (note the consistent ordering of the partials)




S,T
P,V
S/P S/V
P/S P/T
,
J P,V =
J S,T =
T /P T /V
V /S V /T

(73)

and the matrix product of those two Jacobian matrices yields the identity matrix. For instance, the first
row times the first column gives
 


 




P
S
P
T
P
+
=
= 1.
S T P V
T S P V
P V
A subscript has been added to remind which other variable is held fixed during the partial differentiation.
The inverse relationship between the Jacobian determinants (71) then follows from the inverse relationship
between the Jacobian matrices (72) since the determinant of a product is the product of the determinants.
This important property of determinants can be verified directly by explicit calculation for this 2-by-2 case.
So whats the work done by the gas during one Carnot cycle? Well,
P,V

S,T

JS,T = JP,V
so
Z

1

T1


=
S2

AP,V =
T3

S4

S T
S T

P V
V P

Z
P,V

JS,T dSdT =

1

T1

T3

= N k V P V 1 P V

S2

S4

1

Nk
(1 )S

Nk
N k(T1 T3 ) S2
dSdT =
ln ,
( 1)S
( 1)
S4

(74)

(75)

## since > 1 and other quantities are positive.

Exercises
1. Calculate the area between the curves xy = 1 , xy = 2 and y = 1 x, y = 2 x in the (x, y) plane.
Sketch the area. (1 , 2 , 1 , 2 > 0.)
2. Calculate the area between the curves xy = 1 , xy = 2 and y 2 = 21 x, y 2 = 22 x in the (x, y) plane.
Sketch the area. (1 , 2 , 1 , 2 > 0.)
3. Calculate the area between the curves x2 + y 2 = 21 x, x2 + y 2 = 22 x and x2 + y 2 = 21 y, x2 + y 2 =
22 y. Sketch the area. (1 , 2 , 1 , 2 > 0.)
4. Calculate the area of the ellipse x2 /a2 +y 2 /b2 = 1 and the volume of the ellipsoid x2 /a2 +y 2 /b2 +z 2 /c2 =
1 by transforming them to a disk and a sphere, respectively, using a change of variables. [Hint: consider
the change of variables x = au, y = bv, z = cw.]
R R
R
2
2
2
5. Calculate the integral e(x +y ) dxdy. Deduce the value of the Poisson integral ex dx.
[Hint: switch to polar coordinates].
R R
6. Calculate (a2 + x2 + y 2 ) dxdy. Where a 6= 0 and is real. Discuss the values of for which
the integral exists.

c
F.
Waleffe, Math 321, 2013/1/21

55

Consider a scalar function of a vector variable: f (r), for instance the pressure p(r) as a function of position,
or the temperature T (r) at point r, etc. One way to visualize such functions is to consider isosurfaces or
level sets, these are the set of all rs for which f (r) = C0 , for some constant C0 . In cartesian coordinates
r = xe+yy
+ z z and the scalar function is a function of the three coordinates f (r) f (x, y, z), hence we
can interpret an isosurface f (x, y, z) = C0 as a single equation for the 3 unknowns x, y, z. In general, we are
free to pick two of those variables, x and y say, then solve the equation f (x, y, z) = C0 for z. For example,
the isosurfaces of f (r) = x2 + y 2 + z 2 are determined by the equation f (r) = C0 . This is the sphere or radius

C0 , if C0 0.

6.1

## Geometric definition of the Gradient

The value of f (r) at a point r 0 defines an isosurface f (r) = f (r 0 ) through that point r 0 . The gradient of
f (r) at r 0 can be defined geometrically as the vector, denoted f (r 0 ), that
(i) is perpendicular to the isosurface f (r) = f (r 0 ) at the point r 0 and points in the direction of increase
of f (r) and
(ii) has a magnitude equal to the rate of change of f (r) with distance from the isosurface.
From this geometric definition, we deduce that the plane tangent to the isosurface f (r) = f (r 0 ) at r 0 has
the equation
(r r 0 ) f (r 0 ) = 0
(76)
Likewise, the plane (r r 0 ) f (r 0 ) =  is parallel to the tangent plane but further up in the direction of
the gradient if  > 0 or down if  < 0. More generally, the function f (r) can be locally (i.e. for r near r 0 )
approximated by
f (r) f (r 0 ) + (r r 0 ) f (r 0 ).
(77)
Indeed since f points in the direction of fastest increase of f (r) and has a magnitude equal to the rate of
change of f with distance in that direction, the change in f as we move slightly away from r 0 only depends
on how much we have moved in the direction of the gradient. The actual change in f is the distance times
the rate of change with distance, hence it is (r r 0 ) f (r 0 ). Equation (77) states that the isosurfaces of
f (r) look like planes perpendicular to f (r 0 ) for r in the neighborhood for r 0 . Equation (77) is a linear
approximation of f (r) in the neighborhood of r 0 , just like f (x) f (x0 ) + (x x0 )f 0 (x0 ) for functions of one
variable. This is not exact, there is a small error that goes to zero faster than |r r 0 | as r r 0 . The limit
as r r 0 can be written as the exact differential relation
df (r) = dr f (r),

(78)

where df (r) = f (r + dr) f (r) is the differential change in f (r) when r changes from r to r + dr. This
differential relationship holds for any r and dr. It is analogous to the differential relationship df = f 0 (x) dx
for functions of a single variable.
I It follows immediately from the geometric definition of the gradient that if r = |r| is the distance to
the origin and r = r/r is the unit radial vector, then r = r and more generally f (r) = rdf /dr. For
instance, (1/r) =
r /r2 .


6.2

## The rate of change of f (r) in the direction of the unit vector n

at point r, denoted f (r)/n is defined as
f (r + h
n) f (r)
f (r)
= lim
.
h0
n
h

(79)

f (r)
=n
f (r).
n

(80)

## From (77) and (78),

56
This is exact because the error of the linear approximation (77), f (r + h
n) f (r) + h
n f (r), goes to
zero faster than h as h 0.
In particular,
f
f
f
=x
f,
=y
f,
= z f,
(81)
x
y
z
hence
f = x

f
f
f
+y

+ z ,
x
y
z

(82)

for cartesian coordinates x, y, z. We can also deduce this important result from the Chain rule for functions
of several variables. In cartesian coordinates, written (x1 , x2 , x3 ) in place of (x, y, z), the position vector
reads r = x1 e1 + x2 e2 + x3 e3 and the scalar function f (r) f (x1 , x2 , x3 ) is a function of the 3 coordinates.
The directional derivative
f (r + h
n) f (r)
f
f
f
f (r)
= lim
= n1
+ n2
+ n3
=n
f.
h0
n
h
x1
x2
x3

(83)

## This follows from rewriting the difference f (r + h

n) f (r) as the telescoping sum
[f (x1 + hn1 , x2 + hn2 , x3 + hn1 )f (x1 , x2 + hn2 , x3 + hn3 )]
+[f (x1 , x2 + hn2 , x3 + hn3 )f (x1 , x2 , x3 + hn3 )]
+[f (x1 , x2 , x3 + hn3 )f (x1 , x2 , x3 )]

(84)

then using continuity and the definition of the partial derivatives, e.g.
lim

h0

f
f (r + e3 ) f (r)
f (r + hn3 e3 ) f (r)
= n3 lim
= n3
.
0
h

x3

## Either way, we establish that in cartesian coordinates

= e1

+ e2
+ e3
ei i
x1
x2
x3

(85)

where i is short for /xi and we have used the convention of summation over all values of the repeated
index i.

6.3

## Div and Curl

Well depart from our geometric point of view to first define divergence and curl computationally based on
their cartesian representation. Here we consider vector fields v(r) which are vector functions of a vector
variable, for example the velocity v(r) of a fluid at point r, or the electric field E(r) at point r, etc. For
the geometric meaning of divergence and curl, see the sections on divergence and Stokes theorems.
The divergence of a vector field v(r) is defined as the dot product v. Now since the unit vectors ei
are constant for cartesian coordinates, v = ei i ej vj = (ei ej ) i vj = ij i vj hence
v = i vi = 1 v1 + 2 v2 + 3 v3 .

(86)

Likewise, the curl of a vector field v(r) is the cross product v. In cartesian coordinates, v =
(ej j ) (ek vk ) = (ej ek ) j vk . Recall that ijk = ei (ej ek ), or in other words ijk is the i component
of the vector ej ek , thus ej ek = ei ijk and
v = ei ijk j vk .

(87)

(Recall that a b = ei ijk aj bk .) The right hand side is a triple sum over all values of the repeated indices
i, j and k! But that triple sum is not too bad since ijk = 1 depending on whether (i, j, k) is a cyclic

c
F.
Waleffe, Math 321, 2013/1/21

57

(=even) permutation or an acyclic (=odd) permutation of (1, 2, 3) and vanishes in all other instances. Thus
(87) expands to
v = e1 (2 v3 3 v2 ) + e2 (3 v1 1 v3 ) + e3 (1 v2 2 v1 ) .
(88)
We can also write that the i-th cartesian component of the curl is


ei v = v i = ijk j vk .

(89)

## Note that the divergence is a scalar but the curl is a vector.

6.4

Vector identities

In the following f = f (r) is an arbitrary scalar field while v(r) and w(r) are vector fields. Two fundamental
identities that can be remembered from a (a b) = 0 and a (a) = 0 are
( v) = 0,

(90)

(f ) = 0.

(91)

and
The divergence of a curl and the curl of a gradient vanish identically (assuming all those derivatives exist).
These can be proved using index notation. Other useful identities are

12

22

(f v) = (f ) v + f ( v),

(92)

(f v) = (f ) v + f ( v),

(93)

( v) = ( v) 2 v,

(94)

32

where = =
+ +
is the Laplacian operator.
The identity (92) is verified easily using indicial notation
(f v) = i (f vi ) = (i f )vi + f (i vi ) = (f ) v + f ( v).
Likewise the second identity (93) follows from ijk j (f vk ) = ijk (j f )vk + f ijk (j vk ). The identity (94) is
easily remembered from the double cross product formula a (b c) = b(a c) c(a b) but note that the
in the first term must appear first since ( v) 6= ( v). That identity can be verified using indicial
notation if one knows the double cross product identity in terms of the permutation tensor (see earlier notes
on ijk and index notation) namely
ijk klm = kij klm = il jm im jl .

(95)

(96)

## A slightly trickier identity is

where (w )v = (wj j )vi in indicial notation. This can be verified using (95) and can be remembered from
the double cross-product identity with the additional input that is a vector operator, not just a regular
vector, hence (v w) represents derivatives of a product and this doubles the number of terms of the
resulting expression. The first two terms are the double cross product a (b c) = (a c)b (a b)c for
derivatives of v while the last two terms are the double cross product for derivatives of w.
Another similar identity is
v ( w) = (w) v (v )w.
(97)
ijk vj (klm l wm ) = (i wj )vj (vj j )wi .

(98)

58
Note that this last identity involves the gradient of a vector field w. This makes sense and is a tensor,
i.e. a geometric object whose components with respect to a basis form a matrix. In indicial notation, the
components of w are i wj and there are 9 of them. This is very different from w = i wi which is a
scalar.
The bottom line is that these identities can be reconstructed relatively easily from our knowledge of
regular vector identities for dot, cross and double cross products, however is a vector of derivatives and
one needs to watch out more carefully for the order and the product rules. If in doubt, jump to indicial
notation.
Exercises
1. Verify (90) and (91).
2. Digest and verify the identity (95) ab initio.
3. Verify the double cross product identities a(bc) = b(ac)c(ab) and (ab)c = b(ac)a(bc)
in indicial notation using (95).
4. Verify (96) and (97) using indicial notation and (95).
5. Use (95) to derive vector identities for (a b) (c d) and ( v) ( w).
6. Show that (v w) = w ( v) v ( w) and explain how to reconstruct this from the rules
for the mixed (or box) product a (b c) of three regular vectors.
7. Find the fastest way to show that r = 3 and r = 0.
8. Find the fastest way to show that (
r /r2 ) = 0 and (
r /r2 ) = 0 for all r 6= 0.
9. Quickly calculate B and B when B = (
z r)/|
z r|2 (cf. Biot-Savart law for a line current)
[Hint: use both vector identities and cartesian coordinates where convenient].

6.5

## Grad, Div, Curl in cylindrical and spherical coordinates

[to be developed]
Show that

+y

+ z
x
y
z
1

+ z
= +

= r +
+

r
r
r sin

=x

(99)

where
(
x, y
, z) are fixed mutually orthogonal cartesian unit vectors,
(,
,
z) are mutually orthogonal cylindrical unit vectors but and
depend on ,
)
(
r , ,
are mutually orthogonal spherical unit vectors but r and depend on both and .
Show that

= ,

r = sin ,

r = ,

= cos

=
r

(100)
(101)
(102)

c
F.
Waleffe, Math 321, 2013/1/21

59

See Chapter 1, section 1.3 to grind this out intelligently, but learn also to re-derive these relationships
geometrically as all good physicists know how to do.
With these relations, one can derive the expression for div, curl, Laplacian, etc in cylindrical and spherical
coordinates. For instance
= (v) + v ( )

(v )


1

1 v
1

=
+v
+

r
r
r sin
(103)
1 v v
v

=
+
+

r
r

r sin

1 v
v
1

=
+
=
(v sin )
r
r tan
r sin
Show likewise that
1

1
(v sin ) +
w
(u
r + v + w)
= 2 (r2 u) +
r r
r sin
r sin

(104)

and in particular
2 f = f =

1 2 f
(r
)+
r2 r
r
r sin


sin

1 f
r


+

r sin

1 f
r sin


(105)

60

## Fundamental theorems of vector calculus

Integration in R2 and R3

7.1

R
The integral of a function
of two variables f (x, y) over a domain A of R2 denoted A f (x, y)dA can be defined
P
as a the limit of a n f (xn , yn )An where the An s, n = 1, . . . , N provide an (approximate) partition of
A that breaks up A into a set of small area elements, squares or triangles for instance. An is the area of
those element n and (xn , yn ) is a point inside that element, for instance the center of area of the triangle.
The integral would be the limit of such sums when the area of the triangles goes to zero and their number
N must then go to infinity. This limit should be such that the aspect ratios of the triangles remain bounded
away from 0 so we get a finer and finer sampling of A. This definition also provides a way to approximate
the integral by such a finite sum.
yT

## If we imagine breaking up A into small squares aligned with

the x and y axes then the sum over all squares inside A
can be performed row by row. Each row-sum then tends to
an integral in the x-direction, this leads to the conclusion
that the integral can be calculated as iterated integrals

A
y

yT

f (x, y)dA =

yB

x` (y)

xr (y)

dy
yB

f (x, y)dx

(106)

x` (y)

xr (y)

## We can also imagine summing up column by column

instead and each column-sum then tends to an integral in the y-direction, this leads to the iterated integrals
Z

xR

f (x, y)dA =
A

yt (x)

dx
xL

yt (x)

f (x, y)dy.

(107)

yb (x)

## Note of course that the limits of integrations differ

from those of the previous iterated integrals.

yb (x)

xL
x
xR
This iterated integral approach readily extends to integrals over three-dimensional domains in R3 and
more generally to integrals in Rn .

7.2

## The fundamental theorem of calculus can be written

Z b
dF
dx = F (b) F (a).
a dx

(108)

Once again we can interpret this in terms of limits of finite differences. The derivative is defined as
dF
F
= lim
x0 x
dx

(109)

## where F = F (x + x) F (x), while the integral

Z

f (x)dx = lim
a

xn 0

N
X
n=1

f (
xn )xn

(110)

c
F.
Waleffe, Math 321, 2013/1/21

61

## where xn = xn xn1 and xn1 x

n xn , with n = 1, . . . , N and x0 = a, xN = b, so the set of xn s
provides a partition of the interval [a, b]. The best choice for x
n is the midpoint x
n = (xn + xn1 )/2. This
is the midpoint scheme in numerical integration methods. Putting these two limits together and setting
Fn = F (xn ) F (xn1 ) we can write
b

Z
a

N
N
X
X
Fn
dF
dx = lim
xn = lim
Fn = F (b) F (a).
xn 0
xn 0
dx
xn
n=1
n=1

## We can also write this in the integral form

Z b
Z F (b)
dF
dx =
dF = F (b) F (a).
a dx
F (a)

(111)

(112)

Fundamental theorem in R2

7.3

From the fundamental theorem of calculus and the reduction of integrals on a domain A of R2 to iterated
integrals on intervals in R we obtain for a function G(x, y)
Z
A

G
dA =
x

yT

xr (y)

dy
yB

x` (y)

G
dx =
x

yT



G(xr (y), y) G(x` (y), y) dy.

(113)

yB

This looks nice enough but we can rewrite the integral on the right-hand side as a line integral over the
boundary of A. The boundary of A is a closed curve C often denoted A (not to be confused with a partial
derivative). The boundary C has two parts C1 and C2 .
The curve C1 can be parametrized in terms of y as r(y) =
x
xr (y) + y
y with y = yB yT , hence
Z
Z yT
G(x, y)
y dr =
G(xr (y), y)dy.

yT

C2

C1

C1

yB

yB

## Likewise, the curve C2 can be parametrized using y as

r(y) = x
x` (y) + y
y with y = yT yB , hence
Z
Z yB
G(x, y)
y dr =
G(x` (y), y)dy.
C2

x` (y)

yT

xr (y)

Putting these two results together the right hand side of (113) becomes
Z yT
Z
Z
I


G(xr (y), y) G(x` (y), y) dy =
Gy
dr +
Gy
dr =
Gy
dr,
yB

C1

C2

## where C = C1 + C2 is the closed curve bounding A. Then (113) becomes

I
Z
G
dA =
Gy
dr.
(114)
C
A x
H
The symbol is used to emphasize that the integral is over a closed curve. Note that the curve C has been
oriented counter-clockwise such that the interior is to the left of the curve.
Similarly the fundamental theorem of calculus and iterated integrals lead to the result that
Z
Z xR Z yt (x)
Z xR


F
F
dA =
dx
dy =
F (x, yt (x)) F (x, yb (x)) dx,
(115)
A y
xL
yb (x) y
xL
and the integral on the right hand side can be rewritten as a line integral around the boundary curve
C = C3 + C4 .

62
The curve C3 can be parametrized in terms of x as r(x) =
x
x + y
yb (x) with x = xL xR , hence
Z
Z xR
F (x, y)
x dr =
F (x, yb (x))dx.

C4

yt (x)

yb (x)

C3

xL

## Likewise, the curve C4 can be parametrized using x as

r(x) = x
x + y
yt (x) with x = xR xL , hence
Z
Z xL
F (x, y)
x dr =
F (x, yt (x))dx.

C3

C4

xL

xR

xR

## The right hand side of (115) becomes

Z
Z xR


F (x, yt (x)) F (x, yb (x)) dx =

Z
Fx
dr

C4

xL

I
Fx
dr =

C3

Fx
dr,
C

where C = C3 + C4 is the closed curve bounding A oriented counter-clockwise as before. Then (115) becomes
I
Z
F
dA = F x
dr.
(116)
C
A y

7.4

## Green and Stokes theorems

The two results (114) and (116) can be combined into a single important formula. Subtract (116) from (114)
to deduce the curl form of Greens theorem
Z 
A

G F

x
y

I
(F x
+ G
y ) dr =

dA =
C

(F dx + Gdy).

(117)

Note that dx and dy in the last integral are not independent quantities, they are the projection of the
line element dr onto the basis vectors x
and y
as written in the middle line integral. If (x(t), y(t)) is a
parametrization for the curve then dx = (dx/dt)dt and dy = (dy/dt)dt and the t-bounds of integration
should be picked to correspond to counter-clockwise orientation. Note also that Greens theorem (117) is
the formula to remember since it includes both (114) when F = 0 and (116) when G = 0.
Greens theorem can be written in several equivalent forms. Define the vector field v = F (x, y)
x+
G(x, y)
y . A simple calculation verifies that its curl is purely in the z direction indeed (88) gives v =
z (G/x F/y) thus (117) can be rewritten in the form
Z
I
( v) z dA =
v dr,
(118)
A

## This result also applies to any 3D vector field v(x, y, z) = F (x, y, z)

x + G(x, y, z)
y + H(x, y, z)
z and any
planar surface A perpendicular to z since z ( v) still equals G/x F/y for such 3D vector fields
and the line element dr of the boundary curve C of such planar area is perpendicular to z so v dr is still
equal to F (x, y, z)dx + G(x, y, z)dy. The extra z coordinate is a mere parameter for the integrals and (118)
applies equally well to 3D vector field v(x, y, z) provided A is a planar area perpendicular to z.
In fact that last restriction on A itself can be removed. This is Stokes theorem which reads
Z
I
( v) dS =
v dr,
(119)
S

where S is a bounded orientable surface in 3D space, not necessarily planar, and C is its closed curve
boundary. The orientation of the surface as determined by the direction of its normal n
, where dS = n
dS,
and the orientation of the boundary curve C must obey the right-hand rule. Thus a corkscrew turning in the

c
F.
Waleffe, Math 321, 2013/1/21

63

## direction of C would go through S in the direction of its normal n

. That restriction is a direct consequence
of the fact that the right hand rule enters the definition of the curl as the cross product v.
Stokes theorem (119) provides a geometric interpretation for the curl. At any point r, consider a small
disk of area A perpendicular to an arbitrary unit vector n
, then Stokes theorem states that
I
1
v dr
(120)
( v) n
= lim
A0 A C
H
where C is the circle bounding the disk A oriented with n
. The line integral v dr is called the circulation
of the vector field v around the closed curve C.

## Proof of Stokes theorem

Index notation enables a fairly straightforward proof of Stokes theorem for the more general case of a
surface S in 3D space that can be parametrized by a good function r(s, t) (differentiable and integrability
as needed). Such a surface S can fold and twist (it could even intersect itself!) and is therefore of a more
general kind than those that can be parametrized by the cartesian coordinates x and y. The main restriction
on S is that it must be orientable. This means that it must have an up and a down as defined by the
direction of the normal r/s r/t. The famous Mobius strip only has one side and is the classical
example of a non-orientable surface. The boundary of the Mobius strip forms a knot.
I Let xi (s, t) represent the i component of the position vector r(s, t) in the surface S with i = 1, 2, 3.
Then


r r
vk
vk xl xm
xl xm

=ijk
= ijk ilm
( v)
ilm
s
t
xj
s t
xj s t
vk xl xm
= (jl km jm kl )
xj s t
vk xk xj
vk xj xk

=
xj s t
xj s t

vk xk

vk xk
=

s  t
t s 


xk
xk
=
vk

vk
s
t
t
s
F (s, t)
G(s, t)

,
(121)
=
s
t
where we have used (95), then the chain rule
vk xj
vk x1
vk x2
vk x3

vk
=
+
+
=
,
xj s
x1 s
x2 s
x3 s
s

and equality of mixed partials 2 xk /st = 2 xk /ts with vk (s, t) = vk x1 (s, t), x2 (s, t), x3 (s, t) . This
proof demonstrates the power of index notation with the summation convention. The first line of (121) is
a quintuple sum over all values of the indices i, j, k, l and m! This would be unmanageable without the
compact notation.
Now for any point on the surface S with position vector r(s, t)
 




xi
xi
xi
xi
ds +
dt = vi
ds + vi
dt = F (s, t)ds + G(s, t)dt,
(122)
v dr = vi
s
t
s
t
and we have again reduced Stokes theorem to Greens theorem (117) but expressed in terms of s and t
instead of x and y. In details we have shown that



Z
Z
Z 
r r
G(s, t) F (s, t)
( v) dS = ( v)

dsdt =

dsdt,
(123)
s
t
s
t
S
A
A

64
I

I
v dr =

## F (s, t)ds + G(s, t)dt.

The right hand sides of (123) and (124) are equal by Greens theorem (117).

7.5

(124)

CA

## Divergence form of Greens theorem

The fundamental theorems (114) and (116) in R2 can be rewritten in a more palatable form.

The line element dr at a point on the curve C is in the direction of the unit
tangent t at that point, so dr = tds, where t points in the counterclockwise
direction of the curve. Then t z = n
is the unit outward normal n
to the
curve at that point and z n
= t so

x
t = x
(
zn
) = (
x z) n
=
yn

(125)

y
t = y
(
zn
) = (
y z) n
=x
n
.

(126)

Hence since dr = tds, the fundamental theorems (114) and (116) can be
rewritten
I
I
Z
F
dA =
Fy
dr =
Fx
n
ds,
(127)
C
C
A x
Z
I
I
F
dA = F x
dr =
Fy
n
ds.
(128)
A y
C
C

The right hand side of these equations is easier to remember since they have x
going with /x and y

with /y and both equations have positive signs. But there are hidden subtleties. The arclength element
ds = |dr| is positive by definition and n
must be the unit outward normal to the boundary, so if an
explicit
parametrization
of
the
boundary
curve
is known, the bounds of integration should be picked so that
H
H
ds
=
|dr|
>
0
would
be
the
length
of
the
curve with a positive sign. For the dr line integrals, the
C
C
bounds of integration must correspond to counter-clockwise orientation of C.
Formulas (127) and (128) can be combined in more useful forms. First, if u is the signed distance in the
direction of the unit vector u
, then the (directional) derivative in the direction u
is F/u = u
F
ux F/x + uy F/y, where u
= ux x
+ uy y
, therefore combining (127) and (128) accordingly we obtain
Z
A

F
dA =
u

I
Fu
n
ds.

(129)

## This result is written in a coordinate-free form. It applies to any direction u

in the x,y plane.
Another useful combination is to add (127) to (128) written for a function G(x, y) in place of F (x, y)
yielding the divergence-form of Greens theorem
Z 
A

F
G
+
x
y

I
(F x
+ G
y) n
ds.

dA =

(130)

The left hand side integrand is easily recognized as the divergence v of the vector field v(x, y) = F x
+G
y.

7.6

Gauss theorem

Gauss theorem is the 3D version of the divergence form of Greens theorem. It is proved by first extending
the fundamental theorem of calculus to 3D.
If V is a bounded volume in 3D space and F (x, y, z) is a scalar function of the cartesian coordinates
(x, y, z), then we have
Z
I
F
dV =
F z n
dS,
(131)
V z
S

c
F.
Waleffe, Math 321, 2013/1/21

65

## where S is the closed surface enclosing V and n

is the unit outward normal to S.
The proof of this result is similar to that for (114). Assume that the surface can be parametrized using
x and y in two pieces: an upper hemisphere at z = zu (x, y) and a lower hemisphere at z = zl (x, y) with
x, y in a domain A, the projection of S onto the x, y plane, that is the same domain for both the upper and
lower surfaces. The closed surface S is not a sphere in general but we used the word hemisphere to help
visualize the problem. For a sphere, A is the equatorial disk, z = zu (x, y) is the northern hemisphere and
z = zl (x, y) is the southern hemisphere.
Iterated integrals with dV = dA dz and the fundamental theorem of calculus give
Z
Z
Z
Z zu (x,y)


F
F
dV =
dz =
F (x, y, zu (x, y)) F (x, y, zl (x, y)) dA.
(132)
dA
z
A
zl (x,y)
A
V z

z
dS

dA

We can interpret that integral over A as an integral over the entire closed
surface S that bounds V . All we need for that is a bit of geometry. If dA is
the projection of the surface element dS = n
dS onto the x, y plane then we
have dA = cos dS =
zn
dS. The + sign applies to the upper surface
for which n
is pointing up (i.e. in the direction of z) and the - sign for the
bottom surface where n
points down (and would be opposite to the n
on
the side figure). Thus we obtain
I
Z
F
dV =
F z n
dS,
(133)
S
V z

where n
is the unit outward normal to S.
We can obtain similar results for the volume integrals of F/x and F/y and combine those to obtain
Z
I
F
dV =
Fu
n
dS,
(134)
V u
S
for arbitrary but fixed direction u
. This is the 3D version of (129) and of the fundamental theorem of
calculus.
We can combine this theorem into many useful forms. Writing it for F (x, y, z) in the x
direction, with
the y
version for a function G(x, y, z) and the z version for a function H(x, y, z) we obtain Gausss theorem
Z
I
v dV =
vn
dS,
(135)
V

where v = F x
+ G
y + H z. This is the 3D version of (130). Note
H that both (134) and (135) are expressed in
coordinate-free forms. These are general results. The integral S v n
dS is the flux of v through the surface
S. If v(r) is the velocity of a fluid at point r then that integral represents the time-rate at which volume of
fluid flows through the surface S.
Gauss theorem provides a coordinate-free interpretation for the divergence. Consider a small sphere of
volume V and surface S centered at a point r, then Gauss theorem states that
I
1
v = lim
vn
dS.
(136)
V 0 V
S
Note that (134) and (135) are equivalent. We deduced (135) from (134), but we can also deduce (134)
from (135) by considering the special v = F u
where u
is a unit vector independent of r. Then from our
vector identities (F u
) = u
F = F/u.

7.7

## A useful form of (134) is to write it in indicial form as

I
Z
F
dV =
nj F dS.
V xj
S

(137)

66
Then with f (r) in place of F (r) we deduce that
Z
I
f
ej
dV =
ej nj f dS
xj
V
S

(138)

since the cartesian unit vectors ej are independent of position. This result can be written in coordinate-free
form as
Z
I
f dV =
fn
dS.
(139)
V

One application of this form of the fundamental theorem is to prove Archimedes principle.
Next, writing (137) for vk in place of F yields
Z
I
vk
dV =
nj vk dS,
V xj
S
which can be multiplied by the position-independent ijk and summed over all j and k to give
I
Z
vk
ijk nj vk dS.
dV =
ijk
xj
S
V
The coordinate-free form of this is

(141)

I
v dV =

n
v dS.

(142)

The integral theorems (139) and (142) provide yet other geometric interpretations for the gradient
I
1
f = lim
fn
dS,
V 0 V
S
and the curl

(140)

1
V 0 V

(143)

I
n
v dS.

v = lim

(144)

These are similar to the result (136) for the divergence. However, the geometric definition of the gradient
as the vector pointing in the direction of greatest rate of change (sect. 6.1) and of the n
component of the
curl as the limit of the local circulation per unit area as given by Stokes theorem (120) are perhaps more
fundamental.
In applications we use the fundamental theorem as we do in 1D, namely to reduce a 3D integral to a 2D
integral, for instance. However we also use them the other way, to evaluate a complicated surface integral
as a simpler volume integral for instance.
Exercises
H
H
1. If C is any closed curve in 3D space (i) calculate C r dr in two ways, (ii) calculate C f dr in two
ways, where f (r) is a scalar function. [Hint: by direct calculation and by Stokes theorem].
H
2. If C is any closed curve in 3D space not passing through the origin calculate C r3 r dr in two ways.
3. Calculate the circulation of the vector field B = (
z r)/|
at the origin in a plane perpendicular to z; (ii) about any closed curve C in 3D that does not go around
the z-axis; (ii) about any closed curve C0 that does go around the z axis. Whats wrong with the z-axis
anyway?
4. Consider v = r where is a constant vector, independent of r. (i) Evaluate the circulation of
v about the circle of radius R centered at the origin in the plane perpendicular to the direction n
by
direct calculation of the line integral; (ii) Calculate the curl of v using vector identities; (iii) calculate
the circulation of v about a circle of radius R centered at r 0 in the plane perpendicular to n
.
H
5. If C is any closed curve in 2D, calculate C n
dr where n
(r) is the unit outside normal to C at the point
r of C.

c
F.
Waleffe, Math 321, 2013/1/21
6. If S is any closed surface in 3D, calculate
r on S.

67
H

n
dS where n
(r) is the unit outside normal to S at a point

H
7. If S is any closed surface in 3D, calculate p n
dS where n
is the unit outside normal to S and
p(r) = (p0 g z r) where p0 , and g are constants (This is Archimedes principle with as fluid
density and g as the acceleration of gravity.)
8. Calculate the flux of r through (i) the surface of a sphere of radius R centered at the origin in two
ways; (ii) through the surface of the sphere of radius R centered at r 0 ; (iii) through the surface of a
cube of side L with one corner at the origin in two ways.
9. (i) Calculate the flux of v = r/r3 through the surface of a sphere of radius  centered at the origin. (ii)
Calculate the flux of that vector field through a closed surface that does not enclose the origin [Hint:
use the divergence theorem] (iii) Calculate the flux through an arbitrary closed surface that encloses
the origin [Hint: use divergence theorem and (i) to isolate the origin. Whats wrong with the origin
anyway?]
10. Calculate |r r 0 |1 and v with v = (r r 0 )/|r r 0 |3 where r 0 is a constant vector.
11. Calculate the flux of v through the surface of a sphere of radius R centered at the origin when v =
1 (r r 1 )/|r r 1 |3 + 2 (r r 2 )/|r r 2 |3 where 1 and 2 are scalar constants and (i) |r 1 | and |r 2 |
PN
are both less than R; (ii) |r 1 | < R < |r 2 |. Generalize to v = i=1 i (r r i )/|r r i |3 .
R
12. Calculate F (r) = V0 f dV0 where V0 is the inside of a sphere of radius R centered at O, f = (r
r 0 )/|r r 0 |3 with |r| > R and the integral is over r 0 . Warning: this is essentially the gravity field
or force at r due to a sphere of uniform mass density. Supposedly, it took Newton 20 years to figure
it out... of course he was never taught calculus, he had to invent it, and vector calculus came much
after him and he never knew about Gauss theorem. The integral can be cranked out if youre good
at analytic integration. But the smart solution is to realize that the integral over r 0 is essentially a
sum over r i as in the previous exercise so we can figure out the flux of F through any closed surface
enclosing all of V0 . Now by symmetry F (r) = F (r)
r , so knowing the flux is enough to figure out F (r).
Newton would be impressed!

68

Chapter 3

Complex Calculus
1

1.1

## Algebra of Complex numbers

A complex number z = x + iy is composed of a real part Re(z) = x and an imaginary part Im(z) = y, both
of which are real numbers, x, y R. Complex numbers can be defined as pairs of real numbers (x, y) with
special manipulation rules. Thats how complex numbers are defined in Fortran or C. We can map complex
numbers to the plane R2 with the real part as the x axis and the imaginary part as the y-axis. We refer to
that mapping as the complex plane. This is a very useful visualization.
The form x + iy is convenient with the special symbol i standing as the imaginary unit defined such that
i2 = 1. With that form and that special i2 = 1 rule, complex numbers can be manipulated like regular
real numbers.

z = x + iy
|z|
x
z = x i y
z1 + z2 = (x1 + iy1 ) + (x2 + iy2 ) = (x1 + x2 ) + i(y1 + y2 ).

(1)

This is identical to vector addition for the 2D vectors (x1 , y1 ) and (x2 , y2 ).
Multiplication:
z1 z2 = (x1 + iy1 )(x2 + iy2 ) = (x1 x2 y1 y2 ) + i(x1 y2 + x2 y1 ).

(2)

Complex conjugate:
z = x iy

(3)

An overbar z or a star z denotes the complex conjugate of z, which is same as z but with the sign of the
imaginary part flipped. It is readily verified that the complex conjugate of a sum is the sum of the conjugates:
(z1 + z2 ) = z1 + z2 , and the complex conjugate of a product is the product of the conjugates (z1 z2 ) = z1 z2
(show that as an exercise).
Modulus (or Norm)
p

|z| = zz = x2 + y 2 ,
(4)
This modulus is equivalent to the euclidean norm of the 2D vector (x, y), hence it obviously satisfy the
triangle inequality |z1 + z2 | |z1 | + |z2 |. However we can verify that |z1 z2 | = |z1 | |z2 |.
69

70
Division:




x1 x2 + y1 y2
x2 y1 x1 y2
(x1 + iy1 )(x2 iy2 )
z1
z1 z2
=
+i
.
=
=
z2
z2 z2
x22 + y22
x22 + y22
x22 + y22

(5)

All the usual algebraic formula apply, for instance (z + a)2 = z 2 + 2za + a2 and more generally the
binomial formula (defining 0! = 1)
(z + a)n =

n  
X
n
k=0

z k ank =

n
X
k=0

n!
z k ank .
k!(n k)!

(6)

Exercises:
1. Prove that (z1 + z2 ) = z1 + z2 , (z1 z2 ) = z1 z2 and |z1 z2 | = |z1 ||z2 |.
2. Calculate (1 + i)/(2 + i3).
3. Show that the final formula for division follows from the definition of multiplication (as it should): if
z = z1 /z2 then z1 = zz2 , solve for Re(z) and Im(z).

1.2

## Limits and Derivatives

The modulus allows the definition of distance and limit. The distance between two complex numbers z and
a is the modulus of their difference |z a|. A complex number z tends to a complex number a if |z a| 0,
where |z a| is the euclidean distance between the complex numbers z and a in the complex plane. A
function f (z) is continuous at a if limza f (z) = f (a). These concepts allow the definition of derivatives
and series.
The derivative of a function f (z) at z is
df (z)
f (z + a) f (z)
= lim
a0
dz
a

(7)

where a is a complex number and a 0 means |a| 0. This limit must be the same no matter how a 0.
We can use the binomial formula (6) as done in Calc I to deduce that
dz n
= nz n1
dz

(8)

for any integer n = 0, 1, 2, . . ., and we can define the anti-derivative of z n as z n+1 /(n + 1) + C for all
integer n 6= 1. All the usual rules of differentiation: product rule, quotient rule, chain rule,. . . , still apply
for complex differentiation and we will not bother to prove those here, the proofs are just like in Calc I.
So there is nothing special about complex derivatives, or is there? Consider the function f (z) = Re(z) =
x, the real part of z. What is its derivative? Hmm. . . , none of the rules of differentiation help us here, so
lets go back to first principles:
Re(z + a) Re(z)
Re(a)
dRe(z)
= lim
= lim
=?!
a0
a0
dz
a
a

(9)

What is that limit? If a is real, then a = Re(a) so the limit is 1, but if a is imaginary then Re(a) = 0 and
the limit is 0. So there is no limit that holds for all a 0. The limit depends on how a 0, and we cannot
define the z-derivative of Re(z). Re(z) is continuous everywhere, but nowhere z-differentiable!

Exercises:
1. Prove formula (8) from the limit definition of the derivative [Hint: use the binomial formula].
2. Prove that (8) also applies to negative integer powers z n = 1/z n from the limit definition of the
derivative.

c
F.
Waleffe, Math 321, 2013/1/21

1.3

71

## For any complex number q 6= 1, the geometric sum

1 q n+1
.
1q

1 + q + q2 + + qn =

(10)

To prove this, let Sn = 1 + q + + q n and note that qSn = Sn + q n+1 1, then solve that for Sn .
The geometric series is the limit of the sum as n . It follows from (10), that the geometric series
converges to 1/(1 q) if |q| < 1, and diverges if |q| > 1,

qn = 1 + q + q2 + =

n=0

1
,
1q

iff

|q| < 1.

(11)

P
Note that we have two different functions of q: (1) the series n=0 q n which only exists when |q| < 1, (2)
the function 1/(1 q) which is defined and smooth everywhere except at q = 1. These two expressions, the
geometric series and the function 1/(1 q) are identical in the disk |q| < 1, but they are not at all identical
outside of that disk since the series does not make any sense (i.e. it diverges) outside of it. What happens
on the unit circle |q| = 1? (consider for example q = 1, q = 1, q = i, . . . )

=(q)
diverges

|q| = 1

converges

<(q)

Exercises:
1. Derive formula (10) and absorb the idea of the proof. What is Sn when q = 1?
2. Calculate q N + q N +2 + q N +4 + q N +6 + .... with |q| < 1.

1.4

Ratio test

The geometric series leads to a useful test for convergence of the general series

an = a0 + a1 + a2 +

(12)

n=0

We can make sense of this series again as the limit of the partial sums Sn = a0 + a1 + + an as n .
Any one of these finite partial sums exists but the infinite sum does not necessarily converge. Example: take
an = 1 n, then Sn = n + 1 and Sn as n .
A necessary condition for convergence is that an 0 as n as you learned in Math 222 and can
explain why, but that is not sufficient. A sufficient condition for convergence is obtained by comparison to
a geometric series. This leads to the Ratio Test: the series (12) converges if
lim

|an+1 |
=L<1
|an |

(13)

72
Why does the ratio test work? If L < 1, then pick any q such that L < q < 1 and one can find a (sufficiently
large) N such that |an+1 |/|an | < q for all n N so we can write


|aN +1 | |aN +2 | |aN +1 |
|aN | + |aN +1 | + |aN +2 | + |aN +3 | + = |aN | 1 +
+
+
|aN |
|aN +1 | |aN |
(14)

|aN |
< |aN | 1 + q + q 2 + =
< .
1q
If L > 1, then we can reverse the proof (i.e. pick q with 1 < q < L and N such that |an+1 |/|an | > q n N )
to show that the series diverges. If L = 1, the ratio test does not determine convergence.

1.5

Power series

## A power series has the form

cn (z a)n = c0 + c1 (z a) + c2 (z a)2 +

(15)

n=0

where the cn s are complex coefficients and z and a are complex numbers. It is a series in powers of (z a).
By the ratio test, the power series converges if

cn+1 |z a|
cn+1 (z a)n+1

= |z a| lim
< 1,
(16)
lim
n
n
cn (z a)n
cn
R
where we have defined

=(z)

cn+1
= 1.
lim
n
cn R

(17)

|z a| < R
a
R

## The power series converges if |z a| < R. It diverges if |z a| > R.

|z a| = R is a circle of radius R centered at a, hence R is called
the radius of convergence of the power series. R can be 0, or
anything in between. But the key point is that power series always
converge in a disk |z a| < R and diverge outside of that disk.

<(z)
This geometric convergence inside a disk implies that power series can be differentiated (and integrated)
term-by-term inside their disk of convergence (why?). The disk of convergence of theP
derivative or integral

n
series is the same as that of the original
series.
For
instance,
the
geometric
series
n=0 z converges in
P
n1
|z| < 1 and its term-by-term derivative n=0 nz
does also, as you can verify by the ratio test.
Taylor Series
The Taylor Series of a function f (z) about z = a is

X
1
f (n) (a)
f (z) = f (a) + f 0 (a)(z a) + f 00 (a)(z a)2 + =
(z a)n ,
2
n!
n=0

(18)

where f (n) (a) = dn f /dz n (a) is the nth derivative of f (z) at a and n! = n(n 1) 1 is the factorial of n,
with 0! = 1 by convenient definition. The equality between f (z) and its Taylor series is only valid if the
series converges. The geometric series

X
1
= 1 + z + z2 + =
zn
1z
n=0

(19)

c
F.
Waleffe, Math 321, 2013/1/21

73

## is the Taylor series of f (z) = 1/(1 z) about z = 0. As mentioned earlier,

P the function 1/(1 z) exists and
is infinitely differentiable everywhere except at z = 1 while the series n=0 z n only exists in the unit circle
|z| < 1.
Several useful Taylor series are more easily derived from the geometric series (11), (19) than from the
general formula (18) (even if you really like calculating lots of derivatives!). For instance

X
1
2
4
=
1
+
z
+
z
+

=
z 2n
1 z2
n=0

(20)

X
1
= 1 z + z2 =
(z)n
1+z
n=0

(21)

ln(1 + z) = z

X
z2
(1)n z n+1
+ =
2
n+1
n=0

(22)

The last series is obtained by integrating both sides of the previous equation and matching at z = 0 to
determine the constant of integration. These series converge only in |z| < 1 while the functions on the left
hand side exist for (much) larger domains of z.

Exercises:
1. Explain why the domain of convergence of a power series is always a disk (possibly infinitely large),
not an ellipse or a square or any other shape [Hint: read the notes carefully]. (Anything can happen
on the boundary of the disk: weak (algebraic) divergence or convergence, perpetual oscillations, etc.,
recall the geometric series).
P
2. Show that if a function f (z) = n=0 cn (z a)n for all zs within the (non-zero) disk of convergence
of the power series, then the cn s must have the form provided by formula (18).
3. What is the Taylor series of 1/(1 z) about z = 0? what is its radius of convergence? does the series
converge at z = 2? why not?
4. What is the Taylor series of the function 1/(1+z 2 ) about z = 0? what is its radius of convergence? Use
a computer or calculator to test the convergence of the series inside and outside its disk of convergence.
5. What is the Taylor series of 1/z about z = 2? what is its radius of convergence? [Hint: z = a + (z a)]
6. What is the Taylor series of 1/(1 + z)2 about z = 0?
7. Look back at all the places in these notes and exercises (including earlier subsections) where we have
used the geometric series for theoretical or computational reasons.

1.6

Complex transcendentals

The complex versions of the Taylor series for the exponential, cosine and sine functions
exp(z) = 1 + z +

(23)

X
z2
z4
z 2n
+
=
(1)n
2
4!
(2n)!
n=0

(24)

X
z3
z5
z 2n+1
+
=
(1)n
3!
5!
(2n + 1)!
n=0

(25)

cos z = 1

sin z = z

X
zn
z2
+ =
2
n!
n=0

converge in the entire complex plane for any z with |z| < as is readily checked from the ratio test. These
series can now serve as the definition of these functions for complex arguments. We can verify all the usual

74
properties of these functions from the series expansion. In general we can integrate and differentiate series
term by term inside the disk of convergence of the power series. Doing so for exp(z) shows that the function
is still equal to its derivative
!

X
d X zn
d
z n1
exp(z) =
=
= exp(z),
(26)
dz
dz n=0 n!
(n 1)!
n=1
meaning that exp(z) is the solution of the complex differential equation df /dz = f with f (0) = 1. Likewise
the series (24) for cos z and (25) for sin z imply
d
cos z = sin z,
dz

d
sin z = cos z.
dz

(27)

Another slight tour de force with the series for exp(z) is to use the binomial formula (6) to obtain
exp(z + a) =

X
X
n   k nk
n
X
X
X
(z + a)n
n z a
z k ank
=
=
.
n!
n!
k!(n k)!
k
n=0
n=0
n=0
k=0

(28)

k=0

The double sum is over the triangular region 0 n , 0 k n in n, k space. If we interchange the
order of summation, wed have to sum over k = 0 and n = k (sketch it!). Changing variables to
k, m = n k the range of m is 0 to as that of k and the double sum reads
!
!
X

X
X
X am
z k am
zk
exp(z + a) =
=
= exp(z) exp(a).
(29)
k!m!
k!
m!
m=0
m=0
k=0

k=0

This is a major property of the exponential function and we verified it from its series expansion (23) for
general complex arguments z and a. It implies that if we define as before
e = exp(1) = 1 + 1 +

1 1
1
1
+ +
+
+ = 2.71828...
2 6 24 120

(30)

then exp(n) = [exp(1)]n = en and exp(1) = [exp(1/2)]2 thus exp(1/2) = e1/2 etc. so we can still identify
exp(z) as the number e to the complex power z and (29) is the regular algebraic rule for exponents: ez+a =
ez ea . In particular
exp(z) = ez = ex+iy = ex eiy ,
(31)
ex is our regular real exponential but eiy is the exponential of a pure imaginary number. We can make sense
of this from the series (23), (24) and (25) to obtain
eiz = cos z + i sin z,
or
cos z =

## eiz = cos z i sin z,

eiz + eiz
,
2

sin z =

eiz eiz
.
2i

(32)

(33)

These hold for any complex number z. [Exercise: Show that eiz is not the conjugate of eiz unless z is real].
For z real, this is Eulers formula usually written in terms of a real angle
ei = cos + i sin .

(34)

This is arguably one of the most important formula in all of mathematics! It reduces all of trigonometry to
algebra among other things. For instance ei(+) = ei ei implies
cos( + ) + i sin( + ) =(cos + i sin )(cos + i sin )
=(cos cos sin sin ) + i(sin cos + sin cos )
which yields two trigonometric identities in one swoop.

(35)

c
F.
Waleffe, Math 321, 2013/1/21

75

Exercises:
1. Use series to compute the number e to 4 digits. How many terms do you need?
2. Use series to compute exp(i), cos(i) and sin(i) to 4 digits.
3. Express cos(1 + 3i) in terms of real expressions and factors of i that a 221 student might understand
and be able to calculate.
4. What is the conjugate of exp(iz)?
5. Use Eulers formula and geometric sums to derive compact formulas for the trigonometric sums
1 + cos x + cos 2x + cos 3x + + cos N x =?

(36)

## sin x + sin 2x + sin 3x + + sin N x =?

(37)

6. Generalize the previous results by deriving compact formulas for the geometric trigonometric series
1 + p cos x + p2 cos 2x + p3 cos 3x + + pN cos N x =?

(38)

(39)

## where p is an arbitrary real constant.

7. The formula (35) leads to the well-known double angle formula cos 2 = 2 cos2 1 and sin 2 =
2 sin cos . They also lead to the triple angle formula cos 3 = 4 cos3 3 cos and sin 3 =
sin (4 cos2 1). These formula suggests that cos n is a polynomial of degree n in cos and that
sin n is sin times a polynomial of degree n1 in cos . Derive explicit formulas for those polynomials.
[Hint: use Eulers formula for ein and the binomial formula]. The polynomial for cos n in powers of
cos is the Chebyshev polynomial Tn (x) with cos n = Tn (cos ).

1.7

Polar representation

Introducing polar coordinates in the complex plane such that x = r cos and y = r sin , then using Eulers
formula (34), any complex number can be written
z = x + iy = rei = |z|ei arg(z) .

(40)

This is the polar form of the complex number z. Its modulus is |z| = r and the angle = arg(z) + 2k is
called the phase of z, where k = 0, 1, 2, . . . is an integer. A key issue is that for a given z, its phase is
only defined up to an arbitrary multiple of 2 since replacing by 2 does not change z. However the
argument arg(z) is a function of z and therefore we want it to be uniquely defined for every z. For instance
we can define 0 arg(z) < 2, or < arg(z) . These are just two among an infinite number of possible
definitions. Although computer functions (Fortran, C, Matlab, ...) make a specific choice (typically the
2nd one), that choice may not be suitable in some cases. The proper choice is problem dependent. This is
because while is continuous, arg(z) is necessarily discontinuous. For example, if we define 0 arg(z) < 2,
then a point moving about the unit circle at angular velocity will have a phase = t but arg(z) = t
mod 2 which is discontinuous at t = 2k.
The cartesian representation x + iy of a complex number z is perfect for addition/subtraction but the
polar representation rei is more convenient for multiplication and division since
z1 z2 = r1 ei1 r2 ei2 = r1 r2 ei(1 +2 ) ,

(41)

r1 ei1
r1
z1
=
= ei(1 2 ) .
i
2
z2
r2 e
r2

(42)

76

1.8

Logs

The power series expansion of functions is remarkably powerful and closely tied to the theory of functions of
a complex variable. A priori, it doesnt seem very general, how, for instance, could we expand f (z) = 1/z
into a series in positive powers of z
1
= a0 + a1 z + a2 z 2 +
z

??

We can in fact do this easily using the geometric series. For any a 6= 0
1
1
1
=
=
z
a + (z a)
a


1+

X
1
(z a)n
=
(1)n n+1 .
za
a
n=0
a

(43)

Thus we can expand 1/z in powers of z a for any a 6= 0. That (geometric) series converges in the disk
|z a| < |a|. This is the disk of radius |a| centered at a. By taking a sufficiently far away from 0, that
disk where the series converges can be made as big as one wants but it can never include the origin which of
course is the sole singular point of the function 1/z. Integrating (43) for a = 1 term by term yields
ln z =

(1)n

n=0

(z 1)n+1
n+1

(44)

as the antiderivative of 1/z that vanishes at z = 1. This looks nice, however that series only converges for
|z 1| < 1. We need a better definition that works for a larger
P domain in the z-plane.
The Taylor series definition of the exponential exp(z) = n=0 z n /n! is very good. It converges for all zs,
it led us to Eulers formula ei = cos + i sin and it allowed us to verify the key property of the exponential,
namely exp(a + b) = exp(a) exp(b) (where a and b are any complex numbers), from which we deduced other
goodies: exp(z) ez with e = exp(1) = 2.71828 . . ., and ez = ex+iy = ex eiy .
What about ln z? As for functions of a single real variable we can introduce ln z as the inverse of ez or
as the integral of 1/z that vanishes at z = 1.
ln z as the inverse of ez
Given z we want to define the function ln z as the inverse of the exponential. This means we want to find a
complex number w such that ew = z. We can solve this equation for w as a function of z by using the polar
representation for z, z = |z|ei arg(z) , together with the cartesian form for w, w = u + iv, where u = Re(w)
and v = Im(w) are real. We obtain
ew = z eu+iv = |z|ei arg(z) ,
eu = |z|,
u = ln |z|,

eiv = ei arg(z) ,

(why?)

v = arg(z) + 2k,

(45)

where k = 0, 1, 2, Note that |z| 0 is a positive real number so ln |z| is our good old natural log of a
positive real number. We have managed to find the inverse of the exponential
ew = z w = ln |z| + i arg(z) + 2ik.

(46)

The equation ew = z for w, given z, has an infinite number of solutions. This make sense since ew = eu eiv =
eu (cos v + i sin v) is periodic of period 2 in v, so if w = u + iv is a solution, so is u + i(v + 2k) for any
integer k. We can take any one of those solutions as our definition of ln z, in particular


ln z = ln |z|ei arg(z) = ln |z| + i arg(z).

(47)

This definition is unique since we assume that arg z is uniquely defined in terms of z. However different
definitions of arg z lead to different definitions of ln z.

c
F.
Waleffe, Math 321, 2013/1/21

77

Example: If arg(z) is defined by 0 arg(z) < 2 then ln(3) = ln 3 + i, but if we define instead
arg(z) < then ln(3) = ln 3 i.
Note that you can now take logs of negative numbers! Note also that the ln z definition fits with our
usual manipulative rules for logs. In particular since ln(ab) = ln a + ln b then ln z = ln(rei ) = ln r + i. This
is the easy way to remember what ln z is.
Complex powers
As for functions of real variables, we can now define general complex powers in terms of the complex log and
the complex exponential
ab = eb ln a = eb ln |a| eib arg(a) ,
(48)
be careful that b is complex in general, so eb ln |a| is not necessarily real. Once again we need to define arg(a)
and different definitions can actually lead to different values for ab .
In particular, we have the complex power functions
z a = ea ln z = ea ln |z| eia arg(z)

(49)

(50)

## and the complex exponential functions

These functions are well-defined once we have defined a range for arg(z) in the case of z a and for arg(a) in
the case of az .
Once again, a peculiar feature of functions of complex variables is that the user is left free to choose
whichever definition is more convenient for the particular problem under consideration. Note that different
definition for the arg(a) provides definitions for ab that do not simply differ by an additive multiple of 2i
as was the case for ln z. For example
(1)i = ei ln(1) = e arg(1) = e2k
for some k, so the various possible definitions of (1)i will differ by a multiplicative integer power of e2 .
Roots
The fundamental theorem of algebra states that any nth order polynomial equation of the form cn z n +
cn1 z n1 + + c1 z + c0 = 0 with cn 6= 0 always has n roots in the complex plane. This can be stated as
saying that there always exist n complex numbers z1 , . . ., zn such that
cn z n + cn1 z n1 + + c0 = cn (z z1 ) (z zn ).

(51)

The numbers z1 , . . ., zn are the roots or zeros of the polynomial. These roots can be repeated as for the
polynomial 2z 2 4z + 2 = 2(z 1)2 . This expansion is called factoring the polynomial.
The equation 2z 2 2 = 0 has two real roots z = 1 and 2z 2 2 = 2(z 1)(z + 1). The equation
2
3z + 3 = 0 has no real roots, however it has two imaginary roots z = i and 3z 2 + 3 = 3(z i)(z + i).
The equation z n a = 0, with a complex and n a positive integer, therefore has n roots. We might be
tempted to write the solution as z = a1/n but what does that mean? According to our definition of ab above,
we have a1/n = e(ln a)/n which depends on the argument of a since ln a = ln |a| + i arg(a). When we define a
function we need to make the definition unique, but here we are looking for all the roots. This means that
we have to consider all possible definitions of arg(a). Heres a correct way to think about this. Using the
polar representation z = rei
z n = rn ein = a = |a|ei(arg(a)+2k) r = |a|1/n ,

2
arg(a)
+k
n
n

(52)

where k = 0, 1, 2, . . . The moduli |z| = r and |a| are positive real numbers, so |a|1/n is our good old root
function giving a positive real value, but = arg(z) has many possible values that differ by a multiple of
2/n. When n is a positive integer, this yields n distinct values of modulo 2, yielding n distinct values for
z. It is useful to visualize these roots. They are all equispaced on the circle of radius |a|1/n in the complex
plane.

78

Exercises:
1. Find all the roots, visualize and locate them in the complex plane and factor the corresponding polynomial (i) z 4 = 1, (ii) z 4 + 1 = 0, (iii) z 2 = i, (iv) 2z 2 + 5z + 2 = 0.
2. Investigate the solutions of the equation z b = 1 when (i) b is a rational number, i.e. b = p/q with p,
q integers, (ii) when b is irrational e.g. b = , (iii) when b is complex, e.g. b = 1 + i. Visualize the
solutions in the complex plane if possible.
3. If a and b are complex numbers, whats wrong with saying that if w = ea+ib = ea eib then |w| = ea and
arg(w) = b + 2k? Isnt that what we did in (45)?

2
2.1

## Functions of a complex variable

Visualization of complex functions

A function w = f (z) of a complex variable z = x + iy has complex values w = u + iv, where u, v are real.
The real and imaginary parts of w = f (z) are functions of the real variables x and y
f (z) = u(x, y) + i v(x, y).

(53)

z 2 = (x2 y 2 ) + i 2xy

(54)

## For example, w = z 2 = (x + iy)2 is

2

with a real part u(x, y) = x y and an imaginary part v(x, y) = 2xy. How do we visualize complex
functions? In calc I, for real functions of one real variable, y = f (x), we made an xy plot. Here x and y are
independent variables and w = f (z) corresponds to two real functions of two real variables Re(f (z)) = u(x, y)
and Im(f (z)) = v(x, y). One way to visualize f (z) is to make a 3D plot with u as the height above the (x, y)
plane. We could do the same for v(x, y), however a prettier idea is to color the surface u = u(x, y) in the
3D space (x, y, u) by the value of v(x, y). Here is such a plot for w = z 2 :
w=z2
1

0.8
1

0.6

0.4

0.5

0.2

0
0

0.5

0.2

0.4
1
1
0.6
1

0.5
0.5

0.8

0
0.5

0.5
1

c
F.
Waleffe, Math 321, 2013/1/21

79

Note the nice saddle-structure, u = x2 y 2 is the parabola u = x2 along the real axis y = 0, but u = y 2
along the imaginary axis, x = 0. Again the color of the surface is the value of v(x, y) at that point, as
given by the colorbar on the right. This visualization leads to pretty pictures but they quickly become too
complicated to handle, in large part because of the 2D projection on the screen or page. It is often more
useful to make 2D contour plots of u(x, y) and v(x, y):
contours of u=Real(z2)

contours of v=Imag(z2)

0.8

0.8

0.6

0.8

0.6

1.5

0.6

0.4

0.4

0.4
0.2

0.5

0.2

0.2
0

0.2
0.2

0.2

0.5

0.4
0.4

0.4
1
0.6

0.6

0.8

0.8

1
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

0.6
1.5
0.8

1
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

showing the isocurves (or isolines or level sets) of the function, u(x, y) on the left and v(x, y) on the right,
enhanced by constant coloring between contours. Note that the v(x, y) colors indeed match the colors on
the earlier 3D picture. The saddle-structure of both u(x, y) and v(x, y) is quite clear.

2.2

Cauchy-Riemann equations

We reviewed fundamental examples of complex functions: z n , ez , cos z, sin z, ln z, z 1/n , z a , az , etc. as well
as special complex functions such as Re(z), Im(z), z . We showed that Re(z) is not z-differentiable (9), and
Im(z) and z are not either. For instance,
(z + a ) z
a
dz
= lim
= lim
= e2i
a0
a0 a
dz
a

(55)

where a = |a|ei , so the limit is different for every . If a is real, then = 0 and the limit is 1, but if a is
imaginary then = /2 and the limit is 1. If |a| = e then a 0 in a logarithmic spiral as , but
there is no limit in that case since e2i keeps spinning around the unit circle without ever converging to
anything. We cannot define a limit as a 0, so z is not differentiable with respect to z. It is special for a
function to be z-differentiable.
The statement that the complex derivative of f (z) exists in a neighborhood of a point z has powerful
consequences. Thats because the limit in the definition of the derivative (7) can be taken in many different
ways. If we take a z = x real, we find that
f (z + z) f (z)
u(x + x, y) u(x, y)
v(x + x, y) v(x, y)
df
= lim
= lim
+ i lim
x0
x0
dz z0
z
x
x
u
v
=
+i ,
x
x
but if we pick a z = iy pure imaginary, we obtain
df
f (z + z) f (z)
u(x, y + y) u(x, y)
v(x, y + y) v(x, y)
= lim
= lim
+ i lim
y0
y0
dz z0
z
i y
i y
u v
=i
+
.
y
y

(56)

(57)

80
If the df /dz exists, the limit should be the same no matter how z 0, hence (56) and (57) must be
identical implying that
u
v
u
v
(58)
=
,
= .
x
y
y
x
These are the Cauchy-Riemann equations relating the partial derivatives of the real and imaginary part of
a function of a complex variable f (z) = u(x, y) + i v(x, y). This derivation shows that the Cauchy-Riemann
equations are necessary conditions on u(x, y) and v(x, y) if f (z) is differentiable in a neighborhood of z. If
df /dz exists then the Cauchy-Riemann equations (58) necessarily hold.
Example 1: The function f (z) = z 2 has u = x2 y 2 and v = 2xy. Its z-derivative dz 2 /dz = 2z exists
everywhere and the Cauchy-Riemann equations (58) are satisfied everywhere since u/x = 2x = v/y
and u/y = 2y = v/x.
Example 2: The function f (z) = z = x iy has u = x, v = y. Its z-derivative dz /dz =?! does not
exist anywhere as we showed earlier and the Cauchy-Riemann equations (58) do not hold anywhere since
u/x = 1 6= v/y = 1.
B The converse is also true, if the Cauchy-Riemann equations are satisfied in a neighborhood of a point
(x, y) then the functions u(x, y) and v(x, y) are called conjugate functions and they in fact consitute the real
and imaginary part of a differentiable function of a complex variable f (z). To prove this we need to show
that the z-derivative of the function f (z) u(x, y) + iv(x, y) exists independently of how the limit is taken.
Writing a = + i with and real, we have
f (z + a) f (z)
df
= lim
dz a0
a


u(x + , y + ) + iv(x + , y + ) u(x, y) + iv(x, y)
= lim
a0
+ i




u(x + , y + ) u(x, y) + i v(x + , y + ) v(x, y)
= lim
.
a0
+ i

(59)

Now the functions u(x, y) and v(x, y) being differentiable in the neigborhood of (x, y) implies that locally
we can write
u
u
+
+ o(a),
x
y
v
v
v(x + , y + ) v(x, y) =
+
+ o(a)
x
y

u(x + , y + ) u(x, y) =

(60)

where the derivatives are evaluated at the point (x, y) and o(a) (little oh of a) is a remainder that goes
to zero faster than a, so lima0 o(a)/a = 0. (The notation O(a) (big Oh of a) denotes an expression that
goes to zero as fast as a so lima0 O(a)/a = C for some complex constant C.) Using these local expansions
and the Cauchy-Riemann equations (58) to replace the y-derivatives by x-derivatives, we can rewrite (59) as



 

u
v
+ i
u
v
+i
lim
=
+i
,
(61)
x
x a0 + i
x
x
hence the limit is indeed independent of how a = + i tends to zero.

## A function f (z) that is differentiable in a neighborhood of z is said to be analytic (or holomorphic ) in

that neighborhood. A function f (z) = u(x, y) + iv(x, y) is analytic in a neighborhood of z if and only if the
Cauchy-Riemann equations (58) are satisfied in that neighborhood.
Another important consequence of z-differentiability and the Cauchy-Riemann equations (58) is that the
real and imaginary parts of a differentiable function f (z) = u(x, y) + i v(x, y) both satisfy Laplaces equation
2u 2u
2v
2v
+
=
0
=
+
.
x2
y 2
x2
y 2

(62)

c
F.
Waleffe, Math 321, 2013/1/21

81

Exercises:
1. Deduce (62) from the Cauchy-Riemann equations (58). Verify these results (58) and (62), for f (z) = z 2 ,
ez , ln z, etc.
2. Is the function |z| analytic? Why? What about the functions Re(z) and f (|z|)?
3. Given u(x, y) find its conjugate function v(x, y), if possible,
such that u(x, y) + iv(x, y) f (z), for (i)
p
u = y; (ii) u = x + y; (iii) u = cos x cosh y, (iv) u = ln x2 + y 2 .
4. Substituting (60) into (59) we obtain



 



u
v

u
v

o(a)
+i
lim
+
+i
lim
+ lim
a0 a
x
x a0 + i
y
y a0 + i




u
v
1 u
v
=
+i
+
+i
x
x
i y
y
since


lim

a0

+ i


= 1,

lim

a0

+ i


=

1
.
i

## But this does not agree with (61). Why not?

5. Explain why we can write (60).

2.3

## Geometry of Cauchy-Riemann, Conformal Mapping

y

The Cauchy-Riemann equations (58) connecting the real and imaginary part of a z-differentiable function f (z) = u(x, y) + i v(x, y)
have remarkable geometric implications.
x

## For f (z) = z 2 = (x2 y 2 ) + i 2xy, the figure on the left shows

the contours u(x, y) = x2 y 2 = 0, 1, 4, 9 (blue) which are
hyperbolas with asymptotes y = x. The contours v(x, y) = 2xy =
0, 1, 4, 9 (red) are also hyperbolas but now with asymptotes
x = 0 and y = 0. Solid is positive, dashed is negative. The u and
v contours intersect everywhere at 90 degrees, except at z = 0.

The orthogonality of the contours of u = Re(f (z)) and v = Im(f (z)) wherever df /dz exists but does
not vanish, is general and follows directly from the Cauchy-Riemann equations. Indeed, the gradient u =
(u/x, u/y) at a point (x, y) is perpendicular to the contour of u(x, y) through that point (x, y). Likewise
the gradient v = (v/x, v/y) at that same point (x, y) is perpendicular to the isocontour of v(x, y)
through that point. The Cauchy-Riemann equations (58) imply that these two gradients are perpendicular
to each other
u v
u v
v v
v v
u v =
+
=

= 0.
(63)
x x y y
y x x y
Since gradients are always perpendicular to their respective isocontours, and here u v, the isocontours
of u and v are also orthogonal to each other. Therefore if f (z) = u(x, y)+iv(x, y) is z-differentiable (analytic)
then the isocontours of u(x, y) and v(x, y) are orthogonal to each other wherever they intersect. This is one
application of analytic functions: their real and imaginary parts u(x, y) and v(x, y) provide orthogonal
coordinates in the (x, y) plane.
Orthogonality of the u = Re(f (z)) and v = Im(f (z)) contours holds wherever df /dz exists except possibly
at critical points where df /dz = 0 and u = v = 0. For the example w = z 2 plotted above, the contours
are orthogonal everywhere except at z = 0 where dz 2 /dz = 2z = 0.

82
Conformal Mapping
We can visualize the function w = f (z) = u(x, y) + iv(x, y) as a mapping from the complex plane z = x + iy
to the complex plane w = u + iv. For example, f (z) = z 2 is the mapping z w = z 2 , or equivalently from
(x, y) (u, v) = (x2 y 2 , 2xy).

v
w = z2

z
x

z = w1/2

The vertical line u = u0 in the w-plane is the image of the hyperbola x2 y 2 = u0 in the z-plane. The
horizontal line v = v0 in the w-plane is the image of the hyperbola 2xy = v0 in the z-plane. Every point z
in the z-plane has a single image w in the w-plane, however the latter has two pre-images z and z in the
z-plane, indeed the inverse functions are z = w1/2 .
In polar form z = rei w = r2 ei2 . This means that every radial line from the origin with angle
from the x-axis in the z-plane is mapped to a radial line from the origin with angle 2 from the u-axis in
the w-plane.
The blue and red curves intersect at 90 degrees in both planes. That is the orthogonality of u and v, but
the dotted radial line intersects the blue and red curves at 45 degrees in both planes, for example. In fact
any angle between any two curves in the z-plane is preserved in the w-plane except at z = w = 0 where they
are doubled from the z to the w-plane.
This is another general property of z-differentiable complex functions f (z). If f (z) is z-differentiable
(analytic) then the mapping w = f (z) preserves all angles at all zs such that f 0 (z) 6= 0 when mapping from
the z-plane to the w-plane.
To show this in general, consider three neighboring points in the z-plane: z, z + dz1 and z + dz2 . We
are interested in seeing what happens to the angle between the two infinitesimal vectors dz1 and dz2 . If
dz1 = |dz1 |ei1 and dz2 = |dz2 |ei2 then the angle between those two vectors is = 2 1 and this is the
phase of dz2 /dz1 = |dz2 |/|dz1 |ei(2 1 ) .

dz2
z

dz1
w = f (z)

dw2
dw1
w

## The point z is mapped to the point w = f (z), the point z + dz1 is

mapped to w + dw1 = f (z + dz1 ) ' f (z) + f 0 (z)dz1 and z + dz2 is
mapped to w +dw2 = f (z +dz2 ) ' f (z)+f 0 (z)dz2 . The angle between
the infinitesimal vectors dw1 = f 0 (z)dz1 and dw2 = f 0 (z)dz2 at w is
the phase of dw2 /dw1 = dz2 /dz1 , hence it is identical to the angle
between dz1 and dz2 .

All angles are preserved by the mapping w = f (z) except where f 0 (z) = 0 and dw1 = dw2 = 0 at
first order. The dws would be 2nd order in dz if f 0 (z) = 0 but f 00 (z) 6= 0 yielding dw1 = f 00 (z)dz12 /2 and
dw2 = f 00 (z)dz22 /2 hence dw2 /dw1 = (dz2 /dz1 )2 = (|dz2 |/|dz1 |)2 ei2(2 1 ) and angles are doubled at such
points. Likewise angles at points where f 0 (z) = f 00 (z) = 0 but f 000 (z) 6= 0 would be tripled, and so on. For

c
F.
Waleffe, Math 321, 2013/1/21

83

example, the mapping w = z 2 preserves all angles except at the origin z = 0 where angles are doubled by this
mapping z = rei z 2 = r2 ei2 . The mapping w = z 3 preserves all angles except at z = 0 where angles are
tripled since z = rei becomes z 3 = r3 ei3 .
A mapping that preserves all angles is called conformal. Analytic functions f (z) provide conformal
mappings between z and w = f (z) at all points where f 0 (z) 6= 0.
Examples of conformal mappings

w = z2

z
x

z = w1/2

w = z 2 The (magenta) vertical line x = x0 maps to the parabola u = x20 y 2 , v = 2x0 y in the (u, v) plane
and the (green) horizontal lime y = y0 becomes the (green) parabola u = x2 y02 , v = 2xy0 in the (u, v)
plane. The green and magenta curves intersect at 90 degrees in both planes. Angles between the dotted line
and the green and magenta curves are the same in both planes, except at z = w = 0. What happens there?
The definition of the inverse function is w1/2 = |w|1/2 ei arg(w)/2 with < arg(w) . What would the
map look like if we defined 0 arg(w) < 2?

y=

v
w = ez

z
u

y =

z = ln w

w = ez = ex eiy Maps the strip < x < , < y to the entire w-plane. z = x0 + iy
w = ex0 eiy circles of radius ex0 in the w-plane (magenta). z = x + iy0 w = ex eiy0 radial lines
with polar angle arg(w) = y0 in w-plane (green). z = x + iax w = ex eiax radial lines out of the

84
origin in z-plane mapped to logarithmic spirals in w-plane since z = x + iax with a fixed (and real) becomes
w = ex eiax rei so r = ex , = ax and r = e/a in the w-plane (blue). Notes: z = 0 w = 1. ez+2i = ez ,
periodic of complex period 2i, so ez maps an infinite number of zs to the same w. The inverse function
z = ln w = ln |w| + i arg(w) showed in this picture corresponds to the definition < arg(w) . All angles
are preserved e.g. the angles between green and magenta curves, as well as between blue and colored curves,
except at w = 0. What zs correspond to w = 0?

v
w = cosh(z)

z = ln(w +

w2 1)

w = cosh(z) = (ez + ez )/2 = (ex eiy + ex eiy )/2 = cosh x cos y + i sinh x sin y u + iv. Maps the semiinfinite strip 0 x < , < y to the entire w-plane. cosh(z) = cosh(z) and cosh(z+2i) = cosh(z),
even in z and periodic of period 2i. x = x0 0 u = cosh x0 cos y, v = sinh x0 sin y, ellipses in the
w-plane (magenta). y = y0 0 u = cosh x cos y0 , v = sinh x sin y0 , hyperbolas in the w-plane
(green).

2 1), but
w
This mapping gives orthogonal,
confocal
elliptic
coordinates.
The
inverse
map
is
z
=
ln(w
+

for what definition of w2 1? (not Matlab!). The line from w = to w = 1 is a branch cut, our
definition for ln(w + w2 1) is discontinuous across that line.

Exercises:
1. Consider the mapping w = z 2 . Determine precisely where the triangle (i) (1, 0), (1, 1), (0, 1) in the
z-plane gets mapped to in the w = u + iv plane; (ii) same but for triangle (0, 0), (1, 0), (1, 1). Do not
simply map the vertices, determine precisely what happens to each edge of the triangles.
2. Analyze the mapping w = 1/z. Determine what isocontours of u and v look like in the z-plane.
Determine where radial lines ( = constant) and circles (r = constant) in the z-plane get mapped to
in the w-plane.
3. Analyze the mappings w = ez and w = cosh z = (ez + ez )/2.
4. Determine what happens to circles and radial lines in the z-plane under the Joukowski mapping w =
(z + 1/z).

## Integration of Complex Functions

Rb
What do we mean by a f (z)dz when f (z) is a complex function of the complex variable z and the bounds
a and b are complex numbers in the z-plane?

c
F.
Waleffe, Math 321, 2013/1/21

85

In general
R we need to specify the path C in the complex plane to go from a to b and we need to write the
integral as C f (z)dz. Then if z0 = a, z1 , z2 , . . . , zN = b are successive points on the path from a to b we
can define the integral as usual as
N
X

Z
f (z)dz = lim

zn 0

f (
zn )zn

(64)

n=1

where zn = zn zn1 and zn is a point on the path (or the line segment) between zn1 and zn . This
definition
R also provides a practical way to estimate the integral. In particular if |f (z)| M along the curve
then | C f (z)dz| M L where L 0 is the length of the curve from a to b. Note also that the integral from
a to b along C is minus that from b to a along the same curve since all the zn change sign for that curve.
If C is from a to b, well use C to denote the same path but with the opposite orientation, from b to a. If
we have a parametrization for the curve, say z(t) with t real and z(ta ) = a, z(tb ) = b then the integral can
be expressed as
Z
Z tb
 dz
dt.
(65)
f (z)dz =
f z(t)
dt
C
ta
Examples: To compute the integral of 1/z along the path C1 that consists
of the unit circle counterclockwise from a = 1 to b = i, we can parametrize
the circle using the polar angle as z() = ei then dz = iei d and
Z
C1

1
dz =
z

/2

C1

1 i

ie d = i ,
i
e
2

but along the path C2 which consists of the unit circle clockwise between
the same endpoints a = 1 to b = i

i
1

Z
C2

C2

1
dz =
z

Z
0

3/2

1 i
3
ie d = i .
ei
2

## Clearly the integral of 1/z from a = 1 to b = i depends on the path.

However for the function z 2 over the same two paths with z = ei , z 2 = ei2 and dz = iei d, we find
Z

z 2 dz =

C1

/2

iei3 d =

 i 1
1  i3/2
b3 a3
e
1 =
=
,
3
3
3

3/2

 i 1
1  i9/2
b3 a3
e
1 =
=
.
3
3
3
C2
0
Rb
Thus for z 2 it appears that we obtain the expected result a z 2 dz = (b3 a3 )/3, independently of the path.
Weve only checked two special paths, so we do not know for sure but, clearly, a key issue is to determine
when an integral depends on the path of integration or not.
Z

iei3 d =

z dz =

3.1

Cauchys theorem

The integral of a complex function is independent of the path of integration if and only if the integral over
a closed contour always vanishes. Indeed if C1 and C2 are two distinct paths from a to b then the curve
C = C1 C2 which goes from a to b along C1 then back from b to a along C2 is closed. The integral along
that close curve is zero if and only if the integral along C1 and C2 are equal.
Writing z = x + iy and f (z) = u(x, y) + iv(x, y) the complex integral around a closed curve C can be
written as
I
I
I
I
f (z)dz = (u + iv)(dx + idy) = (udx vdy) + i (vdx + udy)
(66)
C

86
hence the real and imaginary parts of the integral are real line integrals. These line integrals can be turned
into area integrals using the curl form of Greens theorem:
I
I
I
f (z)dz =
(udx vdy) + i (vdx + udy)
C
C


ZC 
Z 
u
u v
v

dA + i

dA,
(67)
=

x y
x y
A
A
where A is the interior domain bounded by the closed curve C. But the Cauchy-Riemann equations (58) give
u v

= 0,
x y

v
u
+
=0
x y

(68)

whenever the function f (z) is analytic in the neighborhood of the point z = x + iy. Thus both integrals
vanish if f (z) is analytic at all points of A. This is Cauchys theorem,
I
f (z)dz = 0
(69)
C

## if df /dz exists everywhere inside (and on) the closed curve C.

Functions like ez , cos z, sin z and z n with n 0 are differentiable for all z, hence the integral of such
functions around any closed contour C vanishes. But what about the integral of simple functions such as
z n with n > 0? Those functions are analytic everywhere except at z = 0 so the integral of 1/z n around
any closed contour that does not include the origin will still vanish. Lets figure out what happens when
the contour circles the origin. Consider the circle z = Rei , which is of radius R and centered at the origin,
oriented counter-clockwise, then as before dz = iRei d and

Z 2
Z 2
I
dz
iRei
2i if n = 1,
1n
i(1n)
=
d
=
iR
e
d
=
n
n ein
0
if n 6= 1.
z
R
0
|z|=R
0
This result in fact holds for any closed counter-clockwise curve C around the origin.

A
C0

## circle C0 of radius  > 0 as small as needed to be inside the outer

closed curve C. Now, the function 1/z n is analytic everywhere inside
the domain A bounded by the counter-clockwise outer boundary C and
the inner circle boundary C0 oriented clockwise (emphasized here by
the minus sign) so the interior A is always to the left of the boundary,
as required by convention for the curl-form of Greens theorem in vector
calculus.

By Cauchys theorem, this implies that the integral over the closed contour, which consists of the sum of
the outer counter-clockwise curve C and the inner clockwise small circle about the origin C0 , vanishes
I
I
I
1
1
1
dz
=
0

dz
=
dz.
n
n
n
z
z
z
C+(C0 )
C
C0
In other words the integral about the closed contour C equals the integral about the closed inner circle C0 ,
both of which have the same orientation, counter-clockwise in this case. 1
This result can be slightly generalized to the functions (z a)n , n < 0 for any a (consider a small circle
about a: z = a + ei , etc.) so, combining with Cauchys theorem when n 0 we get the important result
that for integer n = 0, 1, 2, . . . and a closed contour C oriented counter-clockwise then

I
2i if n = 1 and C encloses a,
(z a)n dz =
(70)
0
otherwise.
C
1 Recall that we used this singularity isolation technique in conjunction with the divergence theorem to evaluate the flux of
r
/r2 = r/r3 (the inverse square law of gravity and electrostatics) through any closed surface enclosing the origin, as well as
in conjunction with Stokes theorem for the circulation of a line current B = (
z r)/|
z r|2 = /
= around a loop
enclosing the z-axis.

c
F.
Waleffe, Math 321, 2013/1/21

87

Connection with ln z
The integral of 1/z is of course directly related to ln z, the natural log of z which can be defined as the
antiderivative of 1/z that vanishes at z = 1, that is
Z z
1
d.
ln z

1
We use as the dummy variable of integration since z is the upper limit of integration.
But along what path from 1 to z? Heres the 2i multiplicity again. We have seen earlier in this section
that the integral of 1/z from a to b depends on how we go around the origin. If we get one result along
one path, we can get the same result + 2i if we use a path that loops around the origin one more time
counterclockwise than the original path. Or 2i if it loops clockwise, etc. Look back at exercise (7) in
section (3.1). If we define a range for arg(z), e.g. 0 arg(z) < 2, we find
Z z
1
d = ln |z| + i arg(z) + 2ik
(71)

1
for some Rspecific k that depends on the actual path taken from 1 to z and our definition of arg(z). The
z
notation 1 is not complete for this integral. The integral is path-dependent and it is necessary to specify
that path in more details, however all possible paths give the same answer modulo 2i.

Exercises:
closed paths are oriented counterclockwise unless specified otherwise.
1. Calculate the integral of f (z) = z + 2/z along the path C that goes once around the circle |z| = R > 0.
Discuss result in terms of R.
2. Calculate the integral of f (z) = az + b/z + c/(z + 1), where a, b and c are complex constants, around
(i) the circle of radius R > 0 centered at z = 0, (ii) the circle of radius 2 centered at z = 0, (iii) the
triangle 1/2, 2 + i, 1 2i.
3. Calculate the integral of f (z) = 1/(z 2 4) around (i) the unit circle, (ii) the parallelogram 0, 2 i, 4,
2 + i. [Hint: use partial fractions]
4. Calculate the integral of f (z) = 1/(z 4 1) along the circle of radius 1 centered at i.
5. Calculate the integral of sin(1/(3z)) over the square 1, i, 1, i. [Hint: use the Taylor series for sin z].
6. Calculate the integral of 1/z from z = 1 to z = 2ei/4 along (i) the path 1 2 along the real line then
2 2ei/4 along the circle of radius 2, (ii) along 1 2 on the real line, followed by 2 2ei/4 along
the circle of radius 2, clockwise.
7. If a is an arbitrary complex number, show that the integral of 1/z along the straight line from 1 to a
is equal to the integral of 1/z from 1 to |a| along the real line + the integral of 1/z along the circle of
radius |a| from |a| to a along a certain circular path. Draw a sketch!! Discuss which circular path and
calculate the integral. What happens if a is real but negative?
8. Does the integral of 1/z 2 from z = a to z = b (with a and b complex) depend on the path? Explain.
9. Pause and marvel at the power of (69) combined with (70). Continue.
P
n
10. The expansionH (43) with a =
H 1/z = nn=0 (1 z) . Using this expansion together with (70)
P1 gives
we find that |z|=1 dz/z =
n=0 |z|=1 (1 z) dz = 0. But this does not match with our explicit
H
calculation that |z|=1 dz/z = 2i. Whats wrong?!

88

3.2

Cauchys formula

The combination of (69) with (70) and partial fraction and/or Taylor series expansions is quite powerful
as we have already seen in the exercises, but there is another fundamental result that can be derived from
them. This is Cauchys formula
I
f (z)
dz = 2if (a)
(72)
C za
which holds for any closed counterclockwise contour C that encloses a provided f (z) is analytic (differentiable)
everywhere inside and on C.
H
The proof of this result follows the approach we used to calculate C dz/(z a) in section 3.1. Using
Cauchys theorem (69), the integral over C is equal to the integral over a small counterclockwise circle Ca of
radius  centered at a. Thats because the function f (z)/(z a) is analytic in the domain between C and
the circle Ca : z = a + ei with = 0 2, so
I
C

f (z)
dz =
za

I
Ca

f (z)
dz =
za

## f (a + ei ) id = 2if (a).

(73)

The final step follows from the fact that the integral has the same value no matter what  > 0 we pick. Then
taking the limit  0+ , the function f (a + ei ) f (a) because f (z) is a nice continuous and differentiable
function everywhere inside C, and in particular at z = a.
Cauchys formula has major consequences that follows from the fact that it applies to any a inside C. To
emphasize that, let us rewrite it with z in place of a, using as the dummy variable of integration
I
f ()
d.
(74)
2if (z) =
C z
This provides an integral formula for f (z) at any z inside C in terms of its values on C. Thus knowing f (z)
on C completely determines f (z) everywhere inside the contour! This formula is at the basis of boundary
integral methods.
Mean Value Theorem
Since (74) holds for any closed contour C as long as f (z) is continuous and differentiable inside and on that
contour, we can write it for a circle of radius r centered at z, = z + rei where d = irei d and (74) yields
1
f (z) =
2

f (z + rei )d

(75)

which states that f (z) is equal to its average over a circle centered at z. This is true as long as f (z) is
differentiable at all points inside the circle of radius r. This mean value theorem also applies to the real and
imaginary parts of f (z) = u(x, y) + iv(x, y). It implies that u(x, y), v(x, y) and |f (z)| do not have extrema
inside a domain where f (z) is differentiable. Points where f 0 (z) = 0 and therefore u/x = u/y =
v/x = v/y = 0 are saddle points, not local maxima or minima.
Generalized Cauchy formula and Taylor Series
Cauchys formula also implies that if f 0 (z) exists in the neighborhood of a point a then f (z) is infinitely
differentiable in that neighborhood! Furthermore, f (z) can be expanded in a Taylor series about a that
converges inside a disk whose radius is equal to the distance between a and the nearest singularity of f (z).
That is why we use the special word analytic instead of simply differentiable. For a function of a complex
variable being differentiable in a neighborhood is a really big deal!
B To show that f (z) is infinitely differentiable, we can show that the derivative of the right-hand side
of (74) with respect to z exists by using the limit definition of the derivative and being careful to justify

c
F.
Waleffe, Math 321, 2013/1/21

89

existence of the integrals and the limit. The final result is the same as that obtained by differentiating with
respect to z under the integral sign, yielding
I
f ()
d.
(76)
2if 0 (z) =
(
z)2
C
Doing this repeatedly we obtain
2if

(n)

I
(z) = n!
C

f ()
d.
( z)n+1

(77)

where f (n) (z) is the nth derivative of f (z) and n! = n(n 1) 1 is the factorial of n. Since all the integrals
exist, all the derivatives exist. Formula (77) is a generalized Cauchy formula.

B Another derivation of these results that establishes convergence of the Taylor series expansion at the
same time is to use the geometric series (11) and the slick trick that we used in (43) to write

X
1
1
1
1
(z a)n
=
=
=
z

a
z
( a) (z a)
a 1
( a)n+1
n=0
a

where the geometric series converges provided |z a| < | a|. Cauchys formula (74) then becomes
I
I X
I

X
f ()
(z a)n
f ()
n
2if (z) =
f ()
d =
(z a)
d =
d
n+1

z
(

a)
(

a)n+1
Ca
Ca n=0
Ca
n=0

(78)

(79)

where Ca is a circle centered at a whose radius is as large as desired provided f (z) is differentiable inside and
on the circle. For instance if f (z) = 1/z then the radius of the circle must be less then |a| since f (z) has a
singularity at z = 0 but is nice everwhere else. If f (z) = 1/(z + i) then the radius must be less than |a + i|
which is the distance between a and i since f (z) has a singularity at i. In general, the radius of the circle
must be less than the distance between a and the nearest singularity of f (z). To justify interchanging the
integral and the series we need to show that each integral exists and that the series of the integrals converges.
If |f ()| M on Ca and |z a|/| a| q < 1 since Ca is a circle of radius r centered at a and z is inside
that circle while is on the circle so a = rei , d = irei d and

I

(z a)n f ()

(80)
d 2M q n

n+1
Ca ( a)
showing that all integrals converge and the series of integrals also converges since q < 1.
The series (79) provides a power series expansion for f (z)
I

X
X
f ()
2if (z) =
(z a)n
d
=
cn (a)(z a)n
n+1
(

a)
Ca
n=0
n=0

(81)

that converges inside a disk centered at a with radius equal to the distance between a and the nearest
singularity of f (z). The series can be differentiated term-by-term and the derivative series also converges in
the same disk. Hence all derivatives of f (z) exist in that disk. In particular we find that
I
f ()
f (n) (a)
=
d.
(82)
cn (a) = 2i
n!
(

a)n+1
Ca
which is the generalized Cauchy formula (77) and the series (79) is none other than the familiar Taylor Series

X
f 00 (a)
f (n) (a)
2
f (z) = f (a) + f (a)(z a) +
(z a) + =
(z a)n .
2
n!
n=0
0

(83)


Finally, Cauchys theorem tells us that the integral on the right of (82) has the same value on any closed
contour (counterclockwise) enclosing a but no other singularities of f (z), so the formula holds for any such
closed contour as written in (77). However convergence of the Taylor series only occurs inside a disk centered
at a and of radius equal to the distance between a and the nearest singularity of f (z).

90

Exercises:
1. Why can we take (z a)n outside of the integrals in (79)?
2. Verify the estimate (80). Why does that estimate implies that the series of integrals converges?
3. Consider the integral of f (z)/(z a)2 about a small circle Ca of radius  centered at a: z = a + ei ,
0 < 2. Study the limit of the -integral as  0+ . Does your limit agree with the generalized
Cauchy formula (77), (82)?
4. Find the Taylor series of 1/(1 + x2 ) and show that its radius of convergence is |x| < 1 [Hint: use
the geometric series]. Explain why the radius of convergence is one in terms of the singularities of
1/(1 + z 2 ). Would the Taylor series of 1/(1 + x2 ) about a = 1 have a smaller or larger radius of
convergence than that about a = 0?
5. Show that since an analytic function f (z) = u(x, y) + iv(x, y) is infinitely differentiable, its real and
imaginary parts are infinitely differentiable with respect to x and y. Show that 2 u = 2 v = 0 where
2 = 2 /x2 + 2 /y 2 is the 2D Laplacian and 2 (x, y) = 0 is Laplaces equation. Functions that
satisfy Laplaces equations are called harmonic functions. [Hint: use the Cauchy-Riemann equations
repeatedly].
6. Show that ekx cos ky and ekx sin ky are solutions of Laplaces equation for any real k. [Hint: consider
the complex function f (z) = ekz ]. These solutions occur in a variety of applications, e.g. surface gravity
waves with the surface at x = 0.
7. Calculate the integrals of cos(z)/z n and sin(z)/z n over the unit circle, where n is a positive integer.

## Applications of complex integration

One application of complex (a.k.a. contour) integration is to turn difficult real integrals into simple complex
integrals.
Example 1: What is the average of the function F (t) = 3/(5 + 4 cos t)? Since F (t) is periodic of period
2/, let = t and the average of F (t) is the same as the average of f () = 3/(5 + 4 cos ) over one period.
R 2
That average is (2)1 0 f ()d. To compute that integral we think integral over the unit circle in the
complex plane! Indeed the unit circle with |z| = 1 has the simple parametrization
z = ei dz = iei d d =

dz
.
iz

(84)

Furthermore
cos =

ei + ei
z + 1/z
=
,
2
2

so we obtain
Z
0

3
d =
5 + 4 cos

I
|z|=1

3
dz
3
=
5 + 2(z + 1/z) iz
2i

I
|z|=1

3
dz
=
1
2i
(z + 2 )(z + 2)

2i
z+2


= 2.

(85)

z= 21

What magic was that? We turned our integral of a periodic function over its period into an integral from
0 to 2 (that can always be done), then we turned that integral into a complex integral over the unit circle
(that can always be done too). That led us to the integral over a closed curve of a relatively nice function
(thats not always the case).

c
F.
Waleffe, Math 321, 2013/1/21

91

z = ei Our complex function has two simple poles, at 1/2 and 2. Since 2
2

is outside the unit circle, it does not contribute to the integral, but the
simple pole at 1/2 does. So the integrand has the form g(z)/(z a)
with a = 1/2 inside our domain and g(z) = 1/(z+2), is a good analytic
function inside the unit circle. So one application of Cauchys formula,
et voil`
a. The function 3/(5 + 4 cos ) which oscillates between 1/3 and
3 has an average of 1.

1/2

## Related exercises: calculate

Z
0

3 cos n
d
5 + 4 cos

(86)

where n is an integer. [Hint: use symmetries to write the integral in [0, 2], do not use 2 cos n = ein + ein
(why not? try it out to find out the problem), use instead cos n = Re(ein ) with n 0.] An even function
f () = f () periodic of period 2 can be expanded in a Fourier series f () = a0 + a1 cos + a2 cos 2 +
a3 cos 3 + . This expansion is useful in all sorts of applications: numerical calculations, Rsignal processing

etc. The coefficient a0 is the average of f (). The other coefficients are given by an = 2 1 0 f () cos nd,
i.e. the integrals (86). So what is the Fourier (cosine) series of 3/(5 + 4 cos )? Can you say something about
its convergence?

Example 2:
Z

dx
=
1 + x2

(87)

This integral is easily done since 1/(1 + x2 ) = d/dx(arctan x), but we use contour integration to demonstrate
the method. The integral is equal to the integral of 1/(1 + z 2 ) over the real line z = x with x = .
That complex function has two simple poles at z = i since z 2 + 1 = (z + i)(z i).

C
i

Cx

## So we turn this into a contour integration by considering the

closed path consisting of Cx : z = x with x = R R (real line)
+ the semi-circle C : z = Rei with = 0 . Since i is the
only simple pole inside our closed contour C = Cx + C , Cauchys
formula gives


I
I
dz
(z + i)1
1
=
dz = 2i
= .
2
zi
z + i z=i
C z +1
C
To get the integral we want, we need to take R and figure
out the C part. That part goes to zero as R since

Z
Z
R
dz iRei d
Rd
= 2
.
=
<

2
2
2i
2
+1
R 1
C z + 1
0 R e
0 R 1

## Example 3: We use the same technique for

Z

dx
=
1 + x4

Z
R

dz
1 + z4

(88)

Here the integrand has 4 simple poles at zk = ei/4+2i(k1)/4 , where k = 1, 2, 3, 4. These are the roots of
z 4 = 1 = ei+2i(k1) . They are on the unit circle, equispaced by /2. Note that z1 and z3 = z1 are
the roots of z 2 i = 0 while z2 and z4 = z2 are the roots of z 2 + i = 0, so z 4 + 1 = (z 2 i)(z 2 + i) =
(z z1 )(z + z1 )(z z2 )(z + z2 ).

92
We use the same closed contour C = Cx + C as above but now
there are two simple poles inside that contour. We need to isolate
I
I
I
=
+
.

C
C2

C1

z2
R

z1

Cx
z4

Likewise

I
C2

dz
= 2i
4
z +1

C2

## Then Cauchys formula gives



I
1

dz
= 2i
=
.
2 z2)
4+1
z
2z
(z
2z
1 1
1
C1
2

z3

C1

1
2z2 (z22

z12 )

=
= .
2(z2 )
2z4
2z1

These manipulations are best understood by looking at the figure which shows that z2 = z4 = z1 together
with z12 = i, z22 = i. Adding both results gives
I
C

dz

=
4
z +1
2

1
1
+
z1
z1


=

ei/4 + ei/4

= cos = .
2
4
2

As before we need to take R and figure out the C part. That part goes to zero as R since
Z
Z
Z

dz iRei d
Rd
R

=
<
= 4
.

4
4
4i
4
+1
R 1
C2 z + 1
0 R e
0 R 1

## We could extend the same method to

Z

x2
dx
1 + x8

(89)

R
(and the much simpler x/(1 + x8 )dx = 0 ;-) ) We would use the same closed contour again, but there
would be 4 simple poles inside it and therefore 4 separate contributions.

Example 4:
Z

dx
=
(1 + x2 )2

Z
R

dz
=
(z 2 + 1)2

Z
R

dz
.
(z i)2 (z + i)2

(90)

We use the same closed contour once more, but now we have a double pole inside the contour at z = i. We
can figure out the contribution from that double pole by using the generalized form of Cauchys formula
(77). The integral over C vanishes as R as before and


Z
d

dx
2
=
2i
(z
+
i)
= .
(91)
2 )2
(1
+
x
dz
2

z=i
A (z a)n in the denominator, with n a positive integer, is called and n-th order pole.

Warning: Cauchys generalized formula is cool but can fail where Taylor coupled with (70) will succeed.
Example:
I
ez+1/z dz
(92)
|z|=1

which has an infinite order pole (a.k.a. essential singularity) at z = 0 and for which Cauchys formula is
not directly useful, but Taylor series and the simple (70) makes this a relative snap for the thinking person.

c
F.
Waleffe, Math 321, 2013/1/21

93

This all looks unbelievably mysterious if you do not understand the key ideas and are just trying to plug
into a formula. If you understand, it is pretty magical.

Example 5:
Z

sin x
dx = .
(93)
x
R
This is a trickier problem. Our impulse is to consider R (sin z)/z dz but that integrand is a super good
function! Indeed (sin z)/z = 1 z 2 /3! + z 4 /5! is analytic in the entire plane, its Taylor series converges
in the entire plane. For obvious reasons such functions are called entire functions. But we love singularities
now since they actually make our life easier. So we write
Z
Z ix
Z iz
sin x
e
e
dx = Im
dx = Im
dz,
(94)
x
x

R z
where Im stands for imaginary part of. Now we have a nice simple pole at z = 0. But thats another
problem since the pole is on the contour! We have to modify our favorite contour a little bit to avoid the
pole by going over or below it. If we go below and close along C as before, then well have a pole inside our
contour. If we go over it, we wont have any pole inside the closed contour. We get the same result either
way (luckily!), but the algebra is a tad simpler if we leave the pole out.
So we consider the closed contour C = C1 + C2 + C3 + C4 where C1
is the real axis from R to , C2 is the semi-circle from  to 
in the top half-plane, C3 is the real axis from  to R and C4 is our
good old semi-circle of radius R. The integrand eiz /z is analytic
everywhere except at z = 0 where it has a simple pole, but since
that
pole is outside our closed contour, Cauchys theorem gives
H
=
0 or
C
Z
Z
Z

C4
C2
C1
R

C3


=
C1 +C3

C2

C4

## The integral over the semi-circle C2 : z = ei , dz = iei d, is

Z
Z
i
eiz
dz = i

eie d i as  0.
z
C2
0
R
As before wed like to show that the C4 0 as R . This is trickier than the previous cases weve
encountered. On the semi-circle z = Rei and dz = iRei d, as weve seen so many times, we dont even
need to think about it anymore (do you?), so
Z
Z
Z
i
eiz
dz = i
eiRe d = i
eiR cos eR sin d.
(95)
0
0
C4 z
This is a pretty scary integral. But with a bit of courage and intelligence its not as bad as it looks. The
integrand has two factors, eiR cos whose norm is always 1 and eR sin which is real and exponentially small
for all in 0 < < , except at 0 and where it is exactly 1. Sketch eR sin in 0 and its pretty
clear the integral should go to zero as R . To show this rigorously, lets consider its modulus (norm) as
we did in the previous cases. Then since (i) the modulus of a sum is less or equal to the sum of the moduli
(triangle inequality), (ii) the modulus of a product is the product of the moduli and (iii) |eiR cos | = 1 when
R and are real (which they are)
Z
Z

0
eiR cos eR sin d <
eR sin d
(96)
0

we still cannot calculate that last integral but we dont need to. We just need to show that it is smaller than
something that goes to zero as R , so our integral will be squeezed to zero.

94
Plotting sin for 0 , we see that it is symmetric with respect
to /2 and that 2/ sin when 0 /2, or changing the
signs 2/ sin and since ex increases monotonically with x,

2/
1

/2

## in 0 /2. This is Jordans Lemma

Z
Z
Z /2
Z /2

1 eR
2R/
R sin
R sin
iRei
<

e
d
=

e
d
<
2
e
d
=
2
e
d

R
0
0
0
0
R
so C4 0 as R and collecting our results we obtain (93).

(97)

Exercises
All of the above of course, +
R 2
1. Calculate 0 1/(a + b sin s)ds where a and b are real numbers. Does the integral exist for any real
values of a and b?
R
2. Make up and solve an exam question which is basically the same as dx/(1 + x2 ) in terms of the
logic and difficulty, but is different in the details.
R
3. Calculate dx/(1 + x2 + x4 ). Can you provide an upper bound for this integral based on integrals
calculated earlier?
R
R

2
2
2
4. Given the Poisson integral ex dx = , what is ex /a dx where a is real? (that should
R x2 /a2 ikx
be easy!). Next, calculate e
e dx where a and k are arbitrary real numbers. [This is the
2

Fourier transform of the Gaussian ex /a . Complete the square. Note that you can pick a, k > 0
(why?), then integrate over an infinite rectangle that consists of the real axis and comes back along
the line y = ka2 /2 (why? justify).
5. The Fresnel integrals come up in optics and quantum mechanics. They are
Z
Z
cos x2 dx, and
sin x2 dx.

R 2
Calculate them both by considering 0 eix dx. The goal is to reduce this to a Poisson integral. This
would be the case if x2 (ei/4 x)2 . So consider the closed path that goes from 0 to R on the real
axis, then on the circle of radius R to Rei/4 then back on the diagonal z = sei/4 with s real.

Branch cuts
Ok if youve made it this far and are still thinking hard, you may have noticed that weveonly dealt with
integer powers. What about fractional powers? First lets take a look at the integral of z over the unit
circle z = ei from = 0 to 0 + 2
I
|z|=1

0 +2

z dz =
0

ei/2 iei d =

2i 3i0 /2 i3
4i 3i0 /2
e
(e 1) =
e
3
3

(98)

The answer depends on 0 ! The integral over the closed circle depends on
where we start on the circle?!
This is weird, whats going on? The problem is with the definition of z. We have implicitly defined

c
F.
Waleffe, Math 321, 2013/1/21

95

## z = |z|1/2 ei arg(z)/2 with

0 arg(z) < 0 + 2 or 0 < arg(z) 0 + 2. But each 0 corresponds to a
different definition for z.

Forreal variables the equation y 2 = x 0 had two solutions y = x and we defined x 0. Cant we
define z
in a similar way? The equationw2 = z in the complex plane always hastwo solutions. We can say

z and z but we still need to define z since z is complex.Could we define z to be such that its real
part is always positive? yes, and thats equivalent to defining z = |z|1/2 ei arg(z)/2 with < arg(z) <
(check it). But thats not complete because the sqrt of a negative real number is pure imaginary, so what do
we do about those numbers? We can define < arg(z) , so real negative numbers have arg(z) = , not
, by definition. This is indeed the definition that
Matlab chooses. But it may not be appropriate for our
problem because it introduces a discontinuity in z as we crossthe negative real axis. If that is not desirable
for our problem than we could define 0 arg(z) < 2. Now z is continuous across the negative real axis
but there is a jump across the positive real axis. Not matter what definition we pick, there will always be a
discontinuity somewhere. We cannot go around z = 0 without encountering such a jump, z = 0 is called a
branch point and the semi-infinite curve emanating from z = 0 across which arg(z) jumps is called a branch
cut.
Heres a simple example that illustrates the extra subtleties and techniques.
Z
x
dx
1 + x2
0

First note that this integral does indeed exist since x/(1 + x2 ) x3/2 as x and therefore goes to
zero fast enough to be integrable.
Our first impulse is to see this as an integral over the real axis from 0 to

2
of the complex function
z/(z
+ 1). That function has simple poles at z = i as we know well. But

theres a problem: z is not analytic at z = 0 which is on our contour again. No big deal, we can avoid
it as we saw in the (sin x)/x example. So lets
take the same 4-piece closed contour as in that problem.
But were not all set yet because we have a z, what do we mean by that when z is complex? We need to
define that function
so that it is analytic everywhere inside and onour contour. Writing z = |z|ei arg(z) then

we can define z = |z|1/2 ei arg(z)/2 . We need to define arg(z) so z is analytic inside and on our contour.
The definitions arg(z)
< would not work with our decision to close in the upper half place. Why?
because arg(z) and thus z would not be continuous at the junction between C4 and C1 . We could close in
the lower half plane, or we can pick another branch cut for arg(z). The standard definition < arg(z)
would work. Try it! Well take a more exotic choice to illustrate branch cuts more dramatically. Lets pick
0 arg(z) < 2.
To be continued . . .