Course Notes for the University of Wisconsin-Madison Mathematics 321 (Analysis) course.

© All Rights Reserved

67 views

Course Notes for the University of Wisconsin-Madison Mathematics 321 (Analysis) course.

© All Rights Reserved

- Physics Notes
- A Small Compendium on Vector and Tensor Algebra and Calculus
- Additional Mathematics Project SPM
- HRW Physics 3 Vectors
- hints2
- 1Q09 Influ Coeff
- Lec-12.pdf
- Lect-week3_ENE2006_sp2019.pdf
- Pre-calculus / math notes (unit 21 of 22)
- Introduction to Inverse Problems - Sari Lasanen
- GlowScript Annotations 1
- dela_4-1.pdf
- Physics Topologic
- math213a01
- Dynamic Eigen Value Analysis
- Lab Exp 1
- Graphics Slides 03
- LaTeX RefSheet
- Effect of Disturbance Directions on Closed-Loop
- Topic 1 Problem Set 2016

You are on page 1of 95

Notes for Math 321 @ UW Madison

c

Fabian

Waleffe

Department of Mathematics &

Department of Engineering Physics

University of Wisconsin, Madison

Spring 2013

Contents

1 Vectors

1

Get your bearings . . . . . . . . . . . . . . . . . . . . . . . .

1.1

Magnitude and direction . . . . . . . . . . . . . . . . .

1.2

Polar and Cartesian representations of 2D vectors . .

1.3

Spherical, Cylindrical and Cartesian representations in

2

Addition and scaling of vectors . . . . . . . . . . . . . . . . .

3

Abstract Vector Spaces . . . . . . . . . . . . . . . . . . . . .

4

Bases and Components . . . . . . . . . . . . . . . . . . . . . .

5

Points and Coordinates . . . . . . . . . . . . . . . . . . . . .

5.1

Position vector . . . . . . . . . . . . . . . . . . . . . .

5.2

Lines and Planes . . . . . . . . . . . . . . . . . . . . .

6

Dot (a.k.a. Scalar or Inner) Product . . . . . . . . . . . . . .

7

Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . .

7.1

Definition of dot product for Rn (Optional) . . . . . .

7.2

Norm of a vector (Optional) . . . . . . . . . . . . . . .

7.3

Definition of norm for Rn (Optional) . . . . . . . . . .

8

Cross (a.k.a. Vector or Area) Product . . . . . . . . . . . . .

8.1

Orientation of Bases . . . . . . . . . . . . . . . . . . .

8.2

Double cross product (Triple vector product) . . . .

9

Index notation . . . . . . . . . . . . . . . . . . . . . . . . . .

9.1

Levi-Civita (a.k.a. alternating or permutation) symbol

9.2

Sigma notation, free and dummy indices . . . . . . . .

9.3

Einstein summation convention . . . . . . . . . . . . .

10 Mixed (or Box) product and Determinant . . . . . . . . . .

11 Points, Lines, Planes, etc. . . . . . . . . . . . . . . . . . . . .

12 Orthogonal Transformations and Matrices . . . . . . . . . . .

12.1 Change of Cartesian Basis . . . . . . . . . . . . . . . .

12.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . .

12.3 Orthogonal Matrices and Euler Angles . . . . . . . . .

12.4 Determinant of a matrix . . . . . . . . . . . . . . . . .

12.5 Three views of Ax = b . . . . . . . . . . . . . . . . . .

12.6 Eigenvalues and Eigenvectors (Math 320 not 321) . . .

. .

. .

. .

3D

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5

5

5

5

6

7

8

9

10

10

11

12

14

15

16

16

17

18

19

21

21

21

22

24

27

28

28

30

31

33

33

35

2 Vector Calculus

1

Vector function of a scalar variable . . . . . . .

2

Motion of a particle . . . . . . . . . . . . . . .

3

Motion of a system of particles (optional) . . .

4

Motion of a rigid body (optional) . . . . . . . .

5

Curves, Surfaces, Volumes and their integrals .

5.1

Curves . . . . . . . . . . . . . . . . . . .

5.2

Integrals along curves, or line integrals

5.3

Surfaces . . . . . . . . . . . . . . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

37

37

38

39

40

42

42

43

45

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5.4

Surface integrals . . . . . . . . . . . . . . . . . . . . . .

5.5

Volumes and volume integrals . . . . . . . . . . . . . . .

5.6

Mappings, curvilinear coordinates . . . . . . . . . . . .

5.7

Change of variables . . . . . . . . . . . . . . . . . . . .

Grad, div, curl . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.1

Geometric definition of the Gradient . . . . . . . . . . .

6.2

Directional derivative, gradient and the operator . .

6.3

Div and Curl . . . . . . . . . . . . . . . . . . . . . . . .

6.4

Vector identities . . . . . . . . . . . . . . . . . . . . . .

6.5

Grad, Div, Curl in cylindrical and spherical coordinates

Fundamental theorems of vector calculus . . . . . . . . . . . . .

7.1

Integration in R2 and R3 . . . . . . . . . . . . . . . . .

7.2

Fundamental theorem of Calculus . . . . . . . . . . . .

7.3

Fundamental theorem in R2 . . . . . . . . . . . . . . . .

7.4

Green and Stokes theorems . . . . . . . . . . . . . . . .

7.5

Divergence form of Greens theorem . . . . . . . . . . .

7.6

Gauss theorem . . . . . . . . . . . . . . . . . . . . . . .

7.7

Other useful forms of the fundamental theorem in 3D .

3 Complex Calculus

1

Basics of Series and Complex Numbers . . . . . .

1.1

Algebra of Complex numbers . . . . . . .

1.2

Limits and Derivatives . . . . . . . . . . .

1.3

Geometric sums and series . . . . . . . . .

1.4

Ratio test . . . . . . . . . . . . . . . . . .

1.5

Power series . . . . . . . . . . . . . . . . .

1.6

Complex transcendentals . . . . . . . . .

1.7

Polar representation . . . . . . . . . . . .

1.8

Logs . . . . . . . . . . . . . . . . . . . . .

2

Functions of a complex variable . . . . . . . . . .

2.1

Visualization of complex functions . . . .

2.2

Cauchy-Riemann equations . . . . . . . .

2.3

Geometry of Cauchy-Riemann, Conformal

3

Integration of Complex Functions . . . . . . . . .

3.1

Cauchys theorem . . . . . . . . . . . . .

3.2

Cauchys formula . . . . . . . . . . . . . .

4

Applications of complex integration . . . . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

Mapping

. . . . . .

. . . . . .

. . . . . .

. . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

47

47

49

51

55

55

55

56

57

58

60

60

60

61

62

64

64

65

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

69

69

69

70

71

71

72

73

75

76

78

78

79

81

84

85

88

90

Chapter 1

Vectors

What is a vector? In calculus, you may have defined vectors as lists of numbers such as (a1 , a2 ) or (a1 , a2 , a3 ),

while in physics, and in geometry before that, you encountered vectors as quantities with both magnitude

and direction such as displacements, velocities and forces.

1.1

We write

a=aa

(1)

. Magnitude of vector a is denoted |a|, and |a| 0

is a positive real number with appropriate physical units (meters, Newtons, . . . ). Direction of vector a is

denoted a

with |

a| = 1 and, since their magnitude is unity, direction vectors are often called unit vectors,

but note that unit vectors have no physical units!

Example 1.1.

a = 2L heading 30 clockwise from North

specifies a horizontal displacement a of magnitude |a| = 2L and direction a

= 30 clockwise from North. L

is a reference length, for instance L = 1 meter or 1 mile, and North is a reference direction, magnetic north

or true North on the earth.

D

B

a

a

displacement a starting from point C leads to point D, with

by specifying displacements from a reference point, thus point B

is located displacement a from point A, and D is a from C.

Vectors are denoted with a boldface: a, b, u, v, . . . , usually lower-case but sometimes upper-case as in a

magnetic field B, electric field E or force F . Some authors write ~a for a and we use that notation to denote

the displacement from point A to point B as AB. Writing by hand, we use the typographical notation for

boldface which is a for a, a

for a

, etc.

e

1.2

can be specified by an angle from a reference direction as in

example 1.1. In navigation, that angle is usually defined clockwise from North and called an azimuthal

angle. In mathematical physics, we specify the direction using an angle counterclockwise from a reference

5

6

direction x

. We can specify a 2D vector a by giving the pair (a, ) for vector a of magnitude |a| = a and

direction a

= angle counterclockwise from x

.

ay y

ax x

We can also specify the vector as a sum of displacements in two reference perpendicular directions, ax in

direction x

and ay in direction y

perpendicular to x

. Now the pair (ax , ay ) specifies a. The pairs (a, ) and

(ax , ay ) are equivalent representations of a,

a (a, ) (ax , ay )

but in general they are not equal

(a, ) 6= (ax , ay ).

To express the equality of the representations, we need to include the direction vectors and write

a = aa

= ax x

+ ay y

(2)

)

ax = a cos

ay = a sin

a2 = a2x + a2y ,

(3)

a

= cos x

+ sin y

.

(4)

The representation a = a a

, where a

=a

() is a function of , is a polar representation of the 2D vector

a and a = ax x

+ ay y

is a cartesian representation with x

and y

two arbitrary perpendicular directions.

The vector a is specified by the pair (a, ) or (ax , ay ), or many other pairs of numbers depending on

how we specify the direction, for instance 20 feet toward the water tower then 5 meters toward the oak tree

specifies a displacement as (20ft, 5m) but the reference directions may not be perpendicular. Clearly, the

same physical 2D vector can be represented by an infinity of pairs of numbers depending on how we choose

our reference magnitudes and directions.

1.3

How do you specify a direction in 3D space? Astronomers use an azimuthal angle measured clockwise

from North in the horizontal plane and an inclination angle, say, measured from the upward direction (or

Zenith), in the azimuthal (or meridional ) plane. In ballistics and computer graphics, they may prefer to use

an elevation angle measured up from the horizontal instead of the inclination angle down from the vertical.

In mathematical physics, we use the angle between the z and a

directions and an angle from the x

a

= sin + cos z,

= cos x

+ sin y

.

(5)

c

F.

Waleffe, Math 321, 2013/1/21

The spherical representation (a, , ) specifies vector a through its magnitude a, its polar angle and

its azimuthal angle . The cylindrical representation (aH , , az ) specifies a by its horizontal magnitude

aH , azimuthal angle and vertical component az and the cartesian representation consists of the familiar

(ax , ay , az ). A 3D vector a can thus be represented by 3 real numbers, for instance (a, , ) or (aH , , az ) or

(ax , ay , az ). Each of these triplets are equivalent representations of the vector a

a (a, , ) (aH , , az ) (ax , ay , az )

(6)

(a, , ) 6= (aH , , az ) 6= (ax , ay , az ).

(7)

a=aa

= aH + az z = ax x

+ ay y

+ az z.

(8)

aH

ay y

az z

aH

ax x

From this vector equation, and with a little help from Pythagoras and basic trigonometry, we deduce

a2 = a2H + a2z ,

(9)

az

aH

+

z = sin + cos z,

a

a

ay

ax

x

+

y

= cos x

+ sin y

.

=

aH

aH

(10)

a

=

(11)

These representations indicate the two fundamental algebraic properties of vectors: they can be added and

scaled.

a

b

c

b

a+

a+b

+c

b+c

that a, b and c are not in the same plane, in general.

(a)

a

is the displacement exactly opposite to a. Vector subtraction b a is then

defined as the addition of b and (a). In particular a+(a) = 0 corresponds

to no net displacement. This is an important difference between points and

displacements, there is no special point in our space, but there is one special

displacement: the zero vector 0 such that a + (a) = 0 = (a) + a and

a + 0 = a, for any a.

8

The other key operation that characterizes vectors is scaling, that is, multiplication by a real number R.

b

(a + b)

b

||a

a + b

a+b

||a

a

a

Geometrically, v = a is a new vector parallel to a but of length |v| = |||a|. The direction of v is

the same as a if > 0 and opposite to a if < 0. Obviously (1)a = (a), multiplying a by (1)

yields the previously defined opposite of a. Other geometrically obvious properties are distributivity with

respect to addition of real factors: ( + )a = a + a, and with respect to multiplication of real factors:

()a = (a). Slightly less trivial is distributivity with respect to vector addition: (a + b) = a + b,

which geometrically corresponds to similarity of triangles, as illustrated above.

Vector addition and multiplication by a number (scaling) are the two key operations that define a Vector

Space, provided those operations satisfy the following 8 properties a, b in the vector space and , in

R. The symbol means for all or for any.

Required vector addition properties:

a + b = b + a,

(12)

a + (b + c) = (a + b) + c,

(13)

a + 0 = a,

(14)

a + (a) = 0.

(15)

( + )a = a + a,

(16)

()a = (a),

(17)

(a + b) = a + b,

(18)

1 a = a.

(19)

When these properties are used to define the vector space they are referred to as axioms, i.e. the defining

properties.

The vector space Rn : Consider the set of ordered n-tuplets of real numbers x (x1 , x2 , . . . , xn ). These

could correspond to student grades on a particular exam, for instance. What kind of operations would we

want to do on these lists of student grades? Well probably want to add several grades for each student and

well probably want to rescale the grades. So the natural operations on these n-tuplets are addition defined1

by adding the respective components:

x + y , (x1 + y1 , x2 + y2 , . . . , xn + yn ) = y + x.

(20)

x , (x1 , x2 , . . . , xn ).

(21)

The set of n-tuplets of real numbers equipped with addition and multiplication by a real number as just

defined is an important vector space called Rn . The vector spaces R2 and R3 will be particularly important

to us as theyll soon corresponds to the components of our physical vectors. But we also use Rn for very

large n when studying systems of equations, for instance.

1 The

c

F.

Waleffe, Math 321, 2013/1/21

Exercises:

1. Show that addition and scalar multiplication of n-tuplets satisfy the 8 required properties listed above.

2. Define addition and scalar multiplication of n-tuplets of complex numbers and show that all 8 properties

are satisfied. That vector space is called Cn .

3. The set of real functions f (x) is also a vector space. Define addition in the obvious way: f (x) + g(x)

h(x) another real function, and scalar multiplication: f (x) = F (x) yet another real function. Show

that all 8 properties are again satisfied.

4. Suppose you define addition of n-tuplets x = (x1 , x2 , . . . , xn ) as usual but define scalar multiplication

according to x = (x1 , x2 , , xn ), that is, only the first component is multiplied by . Which

property is violated? What if you defined x = (x1 , 0, , 0), which property would be violated?

5. From the 8 properties, show that (0)a = 0 and (1)a = (a), a, i.e. show that multiplication by the

scalar 0 yields the neutral element for addition, and multiplication by 1 yields the additive inverse.

Addition and scaling of vectors allow us to define the concepts of linear combination, basis, components and

dimension. These concepts apply to any vector space.

A linear combination of vectors a and b is an expression of the form a + b. This linear combination

yields another vector v. The set of all such vectors, obtained by taking any , R, is itself a vector space

(or more correctly a vector subspace if a and b are two vectors in 3D space for instance). We say that a

and b form a basis for that (sub)space. We also say that this is the (sub)space spanned by a and b. For

a given vector v, the unique real numbers and such that v = a + b are called the components of v

with respect to the basis a, b.

v = a + b

b

a

Linear Independence: The vectors a1 , a2 , , ak for some integer k are linearly independent (L.I.) if

the only way to have

1 a1 + 2 a2 + + k ak = 0

is for all the s to be zero:

1 = 2 = = k = 0.

Dimension: The dimension of a vector space is the largest number of linearly independent vectors, n

say, in that space. A basis for that space consists of n linearly independent vectors. A vector v has n

components (some of them possibly zero) with respect to any basis in that space.

Examples

Two non-parallel vectors a and b in a plane (for instance, horizontal plane) are L.I. and these vectors

form a basis for vectors (for instance, displacements) in the plane. Any given vector v in the plane can be

written as v = a + b, for a unique pair (, ). v is the diagonal of the parallelogram a, b. Three or

more vectors in a plane are linearly dependent.

Three non-coplanar vectors a, b and c in 3D space are L.I. and those vectors form a basis for 3D

space. However 4 or more vectors in 3D are linearly dependent. Any given vector v can be expanded as

v = a + b + c, for a unique triplet of real numbers (, , ). Make sketches to illustrate.

10

The 8 properties of addition and scalar multiplication imply that if two vectors u and v are expanded

with respect to the same basis a1 , a2 , a3 so

u = u1 a1 + u2 a2 + u3 a3 ,

v = v1 a1 + v2 a2 + v3 a3 ,

then

u + v = (u1 + v1 )a1 + (u2 + v2 )a2 + (u3 + v3 )a3 ,

v = (v1 )a1 + (v2 )a2 + (v3 )a3 ,

so addition and scalar multiplication are performed component by component and the triplets of real components (v1 , v2 , v3 ) are elements of the vector space R3 . A basis a1 , a2 , a3 in 3D space provides a one-to-one

correspondence (mapping) between displacements v in 3D (Euclidean) space (call it E3 ) and triplets of real

numbers in R3

v E3 (v1 , v2 , v3 ) R3 .

Exercises

1. Given vectors a, b in E3 , show that the set of all v = a + b, , R is a vector space.

2. Show that the set of all vectors v = a + b, R and fixed a, b is not a vector space.

3. If you defined addition of ordered pairs x = (x1 , x2 ) as usual but scalar multiplication by x =

(x1 , x2 ), would it be possible to represent any vector x as a linear combination of two basis vectors

a and b?

4. Given three points P1 , P2 , P3 in E3 , let M be the midpoint of segment P1 P2 , what are the components

of P2 P3 and P3 M in the basis P1 P2 , P1 P3 ?

5. Find a basis for Rn (consider the natural basis: e1 = (1, 0, , 0), e2 = (0, 1, 0, , 0), etc.)

6. Find a basis for Cn . What is the dimension of that space?

7. What is the dimension of the vector space of real continuous functions f (x) in 0 < x < 1?

8. What could be a basis for the vector space of nice functions f (x) in (0, 1)? (i.e. 0 < x < 1) (whats

a nice function? smooth functions are infinitely differentiable, thats nice!)

5

5.1

Position vector

In elementary calculus and linear algebra it is easy to confuse points and vectors. In R3 for instance, a point

P is defined as a triplet (x1 , x2 , x3 ) but a vector a is also defined by a real number triplet (a1 , a2 , a3 ). In

physical space, points and displacements are two clearly different things.

P

r

The confusion arises from the fundamental way to locate points by specifying displacements from a reference point called the origin and denoted O. An arbitrary

vector is called the position vector of P and denoted r for radial vector from the

origin O.

A Cartesian system of coordinates consists of a reference point O and three mutually orthogonal directions

x

, y

, z that provide a basis for displacements in 3D Euclidean space E3 . The position vector r = OP of

point P can then be specified in spherical, cylindrical or cartesian form as in sect. 1.3 now with the special

r instead of the arbitrary a,

r = r r = + z z = x x

+yy

+ z z,

(22)

c

F.

Waleffe, Math 321, 2013/1/21

11

(

p

= x2 + y 2

y = sin

(

r=

z2

x = cos

= r sin

z = r cos

(23)

(24)

r=

p

x2 + y 2 + z 2

x = r sin cos

y = r sin sin

z = r cos

(25)

Note that x

, y

, z form a basis for 3D Euclidean vector space, but and z do not, and r does not either.

The cartesian basis x

, y

, z consists of three fixed and mutually orthogonal directions independent of P ,

but and r depend on P , each point has its own and r. We will construct and use full cylindrical and

)

spherical orthogonal bases, (,

,

z) and (

r , ,

later in vector calculus. These cylindrical and spherical

basis vectors vary with P , or more precisely with and .

Once a Cartesian system of coordinates, O, x

, y

, z, has been chosen, the cartesian coordinates (x, y, z)

of P are the cartesian components of r = x x

+yy

+ z z. The cylindrical coordinates of P are (, , z) and

its spherical coordinates are (r, , ), but cylindrical and spherical coordinates are not vector components,

they are the cylindrical and spherical representations of r.

Coordinates of points can be specified in many other ways that do not correspond to a displacement

vector. In navigation for instance, a point P in a horizontal plane can be located by specifying the angles,

1 and 2 say, between a reference direction (North) and the lines of sight from P to two reference points P1

and P2 (light houses or radio beacons). In the global positioning system (GPS), a point P is located from

the distances to (at least) 3 satellites. Thus, in general, coordinates of points are not components of vectors.

5.2

The line passing through point A that is parallel to the direction of a consists of all points P such that

P

A

rA

AP = a,

R

(26)

of an origin O we have OP = OA + AP and we can write the vector (parametric) equation of that same line as

r = rA + a,

(27)

respect to O, respectively. The real number is the parameter of the line,

it is the coordinate of P in the system A, a.

Likewise the equation of a plane passing through A and parallel to the vectors a and b consists of all

points P such that

AP = a + b, , R

(28)

or in terms of O

r = rA + a + b.

(29)

This is the parametric vector equation of that plane with parameters , , that are the coordinates of P in

the system of reference A, a, b.

12

Exercises

1. Pick two vectors a, b and some arbitrary point A in the plane of your sheet of paper. If AB = a + b,

sketch the region where B can be if: (i) and are both between 0 and 1, (ii) || || 1.

2. Given three points P1 , P2 , P3 in R3 , what are the coordinates of the midpoints of the triangle with

respect to P1 and the basis P1 P2 , P1 P3 ?

3. Show that the line segment connecting the midpoints of two sides of a triangle is parallel to and equal

to half of the third side using methods of plane geometry and using vectors.

4. Show that the medians of a triangle intersect at the same point which is 2/3 of the way down from

the vertices along each median (a median is a line that connects a vertex to the middle of the opposite

side). Do this using both plane geometry and vector methods.

5. Given three point A, B, C, not co-linear, find a point X such that XA + XB + XC = 0. Show

that the line through A and X cuts BC at its mid-point. Deduce similar results for the other sides

of the triangle ABC and therefore that X is the point of intersection of the medians. Sketch. [Hint:

XB = XA + AB, XC = ]

6. Given four points A, B, C, D not co-planar, find a point X such that XA + XB + XC + XD = 0.

Show that the line through A and X intersects the triangle BCD at its center of area. Deduce similar

results for the other faces and therefore that the medians of the tetrahedron ABCD, defined as the

lines joining each vertex to the center of area of the opposite triangle, all intersect at the same point

X which is 3/4 of the way down from the vertices along the medians. Visualize. [Hint: solve previous

problem first, of course.]

Partial solutions to problems 3 and 4:

B

E

C

the triangles ABC and EDC. They are similar (why?), so ED = AB/2 and

those two segments are parallel. Next consider triangles BAG and EDG.

Those are similar too (why?), so AG = 2GD and BG = 2GE, done! Vector

= = 2/3, done! (Why does this imply that the line through C and G

cuts AB at its midpoint?).

No need to introduce cartesian coordinates, that would be horrible and full of unnecessary algebra. The

vector solution refers to a vector basis: a = CA and b = CB which is perfect (or intrinsic) for the problem,

although it is not orthogonal!

a b , |a| |b| cos ,

(30)

where 0 is the angle between the vectors a and b when their tails coincide. The dot product is

a real number such that a b = 0 iff a and b are orthogonal (perpendicular). The 0 vector is considered

orthogonal to any vector. The dot product of any vector with itself is the square of its length a a = |a|2 .

The dot product is directly related to the perpendicular projections of b onto a and a onto b. The latter are,

respectively,

c

F.

Waleffe, Math 321, 2013/1/21

13

a

b

a

b

|b| cos =

ab

ab

=a

b 6= a

b=

= |a| cos ,

|a|

|b|

(31)

where a

= a/|a| and

b = b/|b| are the unit vectors in the a and b directions, respectively. Recall that a unit

vector is a vector of length one |

a| = 1 a 6= 0.

In physics, the work W done by a force F on a particle undergoing the displacement ` is equal to

distance |`| times the component of F in the direction of `, but that is equal to the total force |F | times

the component of the displacement ` in the direction of F . Both of these statements are contained in the

symmetric definition W = F `, see exercise 1 below.

Parallel and Perpendicular Components

We often want to decompose a vector b into vector components, bk and b , parallel and perpendicular to a

vector a, respectively, such that b = bk + b with

bk = (b a

) a

=

ba

a

aa

b = b bk = b (b a

)

a=b

bk

(32)

ba

a

aa

The dot product has the following properties, most of which are immediate:

1.

a b = b a,

2.

3.

a (b + c) = a b + a c,

4.

a a 0,

5.

(a b)2 (a a) (b b)

a a = 0 a = 0,

ba

b+c

ca

(Cauchy-Schwarz)

drops out so all we need to check is a

b+a

c = a

(b + c) or in other words, that the perpendicular

projections of b and c onto a line parallel to a add up to the projection of b + c onto that line. This is

obvious from the figure. In general, a, b and c are not in the same plane. To visualize the 3D case interpret

a

b as the (signed) distance between two planes perpendicular to a that pass through the tail and head of

b and likewise for a

c and a

(b + c). The result follows directly since b, c and b + c form a triangle. So

the picture is the same but the dotted lines represent planes seen from the sides and b, c, b + c are not in

the plane of the paper, in general.

Exercises

1. A skier slides down an inclined plane with a total vertical drop of h, show that the work done by

gravity is independent of the slope. Use F and `s and sketch the geometry of this result.

14

2. Sketch the solutions of a x = , where a and are known.

3. Sketch c = a + b then calculate c c = (a + b) (a + b) and deduce the law of cosines.

4. If n is a unit vector, show that a a (a n)n is orthogonal to n, a. Sketch.

5. If c = a + b show that c = a + b (defined in previous exercise). Interpret geometrically.

6. B is a magnetic field and v is the velocity of a particle. We want to decompose v = v + v k where v

is perpendicular to the magnetic field and v k is parallel to it. Derive vector expressions for v and v k .

7. Show that the three normals (or heights, or altitudes) dropped from the vertices of a triangle perpendicular to their opposite sides intersect at the same point H say. [Hint: this is similar to problem 4 in

section 4 but now AD and BE are defined by AD CB = 0 and BE CA = 0 and the goal is to show

that CH AB = 0. Do you need to find H to prove the result?].

8. A and B are two points on a sphere of radius R specified by their longitude and latitude. Find the

shortest distance between A and B, traveling on the sphere. [If O is the center of the sphere consider

OA OB to determine their angle].

9. Consider v(t) = a + tb where t R and a, b are arbitrary constant. What is the minimum |v| and for

what t? Solve two ways: geometrically and using calculus.

10. Show that the point of intersection of the medians G (center of gravity), the point of intersection of the

heights H (orthocenter ) and the point of intersection of the perpendicular bisectors O (circumcenter )

of an arbitrary triangle, all lie on the same line (known as the Euler line).

Orthonormal bases

Given an arbitrary vector v and three non co-planar vectors a, b and c in E3 , you can find the three scalars

, and such that

v = a + b + c

by parallel projections (sect. 4). The scalars , and are the components of v in the basis a, b, c. Finding

those components is simpler if the basis is orthogonal, that is if the basis vectors a, b and c are mutually

orthogonal, in which case a b = b c = c a = 0. For an orthogonal basis, a projection parallel to a

say, is a projection perpendicular to b and c, but a perpendicular projection is a dot product. In fact, we

can forget geometry and crank out the vector algebra: take the dot product of both sides of the equation

v = a + b + c with each of the 3 basis vectors to obtain

=

av

,

aa

bv

,

bb

cv

.

cc

An orthonormal basis is even better. Thats a basis for which the vectors are mutually orthogonal and

of unit norm. Such a basis is often denoted2 e1 , e2 , e3 . Its compact definition is

ei ej = ij

(33)

2 Forget about the notation i, j, k for cartesian unit vectors. This is 19th century notation, it is unfortunately still very

common in elementary courses but that old notation will get in the way if you stick to it. We will NEVER use i, j, k, instead

we use (

x, y

, z

) or (e1 , e2 , e3 ) or (ex , ey , ez ) to denote a set of three orthonormal vectors in 3D euclidean space. We will soon

use indices i, j and k (next line already!). Those indices are positive integers that can take all the values from 1 to n, the

dimension of the space. We spend most of our time in 3D space, so most of the time the possible values for these indices i, j

and k are 1, 2 and 3. We will use those indices a lot!. They should not be confused with those old orthonormal vectors i, j, k

from elementary calculus.

c

F.

Waleffe, Math 321, 2013/1/21

15

The components of a vector v with respect to the orthonormal basis e1 , e2 , e3 in E3 are the real numbers

v1 , v2 , v3 such that

v = v e + v e + v e

vi ei

1 1

2 2

3 3

(34)

i=1

vi = ei v, i = 1, 2, 3.

If two vectors a and b are expanded in terms of e1 , e2 , e3 , that is

a = a1 e1 + a2 e2 + a3 e3 ,

b = b1 e1 + b2 e2 + b3 e3 ,

use the distributivity properties of the dot product and the orthonormality of the basis to show that

a b = a1 b1 + a2 b2 + a3 b3 .

(35)

One remarkable property of this formula is that its value is independent of the orthonormal basis. The

dot product is a geometric property of the vectors a and b, independent of the basis. This is obvious from

the geometric definition (30) but not from its expression in terms of components (35). If e1 , e2 , e3 and e1 0 ,

e2 0 , e3 0 are two distinct orthogonal bases then

a = a1 e1 + a2 e2 + a3 e3 = a01 e1 0 + a02 e2 0 + a03 e3 0

but, in general, the components in the two bases are distinct: a1 6= a01 , a2 6= a02 , a3 6= a03 , and likewise for

another vector b, yet

a b = a1 b1 + a2 b2 + a3 b3 = a01 b01 + a02 b02 + a03 b03 .

(36)

The simple algebraic form of the dot product is invariant under a change of orthonormal basis.

Exercises

1. Given the orthonormal (cartesian) basis (e1 , e2 , e3 ), consider a = a1 e1 + a2 e2 , b = b1 e1 + b2 e2 ,

v = v1 e1 + v2 e2 . What are the components of v (i) in terms of a and b? (ii) in terms of a and b

where a b = 0?

2. If (e1 , e2 , e3 ) and (e01 , e02 , e03 ) are two distinct orthogonal bases and a and b are arbitrary vectors, prove

(36) but construct an example that a1 b1 + 2a2 b2 + 3a3 b3 6= a01 b01 + 2a02 b02 + 3a03 b03 in general.

P3

P

3. If w = i=1 wi ei , calculate ej w using

notation and (33).

P3

P3

P3

4. Why is not true that ei i=1 wi ei = i=1 wi (ei ei ) = i=1 wi ii = w1 + w2 + w3 ?

P3

P3

P

notation and (33).

5. If v = i=1 vi ei and w = i=1 wi ei , calculate v w using

P3

P3

6. If v = i=1 vi ai and w = i=1 wi ai , where the basis ai , i = 1, 2, 3, is not orthonormal, calculate

v w.

P3

P3 P3

P3

7. Calculate (i) j=1 ij aj , (ii) i=1 j=1 ij aj bi , (iii) j=1 jj .

7.1

The geometric definition of the dot product (30) is great for oriented line segments as it emphasizes the

geometric aspects, but the algebraic formula (35) is very useful for calculations. Its also the way to define

the dot product for other vector spaces where the concept of angle between vectors may not be obvious e.g.

what is the angle between the vectors (1,2,3,4) and (4,3,2,1) in R4 ?! The dot product (a.k.a. scalar product

or inner product) of the vectors x and y in Rn is defined as suggested by (35):

x y , x1 y1 + x2 y2 + + xn yn .

(37)

16

Verify that this definition satisfies the first 4 properties of the dot product. To show the Cauchy-Schwarz

property, you need a bit of Calculus and a classical trick: consider v = x + y, then by prop 4 of the dot

product: (x + y) (x + y) 0, and by props 1,2,3, F () (x + y) (x + y) = 2 y y + 2x y + x x.

For given, but arbitrary, x and y, this is a quadratic polynomial in . That polynomial F () has a single

minimum at = (x y)/(y y). Find that minimum value of F () and deduce the Cauchy-Schwarz

inequality. Once we know that the definition (37) satisfies Cauchy-Schwarz, (x y)2 (x x) (y y), we

can define the length of a vector by |x| = (x x)1/2 (this is called the Euclidean length since it corresponds

to length in Euclidean geometry by Pythagorass theorem) and the angle between two vectors in Rn by

cos = (x y)/(|x| |y|). A vector space for which a dot (or inner) product is defined is called a Hilbert space

(or an inner product space).

The bottom line is that for more complex vector spaces, the dot (or scalar or inner) product is a key mathematical construct that allows us to generalize the concept of angle between vectors and, most importantly,

to define orthogonal vectors.

Exercises

1. So what is the angle between (1, 2, 3, 4) and (4, 3, 2, 1)?

2. Can you define a dot product for the vector space of real functions f (x)?

3. Find a vector orthogonal to (1, 2, 3, 4). Find all the vectors orthogonal to (1, 2, 3, 4).

4. Decompose (4,2,1,7) into the sum of two vectors one of which is parallel and the other perpendicular

to (1, 2, 3, 4).

5. Show that cos nx with n integer, is a set of orthogonal functions on (0, ). Find formulas for the

components of a function f (x) in terms of that orthogonal basis. In particular, find the components

of sin x in terms of the cosine basis in that (0, ) interval.

7.2

The norm of a vector, denoted kak, is a positive real number that defines its size or length (but not in

the sense of the number of its components). For displacement vectors in Euclidean spaces, the norm is the

length of the displacement, kak = |a| i.e. the distance between point A and B if AB = a. The following

properties are geometrically straightforward for length of displacement vectors:

1.

2.

kak = || kak,

3.

ka + bk kak + kbk.

(triangle inequality)

Draw the triangle formed by a, b and a + b to see why the latter is called the triangle inequality. For more

general vector spaces, these properties become the defining properties (axioms) that a norm must satisfy. A

vector space for which a norm is defined is called a Banach space.

7.3

For other types of vector space, there are many possible definitions for the norm of a vector as long as those

definitions satisfy the 3 norm properties. In Rn , the p-norm of vector x is defined by the positive number

1/p

kxkp , |x1 |p + |x2 |p + + |xn |p

,

(38)

where p 1 is a real number. Commonly used norms are the 2-norm kxk2 which is the square root of the

sum of the squares, the 1-norm kxk1 (sum of absolute values) and the infinity norm, kxk , defined as the

limit as p of the above expression.

c

F.

Waleffe, Math 321, 2013/1/21

17

Note that the 2-norm kxk2 = (x x)1/2 and for that reason is also called the Euclidean norm. In fact, if

a dot product is defined, then a norm can always be defined as the square root of the dot product. In other

words, every Hilbert space is a Banach space, but the converse is not true.

1. Show that the infinity norm kxk = maxi |xi |.

2. Show that the p-norm satisfies the three norm properties for p = 1, 2, .

3. Define a norm for Cn .

4. Define the 2-norm for real functions f (x) in 0 < x < 1.

The cross product is a very useful operation for physical applications (mechanics, electromagnetism), but

it is particular to 3D space. The cross product of two vectors a, b is the vector denoted a b that is (1)

orthogonal to both a and b, (2) has magnitude equal to the area of the parallelogram with sides a and b,

(3) has direction determined by the right-hand rule (or the cork-screw rule), i.e.

a b , An

(39)

where A = |a| |b| sin is the area of the parallelogram spanned by a and b, and n

is a unit vector perpendicular to both a and b in the direction such that a, b, n

is right handed. is the angle between a and

b as defined for the dot product (is sin 0?). The following figure illustrates the cross-product with a

perspective view (left) and a top view (right), with a b out of the paper in the top view.

ab

ab

ba

a

An important geometric property of the cross product is

a b = a b = a b

(40)

where a = a (a

b)

b is the vector component of a perpendicular to b and likewise b = b (b a

)

a is

the vector component of b perpendicular to a (so the meaning of is relative). The cross-product has the

following properties:

1.

a b = b a,

2.

3.

c (a + b) = (c a) + (c b)

(anti-commutativity)

a a = 0,

So we manipulate the cross product as wed expect except for the anti-commutativity, which is a big difference

from our other elementary products! The first 2 properties are geometrically obvious from the definition.

To show the third property (distributivity) let c = |c|

c and get rid of |c| by prop 2. All three cross products

give vectors perpendicular to c and furthermore from (40) we have c a = c a , c b = c b and

c (a + b) = c (a + b) , where means perpendicular to c, a = a (a c)

c, etc. So the cross-products

eliminate the components parallel to c and all the action is in the plane perpendicular to c.

18

c b

c (a + b)

c a

a

c

8.1

from the top, with c pointing out of the sheet of paper. Then a cross

product by c is equivalent to a rotation of the perpendicular components

by /2 counterclockwise. Since a, b and a + b form a triangle, their

perpendicular projections a , b and (a + b) form a triangle and

therefore ca, cb and c(a+b) also form a triangle, demonstrating

distributivity.

(a + b)

Orientation of Bases

If we pick an arbitrary unit vector e1 , then a unit vector e2 orthogonal to e1 , then there are two possible

unit vectors e3 orthogonal to both e1 and e2 . One choice gives a right-handed basis (i.e. e1 in right thumb

direction, e2 in right index direction and e3 in right major direction). The other choice gives a left-handed

basis. These two types of bases are mirror images of each other as illustrated in the following figure, where

e1 0 = e1 point straight out of the paper (or screen).

e2 0

e2

e3 0

e1

e3

e1

Left-handed

Right-handed

mirror

This figure reveals an interesting subtlety of the cross product. For this particular choice of left and right

handed bases (other arrangements are possible of course), e1 0 = e1 and e2 0 = e2 but e3 0 = e3 so e1 e2 = e3

and e1 0 e2 0 = e3 = e3 0 . This indicates that the mirror image of the cross-product is not the cross-product

of the mirror images. On the opposite, the mirror image of the cross-product e3 0 is minus the cross-product of

the images e1 0 e2 0 . We showed this for a special case, but this is general, the cross-product is not invariant

under reflection, it changes sign. Physical laws should not depend on the choice of basis, so this implies that

they should not be expressed in terms of an odd number of cross products. When we write that the velocity

of a particle is v = r, v and r are good vectors (reflecting as they should under mirror symmetry) but

is not quite a true vector, it is a pseudo-vector. It changes sign under reflection. That is because rotation

vectors are themselves defined according to the right-hand rule, so an expression such as r actually

contains two applications of the right hand rule. Likewise in the Lorentz force F = qv B, F and v are

good vectors, but since the definition involves a cross-product, it must be that B is a pseudo-vector. Indeed

B is itself a cross-product so the definition of F actually contains two cross-products.

The orientation (right-handed or left-handed) did not matter to us before but, now that weve defined

the cross-product with the right-hand rule, well typically choose right-handed bases. We dont have to,

geometrically speaking, but we need to from an algebraic point of view otherwise wed need two sets of

algebraic formula, one for right-handed bases and one for left-handed bases. In terms of our right-handed

cross product definition, we can define a right-handed basis by

e1 e2 = e3 ,

(41)

e2 e3 = e1 ,

e2 e1 = e3 ,

e3 e1 = e2 ,

e1 e3 = e2 ,

e3 e2 = e1 .

(42)

(43)

c

F.

Waleffe, Math 321, 2013/1/21

19

Note that (42) are cyclic rotations of the basis vectors in (41), i.e. (e1 , e2 , e3 ) (e2 , e3 , e1 ) (e3 , e1 , e2 ).

The orderings of the basis vectors in (43) do not correspond to cyclic rotations. For 3 elements, a cyclic

rotation corresponds to an even number of permutations. For instance to go from (e1 , e2 , e3 ) to (e2 , e3 , e1 )

we can first permute (switch) e1 e2 to obtain (e2 , e1 , e3 ) then permute e1 and e3 . The concept of even

and odd number of permutations is more general. But for three elements it is useful to think in terms of

cyclic and acyclic permutations.

If we expand a and b in terms of the right-handed e1 , e2 , e3 , then apply the 3 properties of the crossproduct i.e. in compact summation form

a=

3

X

i=1

ai ei ,

b=

3

X

j=1

bj ej ,

ab=

3 X

3

X

ai bj (ei ej ),

i=1 j=1

we obtain

a b = (a2 b3 a3 b2 )e1 + (a3 b1 a1 b3 )e2 + (a1 b2 a2 b1 )e3 .

(44)

Verify this result explicitly. What would the formula be if the basis was left-handed?

That expansion of the cross product with respect to a right-handed orthonormal basis (44) is often remembered using the formal determinant (i.e. this is not a true determinant, its just convenient mnemonics)

e1 e2 e3

a1 a2 a3 .

b1 b2 b3

8.2

I Exercise: Visualize the vector (a b) a. Sketch it. What are its geometric properties? What is its

magnitude?

Double cross products3 occur frequently in applications (e.g. angular momentum of a rotating body)

directly or indirectly (recall above discussion about mirror reflection and cross-products in physics). They

have simple expressions

(a b) c = (a c) b (b c) a,

(45)

a (b c) = (a c) b (a b)c.

One follows from the other after some manipulations and renaming of vectors, but we can remember both

at once as: middle vector times dot product of the other two minus other vector within parentheses times dot

product of the other two. 4

Lets visualize the geometry of the double product: (a b) is orthogonal to both a and b, and v =

(a b) c is orthogonal to (a b) so v is in the a, b plane and v = a + b for some scalars and .

Now v is also orthogonal to c, so v c = (a c) + (b c) = 0, therefore = (b c) and = (a c) for

some scalar and v = (b c)a (a c)b. This almost gives the formula (45) but we still need to show

that = 1.

Write c = ck +c , where ck is parallel, and c perpendicular, to ab,

so v = (a b) c and a c = a c , b c = b c , since parallel to

v

b

a b means perpendicular to a and b. So a, b, v and c are all in the

plane perpendicular to a b. Thats easier to visualize. To determine

a

in v = (bc )a(ac )b, take the cross product with b to obtain

v b = (b c ) (a b). Now from the figure and the definition of v

we have v b = (a b)|c | |b| sin , and the figure also shows that

c

bc = |b| |c | cos( 2 ), therefore = 1 since sin = cos(/2)

ab

(with positive clockwise). [This derivation can be improved]

3 The

double vector product is often called triple vector product, there are 3 vectors but only 2 vector products!

is more useful than the confusing BAC-CAB rule for remembering the 2nd. Try applying the BAC-CAB mnemonic

to (b c) a for confusing fun!

4 This

20

Exercises

1. Show that |a b|2 + (a b)2 = |a|2 |b|2 , a, b.

2. A particle of charge q moving at velocity v in a magnetic field B experiences the Lorentz force F =

qv B. Show that there is no force in the direction of the magnetic field and that the Lorentz force

does no work on the particle. Does it have any effect on the particle?

3. Sketch three vectors such that a + b + c = 0, show that a b = b c = c a in two ways (1) from the

geometric definition of the cross product and (2) from the algebraic properties of the cross product.

Deduce the law of sines relating the sines of the angles of a triangle and the lengths of its sides.

4. Consider any three points P1 , P2 , P3 in 3D Euclidean space. Show that 12 (r 1 r 2 + r 2 r 3 + r 3 r 1 ) is

a vector whose magnitude is the area of the triangle and is perpendicular to the triangle in a direction

determined by the ordering P1 , P2 , P3 and the right hand rule.

5. Consider an arbitrary non-self intersecting quadrilateral in the (x, y) plane with vertices P1 , P2 , P3 , P4 .

Show that 12 (r 1 r 2 + r 2 r 3 + r 3 r 4 + r 4 r 1 ) is a vector whose magnitude is the area of the

quadrilateral and points in the

z direction depending on whether P1 , P2 , P3 , P4 are oriented counterclockwise, or clockwise, respectively. What is the compact formula for the area of the quadrilateral in

terms of the (x, y) coordinates of each point? What happens if the points are in a plane but not the

(x, y) plane?

6. Four arbitrary points in 3D Euclidean space E3 are the vertices of an arbitrary tetrahedron. Consider

the area vectors perpendicular pointing outward for each of the faces of the tetrahedron, with magnitudes equal to the area of the corresponding triangular face. Make sketch. Show that the sum of these

area vectors is zero.

7. Vector c is the (right hand) rotation of vector b about e3 by angle . Find the components of c in

terms of the components of b.

8. Vector c is the (right hand) rotation of vector b about a by angle . Find c in terms of a, b, and

basic vector operations. Find c if a (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ), in cartesian components. [Hint:

consider the intrinsic basis a

, b and a

b ]

9. Show by (1) cross product geometry and (2) cross product algebra that all the vectors x such that

a x = b have the form

ba

x = a +

, R

kak2

10. Show the Jacobi identity:

a (b c) + b (c a) + c (a b) = 0.

11. If n

is any unit vector, show algebraically and geometrically that any vector a can be decomposed as

a = (a n

)

n+n

(a n

) ak + a .

(46)

12. A left-handed basis e1 0 , e2 0 , e3 0 , is defined by ei 0 ej 0 = ij and e1 0 e2 0 = e3 0 . Show that

(ei 0 ej 0 ) ek 0 has the opposite sign to the corresponding expression for a right-handed basis, i, j, k

(the definition of the cross-product remaining its right-hand rule self). Thus deduce that the formula

for the components of the cross-product in the left handed basis would all change sign.

13. Prove (45) using the intrinsic right-handed orthonormal basis e1 = a/|a|, e3 = (a b)/|a b| and

e2 = e3 e1 . Then a = a1 e1 , b = b1 e1 + b2 e2 , c = c1 e1 + c2 e2 + c3 e3 . Visualize and explain why this

is a general result and therefore a proof of the double cross product identity.

14. Prove (45) using the right-handed orthonormal basis e1 = c /|c |, e3 = (ab)/|ab| and e2 = e3 e1 .

In that basis a = a1 e1 + a2 e2 and b = b1 e1 + b2 e2 but what is c? Show by direct calculation that

a b = (a1 b2 a2 b1 )e3 and v = |c |(a1 b2 a2 b1 )e2 = |c |a1 b |c |b1 a. Why is this (a c)b (b c)a

and thus a proof of the identity?

c

F.

Waleffe, Math 321, 2013/1/21

9

9.1

21

Index notation

Levi-Civita (a.k.a. alternating or permutation) symbol

We have used the Kronecker symbol (33) to express all the dot products ei ej = ij in a very compact form.

There is a similar symbol, ijk , the Levi-Civita or alternating symbol, defined as

1

1

if (i, j, k) is an odd permutation of (1, 2, 3),

ijk =

(47)

0

otherwise,

or, explicitly: 123 = 231 = 312 = 1 and 213 = 132 = 321 = 1, all other ijk = 0. Recall that for 3

elements an even permutation is the same as a cyclic permutation, therefore ijk = jki = kij , i, j, k (why?).

The ijk symbol provides a compact expression for the components of the cross-product of right-handed basis

vectors:

(ei ej ) ek = ijk .

(48)

but since this is the k-component of (ei ej ) we can also write

(ei ej ) =

3

X

ijk ek .

(49)

k=1

Note that there is only one non-zero term in the latter sum (but then, why cant we drop the sum?). Verify

this result for yourself.

9.2

The expansion of vectors a and b in terms of basis e1 ,e2 ,e3 , a = a1 e1 +a2 e2 +a3 e3 and b = b1 e1 +b2 e2 +b3 e3 ,

can be written compactly using the sigma () notation

a=

3

X

ai ei ,

b=

i=1

3

X

bi ei .

(50)

i=1

We have introduced the Kronecker symbol ij and the Levi-Civita symbol ijk in order to write and

perform our basic vector operations such as dot and cross products in compact forms, when the basis is

orthonormal and right-handed, for instance using (33) and (49)

ab=

3 X

3

X

ai bj ei ej =

i=1 j=1

ab=

3 X

3

X

ai bj ij =

3

X

ai bi

(51)

ai bj ijk ek

(52)

i=1 j=1

3 X

3

X

ai bj ei ej =

i=1 j=1

3 X

3 X

3

X

i=1

Note that i and j are dummy or summation indices in the sums (50) and (51), they do not have a specific

value, they have all the possible values in their range. It is their place in the particular expression and their

range that matters, not their name

a=

3

X

i=1

ai ei =

3

X

j=1

aj ej =

3

X

ak ek = 6=

k=1

3

X

Indices come in two kinds, the dummies and the free. Heres an example

3

X

ei (a b)c =

aj bj ci ,

j=1

ak ei

(53)

k=1

(54)

22

here j is a dummy summation index, but i is free, we can pick for it any value 1, 2, 3. Freedom comes with

constraints. If we use i on the left-hand side of the equation, then we have no choice, we must use i for ci

on the right hand side. By convention we try to use i, j, k, l, m, n, to denote indices, which are positive

integers. Greek letters are sometimes used for indices.

Mathematical operations impose some naming constraints however. Although, we can use the same index

name, i, in the expansions of a and b, when they appear separately as in (50), we cannot use the same index

name if we multiply them as in (51) and (52). Bad things will happen if you do, for instance

!

!

3

3

3

X

X

X

ab=

ai ei

bi ei =

ai bi ei ei = 0 (WRONG!)

(55)

i=1

9.3

i=1

i=1

While he was developing the theory of general relativity, Einstein noticed that many of the sums that occur

in calculations involve terms where the summation index appears twice. For example, i appears twice in the

single sums in (50), i and j appear twice in the double sum in (51) and i, j and k each appear twice in the

triple sum in (52). To facilitate such manipulations he dropped the signs and adopted the summation

convention that a repeated index implicitly denotes a sum over all values of that index. In a

letter to a friend he wrote I have made a great discovery in mathematics; I have suppressed the summation

sign every time that the summation must be made over an index which occurs twice. Thus with Einsteins

summation convention we write

a = ai ei ,

b = bj ej ,

a b = ai bi ,

a b = ijk ai bj ek ,

(56)

and any repeated index implies a sum over all values of that index. This is a very useful and widely used

notation but you have to use it with care and there are cases where it cannot be used. Indices can never be

repeated more than twice, if they are, thats probably a mistake as in (55), if not then you are out of luck

and need to use s or invent your own notation. A few common operations in the summation convention:

We love to see a ij involved in a sum since this collapses that sum. This is called the substitution rule, if

ij appears in a sum, we can forget about it and eliminate the summation index, for example

ai ij = aj ,

kl kl = kk = ll = 3,

ij ijk = iik = 0

(57)

note the second result kk = 3 because k is repeated, so there is a sum over all values of k and kk =

11 + 22 + 33 . The last result is because ijk vanishes whenever two indices are the same. That last

expression ij ijk involves a double sum over i and over j. The ij collapses one of those sums. It doesnt

matter which index we choose to eliminate since both are dummy indices. Lets compute the l component

of a b from (56). We pick l because i, j and k are already taken. The l component is

el (a b) = ijk ai bj ek el = ijk ai bj kl = ijl ai bj = lmn am bn

(58)

what happened on that last step? first, ijk = kij because (i, j, k) to (k, i, j) is a cyclic permutation which

corresponds to an even number of permutation in space of odd dimension (dimension 3, here) and the value

of ijk does not change under even permutations. Then i and j are dummies and we renamed them m and

n respectively being careful to keep the order. The final result is worth memorizing: if c = a b, the l

component of c is cl = lmn am bn , or switching indices to i, j, k

c = a b ci = ijk aj bk c = ei ijk aj bk .

(59)

In the spirit of no pain-no gain, lets write the double cross product identity (a b) c in this indicial

notation. Let v = a b then the i component of the double cross product v c is ijk vj ck . Now we need

the j component of v = a b. Since i and k are taken we use l, m as new dummy indices, and we have

vj = jlm al bm . So the i component of the double cross product (a b) c is

ijk jlm al bm ck .

(60)

c

F.

Waleffe, Math 321, 2013/1/21

23

Note that j, k, l and m are repeated, so this expression is a quadruple sum! According to our double cross

product identity it should be equal to the i component of (a c)b (b c)a for any a, b, c. We want the i

component of the latter expression since i is a free index in (60), that i component is

(aj cj )bi (bj cj )ai

(61)

(wait! isnt j repeated 4 times? no, its not. Its repeated twice in separate terms so this is a difference

of two sums over j). Since (60) and (61) are equal to each other for any a, b, c, this should be telling us

something about ijk , but to extract that out we need to rewrite (61) in the form al bm ck . How? by making

use of our ability to rename dummy variables and adding variables using ij . Lets look at the first term in

(61), (aj cj )bi , heres how to write it in the form al ck bm as in (60):

(aj cj )bi = (ak ck )bi = (lk al ck )(im bm ) = lk im al ck bm .

(62)

Do similar manipulations to the second term in (61) to obtain (bj cj )ai = il km al ck bm and

ijk jlm al bm ck = (lk im il km )al ck bm .

(63)

Since this equality holds for any al , ck , bm , we must have ijk jlm = (lk im il km ). Thats true but its not

written in a nice way so lets clean it up to a form thats easier to reconstruct. First note that ijk = jki since

ijk is invariant under a cyclic permutation of its indices. So our identity becomes jki jlm = (lk im il km ).

Weve done that flipping so the summation index j is in first place in both factors. Now we prefer the

lexicographic order (i, j, k) to (j, k, i) so lets rename all the indices being careful to rename the correct

indices on both sides. This yields

ijk ilm = jl km jm kl

(64)

This takes some digesting. Go through it carefully again. And again, as many times as it takes. The identity

(64) is actually pretty easy to remember and verify. First, ijk ilm is a sum over i but there is never more

than one non-zero term (why?). Second, the only possible values for that expression are +1, 0 and 1

(why?). The only way to get 1 is to have (j, k) = (l, m) with j = l 6= k = m (why?), but in that case the

right hand side of (64) is also 1 (why?). The only way to get 1 is to have (j, k) = (m, l) with j = m 6= k = l

(why?), but in that case the right hand side is 1 also (why?). Finally, to get 0 we need j = k or l = m and

the right-hand side again vanishes in either case. For instance, if j = k then we can switch j and k in one

of the terms and jl km jm kl = jl km km jl = 0.

Formula (64) has a generalization that does not include summation over one index

ijk lmn =il jm kn + im jn kl + in jl km

im jl kn in jm kl il jn km

(65)

note that the first line correspond to (i, j, k) and (l, m, n) matching up to cyclic rotations, while the second

line corresponds to (i, j, k) matching with an odd (acyclic) rotation of (l, m, n).

Exercises

1. Explain why ijk = jki = jik for any integer i, j, k.

2. Using (48) and Einsteins notation show that (a b) ek = ijk ai bj and (a b) = ijk ai bj ek =

ijk aj bk ei .

3. Show that ijk ljk = 2il by direct deduction and by application of (64).

4. Deduce (64) from (65).

24

10

A mixed product5 of three vectors has the form (a b) c, it involves both a cross and a dot product. The

result is a scalar. We have already encountered mixed products (e.g. eqn. (48)) but their geometric and

algebraic properties are so important that they merit their own subsection.

(a b) c = (b c) a = (c a) b =

a (b c) = b (c a) = c (a b) =

volume of the parallelepiped spanned by a, b, c

(66)

ab

c

b

a

Take a and b as the base of the parallelepiped then a b is perpendicular to the base and has magnitude

equal to the base area. The height is z c where z is the unit vector perpendicular to the base, i.e. parallel to

a b. So the volume is (a b) c. Signwise, (a b) c > 0 if a, b and c, in that order, form a right-handed

basis (not orthogonal in general), and (a b) c < 0 if the triplet is left-handed. Taking b and c, or c and

a, as the base, you get the same volume and sign. The dot product commutes, so (b c) a = a (b c),

yielding the identity

(a b) c = a (b c).

(67)

We can switch the dot and the cross without changing the result. We have shown (66) geometrically. The

properties of the dot and cross products yield many other results such as (a b) c = (b a) c, etc. We

can collect all these results as follows.

A mixed product is one form of a scalar function of three vectors called the determinant

(a b) c , det(a, b, c),

(68)

whose value is the signed volume of the parallelepiped with sides a, b, c. The determinant has three

fundamental properties

1. it changes sign if any two vectors are permuted, e.g.

det(a, b, c) = det(b, a, c) = det(b, c, a),

(69)

det(a + d, b, c) = det(a, b, c) + det(d, b, c),

(70)

det(e1 , e2 , e3 ) = 1.

(71)

5 The mixed product is called the triple scalar product by some authors, but there are only 2 products and they apply to

three vectors.

c

F.

Waleffe, Math 321, 2013/1/21

25

Deduce these from the properties of the dot and cross products as well as geometrically. Property (70) is

a combination of the distributivity properties of the dot and cross products with respect to vector addition

and multiplication by a scalar. For example,

det(a + d, b, c) = (a + d) (b c) = a (b c) + d (b c)

= det(a, b, c) + det(d, b, c).

From these three properties, you deduce easily that the determinant is zero if any two vectors are identical

(from prop 1), or if any vector is zero (from prop 2 with = 1 and d = 0), and that the determinant does

not change if we add a multiple of one vector to another, for example

det(a, b, a) = 0,

det(a, 0, c) = 0,

det(a + b, b, c) = det(a, b, c).

(72)

Geometrically, this last one corresponds to a shearing of the parallelepiped, with no change in volume or

orientation. One key application of determinants is

det(a, b, c) 6= 0 a, b, c form a basis.

(73)

If det(a, b, c) = 0 then either one of the vectors is zero or they are co-planar and a, b, c cannot provide a basis

for vectors in E3 . This is how the determinant is introduced in elementary linear algebra, it determines

whether a system of linear equations has a solution or not. But the determinant is so much more, it

determines the volume of the parallelepiped and its orientation!

The 3 fundamental properties fully specify the determinant as explored in exercises 5, 6 below. If the

vectors are expanded in terms of a right-handed orthonormal basis, i.e. a = ai ei , b = bj ej , c = ck ek

(summation convention), then we obtain the following formula for the determinant in terms of the vector

components

det(a, b, c) = (a b) c = ai bj ck (ei ej ) ek = ijk ai bj ck .

(74)

Expanding that expression

ijk ai bj ck = a1 b2 c3 + a2 b3 c1 + a3 b1 c2 a2 b1 c3 a3 b2 c1 a1 b3 c2 ,

(75)

a1

det(a, b, c) = ijk ai bj ck b1

c1

a2

b2

c2

a3

b3

c3

a1

= a2

a3

b1

b2

b3

c1

c2

c3

(76)

Note that it does not matter whether we put the vector components along rows or columns. This is a

non-trivial and important property of determinants. (see section on matrices).

This familiar determinant has the same three fundamental properties (69), (70), (71) of course

1. it changes sign if any two columns (or

a1

a2

a3

b1 a1 c1

b1 c1

b2 c2 = b2 a2 c2

b3 a3 c3

b3 c3

a1 + d1 b1 c1

a2 + d2 b2 c2 =

a3 + d3 b3 c3

, (d1 , d2 , d3 ),

a1 b1 c1 d1

a2 b2 c2 + d2

a3 b3 c3 d3

(77)

b1

b2

b3

c1

c2

c3

(78)

26

3. finally, the determinant of the natural basis

is

1

0

0

0

1

0

0

0

1

= 1,

(79)

You deduce easily from these three properties that the det vanishes if any column (or row) is zero or if any two

columns (or rows) is a multiple of another, and that the determinant does not change if we add to one column

(row) a linear combination of the other columns (rows). These properties allow us to calculate determinants

by successive shearings and column-swapping. There is another explicit formula for determinants, in addition

to the ijk ai bj ck formula, it is the Laplace (or Cofactor) expansion in terms of 2-by-2 determinants, e.g.

a1 b1 c1

a2 b2 c2 = a1 b2 c2 a2 b1 c1 + a3 b1 c1 ,

(80)

b2 c1

b3 c3

b3 c3

a3 b3 c3

where the 2-by-2 determinants are

a1

a2

b1

= a1 b2 a2 b1 .

b2

(81)

This formula is nothing but a (b c) expressed with respect to a right handed basis. To verify that,

compute the components of (b c) first, then dot with the components of a. This cofactor expansion

formula can be applied to any column or row, however there are 1 factors that appear. We wont go into

the straightforward details, but all that follows directly from the column swapping property (77). Thats

essentially the identities a (b c) = b (c a) = .

Exercises

a b

1. Show that the 2-by-2 determinant 1 1 = a1 b2 a2 b1 , is the signed area of the parallelogram

a2 b2

with sides a = a1 e1 + a2 e2 , b = b1 e1 + b2 e2 . It is positive if a, b, a, b is a counterclockwise cycle,

negative if the cycle is clockwise. Sketch (of course).

2. The determinant det(a, b, c) of three oriented line segments a, b, c is a geometric quantity. Show that

det(a, b, c) = |a| |b| |c| sin cos . Specify and . Sketch.

3. Show that |a| |b| |c| det(a, b, c) |a| |b| |c|. When do the equalities apply? Sketch.

4. Use properties (69) and (70) to show that

det(a + d, b + e, c) =

det(a, b, c) + det(a, e, c) + det(d, b, c) + det(d, e, c).

5. Use properties (69) and (71) to show that det(ei , ej , ek ) = ijk .

6. Use property (70) and exercise 5 above to show that if a = ai ei , b = bi ei , c = ci ei (summation

convention) then det(a, b, c) = ijk ai bj ck .

7. Prove the identity (a b) (c d) = (a c)(b d) (a d)(b c) using both vector identities and indicial

notation.

8. Express (a b) (a b) in terms of dot products of a and b.

9. Show that (a a)(b b) (a b)2 is the square of the area of the parallelogram spanned by a and b.

10. If A is the area the parallelogram with sides a and b, show that

aa ab

2

.

A =

ab bb

c

F.

Waleffe, Math 321, 2013/1/21

27

11. If det(a, b, c) 6= 0, then any vector v can be expanded as v = a + b + c. Find explicit expressions

for the components , , in terms of v and the basis vectors a, b, c in the general case when the

latter are not orthogonal. [Hint: project on cross products of the basis vectors, then collect the mixed

products into determinants and deduce Cramers rule].

12. Given three vectors a1 , a2 , a3 such that D = a1 (a2 a3 ) 6= 0, define

a01 = (a2 a3 )/D,

(82)

(i) Show that ai a0j = ij , i, j = 1, 2, 3.

P3

P3

(ii) Show that if v = i=1 vi ai and v = i=1 vi0 a0i , then vi = v a0i and vi0 = v ai . So the components

in one basis are obtained by projecting onto the other basis.

13. If a and b are linearly independent and c is any arbitrary vector, find , and such that c =

a + b + (a b). Express , and in terms of dot products only. [Hint: find and first, then

use ck = c c .]

14. Express (a b) c in terms of dot products of a, b and c only. [Hint: solve problem 13 first.]

15. Provide an algorithm to compute the volume of the parallelepiped (a, b, c) by taking only dot products. [Hint: rectify the parallelepiped (a, b, c) (a, b , c ) (a, b , c ) where b and c are

perpendicular to a, and c is perpendicular to both a and b . Explain geometrically why these

transformations do not change the volume. Explain why these transformations do not change the

determinant by using the properties of determinants.]

16. (*) If V is the volume of the parallelepiped with sides a,

aa ab

2

V = b a b b

ca cb

b, c show that

a c

b c .

cc

Do this in several ways: (i) from problem 13, (ii) using indicial notation and the formula (65).

11

We discussed points, lines and planes in section 5 and reviewed the concepts of position vector OP = r =

r

r = + z z = x

x + yy

+ z z and parametric equations of lines and planes. Here, we briefly review implicit

equations of lines and planes and some applications of dot and cross products to points, lines and planes.

The center of

r c of a system of P

N particles of mass mi located at position r i , i = 1, . . . , N is defined

Pmass

N

N

by M r c = i=1 mi r i where M = i=1 mi is the total mass. This is a coordinate-free expression for

the center of mass. In particular, if all the masses are equal then for N = 2, r c = (r 1 + r 2 )/2, for

N = 3, r c = (r 1 + r 2 + r 3 )/3.

The vector equation of a line parallel to a passing through a point r 0 is

r = r 0 + a,

R (r r 0 ) a = 0

(83)

These are the (explicit) parametric equation r = r 0 + a, with parameter , and the implicit equation

(r r 0 ) a = 0 of a line.

The equation of a plane through r 0 parallel to a and b (with ab 6= 0), or (equivalently) perpendicular

to n a b is

r = r 0 + a + b, , R (r r 0 ) n = 0

(84)

These are the explicit, parametric equation of the plane with parameters and , and the implicit

equation of the plane.

28

The equation of a sphere of center r c and radius R is

|r r c | = R r = r c + R n

,

n s.t. |

n| = 1.

(85)

Exercises

1. Show that the center of gravity of three points of equal mass is at the point of intersection of the

medians of the triangle formed by the three points.

2. What are the equations of lines and planes in Cartesian coordinates?

3. Find vector equations for the line passing through the two points r 1 , r 2 and the plane through the

three points r 1 , r 2 , r 3 .

4. What is the distance between the point r 1 and the plane through r 0 perpendicular to a?

5. What is the distance between the point r 1 and the plane through r 0 parallel to a and b?

6. What is the distance between the line parallel to a that passes through point A and the line parallel

to b that passes through point B?

7. A particle was at point P1 at time t1 and is moving at the constant velocity v 1 . Another particle was

at P2 at t2 and is moving at the constant velocity v 2 . How close did the particles get to each other

and at what time? What conditions are needed for a collision?

8. Point C is obtained by rotating point B about the axis passing through point A, with direction a, by

angle (right hand rotation by about a). Find an explicit vector expression for OC in terms of OB,

OA, a and . Make clean sketches. Express your vector result in cartesian form.

12

12.1

Consider two orthonormal bases (e1 , e2 , e3 ) and (e01 , e02 , e03 ) in 3D euclidean space. A vector v can be

expanded in terms of each bases as v = v1 e1 + v2 e2 + v3 e3 vi ei and v = v10 e1 0 + v20 e2 0 + v30 e3 0 vi0 e0i .

What are the relationships between the two sets of components (v1 , v2 , v3 ) and (v10 , v20 , v30 )?

In 2D, you can find the relations between the components (v1 , v2 )

and (v10 , v20 ) directly using geometry,

v2

v20

v

v10

+v2 sin ,

v20 = v1 sin

+v2 cos .

(86)

Can you?

v1

However, it is easier and more systematic to use the basis vectors

and vector operations. The starting point is the vector identity

e2

e02

v10 = v1 cos

(87)

e01

e1

= v1 cos + v2 sin .

(88)

c

F.

Waleffe, Math 321, 2013/1/21

29

In 3D, a direct geometric derivation is quite challenging but using the direction vectors and vector algebra

is just as straightforward. The components (v1 , v2 , v3 ) 6= (v10 , v20 , v30 ) for distinct bases, but the geometric

vector v is independent of the choice of basis, thus

v = vi0 ei 0 = vj ej ,

using Einsteins summation convention, and the relationship between the two sets of coordinates are then

vi0 = ei 0 v = (ei 0 ej ) vj , Qij vj ,

(89)

(90)

Qij , ei 0 ej = cos ij

(91)

and, likewise,

where we defined

where ij is the angle between the unit vectors e0i and ej and these coefficients Qij = cos ij are the direction

cosines. These 9 coefficients Qij are the elements of a 3-by-3 matrix Q, that is, a 3-by-3 table with the first

index i corresponding to the row index and the second index j to the column index

e1 e1 e1 0 e2 e1 0 e3

Q Qij = Q21 Q22 Q23 = e2 0 e1 e2 0 e2 e2 0 e3

(92)

Q31 Q32 Q33

e3 0 e1 e3 0 e2 e3 0 e3

This is the Direction Cosine Matrix.

There are 9 direction cosines (in 3D space) but orthonormality of both bases imply many constraints, so

these coefficients are not independent. These constraints follow from eqns. (89), (90) which must hold for

any (v1 , v2 , v3 ) and (v10 , v20 , v30 ). Substituting (90) into (89) (watching out for dummy indices!) yields

vi0 = Qik Qjk vj0 ,

v 0 Qik Qjk = ij .

(93)

This means that the rows of matrix Q (92) are orthogonal to each other and of unit magnitude. Likewise,

substituting (89) into (90) gives

vi = Qki Qkj vj ,

v Qki Qkj = ij ,

(94)

and the columns of matrix Q (92) are also orthogonal to each other and of unit magnitude.

These two sets of orthogonal relationships can also be derived more geometrically as follows. The coefficient Qij = ei 0 ej is both the j component of ei 0 in the {e1 , e2 , e3 } basis, and the i component of ej in the

{e1 0 , e2 0 , e3 0 } basis. Therefore we can write

ei 0 = (ei 0 ek ) ek = Qik ek Qi1 e1 + Qi2 e2 + Qi3 e3 ,

in other words, e0i (Qi1 , Qi2 , Qi3 ) in the basis (e1 , e2 , e3 ), and this is the i-th row of matrix Q (92). Now

the dot product of 2 vectors a = ak ek and b = bl el is a b = ak bk since the basis (e1 , e2 , e3 ) is orthonormal,

so the dot product of e0i = Qik ek and e0j = Qjl el is

ei 0 ej 0 = Qik Qjk = ij ,

(95)

since (e01 , e02 , e03 ) is also orthonormal and therefore ei 0 ej 0 = ij . So the rows of Q are orthonormal because

they are the components of (e01 , e02 , e03 ) in the basis (e1 , e2 , e3 ).

Likewise,

ej = (ej ek 0 ) ek 0 = Qkj ek 0 Q1j e1 0 + Q2j e2 0 + Q3j e3 0 .

and ej (Q1j , Q2j , Q3j ) in the basis (e01 , e02 , e03 ), this corresponds to the j-th column of matrix Q (92).

Then

ei ej = Qki Qkj = ij ,

(96)

and the columns of Q are orthonormal because they are the components of (e1 , e2 , e3 ) in the basis (e01 , e02 , e03 )

.

30

Exercises

1. Find Q if (e01 , e02 , e03 ) is the right hand rotation of (e1 , e2 , e3 ) about e3 by angle . Verify (95) and

(96) for your Q. If v = vi ei , find the components of v in the basis (e01 , e02 , e03 ) in terms of (v1 , v2 , v3 )

and .

2. Find Q if (e01 , e02 , e03 ) is the right hand rotation of (e1 , e2 , e3 ) about e2 by angle . Verify (95) and (96)

for your Q.

3. Find Q if e01 = e1 , e02 = e3 , e03 = e2 . Verify (95) and (96) for your Q.

4. What is the value of the determinant of Q? Go back to section 10 to find out and explain.

5. If a and b are two arbitrary vectors and (e1 , e2 , e3 ) and (e01 , e02 , e03 ) are two distinct arbitrary bases,

we have already shown that ai bi = a0i b0i (summation convention) in section (36). Verify this invariance

directly from the transformation rule (89), vi0 = Qij vj , showing of your mastery of index notation.

12.2

Matrices

That Q was a very special matrix, namely an orthogonal matrix. More generally a 3-by-3 real matrix A is

a table of 9 real numbers

A [Aij ] A21 A22 A23 .

(97)

A31 A32 A33

Matrices are denoted by a capital letter, e.g. A and Q and by square brackets [ ]. By convention, vectors in

R3 are defined as 3-by-1 matrices e.g.

x1

x = x1 ,

x3

although for typographical reasons we often write x = (x1 , x2 , x3 ) but not [x1 , x2 , x3 ] which would denote

a 1-by-3 matrix, or row vector. The term matrix is similar to vectors in that it implies precise rules for

manipulations of these objects (for vectors these are the two fundamental addition and scalar multiplication

operations with specific properties, see Sect. 2).

Matrix-vector multiply

Equation (89) shows how matrix-vector multiplication should be defined. The matrix vector product Ax (A

3-by-3, x 3-by-1) is a 3-by-1 vector b in R3 whose i-th component is the dot product of row i of matrix A

with the column x,

Ax = b bi = Aij xj

(98)

where Aij xj Ai1 x1 +Ai2 x2 +Ai3 x3 in the summation convention. The product of a matrix with a (column)

vector is performed row-by-column. This product is defined only if the number of columns of A is equal to

the number of rows of x. A 2-by-1 vector cannot be multiplied by a 3-by-3 matrix.

Identity Matrix

There is a unique matrix such that Ix = x, x. For x R3 , show that

1

I= 0

0

0

1

0

0

0 .

1

(99)

c

F.

Waleffe, Math 321, 2013/1/21

31

Matrix-Matrix multiply

Two successive linear transformation of coordinates, that is,

x0i = Aij xj ,

then

(summation over repeated indices) can be combined into one transformation from xj to x00i

x00i = Bik Akj xj , Cij xj

where

Cij , (BA)ij = Bik Akj .

(100)

This defines matrix multiplication. The product of two matrices BA is a matrix, C say, whose (i, j) element

Cij is the dot product of row i of B with column j of A. As for matrix-vector multiplication, the product

of two matrices is done row-by-column. This requires that the number of columns of the first matrix in the

product (B) equals the number of rows of the second matrix (A). Thus, the product of a 3-by-3 matrix

and a 2-by-2 matrix is not defined, for instance. We can only multiply M-by-N by an N-by-P, that is inner

dimensions must match. In general, BA 6= AB, matrix multiplication does not commute. You can visualize

this by considering two successive rotation of axes, one by angle about e3 , followed by one by about e02 .

This is not the same as rotating by about e2 , then by about e03 . You can also see it algebraically

(BA)ij = Bik Akj 6= Aik Bkj = (AB)ij .

Matrix transpose

The transformation (90) involves the sum Aji x0j that is similar to the matrix vector multiply except that

the multiplication is column-by-column! To write this as a matrix-vector multiply, we define the transpose

matrix AT whose row i correspond to column i of A. If the (i, j) element of A is Aij then the (i, j) element

of AT is Aji

(AT )ij = (A)ji .

Then

xi = Aji x0j

x = AT x 0 .

(101)

B Show that the transpose of a product is equal to the product of the transposes in reverse order (AB)T =

B T AT .

12.3

Arbitrary matrices are typically denoted A, while orthogonal matrices are typically denoted Q in the

literature. In matrix notation, the orthogonality conditions (95), (96) read

QT Q = QQT = I.

(102)

A matrix that satisfy these relationships is called an orthogonal matrix (it should have been called orthonormal). A proper orthogonal matrix has determinant equal to 1 and corresponds to a pure rotation. An

improper orthogonal matrix has determinant -1. It corresponds to a combination of rotations and an odd

number of reflections. The product of orthogonal matrices is an orthogonal matrix.

This is useful as any 3-by-3 proper orthogonal matrix can be decomposed into the product of three

elementary rotations. There are several ways to define these elementary rotations but a common one that

corresponds to spherical coordinates is to (1) rotate by about e3 , (2) rotate by about e2 0 , (3) rotate by

about e3 00 . The 3 angles , , are called Euler angles. Our choice of angles here is typically labelled

32

ZY Z, since the successive bases rotation are around the z-axis, then y, then z. Hence a general 3-by-3

orthogonal matrix A can always be written as

cos

sin 0

cos 0 sin

cos sin 0

sin cos 0 .

1

0

Q = sin cos 0 0

(103)

0

0

1

sin 0 cos

0

0

1

To define an arbitrary orthogonal matrix, we can then simply pick any three arbitrary (Euler) angles

, , and construct an orthonormal matrix using (103). Another important procedure to do this is the

Gram-Schmidt procedure: pick any three a1 , a2 , a3 and orthonormalize them, i.e.

(1) First, define q 1 = a1 /ka1 k and a02 = a2 (a2 q 1 )q 1 , a03 = a3 (a3 q 1 )q 1 ,

(2) next, define q 2 = a02 /ka02 k and a003 = a03 (a03 q 2 )q 2 ,

(3) finally, define q 3 = a003 /ka003 k.

The vectors q 1 , q 2 , q 3 form an orthonormal basis. This procedure generalizes not only to any dimension

but also to other vector spaces, e.g. to construct orthogonal polynomials.

Exercises

1. Give explicit examples of 2-by-2 and 3-by-3 symmetric and antisymmetric matrices.

2. If xT = [x1 , x2 , x3 ], calculate xT x and xxT .

3. Show that xT x and xxT are symmetric (explicitly and by matrix manipulations).

4. If A is a square matrix of appropriate size, what is xT Ax?

5. Show that the product of two orthogonal matrices is an orthogonal matrix. Interpret geometrically.

6. What is the general form of a 3-by-3 orthogonal and symmetric matrix?

7. What is the orthogonal matrix corresponding to a reflection about the x z plane? What is its

determinant?

8. What is the most general form of a 2-by-2 orthogonal matrix?

9. Suppose that you would like to rotate an object (i.e. a set of points) about a given axis by an angle

. Can you explain how to do this? [Hint: (1) Translation: express coordinates of any point r with

respect to any point r 0 on the rotation axis: r r 0 . (2) Perform two elementary rotations to align

the vertical axis with the rotation axis, i.e. find the Euler angles and . Express the coordinates

of r r 0 in that new set of coordinates. (3) Rotate the vector by , this is equivalent to multiplying

by the transpose of the matrix corresponding to rotation of axes by . Then you need to re-express

the coordinates in terms of the original axes! thats a few multiplication by transpose of matrices you

already have].

10. What is the rotation matrix corresponding to rotation by about e2 ?

11. What is the matrix corresponding to (right-handed) rotation by angle about the direction e1 +e2 +e3 ?

12. Find the components of a vector a rotated by angle about the direction e1 + 2e2 + 2e3 .

13. Pick three non-trivial but arbitrary vectors in R3 (e.g. using Matlabs randn(3,3)), then construct an

orthonormal basis q 1 , q 2 , q 3 using the Gram-Schmidt procedure. Verify that the matrix Q = [q 1 , q 2 , q 3 ]

is orthogonal. Note in particular that the rows are orthogonal eventhough you orthogonalized the

columns only.

14. Pick two arbitrary vectors a1 , a2 in R3 and orthogonalize them to construct q 1 , q 2 . Consider the

3-by-2 matrix Q = [q 1 , q 2 ] and compute QQT and QT Q. Can you explain the results?

c

F.

Waleffe, Math 321, 2013/1/21

12.4

33

Determinant of a matrix

See earlier discussion of determinants (section on mixed product). The determinant of a matrix has the

explicit formula det(A) = ijk Ai1 Aj2 Ak3 , the only non-zero terms are for (i, j, k) equal to a permutation of

(1, 2, 3). We can deduce several fundamental properties of determinants from that formula. We can reorder

Ai1 Aj2 Ak3 into A1l A2m A3n using an even number of permutations if (i, j, k) is an even perm of (1,2,3) and

an odd number for odd permutations. So

det(A) = ijk Ai1 Aj2 Ak3 = lmn A1l A2m A3n = det(AT ).

(104)

ijk Ail Ajm Akn = ijk lmn Ai1 Aj2 Ak3

(105)

det(AB) = ijk Ail Bl1 Ajm Bm2 Akn Bn3 = ijk lmn Ai1 Aj2 Ak3 Bl1 Bm2 Bn3 = det(A) det(B)

(106)

One nice thing is that these results and manipulations generalize straightforwardly to any dimension.

12.5

Three views of Ax = b

Column View

I View b as a linear combination of the columns of A.

Write A as a row of columns, A = [a1 , a2 , a3 ], where aT1 = [a11 , a21 , a31 ] etc., then

b = Ax = x1 a1 + x2 a2 + x3 a3

and b is a linear combination of the columns a1 , a2 , a3 . If x is unknown, the linear system of equations

Ax = b will have a solution for any b if and only if the columns form a basis, i.e. iff det(a1 , a2 , a3 )

det(A) 6= 0. If the determinant is zero, then the 3 columns are in the same plane and the system will have

a solution only if b is also in that plane.

As seen in earlier exercises, we can find the components (x1 , x2 , x3 ) by thinking geometrically and projecting on the reciprocal basis e.g.

x1 =

Likewise

x2 =

det(b, a2 , a3 )

b (a2 a3 )

.

a1 (a2 a3 )

det(a1 , a2 , a3 )

det(a1 , b, a3 )

,

det(a1 , a2 , a3 )

x3 =

(107)

det(a1 , a2 , b)

.

det(a1 , a2 , a3 )

This is a nifty formula. Component xi equals the determinant where vector i is replaced by b divided

by the determinant of the basis vectors. You can deduce this directly from the algebraic properties of

determinants, for example,

det(b, a2 , a3 ) = det(x1 a1 + x2 a2 + x3 a3 , a2 , a3 ) = x1 det(a1 , a2 , a3 ).

This is Cramers rule and it generalizes to any dimension, however computing determinants in higher

dimensions can be very costly and the next approach is computationally much more efficient.

Row View:

I View x as the intersection of planes perpendicular to the rows of A.

View A as a column of rows, A = [n1 , n2 , n3 ]T , where nT1 = [a11 , a12 , a13 ] is the first row of A, etc., then

T

n1

n1 x = b1

n2 x = b2

b = Ax = nT2 x

n3 x = b3

nT3

34

and x is seen as the position vector of the intersection of three planes. Recall that n x = C is the equation

of a plane perpendicular to n and passing through a point x0 such that n x0 = C, for instance the point

x0 = Cn/knk.

To find x such that Ax = b, for given b and A, we can combine the equations in order to eliminate

unknowns, i.e.

n1 x = b1

n1 x = b1

n2 x = b2

(n2 2 n1 ) x = b2 2 b1

n3 x = b3

(n3 3 n1 ) x = b3 3 b1

where we pick 2 and 3 such that the new normal vectors n02 = n2 2 n1 and n03 = n3 3 n1 have a

zero 1st component i.e. n02 = (0, a022 , a023 ), n03 = (0, a032 , a033 ). At the next step, one defines a n003 = n03 3 n02

picking 3 so that the 1st and 2nd components of n003 are zero, i.e. n003 = (0, 0, a0033 ). And the resulting system

of equations is then easy to solve by backward substitution. This is Gaussian Elimination which in general

requires swapping of equations to avoid dividing by small numbers. We could also pick the s and s to

orthogonalize the ns, just as in the Gram-Schmidt procedure. That is better in terms of roundoff error and

does not require equation swapping but is computationally twice as expensive as Gaussian elimination.

Linear Transformation of vectors into vectors

I View b as a linear transformation of x.

Here A is a black box that transforms the vector input x into the vector output b. This is the most

general view of Ax = b. The transformation is linear, this means that

A(x + y) = (Ax) + (Ay),

, R, x, y Rn

(108)

This can be checked directly from the explicit definition of matrix-vector multiply:

X

Aik (xk + yk ) =

Aik xk +

Aik yk .

This linearity property is a key property because if A is really a black box (e.g. the matrix is not actually

known, its just a machine that takes a vector and spits out another vector) we can figure out the effect of

A onto any vector x once we know Ae1 , Ae2 , . . . , Aen .

This transformation view of matrices leads to the following extra rules of matrix manipulations.

Matrix-Matrix addition

X

X

X

Ax + Bx = (A + B)x

Aik xk +

Bik xk =

(Aik + Bik )xk , xk

(109)

k

zero matrix is the matrix whose entries are all zero.

Matrix-scalar multiply

A(x) = (A)x

X

k

Aik (xk ) =

X

(Aik )xk , , xk

(110)

so multiplication by a scalar is also done component by component and (A) = ()A = (A).

In other words, matrices can be seen as elements of a vector space! This point of view is also useful

in some instances (in fact, computer languages like C and Fortran typically store matrices as long vectors.

Fortran stores it column by column, and C row by row). The set of orthogonal matrices does NOT form a

vector space because the sum of two orthogonal matrices is not, in general, an orthogonal matrix. The set

of orthogonal matrices is a group, the orthogonal group O(3) (for 3-by-3 matrices). The special orthogonal

group SO(3) is the set of all 3-by-3 proper orthogonal matrices, i.e. orthogonal matrices with determinant

=+1 that correspond to pure rotation, not reflections. The motion of a rigid body about its center of inertia

is a motion in SO(3), not R3 . SO(3) is the configuration space of a rigid body.

c

F.

Waleffe, Math 321, 2013/1/21

35

Exercises

B Pick a random 3-by-3 matrix A and a vector b, ideally in matlab using its A=randn(3,3), b=randn(3,1).

Solve Ax = b using Cramers rule and Gaussian Elimination. Ideally again in matlab, unless punching

numbers into your calculator really turns you on. Matlab knows all about matrices and vectors. To compute

det(a1 , a2 , a3 ) = det(A) and det(b, a2 , a3 ) in matlab, simply use det(A), det(b,A(:,2),A(:,3)). Type

help matfun, or help elmat, and or demos for a peek at all the goodies in matlab.

12.6

Ax = x.

(111)

These special vectors are eigenvectors for A. They are simply shrunk or elongated by the transformation A.

The scalar is the eigenvalue. The eigenvalue problem can be rewritten

(A I)x = 0

where I is the identity matrix of the same size as A. This will have a non-zero solution iff

det(A I) = 0.

(112)

This is the characteristic equation for . If A is n-by-n, it is a polynomial of degree n in called the

characteristic polynomial.

36

Chapter 2

Vector Calculus

1

The position vector of a moving particle is a vector function r(t) of the scalar time t, for instance, a particle

moving a constant velocity v 0 has position vector r = r 0 + tv 0 = r(t) where r 0 is the position at t = 0. The

derivative of a vector function r(t) is defined as usual as the limit of a ratio

dr

r(t + h) r(t)

= lim

.

h0

dt

h

(1)

The derivative of the position vector is of course the instantaneous velocity vector v(t) = dr/dt, and for the

simple motion r = r 0 + tv 0 , dr/dt = v 0 . In general, a position vector function r(t) describes a curve in

three-dimensional space, the particle trajectory, and v(t) is tangent to that curve as we will discuss further in

section 5.1 below. The derivative of the velocity vector is the acceleration vector a(t) = dv/dt. We sometime

use Newtons dot notation for time derivatives: dr/dt r,

d2 r/dt2 = r, etc.

Rules for derivatives of vector functions are similar to those of simple functions. The derivative of a sum

of vector functions is the sum of the derivatives,

d

da db

(a + b) =

+

.

dt

dt

dt

(2)

d

d

da

(a) =

a+ ,

dt

dt

dt

d

da

db

(a b) =

b+a

,

dt

dt

dt

(3)

da

db

d

(a b) =

b+a

,

dt

dt

dt

(4)

d

da

db

dc

[(a b) c] = (

b) c + (a

) c + (a b) ,

dt

dt

dt

dt

(5)

therefore

d

da

db

dc

det(a, b, c) = det( , b, c) + det(a, , c) + det(a, b, ).

(6)

dt

dt

dt

dt

All of these are as expected but the formula for the derivative of a determinant is worth noting because it

generalizes to any dimension.1

1 For

determinants in R3 it reads

a

b1

d 1

a2 b2

dt a

b3

3

c1

c2

c3

a 1

= a 2

a 3

b1

b2

b3

c1

c2

c3

a1

+ a2

a

3

b 1

b 2

b 3

c1

c2

c3

and of course we could also take the derivatives along rows instead of columns.

37

a1

+ a2

a3

b1

b2

b3

c1

c2

c3

38

Exercises

1. Show that if u(t) is any vector with constant length, then

u

du

= 0,

dt

t.

(7)

The derivative of a vector of constant magnitude is orthogonal to the vector. [Hint: u u = u20 ]

2. If r(t) is not of constant magnitude, what is the geometric meaning of points where r dr/dt = 0?

Make sketches to illustrate such r(t) and points.

3. Show that d|a|/dt = a

da/dt for any vector function a(t). Make a sketch to illustrate.

4. If v(t) = dr/dt show that d(r v)/dt = r dv/dt. In mechanics, r mv , L is the angular momentum

of the particle of mass m and velocity v with respect to the origin.

We now illustrate all these concepts and results by considering the basic problems of classical mechanics:

motion of a particle and motion of a rigid body.

Motion of a particle

F = ma,

(8)

where F is the resultant of the forces acting on the particle and a(t) = dv/dt = d2 r/dt2 is its acceleration,

with r(t) its position vector. Newtons law is a vector equation.

I Deduce the angular momentum law dL/dt = T where T = r F is the torque (see exercise 4 above).

Free motion

If F = 0 then a = dv/dt = 0 so the velocity of the particle is constant, v(t) = v 0 say, and its position is given

by the vector differential equation dr/dt = v 0 whose solution is r(t) = r 0 + tv 0 where r 0 is a constant of

integration which corresponds to the position of the particle at time t = 0. The particle moves in a straight

line through r 0 parallel to v 0 .

Constant acceleration

dv

d2 r

=

= a(t) = a0

dt2

dt

where a0 is a time-independent vector. Integrating we find

v(t) = a0 t + v 0 ,

r(t) = a0

t2

+ v0 t + r0

2

(9)

(10)

where v 0 and r 0 are vector constants of integration. They are easily interpreted as the velocity and position

at t = 0. The trajectory is a parabola passing through r 0 parallel to v 0 at t = 0. The parabolic motion is

in the plane through r 0 that is parallel to v 0 and a0 .

Uniform rotation

If a particle rotates with angular velocity about an axis parallel to n

that passes through the point r A

(|

n| = 1 and define > 0 for right-handed rotation about n

, < 0 for left-handed rotation) then its velocity

is

dr

v(t) =

= (r r A )

(11)

dt

where = n is the rotation vector.

B Show that |r(t) r a | remains constant. Calculate the particle acceleration if and r A are constants

and interpret geometrically. Find the force required to sustain such a motion according to Newtons law.

c

F.

Waleffe, Math 321, 2013/1/21

39

A force F = F (r) r where r = |r| that always points toward the origin (if F (r) > 0, away if F (r) < 0 )

and depends only on the distance to the origin is called a central force. The gravitational force for planetary

motion and the Coulomb force in electromagnetism are of that kind. Newtons law for a particle submitted

to such a force is

dv

m

= F (r) r

(12)

dt

where v = dr/dt and r(t) = r

r is the position vector of the particle, hence both r and r are functions of

time t, in general. Motion due to such a force has two conserved quantities, angular momentum and energy.

1. Conservation of angular momentum F (r)

Take the cross product of (12) with r to obtain

r

d

dv

=0

(r v) = 0 r v = r 0 v 0 , L0 z

dt

dt

(13)

where L0 > 0 and z are constants (see exercise 4 in the previous section). So the motion remains in

the plane that passes through the origin O and is orthogonal to z (why?). Now r vdt = r dr =

2 dA(t) z = L0 zdt where dA(t) = 12 |r(t) dr(t)| = L0 /2 dt is the infinitesimal triangular area swept

by r in time dt. This yields

Keplers law: The radius vector sweeps equal areas in equal times.

2. Conservation of energy: kinetic + potential

Take the dot product of (12) with v to obtain

m

dv

d vv

v + F (r) r v = 0

m

+ V (r) = 0,

dt

dt

2

(14)

where dV (r)/dr F (r) as by the chain rule dV (r)/dt = (dV /dr)(dr/dt) = (dV /dr) r v (see exercise

3 in the previous section) This implies that

|v|2

m

+ V (r) = E0

(15)

2

where E0 is a constant. The first term m|v|2 /2 is the kinetic energy and the second V (r) is the potential

energy which is defined up to an arbitrary constant. The constant E0 is the total conserved energy.

Note that V (r) and E0 can be negative but m|v|2 /2 0, so the physically admissible r domain is that

were V (r) is less than E0 .

Consider N particles of mass mi at positions r i , i = 1, . . . , N . The net force acting on particle number i is

F i and Newtons law for each particle reads mi ri = F i . Summing over all is yields

N

X

i=1

mi ri =

N

X

F i.

i=1

Great cancellations occur on both sides. On the left side, let r i = r c + si , where r c is the center of mass

and si is the position vector of particle i with respect to the center of mass, then

X

X

X

X

mi r i =

mi (r c + si ) = M r c +

mi si

mi si = 0,

i

P

P

as, by definition of the

r c , where M = i mi is thePtotal mass.PIf the masses mi

P center of mass

P i mi r i = M P

are constants then i mi si = 0 i mi s i = 0 i mi si = 0. In that case, i mi ri = i mi (

r c + si ) =

40

P

mi rc = M rc . On the right-hand side, by action-reaction, all internal forces cancel out and the resultant

P

P (e)

is therefore the sum of all external forces only i F i = i F i = F (e) .

Therefore,

M rc = F (e)

(16)

i

where M is the total mass and F (e) is the resultant of all external forces acting on all the particles. The

motion of the center of mass of a system of particles is that of a single particle of mass M with position

vector r c under the action of the sum of all external forces. This is a fundamental theorem of mechanics.

There are also nice cancellations occurring for the motion about the center of mass. This involves

considering angular momentum and torques about the center of mass. Taking the cross-product of Newtons

law, mi ri = F i , with si for each particle and summing over all particles gives

X

X

si mi ri =

si F i .

i

On the left hand side, r i r c + si and the definition of center of mass implies

X

si mi ri =

si mi (

r c + si ) =

si mi si =

d

dt

mi si = 0. Therefore

!

si mi s i

This last expression is the rate of change of the total angular momentum about the center of mass

Lc

N

X

(si mi s i ) .

i=1

On the right hand side, one can argue that the (internal) force exerted by particle j on particle i is in the

direction of the relative position of j with respect to i, f ij ij (r i r j ). By action-reaction the force from

i onto j is f ji = f ij = ij (r i r j ), and the net contribution to the torque from the internal forces will

cancel out: r i f ij + r j f ji = 0. This is true with respect to any point and in particular, with respect to

the center of mass si f ij + sj f ji = 0. Hence, for the motion about the center of mass we have

dLc

= T c(e)

dt

(17)

P

where T (e) = i si F i is the net torque about the center of mass due to external forces only. This is

another fundamental theorem, that the rate of change of the total angular momentum about the center of

mass is equal to the total torque due to the external forces only.

B If f ij = (r i r j ) and f ji = (r j r i ), show algebraically and geometrically that si f ij +sj f ji = 0,

where s is the position vector from the center of mass.

The two vector differential equations for motion of the center of mass and evolution of the angular momentum

about the center of mass are sufficient to fully determine the motion of a rigid body.

A rigid body is such that all lengths and angles are preserved within the rigid body. If A, B and C are

any three points of the rigid body, then AB AC = constant.

Kinematics of a rigid body

Consider a right-handed orthonormal basis, e1 (t), e2 (t), e3 (t) tied to the body. These vectors are functions

of time t because they are frozen into the body so they rotate with it. However the basis remains orthonormal

as all lengths and angles are preserved. Hence ei (t) ej (t) = ij i, j = 1, 2, 3, and t and differentiating

with respect to time

dei

dej

ej + ei

= 0.

(18)

dt

dt

c

F.

Waleffe, Math 321, 2013/1/21

41

In particular, as seen in an earlier exercise, the derivative of a unit vector is orthogonal to the vector:

el del /dt = 0, l = 1, 2, 3. So we can write

del

l el ,

dt

l = 1, 2, 3

(19)

as this guarantees that el del /dt = 0 for any l . Substituting this expression into (18) yields

( i ei ) ej + ei ( j ej ) = 0,

and rewriting the mixed products

(ei ej ) i = (ei ej ) j .

(20)

Now let

l

kl ek = 1l e1 + 2l e2 + 3l e3 ,

X

X

ijk ki =

ijk kj

k

(21)

where as before ijk (ei ej ) ek . The sums over k have at most one non-zero term. This yields the three

equations

(i, j, k) = (1, 2, 3) 31 = 32

(i, j, k) = (2, 3, 1) 12 = 13

(22)

(i, j, k) = (3, 1, 2) 23 = 21 .

The second equation, for instance, says that the first component of 2 is equal to the first component of

3 . Now ll is arbitrary according to (19) (why?), so we can choose to define 11 , the first component of

1 , for instance, equal to the first components of the other two vectors that are equal to each other, i.e.

11 = 12 = 13 . Likewise, pick 22 = 23 = 21 and 33 = 31 = 32 . This choice implies that

1 = 2 = 3

(23)

The Poisson vector (t) gives the rate of change of any vector tied to the body. Indeed, if A and B are

any two points of the body then the vector c AB can be expanded with respect to the body basis e1 (t),

e2 (t), e3 (t)

c(t) = c1 e1 (t) + c2 e2 (t) + c3 e3 (t),

but the components ci c(t) ei (t) are constants because all lengths and angles, and therefore all dot

products, are time-invariant. Thus

3

X

dc X dei

=

ci

=

ci ( ei ) = c.

dt

dt

i=1

i=1

This is true for any vector tied to the body (material vectors), implying that the Poisson vector is unique

for the body.

Dynamics of rigid body

The center of mass of a rigid body moves according to the sum of the external forces as for a system of

particles. A continuous rigid body can be considered as a continuous distribution of infinitesimal masses

dm

Z

N

X

mi si

s dm

i=1

42

where the three-dimensional integral is over all points s in the domain V of the body (dm is the measure of

the infinitesimal volume element dV , or in other words dm = dV , where (s) is the mass density at point

s).

For the motion about the center of mass, the position vectors si are frozen into the body hence s i = si

for any point of the body. The total angular momentum for a rigid system of particles then reads

L=

mi si s i =

mi si ( si ) =

mi |si |2 si (si ) .

(24)

Z

L=

|s|2 s (s ) dm.

(25)

The Poisson vector is unique for the body, so it does not depend on s and we should be able Rto take it out

of the sum, or integral. Thats easy for the ksk2 term, but how can we get out of the s (s ) dm

term?! We need to introduce the concepts of tensor product and tensors to do this, but we can give a hint

by switching to index notation L Li , s si , |s| = s, i , with i = 1, 2, 3 and writing

Z

Li =

s2 i si (sj j ) dm =

Z

s2 ij si sj dm j , Jij j

(26)

where J is the tensor of inertia of the rigid body, independent of the rotation vector .

5.1

Curves

v0

r0

r(t)

P

Recall the parametric equation of a line: r(t) = r 0 + tv 0 , where r(t) =

origin O, r 0 is the position vector of a reference point on the line and

v 0 is a vector parallel to the line. Note that this can be interpreted as

the linear motion of a particle with constant velocity v 0 that was at

the point r 0 at time t = 0 and r(t) is the position at time t.

O

More generally, a vector function r(t) of a real variable t defines a curve C.

The vector function r(t) is the parametric representation of that curve and

t is the parameter. It is useful to think of t as time and r(t) as the position

of a particle at time t. The collection of all the positions for a range of t is

the particle trajectory. The vector r = r(t + h) r(t) is a secant vector

connecting two points on the curve, if we divide r by h and take the limit

as h 0 we obtain the vector dr/dt which is tangent to the curve at r(t).

If t is time, then dr/dt = v is the velocity.

C

dr/dt

r(t)

r(t + h)

O

c

F.

Waleffe, Math 321, 2013/1/21

43

The parameter can have any name and does not need to correspond

to time. For instance r(t) = x

a cos (t) + y

a sin (t), with (t) =

cos t and x

, y

orthonormal, is the position of a particle moving

around a circle of radius a, but the particle oscillates back and

forth around the circle. That same circle is more simply described

by

r() = x

a cos + y

a sin ,

dr/d

r()

between the position vector and the x

basis vector.

Remark: Note the common abuse of notation where r(t) and r() are two different vector functions.

Both represent points on the same curve but not the same points, for instance r(t = /2) = a

x while

r( = /2) = a

y . In principle, we should distinguish between the two vector functions writing r = f (t) and

r = g() with f (t) = g((t)), but in applications this quickly becomes cumbersome and a nice advantage of

this abuse of notation is that it fits nicely with the chain rule

dr d

dr

=

.

dt

d dt

Remark: A convenient parametrization, conceptually, is to use distance along the curve as the parameter,

often denoted s and called the arclength. Once a curve is specified and an s = 0 reference point has been

chosen then the distance s from that reference point along the curve uniquely defines a point on the curve,

e.g. for the highway 90 curve, s could be picked as distance from Chicago, positive westward. For the circle

of radius a, this could be the parametrization r(s) = x

a cos(s/a) + y

a sin(s/a), where s is now the distance

along the circle from the x-axis and s/a is the angle in radians.

5.2

dr

C

Line element: Given a curve C, the line element denoted dr is

an infinitesimal secant vector. This is a useful shortcut for the

procedure of approximating the curve by a succession of secant

vectors r n = r n r n1 where r n1 and r n are two consecutive

points on the curve, with n = 1, 2, . . . , N integer, then taking the

limit max |r n | 0 (so N ). In that limit, the direction of

the secant vector r n becomes identical with that of the tangent

vector at that point. If an explicit parametric representation r(t)

is known for the curve then

r n

r n1

dr =

rn

dr(t)

dt

dt

(27)

R

The typical line integral along a curve C has the form C F dr where F (r) is a vector field, i.e. a vector

function of position. If F (r) is a force, this integral represent the net work done by the force on a particle as

the latter moves along the curve. We can make sense of this integral as the limit of a sum, namely breaking

up the curve into a chain of N secant vectors r n as above then

Z

F dr =

C

lim

max |r n |0

N

X

F n r n

(28)

n=1

where F n is an estimate of the average value of F along the segments r n1 r n . A simple choice is

F n = F (r n ) but better choices are the trapezoidal rule F n = 21 (F (r n ) + F (r n1 )), or the midpoint rule

44

1

RF n = F 2 (r n + r n1 ) . These different choices for F n give finite sums that converge to the same limit,

F dr, but the trapezoidal and midpoint rules will converge faster for nice functions, and give more

C

accurate finite sum approximations.

If an explicit representation r(t) is known then we can reduce the line integral to a regular Calc I integral:

Z

tb

F dr =

C

ta

dr(t)

dt,

F (r(t))

dt

(29)

where r(ta ) is the starting point of curve C and r(tb ) is its end point. These may be the same point even if

ta 6= tb (e.g. integral once around a circle from = 0 to = 2).

Likewise, we can use the limit-of-a-sum definition to make sense of many other types of line integrals

such as

Z

Z

Z

Z

f (r) |dr|,

F |dr|,

f (r) dr,

F dr.

C

The first one gives a scalar result and the latter three give vector results.

I One important example is

Z

N

X

|dr| = lim

|r n |

|r n |0

(30)

n=1

which is the length of the curve C from its starting point r a = r 0 to its end point r b = r N . If a parametrization

r = r(t) is known then

Z

Z tb

dr(t)

(31)

|dr| =

dt |dt|

C

ta

where r a = r(ta ) and r b = r(tb ). Thats almost a Calc I integral, except for that |dt|, what does that mean?!

Again you can understand that from the limit-of-a-sum definition with t0 = ta , tN = tb and tn = tn tn1 .

If ta < tb then tn > 0 and dt > 0, so |dt| = dt and were blissfully happy. But if tb < ta then tn < 0 and

dt < 0, so |dt| = dt and

Z tb

Z ta

( )|dt| =

( )dt,

if ta > tb .

(32)

ta

I A special example of a

R

C

tb

F dr integral is

Z

r dr =

C

lim

|r n |0

N

X

n=1

tb

r n r n =

ta

dr(t)

dt

r(t)

dt

z whose magnitude is twice the area

A swept by the position vector r(t) when the curve C lies in a plane

perpendicular to z and O is in that plane (recall Keplers law that the

radius vector sweeps equal areas in equal times ). This follows directly

from the fact that r n r n = 2(A)

z is the area of the parallelogram

with sides r n and r n which is twice the area A of the triangle r n1 ,

r n , r n . If C and O are not coplanar then the vectors r n r n are

not necessarily in the same direction and their vector sum is not the

area swept by r. In that more general case, the surface Ris conical and

to calculate its area S we would need to calculate S = 12 C |r dr|.

(33)

r n

r n1

A

rn

z

Exercises:

1. What is the curve described by r(t) = a cos t x

+ a sin t y

+ bt z, where a, b and are constant real

numbers and x

, y

, z represent the orthonormal unit vectors for cartesian coordinates? What is the

tangent to the curve at r(t)?

2. Consider the vector function r() = r c + a cos e1 + b sin e2 , where r c , e1 , e2 , a and b are constants,

with ei ej = ij . What kind of curve is this? Next, assume that r c , e1 and e2 are in the same plane.

c

F.

Waleffe, Math 321, 2013/1/21

45

x + yy

. Assume that the angle

between e1 and x

is . Derive the equation of the curve in terms of the cartesian coordinates (x, y)

(i) in parametric form, (ii) in implicit form f (x, y) = 0. Simplify your equations as much as possible.

Find a geometric interpretation for the parameter .

3. Generalize the previous exercise to the case where r c is not in the same plane as e1 and e2 . Consider

general cartesian coordinates (x, y, z) such that r = x

x + yy

+ z z. Assume that all the angles between

e1 and e2 and the basis vectors {

x, y

, z} are known. How many independent angles is that? Specify

those angles. Derive the parametric equations of the curve for the cartesian coordinates (x, y, z) in

terms of the parameter .

4. Derive integrals for the length and area of the planar curve in the previous exercise. Clean up your

integrals and compute them if possible.

R

R

5. Calculate C dr and C r dr along the curve of the preceding exercise from r(0) to r(3/2).

R

R

6. Calculate C F dr and C F dr with F = (

z r)/|

z r|2 when C is the circle of radius R in the

x, y plane centered at the origin. How do the integrals depend on R?

5.3

Surfaces

[You favorite lecturer will draw many pretty pictures, but you should learn to draw them all by yourself ]

Recall the parametric equation of a plane: r(u, v) = r 0 + u a + v b where r 0 is a point on the plane, a, b

are two vectors parallel to the plane (but not to each other) and u, v are real parameters. These parameters

specify coordinates for the surface. More generally, the parametric equation of a surface prescribes the

position vector r as a function of two real parameters, u and v say, r(u, v). Again, the names of the

parameters do not matter, r(s, t) is also common. The following two examples are fundamental.

I If the surface can be parametrized by the cartesian coordinates x and y, i.e. the surface is described

by z = h(x, y), then the position vector of a point on the surface is

r(x, y) = x x

+yy

+ h(x, y) z,

(34)

where x

, y

, z are the unit vectors in the respective coordinate directions.

I The surface of a sphere of radius R centered at r c can be parametrized by

r(u, v) = r c + e1 R cos u cos v + e2 R sin u cos v + e3 R sin v,

(35)

where ei ej = ij . Verify that |r r c | = R for any real u and v. The parameters u and v can be

interpreted as angles. With e3 as the polar axis, v is the latitude and u is the longitude. The parametric

representation r(, ) = r c + e1 R cos sin + e2 R sin sin + e3 R cos , is a different but equally valid

parametric representation of the same sphere. Here is the polar angle, or co-latitude. It is the angle

between the position vector and the polar axis e3 . The angle is the azimuthal, or longitudinal, angle.

Coordinate curves and Tangent vectors If one of the parameters is held fixed, v = v0 say, we obtain

a curve r(u, v0 ). There is one such curve for every value of v. For the sphere parametrized as in (35),

r(u, v0 ) is the v0 -parallel, the circle at latitude v0 . Likewise r(u0 , v) describes another curve. This would be

a longitude circle, or meridian, for the sphere. The set of all such curves generates the surface. These two

families of curves are parametric curves or coordinate curves for the surface. The vectors r/u and r/v

are tangent to their respective parametric curves and hence to the surface. These two vectors taken at the

same point r(u, v) define the tangent plane at that point. The coordinates are said to be orthogonal if the

tangent vectors r/u and r/v at each point r(u, v) are orthogonal to each other.

I Consider the sphere x2 +y 2 +z 2 = R2 . Parametrize the northern hemisphere z 0 using both (34) and

(35). Make 3D perspective sketches of the coordinate curves for both parametrizations. What happens at

the pole and and the equator? Calculate r/u and r/v for both parametrizations. Are the coordinates

orthogonal?

46

Normal to the surface at a point At any point r(u, v) on a surface, there is an infinity of tangent

directions but there is only one normal direction. The normal to the surface at a point r(u, v) is given by

N=

r

r

.

u v

(36)

Note that the ordering (u, v) specifies an orientation for the surface, i.e. an up and down side, and that

N is not a unit vector, in general.

Surface element: The surface element dS at a point r on a surface S is a vector perpendicular to

the surface at that point with a magnitude dS that is the area of an infinitesimal patch of surface at that

point. Just like in the case of the line element along a curve, this is a useful shortcut for the limit-of-a-sum

interpretation of integrals. The surface element is often written dS = n

dS where n

is the unit normal to

the surface at that point. If a parametric representation r(u, v) for the surface is known then

r

r

dudv,

(37)

dS =

u v

r

du and

since the right hand side represents the area of the parallelogram formed by the line elements u

r

6= N but n

= N /|N | since n

is a unit vector. Although we often need to refer to

v dv. Note that n

the unit normal n

, it is usually not needed to compute it explicitly since in practice it is the area element

dS = N dudv which is required.

Exercises:

1. Compute tangent vectors and the normal to the surface z = h(x, y). Show that r/x and r/y

are not orthogonal to each other in general. Determine the class of functions h(x, y) for which (x, y)

are orthogonal coordinates on the surface z = h(x, y) and interpret geometrically. Derive an explicit

formula for the area element dS = |dS| in terms of h(x, y).

2. Deduce from the implicit equation |r r c | = R for a sphere of radius R and center r c that (r r c )

r/u = (r r c ) r/v = 0 for any u and v, where r(u, v) is any parametrization of a point on

the sphere. Compute r/u and r/v for the parametrization (35) and verify that these vectors are

perpendicular to the radial vector r r c and to each other (when evaluated at the same point on the

sphere, whatever that point is). Orthogonality of r/u and r/v implies that those u and v are

orthogonal coordinates for the sphere. Compute the surface element dS for the sphere in terms of both

the longitude-latitude parametrization (35) and the longitude-colatitude parametrization , . Do the

surface elements point toward or away from the center of the sphere?

3. Explain why the surface described by r(u, v) = x

a cos u cos v + y

b sin u cos v + z c sin v where a, b and

c are real constants is the surface of an ellipsoid. Are u and v orthogonal coordinates for that surface?

Consider cartesian coordinates (x, y, z) such that r = x

x + yy

+ z z. Derive the implicit equation

f (x, y, z) = 0 satisfied by all such r(u, v)s.

4. Consider the vector function r(u, v) = r c +e1 a cos u cos v+e2 b sin u cos v+e3 c sin v, where r c , e1 , e2 , e3 ,

a, b and c are constants with ei ej = ij . What does the set of all such r(u, v)s represent? Consider

cartesian coordinates (x, y, z) such that r = x

x + yy

+ z z. Assume that all the angles between e1 ,

e2 , e3 and the basis vectors {

x, y

, z} are known. How many independent angles is that? Specify

which angles. Can you assume {e1 , e2 , e3 } is right handed? Explain. Derive the implicit equation

f (x, y, z) = 0 satisfied by all such r(u, v)s. Express your answer in terms of the minimum independent

angles that you specified earlier.

5. Explain why the surface of a torus (i.e. donut or tire) can be parametrized as x = (R + a cos ) cos ,

y = (R + a cos ) sin , z = a sin . Interpret the geometric meaning of the parameters R, a, and

. What are the ranges of and needed to cover the entire torus? Do these parameters provide

orthogonal coordinates for the torus? Calculate the surface element dS.

c

F.

Waleffe, Math 321, 2013/1/21

5.4

47

Surface integrals

R

The typical surface integral is of the form S v dS. This represents

the flux of v through the surface S. If

R

v(r) is the velocity of a fluid, water or air, at point r, then S v dS is the time-rate at which volume of fluid

is flowing through that surface per unit time. Indeed, v dS = (v n

) dS where n

is the unit normal to the

surface and v n

is the component of fluid velocity that is perpendicular to the surface. If that component is

zero, the fluid moves tangentially to the surface, not through the surface. Speed area = volume per unit

time, so v dS is the volume of fluid passing through the surface element dS per unit

R time at that point at

that time. The total volume passing through the entire surface S per unit time is S v dS. Such integrals

are often called flux integrals.

We can make sense of such integrals as the limit of a sum, for instance by picking a series of points on

the surface, then forming a triangular meshing of the surface and evaluating the integral as the limit of the

sum of v evaluated at the center of such triangles dotted with the unit normal to the triangle and times the

area of the triangle (making sure of course that for each triangle we pick its normal n

to correspond to the

same side of the surface as the neighboring elements).

If a parametric representation r(u, v) is known for the surface then we can also write

Z

Z

r

r

dudv,

(38)

v

v dS =

u v

A

S

where A is the domain in the u, v parameter plane that corresponds to S.

As for line integrals, we can make sense of many other types of surface integrals such as

Z

p dS,

S

which would represent the netR pressure force on S if p = p(r) is the pressure at pointR r. Other surface

integrals could have the form S v dS, etc. In particular the total area of surface S is S |dS|.

Exercises:

1. Compute the percentage of surface area that lies north of the arctic circle on the earth (assume it is a

perfect sphere). Show your work, dont just google it.

2. Provide an explicit integral for the total surface area of the torus of outer radius R and inner radius a.

R

3. Calculate S r dS where S is (i) the square 0 x, y a at z = b, (ii) the surface of the sphere of

radius R centered at (0, 0, 0), (iii) the surface of the sphere of radius R centered at x = x0 , y = z = 0.

R

4. Calculate S (r/r3 ) dS where S is the surface of the sphere of radius R centered at the origin. How

does the result depend on R?

5. The pressure outside the sphere of radius R centered at r c is p = p0 + Ar a

where a

is an arbitrary but

fixed unit vector and p0 and A are constants. The pressure inside the sphere is the constant p1 > p0 .

Calculate the net force on the sphere. Calculate the net torque on the sphere about its center r c and

about the origin.

5.5

We have seen that r(t) is the parametric equation of a curve, r(u, v) represents a surface, now we discuss

r(u, v, w) which is the parametric representation of a volume. Curves r(t) are one dimensional objects so

they have only one parameter t or each point on the known curve is determined by a single coordinate.

Surfaces are two-dimensional and require two parameters u and v, which are coordinates for points on that

surface. Volumes are three-dimensional objects that require three parameters u, v, w say. Each point is

specified by three coordinates.

I A sphere of radius R centered at r c has the implicit equation |r r c | R or (r r c ) (r r c ) R2

to avoid square roots. In cartesian coordinates this translates into the implicit equation

(x xc )2 + (y yc )2 + (z zc )2 R2 .

(39)

48

An explicit parametrization for that sphere is

r(u, v, w) = r c + e1 w cos u cos v + e2 w sin u cos v + e3 w sin v,

(40)

(41)

or

where {e1 , e2 , e3 } are any three orthonormal vectors such that ei ej = ij . For the parametrization (40)

we recognize u and v as the longitude and latitude parameters used earlier for spherical surfaces (35) and

note that w has units of length. From orthonormality of {e1 , e2 , e3 } we deduce that (r r c ) (r r c ) = w2 ,

so w can be interpreted as the distance from the center of the sphere. To parametrize the entire sphere we

need to consider all the u, v and ws such that

0 u < 2,

v ,

2

2

0 w R.

(42)

The parametrization (41) is mathematically equivalent to this u, v, w parametrization but it is in the standard

form of spherical coordinates, with r representing the distance to the origin, the azimuthal (or longitude)

angle and the polar angle. To describe the full sphere we need

0 r R,

0 < 2,

0 .

(43)

Coordinate curves

For a curve r(t), all we needed to worry about was the tangent dr/dt and the line element dr = (dr/dt)dt.

For surfaces, r(u, v) we have two sets of coordinates curves with tangents r/u and r/v, a normal

N = (r/u) (r/v) and a surface element dS = N dudv. Now for volumes r(u, v, w), we have three

sets of coordinates curves with tangents r/u, r/v and r/w. A u-coordinate curve for instance,

corresponds to r(u, v, w) with v and w fixed. There is a double infinity of such one dimensional curves, one

for each v, w pair. For the parametrization (40), the u-coordinate curves correspond to parallels, i.e. circles

of fixed radius w at fixed latitude v. The v-coordinate curves are meridians, i.e. circles of fixed radius w

through the poles. The w-coordinate curves are radial lines out of the origin.

Coordinate surfaces

For volumes r(u, v, w), we also have three sets of coordinate surfaces corresponding to one parameter

fixed and the other two free. A w-isosurface for instance corresponds to r(u, v, w) for a fixed w. There is a

single infinity of such two dimensional (u, v) surfaces. For the parametrization (40) such surfaces correspond

to spherical surfaces of radius w centered at r c . Likewise, if we fix u but let v and w free, we get another

surface, and v fixed with u and w free is another coordinate surface.

Line Elements

Thus given a volume parametrization r(u, v, w) we can define four types of line elements, one for each

of the coordinate directions (r/u)du, (r/v)dv, (r/w)dw and a general line element corresponding to

the infinitesimal displacement from coordinates (u, v, w) to the coordinates (u + du, v + dv, w + dw). That

general line element dr is given by (chain rule):

dr =

r

r

r

du +

dv +

dw.

u

v

w

(44)

Surface Elements

Likewise, there are three basic types of surface elements, one for each coordinate surface. The surface

element on a w-isosurface, for example, is given by

r

r

dS w =

dudv,

(45)

u v

while the surface elements on a u-isosurface and a v-isosurface are respectively

r

r

r

r

dS u =

dvdw,

dS v =

dudw.

v

w

w u

(46)

c

F.

Waleffe, Math 321, 2013/1/21

49

Note that surface orientations are built into the order of the coordinates.

Volume Element

Last but not least, a parametrization r(u, v, w) defines a volume element given by the mixed (i.e. triple

scalar) product

r

r

r

r r r

dV =

du dv dw det

,

,

du dv dw.

(47)

u v

w

u v w

The definition of volume integrals as limit-of-a-sum should be obvious by now. If an explicit parametrization r(u, v, w) for the volume is known, we can use the volume element (47) and write the volume integral

in r space as an iterated triple integral over u, v, w. Be careful that there is an orientation implicitly built

into the ordering of the parameters, as should be obvious from the definition of the mixed product and

determinants. The volume element dV is usually meant to be positive so the sign of the mixed product

and the bounds of integrations for the parameters u, v and w must be chosen to respect that. (Recall the

definition of |dt| in the line integrals section).

Exercises

1. Calculate the line, surface and volume elements for the coordinates (40). You need to calculate 4 line

elements and 3 surfaces elements. One line element for each coordinate curve and the general line

element. Verify that these coordinates are orthogonal.

2. Formulate integral expressions in terms of the coordinates (40) and (41) for the surface and volume of

a sphere of radius R. Calculate those integrals.

3. A curve r(t) is given in terms of the (u, v, w) coordinates (40), i.e. r(t) = r(u, v, w) with (u(t), v(t), w(t))

for t = ta to t = tb . Find an explicit expression in terms of (u(t), v(t), w(t)) as a t-integral for the

length of that curve.

4. Find suitable coordinates for a torus. Are your coordinates orthogonal? Compute the volume of that

torus.

5.6

Parametrizations of curves, surfaces and volumes is essentially equivalent to the concepts of mappings and

curvilinear coordinates.

Mappings

The parametrization (35) for the surface of a sphere of radius R provides a mapping of that surface to the

0 u < 2, /2 v /2 rectangle in the (u, v) plane. In a mapping r(u, v) a small rectangle of sides

du, dv at a point (u, v) in the (u, v) plane is mapped to a small parallelogram of sides (r/u)du, (r/v)dv

at point r(u, v) in the Euclidean space.

The parametrization (40) for the sphere of radius R provides a mapping from the sphere of radius R in

Euclidean space to the box 0 u < 2, /2 v /2, 0 w R in the (u, v, w) space. In a mapping

r(u, v, w), the infinitesimal box of sides du, dv, dw located at point (u, v, w) in the (u, v, w) space is mapped

to a parallelepiped of sides (r/u)du, (r/v)dv, (r/w)dw at the point r(u, v, w) in the Euclidean space.

Curvilinear coordinates, orthogonal coordinates

The parametrizations r(u, v) and r(u, v, w) define coordinates for a surface or a volume, respectively. If the

coordinate curves are not straight lines one talks of curvilinear coordinates. These mappings define good

coordinates if the coordinate curves intersect transversally i.e. if the coordinate curves are not tangent to

each other. If the coordinate curves intersect transversally at a point then the coordinates tangent vectors at

that point provide linearly independent directions in the space of r. Tangent intersections at a point r would

imply that the tangent vectors are not linearly independent at that point and that (r/u) (r/v) = 0

50

at r(u, v) in the surface case or that det(r/u, r/v, r/w) = 0 at that point r(u, v, w) in the volume

case.

The coordinates (u, v, w) are orthogonal if the coordinate curves in r-space intersect at right angles. This

is the best kind of transversal intersection and these are the most desirable type of coordinates, however

non-orthogonal coordinates are sometimes more convenient for some problems. Two fundamental examples

of orthogonal curvilinear coordinates are

I Cylindrical (or polar) coordinates

x = cos ,

y = sin ,

z = z.

(48)

I Spherical coordinates

x = r sin cos ,

y = r sin sin ,

z = r cos .

(49)

Changing notation from (x, y, z) to (x1 , x2 , x3 ) and from (u, v, w) to (q1 , q2 , q3 ) a general change of

coordinates from cartesian (x, y, z) (x1 , x2 , x3 ) to curvilinear (q1 , q2 , q3 ) coordinates is expressed succinctly

by

xi = xi (q1 , q2 , q3 ), i = 1, 2, 3.

(50)

The position vector r can be expressed in terms of the qj s through the cartesian expression:

r(q1 , q2 , q3 ) = x

x(q1 , q2 , q3 ) + y

y(q1 , q2 , q3 ) + z z(q1 , q2 , q3 ) =

3

X

ei xi (q1 , q2 , q3 ),

(51)

i=1

where ei ej = ij . The qi coordinate curve is the curve r(q1 , q2 , q3 ) where qi is free but the other two variables

are fixed. The qi isosurface is the surface r(q1 , q2 , q3 ) where qi is fixed and the other two parameters are free.

The coordinate tangent vectors r/qi are key to the coordinates. They provide a natural vector basis

for those coordinates. The coordinates are orthogonal if these tangent vectors are orthogonal to each other

at each point. In that case it is useful to define the unit vector qi in the qi coordinate direction by

r

r

,

= hi qi ,

(52)

hi =

qi

qi

where hi is the the magnitude of the tangent vector in the qi direction, r/qi , and qi qj = ij for orthogonal

coordinates. These hi s are called the scale factors. Note that this decomposition of r/qi into a scale factor

hi and a direction qi clashes with the summation convention. The distance traveled in x-space when changing

qi by dqi , keeping the other qs fixed, is |dr| = hi dqi (no summation). The distance ds travelled in x-space

when the orthogonal curvilinear coordinates change from (q1 , q2 , q3 ) to (q1 + dq1 , q2 + dq2 , q3 + dq3 ) is

ds2 = dr dr = h21 dq12 + h22 dq22 + h23 dq32 .

(53)

Although the cartesian unit vectors ei are independent of the coordinates, the curvilinear unit vectors qi

in general are functions of the coordinates, even if the latter are orthogonal. Hence qi /qj is in general

non-zero. For orthogonal coordinates, those derivatives qi /qj can be expressed in terms of the scale factors

and the unit vectors.

For instance, for spherical coordinates (q1 , q2 , q3 ) (r, , ), the unit vector in the q1 r direction is the

vector

r(r, , )

=x

sin cos + y

sin sin + z cos ,

(54)

r

so the scale coefficient h1 hr = 1 and the unit vector

q1 r er = x

sin cos + y

sin sin + z cos .

(55)

r = r r = x x

+yy

+ z z

3

X

i=1

xi e i .

(56)

c

F.

Waleffe, Math 321, 2013/1/21

51

r , is simpler than in cartesian

coordinates, r = x x

+y y

+z z, but there is a catch! The radial unit vector r = r(, ) varies in the azimuthal

and polar angle directions, while the cartesian unit vectors x

, y

, z are independent of the coordinates!

For orthogonal coordinates, the scale factors hi s determine everything. In particular, the surface and

volume elements can be expressed in terms of the hi s. For instance, the surface element for a q3 -isosurface

is

dS 3 = q3 h1 h2 dq1 dq2 ,

(57)

and the volume element

dV = h1 h2 h3 dq1 dq2 dq3 ,

(58)

assuming that q1 , q2 , q3 is right-handed. These follow directly from (45) and (47) and (52) when the coordinates are orthogonal.

Exercises

1. Find the scale factors hi and the unit vectors qi for cylindrical and spherical coordinates. Express the

3 surface elements and the volume element in terms of those scale factors and unit vectors. Sketch the

unit vector qi in the (x, y, z) space (use several views rather than trying to make an ugly 3D sketch!).

Express the position vector r in terms of the unit vectors qi . Calculate the derivatives qi /qj for all

i, j and express these derivatives in terms of the scale factors hk and the unit vectors qk , k = 1, 2, 3.

2. A curve in the (x, y) plane is given in terms of polar coordinates as = (). Deduce -integral

expressions for the length of the curve and for the area swept by the radial vector.

3. Consider elliptical coordinates (u, v, w) defined by x = cosh u cos v, y = sinh u sin v , z = w for

some > 0, where x, y and z are standard cartesian coordinates in 3D Euclidean space. What do the

coordinate curves correspond to in the (x, y, z) space? Are these orthogonal coordinates? What is the

volume element in terms of elliptical coordinates?

4. For general curvilinear coordinates, not necessarily orthogonal, is the qi -isosurface perpendicular to

r/qi ? is it orthogonal to (r/qj )(r/qk ) where i, j, k are all distinct? What about for orthogonal

coordinates?

5.7

Change of variables

Parametrizations of surfaces and volumes and curvilinear coordinates are geometric examples of a change of

variables. These change of variables and the associated formula and geometric concepts can occur in other

non-geometric contexts. The fundamental relationship is the formula for a volume element (47). In the

context of a general change of variables from (x1 , x2 , x3 ) to (q1 , q2 , q3 ) that formula (47) reads

dx1 dx2 dx3 = dVx = J dq1 dq2 dq3 = J dVq

(59)

where

x1 /q1 x1 /q2 x1 /q3

xi

(60)

J = x2 /q1 x2 /q2 x2 /q3 = det

qj

x3 /q1 x3 /q2 x3 /q3

is the Jacobian determinant and dVx is a volume element in the x-space while dVq is the corresponding

volume element in q-space. The Jacobian is the determinant of the Jacobian matrix

xi

.

(61)

J = x2 /q1 x2 /q2 x2 /q3 Jij =

qj

x3 /q1 x3 /q2 x3 /q3

The vectors (dq1 , 0, 0), (0, dq2 , 0) and (0, 0, dq3 ) at point (q1 , q2 , q3 ) in q-space are mapped to the vectors

(r/q1 )dq1 , (r/q2 )dq2 , (r/q3 )dq3 . In component form this is

(1)

dx

x1 /q1 x1 /q2 x1 /q3

dq1

1(1)

(62)

dx2 = x2 /q1 x2 /q2 x2 /q3 0 ,

(1)

x3 /q1 x3 /q2 x3 /q3

0

dx

3

52

(1)

(1)

(1)

where (dx1 , dx2 , dx3 ) are the x-components of the vector (r/q1 )dq1 . Similar relations hold for the

other basis vectors. Note that the rectangular box in q-space is in general mapped to a non-rectangular

parallelepiped in x-space so the notation dx1 dx2 dx3 for the volume element in (59) is a (common) abuse of

notation.

The formulas (59), (60) tells us how to change variables in multiple integrals. This formula generalizes

to higher dimension, and also to lower dimension. In the 2 variable case, we have a 2-by-2 determinant

that can also be understood as a special case of the surface element formula (37) for a mapping r(q1 , q2 )

from a 2D space (q1 , q2 ) to another 2D-space (x1 , x2 ). In that case r(q1 , q2 ) = e1 x1 (q1 , q2 ) + e2 x2 (q1 , q2 ) so

(r/q1 ) (r/q2 )dq1 dq2 = e3 dAx and

dx1 dx2 = dAx = J dq1 dq2 = J dAq

(63)

x1 /q1

J =

x2 /q1

x1 /q2

.

x2 /q2

(64)

Consider a Carnot cycle for a perfect gas. The equation of state is P V = nRT = N kT , where P is

pressure, V is the volume of gas and T is the temperature in Kelvins. The volume V contains n moles of gas

corresponding to N molecules, R is the gas constant and k is Boltzmanns constant with nR = N k. A Carnot

cycle is an idealized thermodynamic cycle in which a gas goes through (1) a heated isothermal expansion at

temperature T1 , (2) an adiabatic expansion at constant entropy S2 , (3) an isothermal compression releasing

heat at temperature T3 < T1 and (4) an adiabatic compression at constant entropy S4 < S2 . For a perfect

monoatomic gas, constant entropy means constant P V where = CP /CV = 5/3 with CP and CV the heat

capacity at constant pressure P or constant volume V , respectively. Thus let S = P V (this S is not the

physical entropy but it is constant whenever entropy is constant, we can call S a pseudo-entropy).

T1

T1

S4

P

S4

S2

S2

T3

T3

S

R Vb

Now the work done by the gas when its volume changes from Va to Vb is Va P dV (since work = Force

displacement, P = force/area and V =area displacement). Thus the (yellow) area inside the cycle in the

(P, V ) plane is the net work performed by the gas during one cycle. Although we can calculate that area

by working in the (P, V ) plane, it is easier to calculate it by using a change of variables from P , V to T ,

S. The area inside the cycle in the (P, V ) plane is not the same as the area inside the cycle in the (S, T )

plane. There is a distortion. An element of area in the (P, V ) plane aligned with the T and S coordinates

(i.e. with the dashed curves in the (P, V ) plane) is

P

V

P

V

e +

e

dS

e +

e

dT

(65)

dA =

S P

S V

T P

T V

where eP and eV are the unit vectors in the P and V directions in the (P, V ) plane, respectively. This is

r

entirely similar to the area element of surface r(u, v) being equal to dS = u

du r

v dv (but dont confuse

c

F.

Waleffe, Math 321, 2013/1/21

53

the surface element dS with the pseudo-entropy differential dS used in the present example!). Calculating

out the cross product, we obtain

P V

P,V

V P

(66)

dA =

S T

S T

where the sign will be chosen to get a positive area (this depends on the bounds of integrations) and

P P

S T

P,V

P,V

(67)

JS,T = det(J S,T ) =

V V

S T

is the Jacobian determinant (here the vertical bars are the common notation for determinants). It is the

determinant of the Jacobian matrix J P,V

S,T that corresponds to the mapping from (S, T ) to (P, V ). The cycle

area AP,V in the (P, V ) plane is thus

Z

T1

S2

AP,V =

T3

S4

P,V

JS,T dSdT.

(68)

P,V

Note that the vertical bars in this formula are for absolute value of JS,T

and the bounds have been selected so

that dS > 0 and dT > 0 (in the limit-of-a-sum sense). This expression for the (P,V ) area

expressed in terms

P,V

of (S, T ) coordinates is simpler than if we used (P, V ) coordinates, except for that JS,T since we do not have

explicit expression for P (S, T ) and V (S, T ). What we have in fact are the inverse functions T = P V /(N k)

and S = P V . To find the partial derivatives that we need we could (1) find the inverse functions by solving

for P and V in terms of T and S then compute the partial derivatives and the Jacobian, or (2) use implicit

differentiation e.g. N k T /T = N k = V P/T + P V /T and S/T = 0 = V P/T + P V 1 V /T

etc. and solve for the partial derivatives we need. But there is a simpler way that makes use of an important

property of Jacobians.

Geometric meaning of the Jacobian determinant and its inverse

P,V

The Jacobian JS,T represents the stretching factor of area elements when moving from the (S, T ) plane to the

(P, V ) plane. If dAS,T is an area element centered at point (S, T ) in the (S, T ) plane then that area element

gets mapped to an area element dAP,V centered at the corresponding point in the (P, V ) plane. Thats what

equation (66) represents. In that equation we have in mind the mapping of a rectangular element of area

P,V

dSdT in the (S, T ) plane to a parallelogram element in the (P, V ) plane. The stretching factor is |JS,T | (as

we saw earlier, the meaning of the sign is related to orientation, but here we are worrying only about areas,

so we take absolute values). That relationship is valid for area elements of any shape, not just rectangles

to parallelogram since the differential relationships implies an implicit limit-of-a-sum and in that limit, the

pointwise area stretching is independent of the shape of the area elements. A disk element in the (S, T )

plane would be mapped to an ellipse element in the (P, V ) plane but the pointwise area stretching would be

the same as for a rectangular element (this is not true for finite size areas). So equation (66) can be written

in the more general form

P,V

dAP,V = |JS,T | dAS,T

(69)

which is locally valid for area elements of any shape. The key point is that if we consider the inverse map,

S,T

back from (P, V ) to (S, T ) then there is an are stretching give by the Jacobian JP,V = (S/P )(T /V )

(S/V )(T /P ) such that

S,T

dAS,T = |JP,V | dAP,V

(70)

but since we are coming back to the original dAS,T element we must have

P,V

S,T

JS,T JP,V = 1,

(71)

54

so the Jacobian determinant are inverses of one another. This inverse relationship actually holds for the

Jacobian matrices also

P,V

S,T

1 0

J S,T J P,V =

(72)

.

0 1

The latter can be derived from the chain rule since (note the consistent ordering of the partials)

S,T

P,V

S/P S/V

P/S P/T

,

J P,V =

J S,T =

T /P T /V

V /S V /T

(73)

and the matrix product of those two Jacobian matrices yields the identity matrix. For instance, the first

row times the first column gives

P

S

P

T

P

+

=

= 1.

S T P V

T S P V

P V

A subscript has been added to remind which other variable is held fixed during the partial differentiation.

The inverse relationship between the Jacobian determinants (71) then follows from the inverse relationship

between the Jacobian matrices (72) since the determinant of a product is the product of the determinants.

This important property of determinants can be verified directly by explicit calculation for this 2-by-2 case.

So whats the work done by the gas during one Carnot cycle? Well,

P,V

S,T

JS,T = JP,V

so

Z

1

T1

=

S2

AP,V =

T3

S4

S T

S T

P V

V P

Z

P,V

JS,T dSdT =

1

T1

T3

= N k V P V 1 P V

S2

S4

1

Nk

(1 )S

Nk

N k(T1 T3 ) S2

dSdT =

ln ,

( 1)S

( 1)

S4

(74)

(75)

Exercises

1. Calculate the area between the curves xy = 1 , xy = 2 and y = 1 x, y = 2 x in the (x, y) plane.

Sketch the area. (1 , 2 , 1 , 2 > 0.)

2. Calculate the area between the curves xy = 1 , xy = 2 and y 2 = 21 x, y 2 = 22 x in the (x, y) plane.

Sketch the area. (1 , 2 , 1 , 2 > 0.)

3. Calculate the area between the curves x2 + y 2 = 21 x, x2 + y 2 = 22 x and x2 + y 2 = 21 y, x2 + y 2 =

22 y. Sketch the area. (1 , 2 , 1 , 2 > 0.)

4. Calculate the area of the ellipse x2 /a2 +y 2 /b2 = 1 and the volume of the ellipsoid x2 /a2 +y 2 /b2 +z 2 /c2 =

1 by transforming them to a disk and a sphere, respectively, using a change of variables. [Hint: consider

the change of variables x = au, y = bv, z = cw.]

R R

R

2

2

2

5. Calculate the integral e(x +y ) dxdy. Deduce the value of the Poisson integral ex dx.

[Hint: switch to polar coordinates].

R R

6. Calculate (a2 + x2 + y 2 ) dxdy. Where a 6= 0 and is real. Discuss the values of for which

the integral exists.

c

F.

Waleffe, Math 321, 2013/1/21

55

Consider a scalar function of a vector variable: f (r), for instance the pressure p(r) as a function of position,

or the temperature T (r) at point r, etc. One way to visualize such functions is to consider isosurfaces or

level sets, these are the set of all rs for which f (r) = C0 , for some constant C0 . In cartesian coordinates

r = xe+yy

+ z z and the scalar function is a function of the three coordinates f (r) f (x, y, z), hence we

can interpret an isosurface f (x, y, z) = C0 as a single equation for the 3 unknowns x, y, z. In general, we are

free to pick two of those variables, x and y say, then solve the equation f (x, y, z) = C0 for z. For example,

the isosurfaces of f (r) = x2 + y 2 + z 2 are determined by the equation f (r) = C0 . This is the sphere or radius

C0 , if C0 0.

6.1

The value of f (r) at a point r 0 defines an isosurface f (r) = f (r 0 ) through that point r 0 . The gradient of

f (r) at r 0 can be defined geometrically as the vector, denoted f (r 0 ), that

(i) is perpendicular to the isosurface f (r) = f (r 0 ) at the point r 0 and points in the direction of increase

of f (r) and

(ii) has a magnitude equal to the rate of change of f (r) with distance from the isosurface.

From this geometric definition, we deduce that the plane tangent to the isosurface f (r) = f (r 0 ) at r 0 has

the equation

(r r 0 ) f (r 0 ) = 0

(76)

Likewise, the plane (r r 0 ) f (r 0 ) = is parallel to the tangent plane but further up in the direction of

the gradient if > 0 or down if < 0. More generally, the function f (r) can be locally (i.e. for r near r 0 )

approximated by

f (r) f (r 0 ) + (r r 0 ) f (r 0 ).

(77)

Indeed since f points in the direction of fastest increase of f (r) and has a magnitude equal to the rate of

change of f with distance in that direction, the change in f as we move slightly away from r 0 only depends

on how much we have moved in the direction of the gradient. The actual change in f is the distance times

the rate of change with distance, hence it is (r r 0 ) f (r 0 ). Equation (77) states that the isosurfaces of

f (r) look like planes perpendicular to f (r 0 ) for r in the neighborhood for r 0 . Equation (77) is a linear

approximation of f (r) in the neighborhood of r 0 , just like f (x) f (x0 ) + (x x0 )f 0 (x0 ) for functions of one

variable. This is not exact, there is a small error that goes to zero faster than |r r 0 | as r r 0 . The limit

as r r 0 can be written as the exact differential relation

df (r) = dr f (r),

(78)

where df (r) = f (r + dr) f (r) is the differential change in f (r) when r changes from r to r + dr. This

differential relationship holds for any r and dr. It is analogous to the differential relationship df = f 0 (x) dx

for functions of a single variable.

I It follows immediately from the geometric definition of the gradient that if r = |r| is the distance to

the origin and r = r/r is the unit radial vector, then r = r and more generally f (r) = rdf /dr. For

instance, (1/r) =

r /r2 .

6.2

at point r, denoted f (r)/n is defined as

f (r + h

n) f (r)

f (r)

= lim

.

h0

n

h

(79)

f (r)

=n

f (r).

n

(80)

56

This is exact because the error of the linear approximation (77), f (r + h

n) f (r) + h

n f (r), goes to

zero faster than h as h 0.

In particular,

f

f

f

=x

f,

=y

f,

= z f,

(81)

x

y

z

hence

f = x

f

f

f

+y

+ z ,

x

y

z

(82)

for cartesian coordinates x, y, z. We can also deduce this important result from the Chain rule for functions

of several variables. In cartesian coordinates, written (x1 , x2 , x3 ) in place of (x, y, z), the position vector

reads r = x1 e1 + x2 e2 + x3 e3 and the scalar function f (r) f (x1 , x2 , x3 ) is a function of the 3 coordinates.

The directional derivative

f (r + h

n) f (r)

f

f

f

f (r)

= lim

= n1

+ n2

+ n3

=n

f.

h0

n

h

x1

x2

x3

(83)

n) f (r) as the telescoping sum

[f (x1 + hn1 , x2 + hn2 , x3 + hn1 )f (x1 , x2 + hn2 , x3 + hn3 )]

+[f (x1 , x2 + hn2 , x3 + hn3 )f (x1 , x2 , x3 + hn3 )]

+[f (x1 , x2 , x3 + hn3 )f (x1 , x2 , x3 )]

(84)

then using continuity and the definition of the partial derivatives, e.g.

lim

h0

f

f (r + e3 ) f (r)

f (r + hn3 e3 ) f (r)

= n3 lim

= n3

.

0

h

x3

= e1

+ e2

+ e3

ei i

x1

x2

x3

(85)

where i is short for /xi and we have used the convention of summation over all values of the repeated

index i.

6.3

Well depart from our geometric point of view to first define divergence and curl computationally based on

their cartesian representation. Here we consider vector fields v(r) which are vector functions of a vector

variable, for example the velocity v(r) of a fluid at point r, or the electric field E(r) at point r, etc. For

the geometric meaning of divergence and curl, see the sections on divergence and Stokes theorems.

The divergence of a vector field v(r) is defined as the dot product v. Now since the unit vectors ei

are constant for cartesian coordinates, v = ei i ej vj = (ei ej ) i vj = ij i vj hence

v = i vi = 1 v1 + 2 v2 + 3 v3 .

(86)

Likewise, the curl of a vector field v(r) is the cross product v. In cartesian coordinates, v =

(ej j ) (ek vk ) = (ej ek ) j vk . Recall that ijk = ei (ej ek ), or in other words ijk is the i component

of the vector ej ek , thus ej ek = ei ijk and

v = ei ijk j vk .

(87)

(Recall that a b = ei ijk aj bk .) The right hand side is a triple sum over all values of the repeated indices

i, j and k! But that triple sum is not too bad since ijk = 1 depending on whether (i, j, k) is a cyclic

c

F.

Waleffe, Math 321, 2013/1/21

57

(=even) permutation or an acyclic (=odd) permutation of (1, 2, 3) and vanishes in all other instances. Thus

(87) expands to

v = e1 (2 v3 3 v2 ) + e2 (3 v1 1 v3 ) + e3 (1 v2 2 v1 ) .

(88)

We can also write that the i-th cartesian component of the curl is

ei v = v i = ijk j vk .

(89)

6.4

Vector identities

In the following f = f (r) is an arbitrary scalar field while v(r) and w(r) are vector fields. Two fundamental

identities that can be remembered from a (a b) = 0 and a (a) = 0 are

( v) = 0,

(90)

(f ) = 0.

(91)

and

The divergence of a curl and the curl of a gradient vanish identically (assuming all those derivatives exist).

These can be proved using index notation. Other useful identities are

12

22

(f v) = (f ) v + f ( v),

(92)

(f v) = (f ) v + f ( v),

(93)

( v) = ( v) 2 v,

(94)

32

where = =

+ +

is the Laplacian operator.

The identity (92) is verified easily using indicial notation

(f v) = i (f vi ) = (i f )vi + f (i vi ) = (f ) v + f ( v).

Likewise the second identity (93) follows from ijk j (f vk ) = ijk (j f )vk + f ijk (j vk ). The identity (94) is

easily remembered from the double cross product formula a (b c) = b(a c) c(a b) but note that the

in the first term must appear first since ( v) 6= ( v). That identity can be verified using indicial

notation if one knows the double cross product identity in terms of the permutation tensor (see earlier notes

on ijk and index notation) namely

ijk klm = kij klm = il jm im jl .

(95)

(96)

where (w )v = (wj j )vi in indicial notation. This can be verified using (95) and can be remembered from

the double cross-product identity with the additional input that is a vector operator, not just a regular

vector, hence (v w) represents derivatives of a product and this doubles the number of terms of the

resulting expression. The first two terms are the double cross product a (b c) = (a c)b (a b)c for

derivatives of v while the last two terms are the double cross product for derivatives of w.

Another similar identity is

v ( w) = (w) v (v )w.

(97)

In indicial notation this reads

ijk vj (klm l wm ) = (i wj )vj (vj j )wi .

(98)

58

Note that this last identity involves the gradient of a vector field w. This makes sense and is a tensor,

i.e. a geometric object whose components with respect to a basis form a matrix. In indicial notation, the

components of w are i wj and there are 9 of them. This is very different from w = i wi which is a

scalar.

The bottom line is that these identities can be reconstructed relatively easily from our knowledge of

regular vector identities for dot, cross and double cross products, however is a vector of derivatives and

one needs to watch out more carefully for the order and the product rules. If in doubt, jump to indicial

notation.

Exercises

1. Verify (90) and (91).

2. Digest and verify the identity (95) ab initio.

3. Verify the double cross product identities a(bc) = b(ac)c(ab) and (ab)c = b(ac)a(bc)

in indicial notation using (95).

4. Verify (96) and (97) using indicial notation and (95).

5. Use (95) to derive vector identities for (a b) (c d) and ( v) ( w).

6. Show that (v w) = w ( v) v ( w) and explain how to reconstruct this from the rules

for the mixed (or box) product a (b c) of three regular vectors.

7. Find the fastest way to show that r = 3 and r = 0.

8. Find the fastest way to show that (

r /r2 ) = 0 and (

r /r2 ) = 0 for all r 6= 0.

9. Quickly calculate B and B when B = (

z r)/|

z r|2 (cf. Biot-Savart law for a line current)

[Hint: use both vector identities and cartesian coordinates where convenient].

6.5

[to be developed]

Show that

+y

+ z

x

y

z

1

+ z

= +

= r +

+

r

r

r sin

=x

(99)

where

(

x, y

, z) are fixed mutually orthogonal cartesian unit vectors,

(,

,

z) are mutually orthogonal cylindrical unit vectors but and

depend on ,

)

(

r , ,

are mutually orthogonal spherical unit vectors but r and depend on both and .

Show that

= ,

r = sin ,

r = ,

= cos

=

r

(100)

(101)

(102)

c

F.

Waleffe, Math 321, 2013/1/21

59

See Chapter 1, section 1.3 to grind this out intelligently, but learn also to re-derive these relationships

geometrically as all good physicists know how to do.

With these relations, one can derive the expression for div, curl, Laplacian, etc in cylindrical and spherical

coordinates. For instance

= (v) + v ( )

(v )

1

1 v

1

=

+v

+

r

r

r sin

(103)

1 v v

v

=

+

+

r

r

r sin

1 v

v

1

=

+

=

(v sin )

r

r tan

r sin

Show likewise that

1

1

(v sin ) +

w

(u

r + v + w)

= 2 (r2 u) +

r r

r sin

r sin

(104)

and in particular

2 f = f =

1 2 f

(r

)+

r2 r

r

r sin

sin

1 f

r

+

r sin

1 f

r sin

(105)

60

Integration in R2 and R3

7.1

R

The integral of a function

of two variables f (x, y) over a domain A of R2 denoted A f (x, y)dA can be defined

P

as a the limit of a n f (xn , yn )An where the An s, n = 1, . . . , N provide an (approximate) partition of

A that breaks up A into a set of small area elements, squares or triangles for instance. An is the area of

those element n and (xn , yn ) is a point inside that element, for instance the center of area of the triangle.

The integral would be the limit of such sums when the area of the triangles goes to zero and their number

N must then go to infinity. This limit should be such that the aspect ratios of the triangles remain bounded

away from 0 so we get a finer and finer sampling of A. This definition also provides a way to approximate

the integral by such a finite sum.

yT

the x and y axes then the sum over all squares inside A

can be performed row by row. Each row-sum then tends to

an integral in the x-direction, this leads to the conclusion

that the integral can be calculated as iterated integrals

A

y

yT

f (x, y)dA =

yB

x` (y)

xr (y)

dy

yB

f (x, y)dx

(106)

x` (y)

xr (y)

instead and each column-sum then tends to an integral in the y-direction, this leads to the iterated integrals

Z

xR

f (x, y)dA =

A

yt (x)

dx

xL

yt (x)

f (x, y)dy.

(107)

yb (x)

from those of the previous iterated integrals.

yb (x)

xL

x

xR

This iterated integral approach readily extends to integrals over three-dimensional domains in R3 and

more generally to integrals in Rn .

7.2

Z b

dF

dx = F (b) F (a).

a dx

(108)

Once again we can interpret this in terms of limits of finite differences. The derivative is defined as

dF

F

= lim

x0 x

dx

(109)

Z

f (x)dx = lim

a

xn 0

N

X

n=1

f (

xn )xn

(110)

c

F.

Waleffe, Math 321, 2013/1/21

61

n xn , with n = 1, . . . , N and x0 = a, xN = b, so the set of xn s

provides a partition of the interval [a, b]. The best choice for x

n is the midpoint x

n = (xn + xn1 )/2. This

is the midpoint scheme in numerical integration methods. Putting these two limits together and setting

Fn = F (xn ) F (xn1 ) we can write

b

Z

a

N

N

X

X

Fn

dF

dx = lim

xn = lim

Fn = F (b) F (a).

xn 0

xn 0

dx

xn

n=1

n=1

Z b

Z F (b)

dF

dx =

dF = F (b) F (a).

a dx

F (a)

(111)

(112)

Fundamental theorem in R2

7.3

From the fundamental theorem of calculus and the reduction of integrals on a domain A of R2 to iterated

integrals on intervals in R we obtain for a function G(x, y)

Z

A

G

dA =

x

yT

xr (y)

dy

yB

x` (y)

G

dx =

x

yT

G(xr (y), y) G(x` (y), y) dy.

(113)

yB

This looks nice enough but we can rewrite the integral on the right-hand side as a line integral over the

boundary of A. The boundary of A is a closed curve C often denoted A (not to be confused with a partial

derivative). The boundary C has two parts C1 and C2 .

The curve C1 can be parametrized in terms of y as r(y) =

x

xr (y) + y

y with y = yB yT , hence

Z

Z yT

G(x, y)

y dr =

G(xr (y), y)dy.

yT

C2

C1

C1

yB

yB

r(y) = x

x` (y) + y

y with y = yT yB , hence

Z

Z yB

G(x, y)

y dr =

G(x` (y), y)dy.

C2

x` (y)

yT

xr (y)

Putting these two results together the right hand side of (113) becomes

Z yT

Z

Z

I

G(xr (y), y) G(x` (y), y) dy =

Gy

dr +

Gy

dr =

Gy

dr,

yB

C1

C2

I

Z

G

dA =

Gy

dr.

(114)

C

A x

H

The symbol is used to emphasize that the integral is over a closed curve. Note that the curve C has been

oriented counter-clockwise such that the interior is to the left of the curve.

Similarly the fundamental theorem of calculus and iterated integrals lead to the result that

Z

Z xR Z yt (x)

Z xR

F

F

dA =

dx

dy =

F (x, yt (x)) F (x, yb (x)) dx,

(115)

A y

xL

yb (x) y

xL

and the integral on the right hand side can be rewritten as a line integral around the boundary curve

C = C3 + C4 .

62

The curve C3 can be parametrized in terms of x as r(x) =

x

x + y

yb (x) with x = xL xR , hence

Z

Z xR

F (x, y)

x dr =

F (x, yb (x))dx.

C4

yt (x)

yb (x)

C3

xL

r(x) = x

x + y

yt (x) with x = xR xL , hence

Z

Z xL

F (x, y)

x dr =

F (x, yt (x))dx.

C3

C4

xL

xR

xR

Z

Z xR

F (x, yt (x)) F (x, yb (x)) dx =

Z

Fx

dr

C4

xL

I

Fx

dr =

C3

Fx

dr,

C

where C = C3 + C4 is the closed curve bounding A oriented counter-clockwise as before. Then (115) becomes

I

Z

F

dA = F x

dr.

(116)

C

A y

7.4

The two results (114) and (116) can be combined into a single important formula. Subtract (116) from (114)

to deduce the curl form of Greens theorem

Z

A

G F

x

y

I

(F x

+ G

y ) dr =

dA =

C

(F dx + Gdy).

(117)

Note that dx and dy in the last integral are not independent quantities, they are the projection of the

line element dr onto the basis vectors x

and y

as written in the middle line integral. If (x(t), y(t)) is a

parametrization for the curve then dx = (dx/dt)dt and dy = (dy/dt)dt and the t-bounds of integration

should be picked to correspond to counter-clockwise orientation. Note also that Greens theorem (117) is

the formula to remember since it includes both (114) when F = 0 and (116) when G = 0.

Greens theorem can be written in several equivalent forms. Define the vector field v = F (x, y)

x+

G(x, y)

y . A simple calculation verifies that its curl is purely in the z direction indeed (88) gives v =

z (G/x F/y) thus (117) can be rewritten in the form

Z

I

( v) z dA =

v dr,

(118)

A

x + G(x, y, z)

y + H(x, y, z)

z and any

planar surface A perpendicular to z since z ( v) still equals G/x F/y for such 3D vector fields

and the line element dr of the boundary curve C of such planar area is perpendicular to z so v dr is still

equal to F (x, y, z)dx + G(x, y, z)dy. The extra z coordinate is a mere parameter for the integrals and (118)

applies equally well to 3D vector field v(x, y, z) provided A is a planar area perpendicular to z.

In fact that last restriction on A itself can be removed. This is Stokes theorem which reads

Z

I

( v) dS =

v dr,

(119)

S

where S is a bounded orientable surface in 3D space, not necessarily planar, and C is its closed curve

boundary. The orientation of the surface as determined by the direction of its normal n

, where dS = n

dS,

and the orientation of the boundary curve C must obey the right-hand rule. Thus a corkscrew turning in the

c

F.

Waleffe, Math 321, 2013/1/21

63

. That restriction is a direct consequence

of the fact that the right hand rule enters the definition of the curl as the cross product v.

Stokes theorem (119) provides a geometric interpretation for the curl. At any point r, consider a small

disk of area A perpendicular to an arbitrary unit vector n

, then Stokes theorem states that

I

1

v dr

(120)

( v) n

= lim

A0 A C

H

where C is the circle bounding the disk A oriented with n

. The line integral v dr is called the circulation

of the vector field v around the closed curve C.

Index notation enables a fairly straightforward proof of Stokes theorem for the more general case of a

surface S in 3D space that can be parametrized by a good function r(s, t) (differentiable and integrability

as needed). Such a surface S can fold and twist (it could even intersect itself!) and is therefore of a more

general kind than those that can be parametrized by the cartesian coordinates x and y. The main restriction

on S is that it must be orientable. This means that it must have an up and a down as defined by the

direction of the normal r/s r/t. The famous Mobius strip only has one side and is the classical

example of a non-orientable surface. The boundary of the Mobius strip forms a knot.

I Let xi (s, t) represent the i component of the position vector r(s, t) in the surface S with i = 1, 2, 3.

Then

r r

vk

vk xl xm

xl xm

=ijk

= ijk ilm

( v)

ilm

s

t

xj

s t

xj s t

vk xl xm

= (jl km jm kl )

xj s t

vk xk xj

vk xj xk

=

xj s t

xj s t

vk xk

vk xk

=

s t

t s

xk

xk

=

vk

vk

s

t

t

s

F (s, t)

G(s, t)

,

(121)

=

s

t

where we have used (95), then the chain rule

vk xj

vk x1

vk x2

vk x3

vk

=

+

+

=

,

xj s

x1 s

x2 s

x3 s

s

and equality of mixed partials 2 xk /st = 2 xk /ts with vk (s, t) = vk x1 (s, t), x2 (s, t), x3 (s, t) . This

proof demonstrates the power of index notation with the summation convention. The first line of (121) is

a quintuple sum over all values of the indices i, j, k, l and m! This would be unmanageable without the

compact notation.

Now for any point on the surface S with position vector r(s, t)

xi

xi

xi

xi

ds +

dt = vi

ds + vi

dt = F (s, t)ds + G(s, t)dt,

(122)

v dr = vi

s

t

s

t

and we have again reduced Stokes theorem to Greens theorem (117) but expressed in terms of s and t

instead of x and y. In details we have shown that

Z

Z

Z

r r

G(s, t) F (s, t)

( v) dS = ( v)

dsdt =

dsdt,

(123)

s

t

s

t

S

A

A

64

I

I

v dr =

The right hand sides of (123) and (124) are equal by Greens theorem (117).

7.5

(124)

CA

The fundamental theorems (114) and (116) in R2 can be rewritten in a more palatable form.

The line element dr at a point on the curve C is in the direction of the unit

tangent t at that point, so dr = tds, where t points in the counterclockwise

direction of the curve. Then t z = n

is the unit outward normal n

to the

curve at that point and z n

= t so

x

t = x

(

zn

) = (

x z) n

=

yn

(125)

y

t = y

(

zn

) = (

y z) n

=x

n

.

(126)

Hence since dr = tds, the fundamental theorems (114) and (116) can be

rewritten

I

I

Z

F

dA =

Fy

dr =

Fx

n

ds,

(127)

C

C

A x

Z

I

I

F

dA = F x

dr =

Fy

n

ds.

(128)

A y

C

C

The right hand side of these equations is easier to remember since they have x

going with /x and y

with /y and both equations have positive signs. But there are hidden subtleties. The arclength element

ds = |dr| is positive by definition and n

must be the unit outward normal to the boundary, so if an

explicit

parametrization

of

the

boundary

curve

is known, the bounds of integration should be picked so that

H

H

ds

=

|dr|

>

0

would

be

the

length

of

the

curve with a positive sign. For the dr line integrals, the

C

C

bounds of integration must correspond to counter-clockwise orientation of C.

Formulas (127) and (128) can be combined in more useful forms. First, if u is the signed distance in the

direction of the unit vector u

, then the (directional) derivative in the direction u

is F/u = u

F

ux F/x + uy F/y, where u

= ux x

+ uy y

, therefore combining (127) and (128) accordingly we obtain

Z

A

F

dA =

u

I

Fu

n

ds.

(129)

in the x,y plane.

Another useful combination is to add (127) to (128) written for a function G(x, y) in place of F (x, y)

yielding the divergence-form of Greens theorem

Z

A

F

G

+

x

y

I

(F x

+ G

y) n

ds.

dA =

(130)

The left hand side integrand is easily recognized as the divergence v of the vector field v(x, y) = F x

+G

y.

7.6

Gauss theorem

Gauss theorem is the 3D version of the divergence form of Greens theorem. It is proved by first extending

the fundamental theorem of calculus to 3D.

If V is a bounded volume in 3D space and F (x, y, z) is a scalar function of the cartesian coordinates

(x, y, z), then we have

Z

I

F

dV =

F z n

dS,

(131)

V z

S

c

F.

Waleffe, Math 321, 2013/1/21

65

is the unit outward normal to S.

The proof of this result is similar to that for (114). Assume that the surface can be parametrized using

x and y in two pieces: an upper hemisphere at z = zu (x, y) and a lower hemisphere at z = zl (x, y) with

x, y in a domain A, the projection of S onto the x, y plane, that is the same domain for both the upper and

lower surfaces. The closed surface S is not a sphere in general but we used the word hemisphere to help

visualize the problem. For a sphere, A is the equatorial disk, z = zu (x, y) is the northern hemisphere and

z = zl (x, y) is the southern hemisphere.

Iterated integrals with dV = dA dz and the fundamental theorem of calculus give

Z

Z

Z

Z zu (x,y)

F

F

dV =

dz =

F (x, y, zu (x, y)) F (x, y, zl (x, y)) dA.

(132)

dA

z

A

zl (x,y)

A

V z

z

dS

dA

We can interpret that integral over A as an integral over the entire closed

surface S that bounds V . All we need for that is a bit of geometry. If dA is

the projection of the surface element dS = n

dS onto the x, y plane then we

have dA = cos dS =

zn

dS. The + sign applies to the upper surface

for which n

is pointing up (i.e. in the direction of z) and the - sign for the

bottom surface where n

points down (and would be opposite to the n

on

the side figure). Thus we obtain

I

Z

F

dV =

F z n

dS,

(133)

S

V z

where n

is the unit outward normal to S.

We can obtain similar results for the volume integrals of F/x and F/y and combine those to obtain

Z

I

F

dV =

Fu

n

dS,

(134)

V u

S

for arbitrary but fixed direction u

. This is the 3D version of (129) and of the fundamental theorem of

calculus.

We can combine this theorem into many useful forms. Writing it for F (x, y, z) in the x

direction, with

the y

version for a function G(x, y, z) and the z version for a function H(x, y, z) we obtain Gausss theorem

Z

I

v dV =

vn

dS,

(135)

V

where v = F x

+ G

y + H z. This is the 3D version of (130). Note

H that both (134) and (135) are expressed in

coordinate-free forms. These are general results. The integral S v n

dS is the flux of v through the surface

S. If v(r) is the velocity of a fluid at point r then that integral represents the time-rate at which volume of

fluid flows through the surface S.

Gauss theorem provides a coordinate-free interpretation for the divergence. Consider a small sphere of

volume V and surface S centered at a point r, then Gauss theorem states that

I

1

v = lim

vn

dS.

(136)

V 0 V

S

Note that (134) and (135) are equivalent. We deduced (135) from (134), but we can also deduce (134)

from (135) by considering the special v = F u

where u

is a unit vector independent of r. Then from our

vector identities (F u

) = u

F = F/u.

7.7

I

Z

F

dV =

nj F dS.

V xj

S

(137)

66

Then with f (r) in place of F (r) we deduce that

Z

I

f

ej

dV =

ej nj f dS

xj

V

S

(138)

since the cartesian unit vectors ej are independent of position. This result can be written in coordinate-free

form as

Z

I

f dV =

fn

dS.

(139)

V

One application of this form of the fundamental theorem is to prove Archimedes principle.

Next, writing (137) for vk in place of F yields

Z

I

vk

dV =

nj vk dS,

V xj

S

which can be multiplied by the position-independent ijk and summed over all j and k to give

I

Z

vk

ijk nj vk dS.

dV =

ijk

xj

S

V

The coordinate-free form of this is

(141)

I

v dV =

n

v dS.

(142)

The integral theorems (139) and (142) provide yet other geometric interpretations for the gradient

I

1

f = lim

fn

dS,

V 0 V

S

and the curl

(140)

1

V 0 V

(143)

I

n

v dS.

v = lim

(144)

These are similar to the result (136) for the divergence. However, the geometric definition of the gradient

as the vector pointing in the direction of greatest rate of change (sect. 6.1) and of the n

component of the

curl as the limit of the local circulation per unit area as given by Stokes theorem (120) are perhaps more

fundamental.

In applications we use the fundamental theorem as we do in 1D, namely to reduce a 3D integral to a 2D

integral, for instance. However we also use them the other way, to evaluate a complicated surface integral

as a simpler volume integral for instance.

Exercises

H

H

1. If C is any closed curve in 3D space (i) calculate C r dr in two ways, (ii) calculate C f dr in two

ways, where f (r) is a scalar function. [Hint: by direct calculation and by Stokes theorem].

H

2. If C is any closed curve in 3D space not passing through the origin calculate C r3 r dr in two ways.

3. Calculate the circulation of the vector field B = (

z r)/|

z r|2 (i) about a circle of radius R centered

at the origin in a plane perpendicular to z; (ii) about any closed curve C in 3D that does not go around

the z-axis; (ii) about any closed curve C0 that does go around the z axis. Whats wrong with the z-axis

anyway?

4. Consider v = r where is a constant vector, independent of r. (i) Evaluate the circulation of

v about the circle of radius R centered at the origin in the plane perpendicular to the direction n

by

direct calculation of the line integral; (ii) Calculate the curl of v using vector identities; (iii) calculate

the circulation of v about a circle of radius R centered at r 0 in the plane perpendicular to n

.

H

5. If C is any closed curve in 2D, calculate C n

dr where n

(r) is the unit outside normal to C at the point

r of C.

c

F.

Waleffe, Math 321, 2013/1/21

6. If S is any closed surface in 3D, calculate

r on S.

67

H

n

dS where n

(r) is the unit outside normal to S at a point

H

7. If S is any closed surface in 3D, calculate p n

dS where n

is the unit outside normal to S and

p(r) = (p0 g z r) where p0 , and g are constants (This is Archimedes principle with as fluid

density and g as the acceleration of gravity.)

8. Calculate the flux of r through (i) the surface of a sphere of radius R centered at the origin in two

ways; (ii) through the surface of the sphere of radius R centered at r 0 ; (iii) through the surface of a

cube of side L with one corner at the origin in two ways.

9. (i) Calculate the flux of v = r/r3 through the surface of a sphere of radius centered at the origin. (ii)

Calculate the flux of that vector field through a closed surface that does not enclose the origin [Hint:

use the divergence theorem] (iii) Calculate the flux through an arbitrary closed surface that encloses

the origin [Hint: use divergence theorem and (i) to isolate the origin. Whats wrong with the origin

anyway?]

10. Calculate |r r 0 |1 and v with v = (r r 0 )/|r r 0 |3 where r 0 is a constant vector.

11. Calculate the flux of v through the surface of a sphere of radius R centered at the origin when v =

1 (r r 1 )/|r r 1 |3 + 2 (r r 2 )/|r r 2 |3 where 1 and 2 are scalar constants and (i) |r 1 | and |r 2 |

PN

are both less than R; (ii) |r 1 | < R < |r 2 |. Generalize to v = i=1 i (r r i )/|r r i |3 .

R

12. Calculate F (r) = V0 f dV0 where V0 is the inside of a sphere of radius R centered at O, f = (r

r 0 )/|r r 0 |3 with |r| > R and the integral is over r 0 . Warning: this is essentially the gravity field

or force at r due to a sphere of uniform mass density. Supposedly, it took Newton 20 years to figure

it out... of course he was never taught calculus, he had to invent it, and vector calculus came much

after him and he never knew about Gauss theorem. The integral can be cranked out if youre good

at analytic integration. But the smart solution is to realize that the integral over r 0 is essentially a

sum over r i as in the previous exercise so we can figure out the flux of F through any closed surface

enclosing all of V0 . Now by symmetry F (r) = F (r)

r , so knowing the flux is enough to figure out F (r).

Newton would be impressed!

68

Chapter 3

Complex Calculus

1

1.1

A complex number z = x + iy is composed of a real part Re(z) = x and an imaginary part Im(z) = y, both

of which are real numbers, x, y R. Complex numbers can be defined as pairs of real numbers (x, y) with

special manipulation rules. Thats how complex numbers are defined in Fortran or C. We can map complex

numbers to the plane R2 with the real part as the x axis and the imaginary part as the y-axis. We refer to

that mapping as the complex plane. This is a very useful visualization.

The form x + iy is convenient with the special symbol i standing as the imaginary unit defined such that

i2 = 1. With that form and that special i2 = 1 rule, complex numbers can be manipulated like regular

real numbers.

z = x + iy

|z|

x

z = x i y

Addition/subtraction:

z1 + z2 = (x1 + iy1 ) + (x2 + iy2 ) = (x1 + x2 ) + i(y1 + y2 ).

(1)

This is identical to vector addition for the 2D vectors (x1 , y1 ) and (x2 , y2 ).

Multiplication:

z1 z2 = (x1 + iy1 )(x2 + iy2 ) = (x1 x2 y1 y2 ) + i(x1 y2 + x2 y1 ).

(2)

Complex conjugate:

z = x iy

(3)

An overbar z or a star z denotes the complex conjugate of z, which is same as z but with the sign of the

imaginary part flipped. It is readily verified that the complex conjugate of a sum is the sum of the conjugates:

(z1 + z2 ) = z1 + z2 , and the complex conjugate of a product is the product of the conjugates (z1 z2 ) = z1 z2

(show that as an exercise).

Modulus (or Norm)

p

|z| = zz = x2 + y 2 ,

(4)

This modulus is equivalent to the euclidean norm of the 2D vector (x, y), hence it obviously satisfy the

triangle inequality |z1 + z2 | |z1 | + |z2 |. However we can verify that |z1 z2 | = |z1 | |z2 |.

69

70

Division:

x1 x2 + y1 y2

x2 y1 x1 y2

(x1 + iy1 )(x2 iy2 )

z1

z1 z2

=

+i

.

=

=

z2

z2 z2

x22 + y22

x22 + y22

x22 + y22

(5)

All the usual algebraic formula apply, for instance (z + a)2 = z 2 + 2za + a2 and more generally the

binomial formula (defining 0! = 1)

(z + a)n =

n

X

n

k=0

z k ank =

n

X

k=0

n!

z k ank .

k!(n k)!

(6)

Exercises:

1. Prove that (z1 + z2 ) = z1 + z2 , (z1 z2 ) = z1 z2 and |z1 z2 | = |z1 ||z2 |.

2. Calculate (1 + i)/(2 + i3).

3. Show that the final formula for division follows from the definition of multiplication (as it should): if

z = z1 /z2 then z1 = zz2 , solve for Re(z) and Im(z).

1.2

The modulus allows the definition of distance and limit. The distance between two complex numbers z and

a is the modulus of their difference |z a|. A complex number z tends to a complex number a if |z a| 0,

where |z a| is the euclidean distance between the complex numbers z and a in the complex plane. A

function f (z) is continuous at a if limza f (z) = f (a). These concepts allow the definition of derivatives

and series.

The derivative of a function f (z) at z is

df (z)

f (z + a) f (z)

= lim

a0

dz

a

(7)

where a is a complex number and a 0 means |a| 0. This limit must be the same no matter how a 0.

We can use the binomial formula (6) as done in Calc I to deduce that

dz n

= nz n1

dz

(8)

for any integer n = 0, 1, 2, . . ., and we can define the anti-derivative of z n as z n+1 /(n + 1) + C for all

integer n 6= 1. All the usual rules of differentiation: product rule, quotient rule, chain rule,. . . , still apply

for complex differentiation and we will not bother to prove those here, the proofs are just like in Calc I.

So there is nothing special about complex derivatives, or is there? Consider the function f (z) = Re(z) =

x, the real part of z. What is its derivative? Hmm. . . , none of the rules of differentiation help us here, so

lets go back to first principles:

Re(z + a) Re(z)

Re(a)

dRe(z)

= lim

= lim

=?!

a0

a0

dz

a

a

(9)

What is that limit? If a is real, then a = Re(a) so the limit is 1, but if a is imaginary then Re(a) = 0 and

the limit is 0. So there is no limit that holds for all a 0. The limit depends on how a 0, and we cannot

define the z-derivative of Re(z). Re(z) is continuous everywhere, but nowhere z-differentiable!

Exercises:

1. Prove formula (8) from the limit definition of the derivative [Hint: use the binomial formula].

2. Prove that (8) also applies to negative integer powers z n = 1/z n from the limit definition of the

derivative.

c

F.

Waleffe, Math 321, 2013/1/21

1.3

71

1 q n+1

.

1q

1 + q + q2 + + qn =

(10)

To prove this, let Sn = 1 + q + + q n and note that qSn = Sn + q n+1 1, then solve that for Sn .

The geometric series is the limit of the sum as n . It follows from (10), that the geometric series

converges to 1/(1 q) if |q| < 1, and diverges if |q| > 1,

qn = 1 + q + q2 + =

n=0

1

,

1q

iff

|q| < 1.

(11)

P

Note that we have two different functions of q: (1) the series n=0 q n which only exists when |q| < 1, (2)

the function 1/(1 q) which is defined and smooth everywhere except at q = 1. These two expressions, the

geometric series and the function 1/(1 q) are identical in the disk |q| < 1, but they are not at all identical

outside of that disk since the series does not make any sense (i.e. it diverges) outside of it. What happens

on the unit circle |q| = 1? (consider for example q = 1, q = 1, q = i, . . . )

=(q)

diverges

|q| = 1

converges

<(q)

Exercises:

1. Derive formula (10) and absorb the idea of the proof. What is Sn when q = 1?

2. Calculate q N + q N +2 + q N +4 + q N +6 + .... with |q| < 1.

1.4

Ratio test

The geometric series leads to a useful test for convergence of the general series

an = a0 + a1 + a2 +

(12)

n=0

We can make sense of this series again as the limit of the partial sums Sn = a0 + a1 + + an as n .

Any one of these finite partial sums exists but the infinite sum does not necessarily converge. Example: take

an = 1 n, then Sn = n + 1 and Sn as n .

A necessary condition for convergence is that an 0 as n as you learned in Math 222 and can

explain why, but that is not sufficient. A sufficient condition for convergence is obtained by comparison to

a geometric series. This leads to the Ratio Test: the series (12) converges if

lim

|an+1 |

=L<1

|an |

(13)

72

Why does the ratio test work? If L < 1, then pick any q such that L < q < 1 and one can find a (sufficiently

large) N such that |an+1 |/|an | < q for all n N so we can write

|aN +1 | |aN +2 | |aN +1 |

|aN | + |aN +1 | + |aN +2 | + |aN +3 | + = |aN | 1 +

+

+

|aN |

|aN +1 | |aN |

(14)

|aN |

< |aN | 1 + q + q 2 + =

< .

1q

If L > 1, then we can reverse the proof (i.e. pick q with 1 < q < L and N such that |an+1 |/|an | > q n N )

to show that the series diverges. If L = 1, the ratio test does not determine convergence.

1.5

Power series

cn (z a)n = c0 + c1 (z a) + c2 (z a)2 +

(15)

n=0

where the cn s are complex coefficients and z and a are complex numbers. It is a series in powers of (z a).

By the ratio test, the power series converges if

cn+1 |z a|

cn+1 (z a)n+1

= |z a| lim

< 1,

(16)

lim

n

n

cn (z a)n

cn

R

where we have defined

=(z)

cn+1

= 1.

lim

n

cn R

(17)

|z a| < R

a

R

|z a| = R is a circle of radius R centered at a, hence R is called

the radius of convergence of the power series. R can be 0, or

anything in between. But the key point is that power series always

converge in a disk |z a| < R and diverge outside of that disk.

<(z)

This geometric convergence inside a disk implies that power series can be differentiated (and integrated)

term-by-term inside their disk of convergence (why?). The disk of convergence of theP

derivative or integral

n

series is the same as that of the original

series.

For

instance,

the

geometric

series

n=0 z converges in

P

n1

|z| < 1 and its term-by-term derivative n=0 nz

does also, as you can verify by the ratio test.

Taylor Series

The Taylor Series of a function f (z) about z = a is

X

1

f (n) (a)

f (z) = f (a) + f 0 (a)(z a) + f 00 (a)(z a)2 + =

(z a)n ,

2

n!

n=0

(18)

where f (n) (a) = dn f /dz n (a) is the nth derivative of f (z) at a and n! = n(n 1) 1 is the factorial of n,

with 0! = 1 by convenient definition. The equality between f (z) and its Taylor series is only valid if the

series converges. The geometric series

X

1

= 1 + z + z2 + =

zn

1z

n=0

(19)

c

F.

Waleffe, Math 321, 2013/1/21

73

P the function 1/(1 z) exists and

is infinitely differentiable everywhere except at z = 1 while the series n=0 z n only exists in the unit circle

|z| < 1.

Several useful Taylor series are more easily derived from the geometric series (11), (19) than from the

general formula (18) (even if you really like calculating lots of derivatives!). For instance

X

1

2

4

=

1

+

z

+

z

+

=

z 2n

1 z2

n=0

(20)

X

1

= 1 z + z2 =

(z)n

1+z

n=0

(21)

ln(1 + z) = z

X

z2

(1)n z n+1

+ =

2

n+1

n=0

(22)

The last series is obtained by integrating both sides of the previous equation and matching at z = 0 to

determine the constant of integration. These series converge only in |z| < 1 while the functions on the left

hand side exist for (much) larger domains of z.

Exercises:

1. Explain why the domain of convergence of a power series is always a disk (possibly infinitely large),

not an ellipse or a square or any other shape [Hint: read the notes carefully]. (Anything can happen

on the boundary of the disk: weak (algebraic) divergence or convergence, perpetual oscillations, etc.,

recall the geometric series).

P

2. Show that if a function f (z) = n=0 cn (z a)n for all zs within the (non-zero) disk of convergence

of the power series, then the cn s must have the form provided by formula (18).

3. What is the Taylor series of 1/(1 z) about z = 0? what is its radius of convergence? does the series

converge at z = 2? why not?

4. What is the Taylor series of the function 1/(1+z 2 ) about z = 0? what is its radius of convergence? Use

a computer or calculator to test the convergence of the series inside and outside its disk of convergence.

5. What is the Taylor series of 1/z about z = 2? what is its radius of convergence? [Hint: z = a + (z a)]

6. What is the Taylor series of 1/(1 + z)2 about z = 0?

7. Look back at all the places in these notes and exercises (including earlier subsections) where we have

used the geometric series for theoretical or computational reasons.

1.6

Complex transcendentals

The complex versions of the Taylor series for the exponential, cosine and sine functions

exp(z) = 1 + z +

(23)

X

z2

z4

z 2n

+

=

(1)n

2

4!

(2n)!

n=0

(24)

X

z3

z5

z 2n+1

+

=

(1)n

3!

5!

(2n + 1)!

n=0

(25)

cos z = 1

sin z = z

X

zn

z2

+ =

2

n!

n=0

converge in the entire complex plane for any z with |z| < as is readily checked from the ratio test. These

series can now serve as the definition of these functions for complex arguments. We can verify all the usual

74

properties of these functions from the series expansion. In general we can integrate and differentiate series

term by term inside the disk of convergence of the power series. Doing so for exp(z) shows that the function

is still equal to its derivative

!

X

d X zn

d

z n1

exp(z) =

=

= exp(z),

(26)

dz

dz n=0 n!

(n 1)!

n=1

meaning that exp(z) is the solution of the complex differential equation df /dz = f with f (0) = 1. Likewise

the series (24) for cos z and (25) for sin z imply

d

cos z = sin z,

dz

d

sin z = cos z.

dz

(27)

Another slight tour de force with the series for exp(z) is to use the binomial formula (6) to obtain

exp(z + a) =

X

X

n k nk

n

X

X

X

(z + a)n

n z a

z k ank

=

=

.

n!

n!

k!(n k)!

k

n=0

n=0

n=0

k=0

(28)

k=0

The double sum is over the triangular region 0 n , 0 k n in n, k space. If we interchange the

order of summation, wed have to sum over k = 0 and n = k (sketch it!). Changing variables to

k, m = n k the range of m is 0 to as that of k and the double sum reads

!

!

X

X

X

X am

z k am

zk

exp(z + a) =

=

= exp(z) exp(a).

(29)

k!m!

k!

m!

m=0

m=0

k=0

k=0

This is a major property of the exponential function and we verified it from its series expansion (23) for

general complex arguments z and a. It implies that if we define as before

e = exp(1) = 1 + 1 +

1 1

1

1

+ +

+

+ = 2.71828...

2 6 24 120

(30)

then exp(n) = [exp(1)]n = en and exp(1) = [exp(1/2)]2 thus exp(1/2) = e1/2 etc. so we can still identify

exp(z) as the number e to the complex power z and (29) is the regular algebraic rule for exponents: ez+a =

ez ea . In particular

exp(z) = ez = ex+iy = ex eiy ,

(31)

ex is our regular real exponential but eiy is the exponential of a pure imaginary number. We can make sense

of this from the series (23), (24) and (25) to obtain

eiz = cos z + i sin z,

or

cos z =

eiz + eiz

,

2

sin z =

eiz eiz

.

2i

(32)

(33)

These hold for any complex number z. [Exercise: Show that eiz is not the conjugate of eiz unless z is real].

For z real, this is Eulers formula usually written in terms of a real angle

ei = cos + i sin .

(34)

This is arguably one of the most important formula in all of mathematics! It reduces all of trigonometry to

algebra among other things. For instance ei(+) = ei ei implies

cos( + ) + i sin( + ) =(cos + i sin )(cos + i sin )

=(cos cos sin sin ) + i(sin cos + sin cos )

which yields two trigonometric identities in one swoop.

(35)

c

F.

Waleffe, Math 321, 2013/1/21

75

Exercises:

1. Use series to compute the number e to 4 digits. How many terms do you need?

2. Use series to compute exp(i), cos(i) and sin(i) to 4 digits.

3. Express cos(1 + 3i) in terms of real expressions and factors of i that a 221 student might understand

and be able to calculate.

4. What is the conjugate of exp(iz)?

5. Use Eulers formula and geometric sums to derive compact formulas for the trigonometric sums

1 + cos x + cos 2x + cos 3x + + cos N x =?

(36)

(37)

6. Generalize the previous results by deriving compact formulas for the geometric trigonometric series

1 + p cos x + p2 cos 2x + p3 cos 3x + + pN cos N x =?

(38)

(39)

7. The formula (35) leads to the well-known double angle formula cos 2 = 2 cos2 1 and sin 2 =

2 sin cos . They also lead to the triple angle formula cos 3 = 4 cos3 3 cos and sin 3 =

sin (4 cos2 1). These formula suggests that cos n is a polynomial of degree n in cos and that

sin n is sin times a polynomial of degree n1 in cos . Derive explicit formulas for those polynomials.

[Hint: use Eulers formula for ein and the binomial formula]. The polynomial for cos n in powers of

cos is the Chebyshev polynomial Tn (x) with cos n = Tn (cos ).

1.7

Polar representation

Introducing polar coordinates in the complex plane such that x = r cos and y = r sin , then using Eulers

formula (34), any complex number can be written

z = x + iy = rei = |z|ei arg(z) .

(40)

This is the polar form of the complex number z. Its modulus is |z| = r and the angle = arg(z) + 2k is

called the phase of z, where k = 0, 1, 2, . . . is an integer. A key issue is that for a given z, its phase is

only defined up to an arbitrary multiple of 2 since replacing by 2 does not change z. However the

argument arg(z) is a function of z and therefore we want it to be uniquely defined for every z. For instance

we can define 0 arg(z) < 2, or < arg(z) . These are just two among an infinite number of possible

definitions. Although computer functions (Fortran, C, Matlab, ...) make a specific choice (typically the

2nd one), that choice may not be suitable in some cases. The proper choice is problem dependent. This is

because while is continuous, arg(z) is necessarily discontinuous. For example, if we define 0 arg(z) < 2,

then a point moving about the unit circle at angular velocity will have a phase = t but arg(z) = t

mod 2 which is discontinuous at t = 2k.

The cartesian representation x + iy of a complex number z is perfect for addition/subtraction but the

polar representation rei is more convenient for multiplication and division since

z1 z2 = r1 ei1 r2 ei2 = r1 r2 ei(1 +2 ) ,

(41)

r1 ei1

r1

z1

=

= ei(1 2 ) .

i

2

z2

r2 e

r2

(42)

76

1.8

Logs

The power series expansion of functions is remarkably powerful and closely tied to the theory of functions of

a complex variable. A priori, it doesnt seem very general, how, for instance, could we expand f (z) = 1/z

into a series in positive powers of z

1

= a0 + a1 z + a2 z 2 +

z

??

We can in fact do this easily using the geometric series. For any a 6= 0

1

1

1

=

=

z

a + (z a)

a

1+

X

1

(z a)n

=

(1)n n+1 .

za

a

n=0

a

(43)

Thus we can expand 1/z in powers of z a for any a 6= 0. That (geometric) series converges in the disk

|z a| < |a|. This is the disk of radius |a| centered at a. By taking a sufficiently far away from 0, that

disk where the series converges can be made as big as one wants but it can never include the origin which of

course is the sole singular point of the function 1/z. Integrating (43) for a = 1 term by term yields

ln z =

(1)n

n=0

(z 1)n+1

n+1

(44)

as the antiderivative of 1/z that vanishes at z = 1. This looks nice, however that series only converges for

|z 1| < 1. We need a better definition that works for a larger

P domain in the z-plane.

The Taylor series definition of the exponential exp(z) = n=0 z n /n! is very good. It converges for all zs,

it led us to Eulers formula ei = cos + i sin and it allowed us to verify the key property of the exponential,

namely exp(a + b) = exp(a) exp(b) (where a and b are any complex numbers), from which we deduced other

goodies: exp(z) ez with e = exp(1) = 2.71828 . . ., and ez = ex+iy = ex eiy .

What about ln z? As for functions of a single real variable we can introduce ln z as the inverse of ez or

as the integral of 1/z that vanishes at z = 1.

ln z as the inverse of ez

Given z we want to define the function ln z as the inverse of the exponential. This means we want to find a

complex number w such that ew = z. We can solve this equation for w as a function of z by using the polar

representation for z, z = |z|ei arg(z) , together with the cartesian form for w, w = u + iv, where u = Re(w)

and v = Im(w) are real. We obtain

ew = z eu+iv = |z|ei arg(z) ,

eu = |z|,

u = ln |z|,

eiv = ei arg(z) ,

(why?)

v = arg(z) + 2k,

(45)

where k = 0, 1, 2, Note that |z| 0 is a positive real number so ln |z| is our good old natural log of a

positive real number. We have managed to find the inverse of the exponential

ew = z w = ln |z| + i arg(z) + 2ik.

(46)

The equation ew = z for w, given z, has an infinite number of solutions. This make sense since ew = eu eiv =

eu (cos v + i sin v) is periodic of period 2 in v, so if w = u + iv is a solution, so is u + i(v + 2k) for any

integer k. We can take any one of those solutions as our definition of ln z, in particular

ln z = ln |z|ei arg(z) = ln |z| + i arg(z).

(47)

This definition is unique since we assume that arg z is uniquely defined in terms of z. However different

definitions of arg z lead to different definitions of ln z.

c

F.

Waleffe, Math 321, 2013/1/21

77

Example: If arg(z) is defined by 0 arg(z) < 2 then ln(3) = ln 3 + i, but if we define instead

arg(z) < then ln(3) = ln 3 i.

Note that you can now take logs of negative numbers! Note also that the ln z definition fits with our

usual manipulative rules for logs. In particular since ln(ab) = ln a + ln b then ln z = ln(rei ) = ln r + i. This

is the easy way to remember what ln z is.

Complex powers

As for functions of real variables, we can now define general complex powers in terms of the complex log and

the complex exponential

ab = eb ln a = eb ln |a| eib arg(a) ,

(48)

be careful that b is complex in general, so eb ln |a| is not necessarily real. Once again we need to define arg(a)

and different definitions can actually lead to different values for ab .

In particular, we have the complex power functions

z a = ea ln z = ea ln |z| eia arg(z)

(49)

(50)

These functions are well-defined once we have defined a range for arg(z) in the case of z a and for arg(a) in

the case of az .

Once again, a peculiar feature of functions of complex variables is that the user is left free to choose

whichever definition is more convenient for the particular problem under consideration. Note that different

definition for the arg(a) provides definitions for ab that do not simply differ by an additive multiple of 2i

as was the case for ln z. For example

(1)i = ei ln(1) = e arg(1) = e2k

for some k, so the various possible definitions of (1)i will differ by a multiplicative integer power of e2 .

Roots

The fundamental theorem of algebra states that any nth order polynomial equation of the form cn z n +

cn1 z n1 + + c1 z + c0 = 0 with cn 6= 0 always has n roots in the complex plane. This can be stated as

saying that there always exist n complex numbers z1 , . . ., zn such that

cn z n + cn1 z n1 + + c0 = cn (z z1 ) (z zn ).

(51)

The numbers z1 , . . ., zn are the roots or zeros of the polynomial. These roots can be repeated as for the

polynomial 2z 2 4z + 2 = 2(z 1)2 . This expansion is called factoring the polynomial.

The equation 2z 2 2 = 0 has two real roots z = 1 and 2z 2 2 = 2(z 1)(z + 1). The equation

2

3z + 3 = 0 has no real roots, however it has two imaginary roots z = i and 3z 2 + 3 = 3(z i)(z + i).

The equation z n a = 0, with a complex and n a positive integer, therefore has n roots. We might be

tempted to write the solution as z = a1/n but what does that mean? According to our definition of ab above,

we have a1/n = e(ln a)/n which depends on the argument of a since ln a = ln |a| + i arg(a). When we define a

function we need to make the definition unique, but here we are looking for all the roots. This means that

we have to consider all possible definitions of arg(a). Heres a correct way to think about this. Using the

polar representation z = rei

z n = rn ein = a = |a|ei(arg(a)+2k) r = |a|1/n ,

2

arg(a)

+k

n

n

(52)

where k = 0, 1, 2, . . . The moduli |z| = r and |a| are positive real numbers, so |a|1/n is our good old root

function giving a positive real value, but = arg(z) has many possible values that differ by a multiple of

2/n. When n is a positive integer, this yields n distinct values of modulo 2, yielding n distinct values for

z. It is useful to visualize these roots. They are all equispaced on the circle of radius |a|1/n in the complex

plane.

78

Exercises:

1. Find all the roots, visualize and locate them in the complex plane and factor the corresponding polynomial (i) z 4 = 1, (ii) z 4 + 1 = 0, (iii) z 2 = i, (iv) 2z 2 + 5z + 2 = 0.

2. Investigate the solutions of the equation z b = 1 when (i) b is a rational number, i.e. b = p/q with p,

q integers, (ii) when b is irrational e.g. b = , (iii) when b is complex, e.g. b = 1 + i. Visualize the

solutions in the complex plane if possible.

3. If a and b are complex numbers, whats wrong with saying that if w = ea+ib = ea eib then |w| = ea and

arg(w) = b + 2k? Isnt that what we did in (45)?

2

2.1

Visualization of complex functions

A function w = f (z) of a complex variable z = x + iy has complex values w = u + iv, where u, v are real.

The real and imaginary parts of w = f (z) are functions of the real variables x and y

f (z) = u(x, y) + i v(x, y).

(53)

z 2 = (x2 y 2 ) + i 2xy

(54)

2

with a real part u(x, y) = x y and an imaginary part v(x, y) = 2xy. How do we visualize complex

functions? In calc I, for real functions of one real variable, y = f (x), we made an xy plot. Here x and y are

independent variables and w = f (z) corresponds to two real functions of two real variables Re(f (z)) = u(x, y)

and Im(f (z)) = v(x, y). One way to visualize f (z) is to make a 3D plot with u as the height above the (x, y)

plane. We could do the same for v(x, y), however a prettier idea is to color the surface u = u(x, y) in the

3D space (x, y, u) by the value of v(x, y). Here is such a plot for w = z 2 :

w=z2

1

0.8

1

0.6

0.4

0.5

0.2

0

0

0.5

0.2

0.4

1

1

0.6

1

0.5

0.5

0.8

0

0.5

0.5

1

c

F.

Waleffe, Math 321, 2013/1/21

79

Note the nice saddle-structure, u = x2 y 2 is the parabola u = x2 along the real axis y = 0, but u = y 2

along the imaginary axis, x = 0. Again the color of the surface is the value of v(x, y) at that point, as

given by the colorbar on the right. This visualization leads to pretty pictures but they quickly become too

complicated to handle, in large part because of the 2D projection on the screen or page. It is often more

useful to make 2D contour plots of u(x, y) and v(x, y):

contours of u=Real(z2)

contours of v=Imag(z2)

0.8

0.8

0.6

0.8

0.6

1.5

0.6

0.4

0.4

0.4

0.2

0.5

0.2

0.2

0

0.2

0.2

0.2

0.5

0.4

0.4

0.4

1

0.6

0.6

0.8

0.8

1

1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

0.6

1.5

0.8

1

1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

showing the isocurves (or isolines or level sets) of the function, u(x, y) on the left and v(x, y) on the right,

enhanced by constant coloring between contours. Note that the v(x, y) colors indeed match the colors on

the earlier 3D picture. The saddle-structure of both u(x, y) and v(x, y) is quite clear.

2.2

Cauchy-Riemann equations

We reviewed fundamental examples of complex functions: z n , ez , cos z, sin z, ln z, z 1/n , z a , az , etc. as well

as special complex functions such as Re(z), Im(z), z . We showed that Re(z) is not z-differentiable (9), and

Im(z) and z are not either. For instance,

(z + a ) z

a

dz

= lim

= lim

= e2i

a0

a0 a

dz

a

(55)

where a = |a|ei , so the limit is different for every . If a is real, then = 0 and the limit is 1, but if a is

imaginary then = /2 and the limit is 1. If |a| = e then a 0 in a logarithmic spiral as , but

there is no limit in that case since e2i keeps spinning around the unit circle without ever converging to

anything. We cannot define a limit as a 0, so z is not differentiable with respect to z. It is special for a

function to be z-differentiable.

The statement that the complex derivative of f (z) exists in a neighborhood of a point z has powerful

consequences. Thats because the limit in the definition of the derivative (7) can be taken in many different

ways. If we take a z = x real, we find that

f (z + z) f (z)

u(x + x, y) u(x, y)

v(x + x, y) v(x, y)

df

= lim

= lim

+ i lim

x0

x0

dz z0

z

x

x

u

v

=

+i ,

x

x

but if we pick a z = iy pure imaginary, we obtain

df

f (z + z) f (z)

u(x, y + y) u(x, y)

v(x, y + y) v(x, y)

= lim

= lim

+ i lim

y0

y0

dz z0

z

i y

i y

u v

=i

+

.

y

y

(56)

(57)

80

If the df /dz exists, the limit should be the same no matter how z 0, hence (56) and (57) must be

identical implying that

u

v

u

v

(58)

=

,

= .

x

y

y

x

These are the Cauchy-Riemann equations relating the partial derivatives of the real and imaginary part of

a function of a complex variable f (z) = u(x, y) + i v(x, y). This derivation shows that the Cauchy-Riemann

equations are necessary conditions on u(x, y) and v(x, y) if f (z) is differentiable in a neighborhood of z. If

df /dz exists then the Cauchy-Riemann equations (58) necessarily hold.

Example 1: The function f (z) = z 2 has u = x2 y 2 and v = 2xy. Its z-derivative dz 2 /dz = 2z exists

everywhere and the Cauchy-Riemann equations (58) are satisfied everywhere since u/x = 2x = v/y

and u/y = 2y = v/x.

Example 2: The function f (z) = z = x iy has u = x, v = y. Its z-derivative dz /dz =?! does not

exist anywhere as we showed earlier and the Cauchy-Riemann equations (58) do not hold anywhere since

u/x = 1 6= v/y = 1.

B The converse is also true, if the Cauchy-Riemann equations are satisfied in a neighborhood of a point

(x, y) then the functions u(x, y) and v(x, y) are called conjugate functions and they in fact consitute the real

and imaginary part of a differentiable function of a complex variable f (z). To prove this we need to show

that the z-derivative of the function f (z) u(x, y) + iv(x, y) exists independently of how the limit is taken.

Writing a = + i with and real, we have

f (z + a) f (z)

df

= lim

dz a0

a

u(x + , y + ) + iv(x + , y + ) u(x, y) + iv(x, y)

= lim

a0

+ i

u(x + , y + ) u(x, y) + i v(x + , y + ) v(x, y)

= lim

.

a0

+ i

(59)

Now the functions u(x, y) and v(x, y) being differentiable in the neigborhood of (x, y) implies that locally

we can write

u

u

+

+ o(a),

x

y

v

v

v(x + , y + ) v(x, y) =

+

+ o(a)

x

y

u(x + , y + ) u(x, y) =

(60)

where the derivatives are evaluated at the point (x, y) and o(a) (little oh of a) is a remainder that goes

to zero faster than a, so lima0 o(a)/a = 0. (The notation O(a) (big Oh of a) denotes an expression that

goes to zero as fast as a so lima0 O(a)/a = C for some complex constant C.) Using these local expansions

and the Cauchy-Riemann equations (58) to replace the y-derivatives by x-derivatives, we can rewrite (59) as

u

v

+ i

u

v

+i

lim

=

+i

,

(61)

x

x a0 + i

x

x

hence the limit is indeed independent of how a = + i tends to zero.

that neighborhood. A function f (z) = u(x, y) + iv(x, y) is analytic in a neighborhood of z if and only if the

Cauchy-Riemann equations (58) are satisfied in that neighborhood.

Another important consequence of z-differentiability and the Cauchy-Riemann equations (58) is that the

real and imaginary parts of a differentiable function f (z) = u(x, y) + i v(x, y) both satisfy Laplaces equation

2u 2u

2v

2v

+

=

0

=

+

.

x2

y 2

x2

y 2

(62)

c

F.

Waleffe, Math 321, 2013/1/21

81

Exercises:

1. Deduce (62) from the Cauchy-Riemann equations (58). Verify these results (58) and (62), for f (z) = z 2 ,

ez , ln z, etc.

2. Is the function |z| analytic? Why? What about the functions Re(z) and f (|z|)?

3. Given u(x, y) find its conjugate function v(x, y), if possible,

such that u(x, y) + iv(x, y) f (z), for (i)

p

u = y; (ii) u = x + y; (iii) u = cos x cosh y, (iv) u = ln x2 + y 2 .

4. Substituting (60) into (59) we obtain

u

v

u

v

o(a)

+i

lim

+

+i

lim

+ lim

a0 a

x

x a0 + i

y

y a0 + i

u

v

1 u

v

=

+i

+

+i

x

x

i y

y

since

lim

a0

+ i

= 1,

lim

a0

+ i

=

1

.

i

5. Explain why we can write (60).

2.3

y

The Cauchy-Riemann equations (58) connecting the real and imaginary part of a z-differentiable function f (z) = u(x, y) + i v(x, y)

have remarkable geometric implications.

x

the contours u(x, y) = x2 y 2 = 0, 1, 4, 9 (blue) which are

hyperbolas with asymptotes y = x. The contours v(x, y) = 2xy =

0, 1, 4, 9 (red) are also hyperbolas but now with asymptotes

x = 0 and y = 0. Solid is positive, dashed is negative. The u and

v contours intersect everywhere at 90 degrees, except at z = 0.

The orthogonality of the contours of u = Re(f (z)) and v = Im(f (z)) wherever df /dz exists but does

not vanish, is general and follows directly from the Cauchy-Riemann equations. Indeed, the gradient u =

(u/x, u/y) at a point (x, y) is perpendicular to the contour of u(x, y) through that point (x, y). Likewise

the gradient v = (v/x, v/y) at that same point (x, y) is perpendicular to the isocontour of v(x, y)

through that point. The Cauchy-Riemann equations (58) imply that these two gradients are perpendicular

to each other

u v

u v

v v

v v

u v =

+

=

= 0.

(63)

x x y y

y x x y

Since gradients are always perpendicular to their respective isocontours, and here u v, the isocontours

of u and v are also orthogonal to each other. Therefore if f (z) = u(x, y)+iv(x, y) is z-differentiable (analytic)

then the isocontours of u(x, y) and v(x, y) are orthogonal to each other wherever they intersect. This is one

application of analytic functions: their real and imaginary parts u(x, y) and v(x, y) provide orthogonal

coordinates in the (x, y) plane.

Orthogonality of the u = Re(f (z)) and v = Im(f (z)) contours holds wherever df /dz exists except possibly

at critical points where df /dz = 0 and u = v = 0. For the example w = z 2 plotted above, the contours

are orthogonal everywhere except at z = 0 where dz 2 /dz = 2z = 0.

82

Conformal Mapping

We can visualize the function w = f (z) = u(x, y) + iv(x, y) as a mapping from the complex plane z = x + iy

to the complex plane w = u + iv. For example, f (z) = z 2 is the mapping z w = z 2 , or equivalently from

(x, y) (u, v) = (x2 y 2 , 2xy).

v

w = z2

z

x

z = w1/2

The vertical line u = u0 in the w-plane is the image of the hyperbola x2 y 2 = u0 in the z-plane. The

horizontal line v = v0 in the w-plane is the image of the hyperbola 2xy = v0 in the z-plane. Every point z

in the z-plane has a single image w in the w-plane, however the latter has two pre-images z and z in the

z-plane, indeed the inverse functions are z = w1/2 .

In polar form z = rei w = r2 ei2 . This means that every radial line from the origin with angle

from the x-axis in the z-plane is mapped to a radial line from the origin with angle 2 from the u-axis in

the w-plane.

The blue and red curves intersect at 90 degrees in both planes. That is the orthogonality of u and v, but

the dotted radial line intersects the blue and red curves at 45 degrees in both planes, for example. In fact

any angle between any two curves in the z-plane is preserved in the w-plane except at z = w = 0 where they

are doubled from the z to the w-plane.

This is another general property of z-differentiable complex functions f (z). If f (z) is z-differentiable

(analytic) then the mapping w = f (z) preserves all angles at all zs such that f 0 (z) 6= 0 when mapping from

the z-plane to the w-plane.

To show this in general, consider three neighboring points in the z-plane: z, z + dz1 and z + dz2 . We

are interested in seeing what happens to the angle between the two infinitesimal vectors dz1 and dz2 . If

dz1 = |dz1 |ei1 and dz2 = |dz2 |ei2 then the angle between those two vectors is = 2 1 and this is the

phase of dz2 /dz1 = |dz2 |/|dz1 |ei(2 1 ) .

dz2

z

dz1

w = f (z)

dw2

dw1

w

mapped to w + dw1 = f (z + dz1 ) ' f (z) + f 0 (z)dz1 and z + dz2 is

mapped to w +dw2 = f (z +dz2 ) ' f (z)+f 0 (z)dz2 . The angle between

the infinitesimal vectors dw1 = f 0 (z)dz1 and dw2 = f 0 (z)dz2 at w is

the phase of dw2 /dw1 = dz2 /dz1 , hence it is identical to the angle

between dz1 and dz2 .

All angles are preserved by the mapping w = f (z) except where f 0 (z) = 0 and dw1 = dw2 = 0 at

first order. The dws would be 2nd order in dz if f 0 (z) = 0 but f 00 (z) 6= 0 yielding dw1 = f 00 (z)dz12 /2 and

dw2 = f 00 (z)dz22 /2 hence dw2 /dw1 = (dz2 /dz1 )2 = (|dz2 |/|dz1 |)2 ei2(2 1 ) and angles are doubled at such

points. Likewise angles at points where f 0 (z) = f 00 (z) = 0 but f 000 (z) 6= 0 would be tripled, and so on. For

c

F.

Waleffe, Math 321, 2013/1/21

83

example, the mapping w = z 2 preserves all angles except at the origin z = 0 where angles are doubled by this

mapping z = rei z 2 = r2 ei2 . The mapping w = z 3 preserves all angles except at z = 0 where angles are

tripled since z = rei becomes z 3 = r3 ei3 .

A mapping that preserves all angles is called conformal. Analytic functions f (z) provide conformal

mappings between z and w = f (z) at all points where f 0 (z) 6= 0.

Examples of conformal mappings

w = z2

z

x

z = w1/2

w = z 2 The (magenta) vertical line x = x0 maps to the parabola u = x20 y 2 , v = 2x0 y in the (u, v) plane

and the (green) horizontal lime y = y0 becomes the (green) parabola u = x2 y02 , v = 2xy0 in the (u, v)

plane. The green and magenta curves intersect at 90 degrees in both planes. Angles between the dotted line

and the green and magenta curves are the same in both planes, except at z = w = 0. What happens there?

The definition of the inverse function is w1/2 = |w|1/2 ei arg(w)/2 with < arg(w) . What would the

map look like if we defined 0 arg(w) < 2?

y=

v

w = ez

z

u

y =

z = ln w

w = ez = ex eiy Maps the strip < x < , < y to the entire w-plane. z = x0 + iy

w = ex0 eiy circles of radius ex0 in the w-plane (magenta). z = x + iy0 w = ex eiy0 radial lines

with polar angle arg(w) = y0 in w-plane (green). z = x + iax w = ex eiax radial lines out of the

84

origin in z-plane mapped to logarithmic spirals in w-plane since z = x + iax with a fixed (and real) becomes

w = ex eiax rei so r = ex , = ax and r = e/a in the w-plane (blue). Notes: z = 0 w = 1. ez+2i = ez ,

periodic of complex period 2i, so ez maps an infinite number of zs to the same w. The inverse function

z = ln w = ln |w| + i arg(w) showed in this picture corresponds to the definition < arg(w) . All angles

are preserved e.g. the angles between green and magenta curves, as well as between blue and colored curves,

except at w = 0. What zs correspond to w = 0?

v

w = cosh(z)

z = ln(w +

w2 1)

w = cosh(z) = (ez + ez )/2 = (ex eiy + ex eiy )/2 = cosh x cos y + i sinh x sin y u + iv. Maps the semiinfinite strip 0 x < , < y to the entire w-plane. cosh(z) = cosh(z) and cosh(z+2i) = cosh(z),

even in z and periodic of period 2i. x = x0 0 u = cosh x0 cos y, v = sinh x0 sin y, ellipses in the

w-plane (magenta). y = y0 0 u = cosh x cos y0 , v = sinh x sin y0 , hyperbolas in the w-plane

(green).

2 1), but

w

This mapping gives orthogonal,

confocal

elliptic

coordinates.

The

inverse

map

is

z

=

ln(w

+

for what definition of w2 1? (not Matlab!). The line from w = to w = 1 is a branch cut, our

definition for ln(w + w2 1) is discontinuous across that line.

Exercises:

1. Consider the mapping w = z 2 . Determine precisely where the triangle (i) (1, 0), (1, 1), (0, 1) in the

z-plane gets mapped to in the w = u + iv plane; (ii) same but for triangle (0, 0), (1, 0), (1, 1). Do not

simply map the vertices, determine precisely what happens to each edge of the triangles.

2. Analyze the mapping w = 1/z. Determine what isocontours of u and v look like in the z-plane.

Determine where radial lines ( = constant) and circles (r = constant) in the z-plane get mapped to

in the w-plane.

3. Analyze the mappings w = ez and w = cosh z = (ez + ez )/2.

4. Determine what happens to circles and radial lines in the z-plane under the Joukowski mapping w =

(z + 1/z).

Rb

What do we mean by a f (z)dz when f (z) is a complex function of the complex variable z and the bounds

a and b are complex numbers in the z-plane?

c

F.

Waleffe, Math 321, 2013/1/21

85

In general

R we need to specify the path C in the complex plane to go from a to b and we need to write the

integral as C f (z)dz. Then if z0 = a, z1 , z2 , . . . , zN = b are successive points on the path from a to b we

can define the integral as usual as

N

X

Z

f (z)dz = lim

zn 0

f (

zn )zn

(64)

n=1

where zn = zn zn1 and zn is a point on the path (or the line segment) between zn1 and zn . This

definition

R also provides a practical way to estimate the integral. In particular if |f (z)| M along the curve

then | C f (z)dz| M L where L 0 is the length of the curve from a to b. Note also that the integral from

a to b along C is minus that from b to a along the same curve since all the zn change sign for that curve.

If C is from a to b, well use C to denote the same path but with the opposite orientation, from b to a. If

we have a parametrization for the curve, say z(t) with t real and z(ta ) = a, z(tb ) = b then the integral can

be expressed as

Z

Z tb

dz

dt.

(65)

f (z)dz =

f z(t)

dt

C

ta

Examples: To compute the integral of 1/z along the path C1 that consists

of the unit circle counterclockwise from a = 1 to b = i, we can parametrize

the circle using the polar angle as z() = ei then dz = iei d and

Z

C1

1

dz =

z

/2

C1

1 i

ie d = i ,

i

e

2

but along the path C2 which consists of the unit circle clockwise between

the same endpoints a = 1 to b = i

i

1

Z

C2

C2

1

dz =

z

Z

0

3/2

1 i

3

ie d = i .

ei

2

However for the function z 2 over the same two paths with z = ei , z 2 = ei2 and dz = iei d, we find

Z

z 2 dz =

C1

/2

iei3 d =

i 1

1 i3/2

b3 a3

e

1 =

=

,

3

3

3

3/2

i 1

1 i9/2

b3 a3

e

1 =

=

.

3

3

3

C2

0

Rb

Thus for z 2 it appears that we obtain the expected result a z 2 dz = (b3 a3 )/3, independently of the path.

Weve only checked two special paths, so we do not know for sure but, clearly, a key issue is to determine

when an integral depends on the path of integration or not.

Z

iei3 d =

z dz =

3.1

Cauchys theorem

The integral of a complex function is independent of the path of integration if and only if the integral over

a closed contour always vanishes. Indeed if C1 and C2 are two distinct paths from a to b then the curve

C = C1 C2 which goes from a to b along C1 then back from b to a along C2 is closed. The integral along

that close curve is zero if and only if the integral along C1 and C2 are equal.

Writing z = x + iy and f (z) = u(x, y) + iv(x, y) the complex integral around a closed curve C can be

written as

I

I

I

I

f (z)dz = (u + iv)(dx + idy) = (udx vdy) + i (vdx + udy)

(66)

C

86

hence the real and imaginary parts of the integral are real line integrals. These line integrals can be turned

into area integrals using the curl form of Greens theorem:

I

I

I

f (z)dz =

(udx vdy) + i (vdx + udy)

C

C

ZC

Z

u

u v

v

dA + i

dA,

(67)

=

x y

x y

A

A

where A is the interior domain bounded by the closed curve C. But the Cauchy-Riemann equations (58) give

u v

= 0,

x y

v

u

+

=0

x y

(68)

whenever the function f (z) is analytic in the neighborhood of the point z = x + iy. Thus both integrals

vanish if f (z) is analytic at all points of A. This is Cauchys theorem,

I

f (z)dz = 0

(69)

C

Functions like ez , cos z, sin z and z n with n 0 are differentiable for all z, hence the integral of such

functions around any closed contour C vanishes. But what about the integral of simple functions such as

z n with n > 0? Those functions are analytic everywhere except at z = 0 so the integral of 1/z n around

any closed contour that does not include the origin will still vanish. Lets figure out what happens when

the contour circles the origin. Consider the circle z = Rei , which is of radius R and centered at the origin,

oriented counter-clockwise, then as before dz = iRei d and

Z 2

Z 2

I

dz

iRei

2i if n = 1,

1n

i(1n)

=

d

=

iR

e

d

=

n

n ein

0

if n 6= 1.

z

R

0

|z|=R

0

This result in fact holds for any closed counter-clockwise curve C around the origin.

A

C0

closed curve C. Now, the function 1/z n is analytic everywhere inside

the domain A bounded by the counter-clockwise outer boundary C and

the inner circle boundary C0 oriented clockwise (emphasized here by

the minus sign) so the interior A is always to the left of the boundary,

as required by convention for the curl-form of Greens theorem in vector

calculus.

By Cauchys theorem, this implies that the integral over the closed contour, which consists of the sum of

the outer counter-clockwise curve C and the inner clockwise small circle about the origin C0 , vanishes

I

I

I

1

1

1

dz

=

0

dz

=

dz.

n

n

n

z

z

z

C+(C0 )

C

C0

In other words the integral about the closed contour C equals the integral about the closed inner circle C0 ,

both of which have the same orientation, counter-clockwise in this case. 1

This result can be slightly generalized to the functions (z a)n , n < 0 for any a (consider a small circle

about a: z = a + ei , etc.) so, combining with Cauchys theorem when n 0 we get the important result

that for integer n = 0, 1, 2, . . . and a closed contour C oriented counter-clockwise then

I

2i if n = 1 and C encloses a,

(z a)n dz =

(70)

0

otherwise.

C

1 Recall that we used this singularity isolation technique in conjunction with the divergence theorem to evaluate the flux of

r

/r2 = r/r3 (the inverse square law of gravity and electrostatics) through any closed surface enclosing the origin, as well as

in conjunction with Stokes theorem for the circulation of a line current B = (

z r)/|

z r|2 = /

= around a loop

enclosing the z-axis.

c

F.

Waleffe, Math 321, 2013/1/21

87

Connection with ln z

The integral of 1/z is of course directly related to ln z, the natural log of z which can be defined as the

antiderivative of 1/z that vanishes at z = 1, that is

Z z

1

d.

ln z

1

We use as the dummy variable of integration since z is the upper limit of integration.

But along what path from 1 to z? Heres the 2i multiplicity again. We have seen earlier in this section

that the integral of 1/z from a to b depends on how we go around the origin. If we get one result along

one path, we can get the same result + 2i if we use a path that loops around the origin one more time

counterclockwise than the original path. Or 2i if it loops clockwise, etc. Look back at exercise (7) in

section (3.1). If we define a range for arg(z), e.g. 0 arg(z) < 2, we find

Z z

1

d = ln |z| + i arg(z) + 2ik

(71)

1

for some Rspecific k that depends on the actual path taken from 1 to z and our definition of arg(z). The

z

notation 1 is not complete for this integral. The integral is path-dependent and it is necessary to specify

that path in more details, however all possible paths give the same answer modulo 2i.

Exercises:

closed paths are oriented counterclockwise unless specified otherwise.

1. Calculate the integral of f (z) = z + 2/z along the path C that goes once around the circle |z| = R > 0.

Discuss result in terms of R.

2. Calculate the integral of f (z) = az + b/z + c/(z + 1), where a, b and c are complex constants, around

(i) the circle of radius R > 0 centered at z = 0, (ii) the circle of radius 2 centered at z = 0, (iii) the

triangle 1/2, 2 + i, 1 2i.

3. Calculate the integral of f (z) = 1/(z 2 4) around (i) the unit circle, (ii) the parallelogram 0, 2 i, 4,

2 + i. [Hint: use partial fractions]

4. Calculate the integral of f (z) = 1/(z 4 1) along the circle of radius 1 centered at i.

5. Calculate the integral of sin(1/(3z)) over the square 1, i, 1, i. [Hint: use the Taylor series for sin z].

6. Calculate the integral of 1/z from z = 1 to z = 2ei/4 along (i) the path 1 2 along the real line then

2 2ei/4 along the circle of radius 2, (ii) along 1 2 on the real line, followed by 2 2ei/4 along

the circle of radius 2, clockwise.

7. If a is an arbitrary complex number, show that the integral of 1/z along the straight line from 1 to a

is equal to the integral of 1/z from 1 to |a| along the real line + the integral of 1/z along the circle of

radius |a| from |a| to a along a certain circular path. Draw a sketch!! Discuss which circular path and

calculate the integral. What happens if a is real but negative?

8. Does the integral of 1/z 2 from z = a to z = b (with a and b complex) depend on the path? Explain.

9. Pause and marvel at the power of (69) combined with (70). Continue.

P

n

10. The expansionH (43) with a =

H 1/z = nn=0 (1 z) . Using this expansion together with (70)

P1 gives

we find that |z|=1 dz/z =

n=0 |z|=1 (1 z) dz = 0. But this does not match with our explicit

H

calculation that |z|=1 dz/z = 2i. Whats wrong?!

88

3.2

Cauchys formula

The combination of (69) with (70) and partial fraction and/or Taylor series expansions is quite powerful

as we have already seen in the exercises, but there is another fundamental result that can be derived from

them. This is Cauchys formula

I

f (z)

dz = 2if (a)

(72)

C za

which holds for any closed counterclockwise contour C that encloses a provided f (z) is analytic (differentiable)

everywhere inside and on C.

H

The proof of this result follows the approach we used to calculate C dz/(z a) in section 3.1. Using

Cauchys theorem (69), the integral over C is equal to the integral over a small counterclockwise circle Ca of

radius centered at a. Thats because the function f (z)/(z a) is analytic in the domain between C and

the circle Ca : z = a + ei with = 0 2, so

I

C

f (z)

dz =

za

I

Ca

f (z)

dz =

za

(73)

The final step follows from the fact that the integral has the same value no matter what > 0 we pick. Then

taking the limit 0+ , the function f (a + ei ) f (a) because f (z) is a nice continuous and differentiable

function everywhere inside C, and in particular at z = a.

Cauchys formula has major consequences that follows from the fact that it applies to any a inside C. To

emphasize that, let us rewrite it with z in place of a, using as the dummy variable of integration

I

f ()

d.

(74)

2if (z) =

C z

This provides an integral formula for f (z) at any z inside C in terms of its values on C. Thus knowing f (z)

on C completely determines f (z) everywhere inside the contour! This formula is at the basis of boundary

integral methods.

Mean Value Theorem

Since (74) holds for any closed contour C as long as f (z) is continuous and differentiable inside and on that

contour, we can write it for a circle of radius r centered at z, = z + rei where d = irei d and (74) yields

1

f (z) =

2

f (z + rei )d

(75)

which states that f (z) is equal to its average over a circle centered at z. This is true as long as f (z) is

differentiable at all points inside the circle of radius r. This mean value theorem also applies to the real and

imaginary parts of f (z) = u(x, y) + iv(x, y). It implies that u(x, y), v(x, y) and |f (z)| do not have extrema

inside a domain where f (z) is differentiable. Points where f 0 (z) = 0 and therefore u/x = u/y =

v/x = v/y = 0 are saddle points, not local maxima or minima.

Generalized Cauchy formula and Taylor Series

Cauchys formula also implies that if f 0 (z) exists in the neighborhood of a point a then f (z) is infinitely

differentiable in that neighborhood! Furthermore, f (z) can be expanded in a Taylor series about a that

converges inside a disk whose radius is equal to the distance between a and the nearest singularity of f (z).

That is why we use the special word analytic instead of simply differentiable. For a function of a complex

variable being differentiable in a neighborhood is a really big deal!

B To show that f (z) is infinitely differentiable, we can show that the derivative of the right-hand side

of (74) with respect to z exists by using the limit definition of the derivative and being careful to justify

c

F.

Waleffe, Math 321, 2013/1/21

89

existence of the integrals and the limit. The final result is the same as that obtained by differentiating with

respect to z under the integral sign, yielding

I

f ()

d.

(76)

2if 0 (z) =

(

z)2

C

Doing this repeatedly we obtain

2if

(n)

I

(z) = n!

C

f ()

d.

( z)n+1

(77)

where f (n) (z) is the nth derivative of f (z) and n! = n(n 1) 1 is the factorial of n. Since all the integrals

exist, all the derivatives exist. Formula (77) is a generalized Cauchy formula.

B Another derivation of these results that establishes convergence of the Taylor series expansion at the

same time is to use the geometric series (11) and the slick trick that we used in (43) to write

X

1

1

1

1

(z a)n

=

=

=

z

a

z

( a) (z a)

a 1

( a)n+1

n=0

a

where the geometric series converges provided |z a| < | a|. Cauchys formula (74) then becomes

I

I X

I

X

f ()

(z a)n

f ()

n

2if (z) =

f ()

d =

(z a)

d =

d

n+1

z

(

a)

(

a)n+1

Ca

Ca n=0

Ca

n=0

(78)

(79)

where Ca is a circle centered at a whose radius is as large as desired provided f (z) is differentiable inside and

on the circle. For instance if f (z) = 1/z then the radius of the circle must be less then |a| since f (z) has a

singularity at z = 0 but is nice everwhere else. If f (z) = 1/(z + i) then the radius must be less than |a + i|

which is the distance between a and i since f (z) has a singularity at i. In general, the radius of the circle

must be less than the distance between a and the nearest singularity of f (z). To justify interchanging the

integral and the series we need to show that each integral exists and that the series of the integrals converges.

If |f ()| M on Ca and |z a|/| a| q < 1 since Ca is a circle of radius r centered at a and z is inside

that circle while is on the circle so a = rei , d = irei d and

I

(z a)n f ()

(80)

d 2M q n

n+1

Ca ( a)

showing that all integrals converge and the series of integrals also converges since q < 1.

The series (79) provides a power series expansion for f (z)

I

X

X

f ()

2if (z) =

(z a)n

d

=

cn (a)(z a)n

n+1

(

a)

Ca

n=0

n=0

(81)

that converges inside a disk centered at a with radius equal to the distance between a and the nearest

singularity of f (z). The series can be differentiated term-by-term and the derivative series also converges in

the same disk. Hence all derivatives of f (z) exist in that disk. In particular we find that

I

f ()

f (n) (a)

=

d.

(82)

cn (a) = 2i

n!

(

a)n+1

Ca

which is the generalized Cauchy formula (77) and the series (79) is none other than the familiar Taylor Series

X

f 00 (a)

f (n) (a)

2

f (z) = f (a) + f (a)(z a) +

(z a) + =

(z a)n .

2

n!

n=0

0

(83)

Finally, Cauchys theorem tells us that the integral on the right of (82) has the same value on any closed

contour (counterclockwise) enclosing a but no other singularities of f (z), so the formula holds for any such

closed contour as written in (77). However convergence of the Taylor series only occurs inside a disk centered

at a and of radius equal to the distance between a and the nearest singularity of f (z).

90

Exercises:

1. Why can we take (z a)n outside of the integrals in (79)?

2. Verify the estimate (80). Why does that estimate implies that the series of integrals converges?

3. Consider the integral of f (z)/(z a)2 about a small circle Ca of radius centered at a: z = a + ei ,

0 < 2. Study the limit of the -integral as 0+ . Does your limit agree with the generalized

Cauchy formula (77), (82)?

4. Find the Taylor series of 1/(1 + x2 ) and show that its radius of convergence is |x| < 1 [Hint: use

the geometric series]. Explain why the radius of convergence is one in terms of the singularities of

1/(1 + z 2 ). Would the Taylor series of 1/(1 + x2 ) about a = 1 have a smaller or larger radius of

convergence than that about a = 0?

5. Show that since an analytic function f (z) = u(x, y) + iv(x, y) is infinitely differentiable, its real and

imaginary parts are infinitely differentiable with respect to x and y. Show that 2 u = 2 v = 0 where

2 = 2 /x2 + 2 /y 2 is the 2D Laplacian and 2 (x, y) = 0 is Laplaces equation. Functions that

satisfy Laplaces equations are called harmonic functions. [Hint: use the Cauchy-Riemann equations

repeatedly].

6. Show that ekx cos ky and ekx sin ky are solutions of Laplaces equation for any real k. [Hint: consider

the complex function f (z) = ekz ]. These solutions occur in a variety of applications, e.g. surface gravity

waves with the surface at x = 0.

7. Calculate the integrals of cos(z)/z n and sin(z)/z n over the unit circle, where n is a positive integer.

One application of complex (a.k.a. contour) integration is to turn difficult real integrals into simple complex

integrals.

Example 1: What is the average of the function F (t) = 3/(5 + 4 cos t)? Since F (t) is periodic of period

2/, let = t and the average of F (t) is the same as the average of f () = 3/(5 + 4 cos ) over one period.

R 2

That average is (2)1 0 f ()d. To compute that integral we think integral over the unit circle in the

complex plane! Indeed the unit circle with |z| = 1 has the simple parametrization

z = ei dz = iei d d =

dz

.

iz

(84)

Furthermore

cos =

ei + ei

z + 1/z

=

,

2

2

so we obtain

Z

0

3

d =

5 + 4 cos

I

|z|=1

3

dz

3

=

5 + 2(z + 1/z) iz

2i

I

|z|=1

3

dz

=

1

2i

(z + 2 )(z + 2)

2i

z+2

= 2.

(85)

z= 21

What magic was that? We turned our integral of a periodic function over its period into an integral from

0 to 2 (that can always be done), then we turned that integral into a complex integral over the unit circle

(that can always be done too). That led us to the integral over a closed curve of a relatively nice function

(thats not always the case).

c

F.

Waleffe, Math 321, 2013/1/21

91

z = ei Our complex function has two simple poles, at 1/2 and 2. Since 2

2

is outside the unit circle, it does not contribute to the integral, but the

simple pole at 1/2 does. So the integrand has the form g(z)/(z a)

with a = 1/2 inside our domain and g(z) = 1/(z+2), is a good analytic

function inside the unit circle. So one application of Cauchys formula,

et voil`

a. The function 3/(5 + 4 cos ) which oscillates between 1/3 and

3 has an average of 1.

1/2

Z

0

3 cos n

d

5 + 4 cos

(86)

where n is an integer. [Hint: use symmetries to write the integral in [0, 2], do not use 2 cos n = ein + ein

(why not? try it out to find out the problem), use instead cos n = Re(ein ) with n 0.] An even function

f () = f () periodic of period 2 can be expanded in a Fourier series f () = a0 + a1 cos + a2 cos 2 +

a3 cos 3 + . This expansion is useful in all sorts of applications: numerical calculations, Rsignal processing

etc. The coefficient a0 is the average of f (). The other coefficients are given by an = 2 1 0 f () cos nd,

i.e. the integrals (86). So what is the Fourier (cosine) series of 3/(5 + 4 cos )? Can you say something about

its convergence?

Example 2:

Z

dx

=

1 + x2

(87)

This integral is easily done since 1/(1 + x2 ) = d/dx(arctan x), but we use contour integration to demonstrate

the method. The integral is equal to the integral of 1/(1 + z 2 ) over the real line z = x with x = .

That complex function has two simple poles at z = i since z 2 + 1 = (z + i)(z i).

C

i

Cx

closed path consisting of Cx : z = x with x = R R (real line)

+ the semi-circle C : z = Rei with = 0 . Since i is the

only simple pole inside our closed contour C = Cx + C , Cauchys

formula gives

I

I

dz

(z + i)1

1

=

dz = 2i

= .

2

zi

z + i z=i

C z +1

C

To get the integral we want, we need to take R and figure

out the C part. That part goes to zero as R since

Z

Z

R

dz iRei d

Rd

= 2

.

=

<

2

2

2i

2

+1

R 1

C z + 1

0 R e

0 R 1

Z

dx

=

1 + x4

Z

R

dz

1 + z4

(88)

Here the integrand has 4 simple poles at zk = ei/4+2i(k1)/4 , where k = 1, 2, 3, 4. These are the roots of

z 4 = 1 = ei+2i(k1) . They are on the unit circle, equispaced by /2. Note that z1 and z3 = z1 are

the roots of z 2 i = 0 while z2 and z4 = z2 are the roots of z 2 + i = 0, so z 4 + 1 = (z 2 i)(z 2 + i) =

(z z1 )(z + z1 )(z z2 )(z + z2 ).

92

We use the same closed contour C = Cx + C as above but now

there are two simple poles inside that contour. We need to isolate

both singularities leading to

I

I

I

=

+

.

C

C2

C1

z2

R

z1

Cx

z4

Likewise

I

C2

dz

= 2i

4

z +1

C2

I

1

dz

= 2i

=

.

2 z2)

4+1

z

2z

(z

2z

1 1

1

C1

2

z3

C1

1

2z2 (z22

z12 )

=

= .

2(z2 )

2z4

2z1

These manipulations are best understood by looking at the figure which shows that z2 = z4 = z1 together

with z12 = i, z22 = i. Adding both results gives

I

C

dz

=

4

z +1

2

1

1

+

z1

z1

=

ei/4 + ei/4

= cos = .

2

4

2

As before we need to take R and figure out the C part. That part goes to zero as R since

Z

Z

Z

dz iRei d

Rd

R

=

<

= 4

.

4

4

4i

4

+1

R 1

C2 z + 1

0 R e

0 R 1

Z

x2

dx

1 + x8

(89)

R

(and the much simpler x/(1 + x8 )dx = 0 ;-) ) We would use the same closed contour again, but there

would be 4 simple poles inside it and therefore 4 separate contributions.

Example 4:

Z

dx

=

(1 + x2 )2

Z

R

dz

=

(z 2 + 1)2

Z

R

dz

.

(z i)2 (z + i)2

(90)

We use the same closed contour once more, but now we have a double pole inside the contour at z = i. We

can figure out the contribution from that double pole by using the generalized form of Cauchys formula

(77). The integral over C vanishes as R as before and

Z

d

dx

2

=

2i

(z

+

i)

= .

(91)

2 )2

(1

+

x

dz

2

z=i

A (z a)n in the denominator, with n a positive integer, is called and n-th order pole.

Warning: Cauchys generalized formula is cool but can fail where Taylor coupled with (70) will succeed.

Example:

I

ez+1/z dz

(92)

|z|=1

which has an infinite order pole (a.k.a. essential singularity) at z = 0 and for which Cauchys formula is

not directly useful, but Taylor series and the simple (70) makes this a relative snap for the thinking person.

c

F.

Waleffe, Math 321, 2013/1/21

93

This all looks unbelievably mysterious if you do not understand the key ideas and are just trying to plug

into a formula. If you understand, it is pretty magical.

Example 5:

Z

sin x

dx = .

(93)

x

R

This is a trickier problem. Our impulse is to consider R (sin z)/z dz but that integrand is a super good

function! Indeed (sin z)/z = 1 z 2 /3! + z 4 /5! is analytic in the entire plane, its Taylor series converges

in the entire plane. For obvious reasons such functions are called entire functions. But we love singularities

now since they actually make our life easier. So we write

Z

Z ix

Z iz

sin x

e

e

dx = Im

dx = Im

dz,

(94)

x

x

R z

where Im stands for imaginary part of. Now we have a nice simple pole at z = 0. But thats another

problem since the pole is on the contour! We have to modify our favorite contour a little bit to avoid the

pole by going over or below it. If we go below and close along C as before, then well have a pole inside our

contour. If we go over it, we wont have any pole inside the closed contour. We get the same result either

way (luckily!), but the algebra is a tad simpler if we leave the pole out.

So we consider the closed contour C = C1 + C2 + C3 + C4 where C1

is the real axis from R to , C2 is the semi-circle from to

in the top half-plane, C3 is the real axis from to R and C4 is our

good old semi-circle of radius R. The integrand eiz /z is analytic

everywhere except at z = 0 where it has a simple pole, but since

that

pole is outside our closed contour, Cauchys theorem gives

H

=

0 or

C

Z

Z

Z

C4

C2

C1

R

C3

=

C1 +C3

C2

C4

Z

Z

i

eiz

dz = i

eie d i as 0.

z

C2

0

R

As before wed like to show that the C4 0 as R . This is trickier than the previous cases weve

encountered. On the semi-circle z = Rei and dz = iRei d, as weve seen so many times, we dont even

need to think about it anymore (do you?), so

Z

Z

Z

i

eiz

dz = i

eiRe d = i

eiR cos eR sin d.

(95)

0

0

C4 z

This is a pretty scary integral. But with a bit of courage and intelligence its not as bad as it looks. The

integrand has two factors, eiR cos whose norm is always 1 and eR sin which is real and exponentially small

for all in 0 < < , except at 0 and where it is exactly 1. Sketch eR sin in 0 and its pretty

clear the integral should go to zero as R . To show this rigorously, lets consider its modulus (norm) as

we did in the previous cases. Then since (i) the modulus of a sum is less or equal to the sum of the moduli

(triangle inequality), (ii) the modulus of a product is the product of the moduli and (iii) |eiR cos | = 1 when

R and are real (which they are)

Z

Z

0

eiR cos eR sin d <

eR sin d

(96)

0

we still cannot calculate that last integral but we dont need to. We just need to show that it is smaller than

something that goes to zero as R , so our integral will be squeezed to zero.

94

Plotting sin for 0 , we see that it is symmetric with respect

to /2 and that 2/ sin when 0 /2, or changing the

signs 2/ sin and since ex increases monotonically with x,

2/

1

/2

Z

Z

Z /2

Z /2

1 eR

2R/

R sin

R sin

iRei

<

e

d

=

e

d

<

2

e

d

=

2

e

d

R

0

0

0

0

R

so C4 0 as R and collecting our results we obtain (93).

(97)

Exercises

All of the above of course, +

R 2

1. Calculate 0 1/(a + b sin s)ds where a and b are real numbers. Does the integral exist for any real

values of a and b?

R

2. Make up and solve an exam question which is basically the same as dx/(1 + x2 ) in terms of the

logic and difficulty, but is different in the details.

R

3. Calculate dx/(1 + x2 + x4 ). Can you provide an upper bound for this integral based on integrals

calculated earlier?

R

R

2

2

2

4. Given the Poisson integral ex dx = , what is ex /a dx where a is real? (that should

R x2 /a2 ikx

be easy!). Next, calculate e

e dx where a and k are arbitrary real numbers. [This is the

2

Fourier transform of the Gaussian ex /a . Complete the square. Note that you can pick a, k > 0

(why?), then integrate over an infinite rectangle that consists of the real axis and comes back along

the line y = ka2 /2 (why? justify).

5. The Fresnel integrals come up in optics and quantum mechanics. They are

Z

Z

cos x2 dx, and

sin x2 dx.

R 2

Calculate them both by considering 0 eix dx. The goal is to reduce this to a Poisson integral. This

would be the case if x2 (ei/4 x)2 . So consider the closed path that goes from 0 to R on the real

axis, then on the circle of radius R to Rei/4 then back on the diagonal z = sei/4 with s real.

Branch cuts

Ok if youve made it this far and are still thinking hard, you may have noticed that weveonly dealt with

integer powers. What about fractional powers? First lets take a look at the integral of z over the unit

circle z = ei from = 0 to 0 + 2

I

|z|=1

0 +2

z dz =

0

ei/2 iei d =

2i 3i0 /2 i3

4i 3i0 /2

e

(e 1) =

e

3

3

(98)

The answer depends on 0 ! The integral over the closed circle depends on

where we start on the circle?!

This is weird, whats going on? The problem is with the definition of z. We have implicitly defined

c

F.

Waleffe, Math 321, 2013/1/21

95

0 arg(z) < 0 + 2 or 0 < arg(z) 0 + 2. But each 0 corresponds to a

different definition for z.

Forreal variables the equation y 2 = x 0 had two solutions y = x and we defined x 0. Cant we

define z

in a similar way? The equationw2 = z in the complex plane always hastwo solutions. We can say

z and z but we still need to define z since z is complex.Could we define z to be such that its real

part is always positive? yes, and thats equivalent to defining z = |z|1/2 ei arg(z)/2 with < arg(z) <

(check it). But thats not complete because the sqrt of a negative real number is pure imaginary, so what do

we do about those numbers? We can define < arg(z) , so real negative numbers have arg(z) = , not

, by definition. This is indeed the definition that

Matlab chooses. But it may not be appropriate for our

problem because it introduces a discontinuity in z as we crossthe negative real axis. If that is not desirable

for our problem than we could define 0 arg(z) < 2. Now z is continuous across the negative real axis

but there is a jump across the positive real axis. Not matter what definition we pick, there will always be a

discontinuity somewhere. We cannot go around z = 0 without encountering such a jump, z = 0 is called a

branch point and the semi-infinite curve emanating from z = 0 across which arg(z) jumps is called a branch

cut.

Heres a simple example that illustrates the extra subtleties and techniques.

Z

x

dx

1 + x2

0

First note that this integral does indeed exist since x/(1 + x2 ) x3/2 as x and therefore goes to

zero fast enough to be integrable.

Our first impulse is to see this as an integral over the real axis from 0 to

2

of the complex function

z/(z

+ 1). That function has simple poles at z = i as we know well. But

theres a problem: z is not analytic at z = 0 which is on our contour again. No big deal, we can avoid

it as we saw in the (sin x)/x example. So lets

take the same 4-piece closed contour as in that problem.

But were not all set yet because we have a z, what do we mean by that when z is complex? We need to

define that function

so that it is analytic everywhere inside and onour contour. Writing z = |z|ei arg(z) then

we can define z = |z|1/2 ei arg(z)/2 . We need to define arg(z) so z is analytic inside and on our contour.

The definitions arg(z)

< would not work with our decision to close in the upper half place. Why?

because arg(z) and thus z would not be continuous at the junction between C4 and C1 . We could close in

the lower half plane, or we can pick another branch cut for arg(z). The standard definition < arg(z)

would work. Try it! Well take a more exotic choice to illustrate branch cuts more dramatically. Lets pick

0 arg(z) < 2.

To be continued . . .

- Physics NotesUploaded bylala2398
- A Small Compendium on Vector and Tensor Algebra and CalculusUploaded bydfcortesv
- Additional Mathematics Project SPMUploaded byPeggy Chance
- HRW Physics 3 VectorsUploaded byPedro Sarmiento Solis
- hints2Uploaded byGeorham
- 1Q09 Influ CoeffUploaded byho-fa
- Lec-12.pdfUploaded byragunath.t
- Lect-week3_ENE2006_sp2019.pdfUploaded byJungdong Park
- Pre-calculus / math notes (unit 21 of 22)Uploaded byOmegaUser
- Introduction to Inverse Problems - Sari LasanenUploaded byMarcoags26
- GlowScript Annotations 1Uploaded byrobbie_delavega7000
- dela_4-1.pdfUploaded byaa
- Physics TopologicUploaded byssheltonaytac
- math213a01Uploaded byvinhkhale
- Dynamic Eigen Value AnalysisUploaded byBhanoday Reddy
- Lab Exp 1Uploaded byNumanAbdullah
- Graphics Slides 03Uploaded byapi-3738981
- LaTeX RefSheetUploaded byAlessioHr
- Effect of Disturbance Directions on Closed-LoopUploaded byvija
- Topic 1 Problem Set 2016Uploaded byPaul Amezquita
- [J. J. Moreau, P. D. Panagiotopoulos (Eds.)] Nonsm(BookZZ.org)Uploaded bySun Shine
- mm-6Uploaded bySomasekhar Chowdary Kakarala
- LecturesUploaded byPrajwal Prakash
- Gate Syllabus EC ImpUploaded bymanikantamnk11
- Prezentacija Marsej_ver.6_finish.pptUploaded byGeorge Acosta
- tran 1995 0219Uploaded byParticle Beam Physics Lab
- 9780817683849-c1.pdfUploaded byGaro Ohanoglu
- On Frequency Sensitive Competitive Learning for VQ Codebook DesignUploaded byLucas Arruda
- vector 1Uploaded byMantu Das
- January 2009 QP - Unit 1 Edexcel PhysicsUploaded bynaamashir

- Comparison of AMOMUploaded bynoahb110
- AA210_Homework_4_2018_19Uploaded bynoahb110
- Heywood Internal Combustion Engines Chapter 3 SolutionsUploaded bynoahb110
- Design and Simulation of Two-Stroke Engines- Blair.pdfUploaded bynoahb110
- Concepts and Applications of Finite Element Analysis - Cook Plesha Witt 4th EdUploaded bynoahb110
- Polygonal Finite Elements for Finite ElasticityUploaded bynoahb110
- Cap. 02.pdfUploaded byRaul Laurindo
- Introduction to Fluid Mechanics- Fox Chapter 6 Solution ManualUploaded bynoahb110
- Stress Concentration in Keyways and Optimization of Keyway DesignUploaded bynoahb110
- Metric Keyway Sizes.pdfUploaded bynoahb110
- EM3711TUploaded bynoahb110
- Pitting_Geometry_Factor.pdfUploaded bynoahb110
- 8020_Price List_wBlack_2013.pdfUploaded bynoahb110
- Classical Dynamics.pdfUploaded bynoahb110
- Classical DynamicsUploaded bynoahb110
- Solid Mechanics MitUploaded byDebasisMohapatra
- 1998 Scoring GuidelinesUploaded bynoahb110

- 1.Mathematical Physics.pptxUploaded bySk Sajedul
- 2005 Jun MS Edexcel Mechanics-1 6677Uploaded byKaushal
- Lista2Uploaded byLeonardo Vicino
- Maneuvering Board ManualUploaded byAlex Cr
- modelling of induction motorUploaded byAnonymous vLerKYA
- Lec8p3Uploaded byGamil Tawaf
- 11th Class PhysicsUploaded byRishabh Sharma
- Física - CinemáticaUploaded bydouglas_gato_148334
- Willcox_Burnback_AIAA-2005-4350-538Uploaded byDinto Thomas
- PHY2049_Fall2015_Test3Uploaded bykevin fernandez
- Text by W C VetschUploaded byMark Power
- Archi Lecture 1Uploaded byJames Aries Barnachea
- FS-gatesUploaded bysanbarun
- Ipc Describing Motion Verbally With Distance and Displacement Study NotesUploaded byGrant
- Application of the finite-element method to electromagnetic and electrical topicsUploaded byamorais64
- [John P. Cullerne, Anton Machacek] the Language of(1)Uploaded byNasim Salim
- Young`s Double slit experimentUploaded bygeorgallidesmarcos
- Relative VelocityUploaded byRahul Gupta
- Data MiningWith R by L TorgoUploaded bySohail Ahmad
- EEE. 2mark and 16 Marks Quest-with AnswerUploaded byPrasanth Govindaraj
- Ch 03 VectorsUploaded byasuhass
- Archimedes' principle in general coordinatesUploaded bycentari
- Web Design IngUploaded byfalconn
- Algebraic and ScalarUploaded bydtr17
- Sun 1994 Creative Mechanical DesignUploaded byMehul Jogani
- Course Plan Mechanics 1(BS Maths)Uploaded byAmbreen Khan
- Mathematics iUploaded byVikram Rao
- 1.1 VECTOR CALCULUS (Derivative of Scalar & Vector Function)Uploaded byZaid Imran
- 3D Force TheoryUploaded byArvind M
- Advance Roving Device for Rescue OperationUploaded byArun Kumar Yadav