Uploaded by Suhel Khayyam

-5

0

5

10

-10

-5

0

5

10

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

z

x

y

z

Vector

Calculus

Michael Corral

Vector Calculus

Michael Corral

Schoolcraft College

About the author:

Michael Corral is an Adjunct Faculty member of the Department of Mathematics

at Schoolcraft College. He received a B.A. in Mathematics from the University

of California at Berkeley, and received an M.A. in Mathematics and an M.S. in

Industrial & Operations Engineering from the University of Michigan.

This text was typeset in L

A

T

E

X2

ε

with the KOMA-Script bundle, using the GNU

Emacs text editor on a Fedora Linux system. The graphics were created using

MetaPost, PGF, and Gnuplot.

Copyright c 2008 Michael Corral.

Permission is granted to copy, distribute and/or modify this document under the terms

of the GNU Free Documentation License, Version 1.2 or any later version published

by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts,

and no Back-Cover Texts. A copy of the license is included in the section entitled

“GNU Free Documentation License”.

Preface

This book covers calculus in two and three variables. It is suitable for a one-semester

course, normally known as “Vector Calculus”, “Multivariable Calculus”, or simply

“Calculus III”. The prerequisites are the standard courses in single-variable calculus

(a.k.a. Calculus I and II).

I have tried to be somewhat rigorous about proving results. But while it is impor-

tant for students to see full-blown proofs - since that is how mathematics works - too

much rigor and emphasis on proofs can impede the flow of learning for the vast ma-

jority of the audience at this level. If I were to rate the level of rigor in the book on a

scale of 1 to 10, with 1 being completely informal and 10 being completely rigorous, I

would rate it as a 5.

There are 420 exercises throughout the text, which in my experience are more than

enough for a semester course in this subject. There are exercises at the end of each

section, divided into three categories: A, B and C. The A exercises are mostly of a

routine computational nature, the B exercises are slightly more involved, and the C

exercises usually require some effort or insight to solve. A crude way of describing A,

B and C would be “Easy”, “Moderate” and “Challenging”, respectively. However, many

of the B exercises are easy and not all the C exercises are difficult.

There are a few exercises that require the student to write his or her own com-

puter program to solve some numerical approximation problems (e.g. the Monte Carlo

method for approximating multiple integrals, in Section 3.4). The code samples in the

text are in the Java programming language, hopefully with enough comments so that

the reader can figure out what is being done even without knowing Java. Those exer-

cises do not mandate the use of Java, so students are free to implement the solutions

using the language of their choice. While it would have been simple to use a script-

ing language like Python, and perhaps even easier with a functional programming

language (such as Haskell or Scheme), Java was chosen due to its ubiquity, relatively

clear syntax, and easy availability for multiple platforms.

Answers and hints to most odd-numbered and some even-numbered exercises are

provided in Appendix A. Appendix B contains a proof of the right-hand rule for the

cross product, which seems to have virtually disappeared from calculus texts over

the last few decades. Appendix C contains a brief tutorial on Gnuplot for graphing

functions of two variables.

This book is released under the GNU Free Documentation License (GFDL), which

allows others to not only copy and distribute the book but also to modify it. For more

details, see the included copy of the GFDL. So that there is no ambiguity on this

iii

iv Preface

matter, anyone can make as many copies of this book as desired and distribute it

as desired, without needing my permission. The PDF version will always be freely

available to the public at no cost (go to http://www.mecmath.net). Feel free to

contact me at mcorral@schoolcraft.edu for any questions on this or any other

matter involving the book (e.g. comments, suggestions, corrections, etc). I welcome

your input.

Finally, I would like to thank my students in Math 240 for being the guinea pigs

for the initial draft of this book, and for finding the numerous errors and typos it

contained.

January 2008 MICHAEL CORRAL

Contents

Preface iii

1 Vectors in Euclidean Space 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Vector Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4 Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.5 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.6 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

1.7 Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1.8 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1.9 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2 Functions of Several Variables 65

2.1 Functions of Two or Three Variables . . . . . . . . . . . . . . . . . . . . . 65

2.2 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

2.3 Tangent Plane to a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . 75

2.4 Directional Derivatives and the Gradient . . . . . . . . . . . . . . . . . . 78

2.5 Maxima and Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

2.6 Unconstrained Optimization: Numerical Methods . . . . . . . . . . . . . 89

2.7 Constrained Optimization: Lagrange Multipliers . . . . . . . . . . . . . . 96

3 Multiple Integrals 101

3.1 Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

3.2 Double Integrals Over a General Region . . . . . . . . . . . . . . . . . . . 105

3.3 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3.4 Numerical Approximation of Multiple Integrals . . . . . . . . . . . . . . 113

3.5 Change of Variables in Multiple Integrals . . . . . . . . . . . . . . . . . . 117

3.6 Application: Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.7 Application: Probability and Expected Value . . . . . . . . . . . . . . . . 128

4 Line and Surface Integrals 135

4.1 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

4.2 Properties of Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 143

4.3 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

v

vi Contents

4.4 Surface Integrals and the Divergence Theorem . . . . . . . . . . . . . . . 156

4.5 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

4.6 Gradient, Divergence, Curl and Laplacian . . . . . . . . . . . . . . . . . . 177

Bibliography 187

Appendix A: Answers and Hints to Selected Exercises 189

Appendix B: Proof of the Right-Hand Rule for the Cross Product 192

Appendix C: 3D Graphing with Gnuplot 196

GNU Free Documentation License 201

History 209

Index 210

1 Vectors in Euclidean Space

1.1 Introduction

In single-variable calculus, the functions that one encounters are functions of a vari-

able (usually x or t) that varies over some subset of the real number line (which we

denote by ). For such a function, say, y = f (x), the graph of the function f con-

sists of the points (x, y) = (x, f (x)). These points lie in the Euclidean plane, which,

in the Cartesian or rectangular coordinate system, consists of all ordered pairs of

real numbers (a, b). We use the word “Euclidean” to denote a system in which all the

usual rules of Euclidean geometry hold. We denote the Euclidean plane by

2

; the

“2” represents the number of dimensions of the plane. The Euclidean plane has two

perpendicular coordinate axes: the x-axis and the y-axis.

In vector (or multivariable) calculus, we will deal with functions of two or three vari-

ables (usually x, y or x, y, z, respectively). The graph of a function of two variables, say,

z = f (x, y), lies in Euclidean space, which in the Cartesian coordinate systemconsists

of all ordered triples of real numbers (a, b, c). Since Euclidean space is 3-dimensional,

we denote it by

3

. The graph of f consists of the points (x, y, z) = (x, y, f (x, y)). The

3-dimensional coordinate system of Euclidean space can be represented on a flat sur-

face, such as this page or a blackboard, only by giving the illusion of three dimensions,

in the manner shown in Figure 1.1.1. Euclidean space has three mutually perpendic-

ular coordinate axes (x, y and z), and three mutually perpendicular coordinate planes:

the xy-plane, yz-plane and xz-plane (see Figure 1.1.2).

x

y

z

0

P(a, b, c)

a

b

c

Figure 1.1.1

x

y

z

0

yz-plane

xy-plane

xz-plane

Figure 1.1.2

1

2 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

The coordinate system shown in Figure 1.1.1 is known as a right-handed coordi-

nate system, because it is possible, using the right hand, to point the index finger in

the positive direction of the x-axis, the middle finger in the positive direction of the

y-axis, and the thumb in the positive direction of the z-axis, as in Figure 1.1.3.

Figure 1.1.3 Right-handed coordinate system

An equivalent way of defining a right-handed system is if you can point your thumb

upwards in the positive z-axis direction while using the remaining four fingers to

rotate the x-axis towards the y-axis. Doing the same thing with the left hand is what

defines a left-handed coordinate system. Notice that switching the x- and y-axes

in a right-handed system results in a left-handed system, and that rotating either

type of system does not change its “handedness”. Throughout the book we will use a

right-handed system.

For functions of three variables, the graphs exist in 4-dimensional space (i.e.

4

),

which we can not see in our 3-dimensional space, let alone simulate in 2-dimensional

space. So we can only think of 4-dimensional space abstractly. For an entertaining

discussion of this subject, see the book by ABBOTT.

1

So far, we have discussed the position of an object in 2-dimensional or 3-dimensional

space. But what about something such as the velocity of the object, or its acceleration?

Or the gravitational force acting on the object? These phenomena all seem to involve

motion and direction in some way. This is where the idea of a vector comes in.

1

One thing you will learn is why a 4-dimensional creature would be able to reach inside an egg and

remove the yolk without cracking the shell!

1.1 Introduction 3

You have already dealt with velocity and acceleration in single-variable calculus.

For example, for motion along a straight line, if y = f (t) gives the displacement of

an object after time t, then dy/dt = f

′

(t) is the velocity of the object at time t. The

derivative f

′

(t) is just a number, which is positive if the object is moving in an agreed-

upon “positive” direction, and negative if it moves in the opposite of that direction. So

you can think of that number, which was called the velocity of the object, as having

two components: a magnitude, indicated by a nonnegative number, preceded by a

direction, indicated by a plus or minus symbol (representing motion in the positive

direction or the negative direction, respectively), i.e. f

′

(t) = ±a for some number a ≥ 0.

Then a is the magnitude of the velocity (normally called the speed of the object), and

the ± represents the direction of the velocity (though the + is usually omitted for the

positive direction).

For motion along a straight line, i.e. in a 1-dimensional space, the velocities are

also contained in that 1-dimensional space, since they are just numbers. For general

motion along a curve in 2- or 3-dimensional space, however, velocity will need to be

represented by a multidimensional object which should have both a magnitude and a

direction. A geometric object which has those features is an arrow, which in elemen-

tary geometry is called a “directed line segment”. This is the motivation for how we

will define a vector.

Definition 1.1. A (nonzero) vector is a directed line segment drawn from a point P

(called its initial point) to a point Q (called its terminal point), with P and Q being

distinct points. The vector is denoted by

−−→

PQ. Its magnitude is the length of the line

segment, denoted by

¸

¸

¸

−−→

PQ

¸

¸

¸, and its direction is the same as that of the directed line

segment. The zero vector is just a point, and it is denoted by 0.

To indicate the direction of a vector, we draw an arrow from its initial point to its

terminal point. We will often denote a vector by a single bold-faced letter (e.g. v) and

use the terms “magnitude” and “length” interchangeably. Note that our definition

could apply to systems with any number of dimensions (see Figure 1.1.4 (a)-(c)).

0

x P Q R S

−−→

PQ

−−→

RS

(a) One dimension

x

y

0

P

Q

R

S

−

−

→

P

Q

−−→

RS

v

(b) Two dimensions

x

y

z

0

P

Q

R

S

−

−

→

P

Q

−−→

R

S

v

(c) Three dimensions

Figure 1.1.4 Vectors in different dimensions

4 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

A few things need to be noted about the zero vector. Our motivation for what a

vector is included the notions of magnitude and direction. What is the magnitude of

the zero vector? We define it to be zero, i.e. 0 = 0. This agrees with the definition of

the zero vector as just a point, which has zero length. What about the direction of the

zero vector? A single point really has no well-defined direction. Notice that we were

careful to only define the direction of a nonzero vector, which is well-defined since the

initial and terminal points are distinct. Not everyone agrees on the direction of the

zero vector. Some contend that the zero vector has arbitrary direction (i.e. can take

any direction), some say that it has indeterminate direction (i.e. the direction can not

be determined), while others say that it has no direction. Our definition of the zero

vector, however, does not require it to have a direction, and we will leave it at that.

2

Now that we know what a vector is, we need a way of determining when two vectors

are equal. This leads us to the following definition.

Definition 1.2. Two nonzero vectors are equal if they have the same magnitude and

the same direction. Any vector with zero magnitude is equal to the zero vector.

By this definition, vectors with the same magnitude and direction but with different

initial points would be equal. For example, in Figure 1.1.5 the vectors u, v and w all

have the same magnitude

√

5 (by the Pythagorean Theorem). And we see that u and

w are parallel, since they lie on lines having the same slope

1

2

, and they point in the

same direction. So u = w, even though they have different initial points. We also see

that v is parallel to u but points in the opposite direction. So u v.

1

2

3

4

1 2 3 4

x

y

0

u

v

w

Figure 1.1.5

So we can see that there are an infinite number of vectors for a given magnitude

and direction, those vectors all being equal and differing only by their initial and

terminal points. Is there a single vector which we can choose to represent all those

equal vectors? The answer is yes, and is suggested by the vector w in Figure 1.1.5.

2

In the subject of linear algebra there is a more abstract way of defining a vector where the concept of

“direction” is not really used. See ANTON and RORRES.

1.1 Introduction 5

Unless otherwise indicated, when speaking of “the vector” with a given magnitude

and direction, we will mean the one whose initial point is at the origin of the

coordinate system.

Thinking of vectors as starting from the origin provides a way of dealing with vec-

tors in a standard way, since every coordinate system has an origin. But there will

be times when it is convenient to consider a different initial point for a vector (for

example, when adding vectors, which we will do in the next section).

Another advantage of using the origin as the initial point is that it provides an easy

correspondence between a vector and its terminal point.

Example 1.1. Let v be the vector in

3

whose initial point is at the origin and whose

terminal point is (3, 4, 5). Though the point (3, 4, 5) and the vector v are different ob-

jects, it is convenient to write v = (3, 4, 5). When doing this, it is understood that the

initial point of v is at the origin (0, 0, 0) and the terminal point is (3, 4, 5).

x

y

z

0

P(3, 4, 5)

(a) The point (3,4,5)

x

y

z

0

v = (3, 4, 5)

(b) The vector (3,4,5)

Figure 1.1.6 Correspondence between points and vectors

Unless otherwise stated, when we refer to vectors as v = (a, b) in

2

or v = (a, b, c)

in

3

, we mean vectors in Cartesian coordinates starting at the origin. Also, we will

write the zero vector 0 in

2

and

3

as (0, 0) and (0, 0, 0), respectively.

The point-vector correspondence provides an easy way to check if two vectors are

equal, without having to determine their magnitude and direction. Similar to seeing

if two points are the same, you are now seeing if the terminal points of vectors starting

at the origin are the same. For each vector, find the (unique!) vector it equals whose

initial point is the origin. Then compare the coordinates of the terminal points of

these “new” vectors: if those coordinates are the same, then the original vectors are

equal. To get the “new” vectors starting at the origin, you translate each vector to

start at the origin by subtracting the coordinates of the original initial point from the

original terminal point. The resulting point will be the terminal point of the “new”

vector whose initial point is the origin. Do this for each original vector then compare.

6 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.2. Consider the vectors

−−→

PQand

−−→

RS in

3

, where P = (2, 1, 5), Q = (3, 5, 7), R =

(1, −3, −2) and S = (2, 1, 0). Does

−−→

PQ =

−−→

RS ?

Solution: The vector

−−→

PQ is equal to the vector v with initial point (0, 0, 0) and terminal

point Q − P = (3, 5, 7) − (2, 1, 5) = (3 − 2, 5 − 1, 7 − 5) = (1, 4, 2).

Similarly,

−−→

RS is equal to the vector w with initial point (0, 0, 0) and terminal point

S − R = (2, 1, 0) − (1, −3, −2) = (2 − 1, 1 − (−3), 0 − (−2)) = (1, 4, 2).

So

−−→

PQ = v = (1, 4, 2) and

−−→

RS = w = (1, 4, 2).

∴

−−→

PQ =

−−→

RS

y

z

x

0

−−→

P

Q

−−→

R

S

Translate

−−→

PQ to v

Translate

−−→

RS to w

P

(2, 1, 5)

Q

(3, 5, 7)

R

(1, −3, −2)

S

(2, 1, 0)

(1, 4, 2)

v = w

Figure 1.1.7

Recall the distance formula for points in the Euclidean plane:

For points P = (x

1

, y

1

), Q = (x

2

, y

2

) in

2

, the distance d between P and Q is:

d =

(x

2

− x

1

)

2

+ (y

2

− y

1

)

2

(1.1)

By this formula, we have the following result:

For a vector

−−→

PQ in

2

with initial point P = (x

1

, y

1

) and terminal point

Q = (x

2

, y

2

), the magnitude of

−−→

PQ is:

¸

¸

¸

−−→

PQ

¸

¸

¸ =

(x

2

− x

1

)

2

+ (y

2

− y

1

)

2

(1.2)

1.1 Introduction 7

Finding the magnitude of a vector v = (a, b) in

2

is a special case of formula (1.2)

with P = (0, 0) and Q = (a, b) :

For a vector v = (a, b) in

2

, the magnitude of v is:

v =

a

2

+ b

2

(1.3)

To calculate the magnitude of vectors in

3

, we need a distance formula for points

in Euclidean space (we will postpone the proof until the next section):

Theorem 1.1. The distance d between points P = (x

1

, y

1

, z

1

) and Q = (x

2

, y

2

, z

2

) in

3

is:

d =

(x

2

− x

1

)

2

+ (y

2

− y

1

)

2

+ (z

2

− z

1

)

2

(1.4)

The proof will use the following result:

Theorem 1.2. For a vector v = (a, b, c) in

3

, the magnitude of v is:

v =

a

2

+ b

2

+ c

2

(1.5)

Proof: There are four cases to consider:

Case 1: a = b = c = 0. Then v = 0, so v = 0 =

√

0

2

+ 0

2

+ 0

2

=

√

a

2

+ b

2

+ c

2

.

Case 2: exactly two of a, b, c are 0. Without loss of generality, we assume that a =

b = 0 and c 0 (the other two possibilities are handled in a similar manner). Then

v = (0, 0, c), which is a vector of length |c| along the z-axis. So v = |c| =

√

c

2

=

√

0

2

+ 0

2

+ c

2

=

√

a

2

+ b

2

+ c

2

.

Case 3: exactly one of a, b, c is 0. Without loss of generality, we assume that a = 0,

b 0 and c 0 (the other two possibilities are handled in a similar manner). Then

v = (0, b, c), which is a vector in the yz-plane, so by the Pythagorean Theorem we have

v =

√

b

2

+ c

2

=

√

0

2

+ b

2

+ c

2

=

√

a

2

+ b

2

+ c

2

.

x

y

z

0

a

Q(a, b, c)

S

P

R

b

c

v

Figure 1.1.8

Case 4: none of a, b, c are 0. Without loss of generality, we can

assume that a, b, c are all positive (the other seven possibil-

ities are handled in a similar manner). Consider the points

P = (0, 0, 0), Q = (a, b, c), R = (a, b, 0), and S = (a, 0, 0), as shown

in Figure 1.1.8. Applying the Pythagorean Theorem to the

right triangle △PS R gives |PR|

2

= a

2

+b

2

. A second application

of the Pythagorean Theorem, this time to the right triangle

△PQR, gives v = |PQ| =

|PR|

2

+ |QR|

2

=

√

a

2

+ b

2

+ c

2

.

This proves the theorem. QED

8 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.3. Calculate the following:

(a) The magnitude of the vector

−−→

PQ in

2

with P = (−1, 2) and Q = (5, 5).

Solution: By formula (1.2),

¸

¸

¸

−−→

PQ

¸

¸

¸ =

(5 − (−1))

2

+ (5 − 2)

2

=

√

36 + 9 =

√

45 = 3

√

5.

(b) The magnitude of the vector v = (8, 3) in

2

.

Solution: By formula (1.3), v =

√

8

2

+ 3

2

=

√

73.

(c) The distance between the points P = (2, −1, 4) and Q = (4, 2, −3) in

2

.

Solution: By formula (1.4), the distance d =

(4 − 2)

2

+ (2 − (−1))

2

+ (−3 − 4)

2

=

√

4 + 9 + 49 =

√

62.

(d) The magnitude of the vector v = (5, 8, −2) in

3

.

Solution: By formula (1.5), v =

5

2

+ 8

2

+ (−2)

2

=

√

25 + 64 + 4 =

√

93.

¨

©

Exercises

A

1. Calculate the magnitudes of the following vectors:

(a) v = (2, −1) (b) v = (2, −1, 0) (c) v = (3, 2, −2) (d) v = (0, 0, 1) (e) v = (6, 4, −4)

2. For the points P = (1, −1, 1), Q = (2, −2, 2), R = (2, 0, 1), S = (3, −1, 2), does

−−→

PQ =

−−→

RS ?

3. For the points P = (0, 0, 0), Q = (1, 3, 2), R = (1, 0, 1), S = (2, 3, 4), does

−−→

PQ =

−−→

RS ?

B

4. Let v = (1, 0, 0) and w = (a, 0, 0) be vectors in

3

. Show that w = |a| v.

5. Let v = (a, b, c) and w = (3a, 3b, 3c) be vectors in

3

. Show that w = 3 v.

C

x

y

z

0

P(x

1

, y

1

, z

1

)

Q(x

2

, y

2

, z

2

)

R(x

2

, y

2

, z

1

)

S (x

1

, y

1

, 0)

T(x

2

, y

2

, 0)

U(x

2

, y

1

, 0)

Figure 1.1.9

6. Though we will see a simple proof of Theorem 1.1

in the next section, it is possible to prove it using

methods similar to those in the proof of Theorem

1.2. Prove the special case of Theorem 1.1 where

the points P = (x

1

, y

1

, z

1

) and Q = (x

2

, y

2

, z

2

) satisfy

the following conditions:

x

2

> x

1

> 0, y

2

> y

1

> 0, and z

2

> z

1

> 0.

(Hint: Think of Case 4 in the proof of Theorem

1.2, and consider Figure 1.1.9.)

1.2 Vector Algebra 9

1.2 Vector Algebra

Now that we know what vectors are, we can start to perform some of the usual al-

gebraic operations on them (e.g. addition, subtraction). Before doing that, we will

introduce the notion of a scalar.

Definition 1.3. A scalar is a quantity that can be represented by a single number.

For our purposes, scalars will always be real numbers.

3

Examples of scalar quanti-

ties are mass, electric charge, and speed (not velocity).

4

We can now define scalar

multiplication of a vector.

Definition 1.4. For a scalar k and a nonzero vector v, the scalar multiple of v by k,

denoted by kv, is the vector whose magnitude is |k| v, points in the same direction

as v if k > 0, points in the opposite direction as v if k < 0, and is the zero vector 0 if

k = 0. For the zero vector 0, we define k0 = 0 for any scalar k.

Two vectors v and w are parallel (denoted by v w) if one is a scalar multiple of

the other. You can think of scalar multiplication of a vector as stretching or shrinking

the vector, and as flipping the vector in the opposite direction if the scalar is a negative

number (see Figure 1.2.1).

v

2v 3v 0.5v

−v

−2v

Figure 1.2.1

Recall that translating a nonzero vector means that the initial point of the vector

is changed but the magnitude and direction are preserved. We are now ready to define

the sum of two vectors.

Definition 1.5. The sum of vectors v and w, denoted by v + w, is obtained by trans-

lating w so that its initial point is at the terminal point of v; the initial point of v + w

is the initial point of v, and its terminal point is the new terminal point of w.

3

The term scalar was invented by 19

th

century Irish mathematician, physicist and astronomer William

Rowan Hamilton, to convey the sense of something that could be represented by a point on a scale or

graduated ruler. The word vector comes from Latin, where it means “carrier”.

4

An alternate definition of scalars and vectors, used in physics, is that under certain types of coordinate

transformations (e.g. rotations), a quantity that is not affected is a scalar, while a quantity that is

affected (in a certain way) is a vector. See MARION for details.

10 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Intuitively, adding w to v means tacking on w to the end of v (see Figure 1.2.2).

v

w

(a) Vectors v and w

v

w

(b) Translate w to the end of v

v

w

v + w

(c) The sum v + w

Figure 1.2.2 Adding vectors v and w

Notice that our definition is valid for the zero vector (which is just a point, and

hence can be translated), and so we see that v + 0 = v = 0 + v for any vector v. In

particular, 0 + 0 = 0. Also, it is easy to see that v + (−v) = 0, as we would expect. In

general, since the scalar multiple −v = −1 v is a well-defined vector, we can define

vector subtraction as follows: v − w = v + (−w). See Figure 1.2.3.

v

w

(a) Vectors v and w

v

−w

(b) Translate −w to the end of v

v

−w

v − w

(c) The difference v − w

Figure 1.2.3 Subtracting vectors v and w

Figure 1.2.4 shows the use of “geometric proofs” of various laws of vector algebra,

that is, it uses laws from elementary geometry to prove statements about vectors. For

example, (a) shows that v + w = w + v for any vectors v, w. And (c) shows how you

can think of v − w as the vector that is tacked on to the end of w to add up to v.

v

v

w w

w+ v

v + w

(a) Add vectors

−w

w

v − w

v − w

v

(b) Subtract vectors

v

w

v + w

v − w

(c) Combined add/subtract

Figure 1.2.4 “Geometric” vector algebra

Notice that we have temporarily abandoned the practice of starting vectors at the

origin. In fact, we have not even mentioned coordinates in this section so far. Since we

will deal mostly with Cartesian coordinates in this book, the following two theorems

are useful for performing vector algebra on vectors in

2

and

3

starting at the origin.

1.2 Vector Algebra 11

Theorem 1.3. Let v = (v

1

, v

2

), w = (w

1

, w

2

) be vectors in

2

, and let k be a scalar. Then

(a) kv = (kv

1

, kv

2

)

(b) v + w = (v

1

+ w

1

, v

2

+ w

2

)

Proof: (a) Without loss of generality, we assume that v

1

, v

2

> 0 (the other possibilities

are handled in a similar manner). If k = 0 then kv = 0v = 0 = (0, 0) = (0v

1

, 0v

2

) =

(kv

1

, kv

2

), which is what we needed to show. If k 0, then (kv

1

, kv

2

) lies on a line with

slope

kv

2

kv

1

=

v

2

v

1

, which is the same as the slope of the line on which v (and hence kv) lies,

and (kv

1

, kv

2

) points in the same direction on that line as kv. Also, by formula (1.3) the

magnitude of (kv

1

, kv

2

) is

(kv

1

)

2

+ (kv

2

)

2

=

k

2

v

2

1

+ k

2

v

2

2

=

k

2

(v

2

1

+ v

2

2

) = |k|

v

2

1

+ v

2

2

=

|k| v. So kv and (kv

1

, kv

2

) have the same magnitude and direction. This proves (a).

x

y

0

w

2

v

2

w

1

v

1

v

1

+ w

1

v

2

+ w

2

w

2

w

1

v

v

w

w

v + w

Figure 1.2.5

(b) Without loss of generality, we assume that

v

1

, v

2

, w

1

, w

2

> 0 (the other possibilities are han-

dled in a similar manner). From Figure 1.2.5,

we see that when translating w to start at

the end of v, the new terminal point of w is

(v

1

+ w

1

, v

2

+ w

2

), so by the definition of v + w

this must be the terminal point of v+w. This

proves (b). QED

Theorem 1.4. Let v = (v

1

, v

2

, v

3

), w = (w

1

, w

2

, w

3

) be vectors in

3

, let k be a scalar. Then

(a) kv = (kv

1

, kv

2

, kv

3

)

(b) v + w = (v

1

+ w

1

, v

2

+ w

2

, v

3

+ w

3

)

The following theorem summarizes the basic laws of vector algebra.

Theorem 1.5. For any vectors u, v, w, and scalars k, l, we have

(a) v + w = w+ v Commutative Law

(b) u + (v + w) = (u + v) + w Associative Law

(c) v + 0 = v = 0 + v Additive Identity

(d) v + (−v) = 0 Additive Inverse

(e) k(lv) = (kl)v Associative Law

(f) k(v + w) = kv + kw Distributive Law

(g) (k + l)v = kv + lv Distributive Law

Proof: (a) We already presented a geometric proof of this in Figure 1.2.4(a).

(b) To illustrate the difference between analytic proofs and geometric proofs in vector

algebra, we will present both types here. For the analytic proof, we will use vectors

in

3

(the proof for

2

is similar).

12 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Let u = (u

1

, u

2

, u

3

), v = (v

1

, v

2

, v

3

), w = (w

1

, w

2

, w

3

) be vectors in

3

. Then

u + (v + w) = (u

1

, u

2

, u

3

) + ((v

1

, v

2

, v

3

) + (w

1

, w

2

, w

3

))

= (u

1

, u

2

, u

3

) + (v

1

+ w

1

, v

2

+ w

2

, v

3

+ w

3

) by Theorem 1.4(b)

= (u

1

+ (v

1

+ w

1

), u

2

+ (v

2

+ w

2

), u

3

+ (v

3

+ w

3

)) by Theorem 1.4(b)

= ((u

1

+ v

1

) + w

1

, (u

2

+ v

2

) + w

2

, (u

3

+ v

3

) + w

3

) by properties of real numbers

= (u

1

+ v

1

, u

2

+ v

2

, u

3

+ v

3

) + (w

1

, w

2

, w

3

) by Theorem 1.4(b)

= (u + v) + w

This completes the analytic proof of (b). Figure 1.2.6 provides the geometric proof.

u

v

w

u + v

v + w

u + (v + w) = (u + v) + w

Figure 1.2.6 Associative Law for vector addition

(c) We already discussed this on p.10.

(d) We already discussed this on p.10.

(e) We will prove this for a vector v = (v

1

, v

2

, v

3

) in

3

(the proof for

2

is similar):

k(lv) = k(lv

1

, lv

2

, lv

3

) by Theorem 1.4(a)

= (klv

1

, klv

2

, klv

3

) by Theorem 1.4(a)

= (kl)(v

1

, v

2

, v

3

) by Theorem 1.4(a)

= (kl)v

(f) and (g): Left as exercises for the reader. QED

A unit vector is a vector with magnitude 1. Notice that for any nonzero vector v,

the vector

v

v

is a unit vector which points in the same direction as v, since

1

v

> 0

and

¸

¸

¸

v

v

¸

¸

¸ =

v

v

= 1. Dividing a nonzero vector v by v is often called normalizing v.

There are specific unit vectors which we will often use, called the basis vectors:

i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1) in

3

; i = (1, 0) and j = (0, 1) in

2

.

These are useful for several reasons: they are mutually perpendicular, since they lie

on distinct coordinate axes; they are all unit vectors: i = j = k = 1; every vector

can be written as a unique scalar combination of the basis vectors: v = (a, b) = a i + b j

in

2

, v = (a, b, c) = a i + b j + c k in

3

. See Figure 1.2.7.

1.2 Vector Algebra 13

1

2

1 2

x

y

0

i

j

(a)

2

x

y

0

ai

bj

v = (a, b)

(b) v = a i + b j

1

2

1 2

1

2

x

y

z

0

i

j

k

(c)

3

x

y

z

0

ai

bj

ck

v = (a, b, c)

(d) v = a i + b j + c k

Figure 1.2.7 Basis vectors in different dimensions

When a vector v = (a, b, c) is written as v = a i+b j+c k, we say that v is in component

form, and that a, b, and c are the i, j, and k components, respectively, of v. We have:

v = v

1

i + v

2

j + v

3

k, k a scalar =⇒ kv = kv

1

i + kv

2

j + kv

3

k

v = v

1

i + v

2

j + v

3

k, w = w

1

i + w

2

j + w

3

k =⇒ v + w = (v

1

+ w

1

)i + (v

2

+ w

2

)j + (v

3

+ w

3

)k

v = v

1

i + v

2

j + v

3

k =⇒ v =

v

2

1

+ v

2

2

+ v

2

3

Example 1.4. Let v = (2, 1, −1) and w = (3, −4, 2) in

3

.

(a) Find v − w.

Solution: v − w = (2 − 3, 1 − (−4) − 1 − 2) = (−1, 5, −3)

(b) Find 3v + 2w.

Solution: 3v + 2w = (6, 3, −3) + (6, −8, 4) = (12, −5, 1)

(c) Write v and w in component form.

Solution: v = 2 i + j − k, w = 3 i − 4 j + 2 k

(d) Find the vector u such that u + v = w.

Solution: By Theorem 1.5, u = w−v = −(v−w) = −(−1, 5, −3) = (1, −5, 3), by part(a).

(e) Find the vector u such that u + v + w = 0.

Solution: By Theorem 1.5, u = −w− v = −(3, −4, 2) − (2, 1, −1) = (−5, 3, −1).

(f) Find the vector u such that 2u + i − 2 j = k.

Solution: 2u = −i + 2 j + k =⇒ u = −

1

2

i + j +

1

2

k

(g) Find the unit vector

v

v

.

Solution:

v

v

=

1

√

2

2

+1

2

+(−1)

2

(2, 1, −1) =

2

√

6

,

1

√

6

,

−1

√

6

**14 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
**

We can now easily prove Theorem 1.1 from the previous section. The distance d

between two points P = (x

1

, y

1

, z

1

) and Q = (x

2

, y

2

, z

2

) in

3

is the same as the length

of the vector w − v, where the vectors v and w are defined as v = (x

1

, y

1

, z

1

) and w =

(x

2

, y

2

, z

2

) (see Figure 1.2.8). So since w− v = (x

2

− x

1

, y

2

− y

1

, z

2

− z

1

), then d = w− v =

(x

2

− x

1

)

2

+ (y

2

− y

1

)

2

+ (z

2

− z

1

)

2

by Theorem 1.2.

x

y

z

0

P(x

1

, y

1

, z

1

)

Q(x

2

, y

2

, z

2

)

v

w

w− v

Figure 1.2.8 Proof of Theorem 1.2: d = w− v

¨

©

Exercises

A

1. Let v = (−1, 5, −2) and w = (3, 1, 1).

(a) Find v − w. (b) Find v + w. (c) Find

v

v

. (d) Find

¸

¸

¸

1

2

(v − w)

¸

¸

¸.

(e) Find

¸

¸

¸

1

2

(v + w)

¸

¸

¸. (f) Find −2 v + 4 w. (g) Find v − 2 w.

(h) Find the vector u such that u + v + w = i.

(i) Find the vector u such that u + v + w = 2 j + k.

(j) Is there a scalar m such that m(v + 2 w) = k? If so, find it.

2. For the vectors v and w from Exercise 1, is v − w = v − w? If not, which

quantity is larger?

3. For the vectors v and w from Exercise 1, is v + w = v + w? If not, which

quantity is larger?

B

4. Prove Theorem 1.5(f) for

3

. 5. Prove Theorem 1.5(g) for

3

.

C

6. We know that every vector in

3

can be written as a scalar combination of the

vectors i, j, and k. Can every vector in

3

be written as a scalar combination of

just i and j, i.e. for any vector v in

3

, are there scalars m, n such that v = mi + n j?

Justify your answer.

1.3 Dot Product 15

1.3 Dot Product

You may have noticed that while we did define multiplication of a vector by a scalar in

the previous section on vector algebra, we did not define multiplication of a vector by

a vector. We will now see one type of multiplication of vectors, called the dot product.

Definition 1.6. Let v = (v

1

, v

2

, v

3

) and w = (w

1

, w

2

, w

3

) be vectors in

3

.

The dot product of v and w, denoted by v··· w, is given by:

v··· w = v

1

w

1

+ v

2

w

2

+ v

3

w

3

(1.6)

Similarly, for vectors v = (v

1

, v

2

) and w = (w

1

, w

2

) in

2

, the dot product is:

v··· w = v

1

w

1

+ v

2

w

2

(1.7)

Notice that the dot product of two vectors is a scalar, not a vector. So the associative

law that holds for multiplication of numbers and for addition of vectors (see Theorem

1.5(b),(e)), does not hold for the dot product of vectors. Why? Because for vectors u, v,

w, the dot product u··· v is a scalar, and so (u··· v) ··· w is not defined since the left side of

that dot product (the part in parentheses) is a scalar and not a vector.

For vectors v = v

1

i + v

2

j + v

3

k and w = w

1

i + w

2

j + w

3

k in component form, the dot

product is still v··· w = v

1

w

1

+ v

2

w

2

+ v

3

w

3

.

Also notice that we defined the dot product in an analytic way, i.e. by referencing

vector coordinates. There is a geometric way of defining the dot product, which we

will now develop as a consequence of the analytic definition.

Definition 1.7. The angle between two nonzero vectors with the same initial point

is the smallest angle between them.

We do not define the angle between the zero vector and any other vector. Any two

nonzero vectors with the same initial point have two angles between them: θ and

360

◦

− θ. We will always choose the smallest nonnegative angle θ between them, so

that 0

◦

≤ θ ≤ 180

◦

. See Figure 1.3.1.

θ

360

◦

− θ

(a) 0

◦

< θ < 180

◦

θ

360

◦

− θ

(b) θ = 180

◦

θ

360

◦

− θ

(c) θ = 0

◦

Figure 1.3.1 Angle between vectors

We can now take a more geometric view of the dot product by establishing a rela-

tionship between the dot product of two vectors and the angle between them.

16 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Theorem1.6. Let v, wbe nonzero vectors, and let θ be the angle between them. Then

cos θ =

v··· w

v w

(1.8)

Proof: We will prove the theorem for vectors in

3

(the proof for

2

is similar). Let

v = (v

1

, v

2

, v

3

) and w = (w

1

, w

2

, w

3

). By the Law of Cosines (see Figure 1.3.2), we have

v − w

2

= v

2

+ w

2

− 2 v w cos θ (1.9)

(note that equation (1.9) holds even for the “degenerate” cases θ = 0

◦

and 180

◦

).

θ

x

y

z

0

v

w

v − w

Figure 1.3.2

Since v − w = (v

1

− w

1

, v

2

− w

2

, v

3

− w

3

), expanding v − w

2

in equation (1.9) gives

v

2

+ w

2

− 2 v w cos θ = (v

1

− w

1

)

2

+ (v

2

− w

2

)

2

+ (v

3

− w

3

)

2

= (v

2

1

− 2v

1

w

1

+ w

2

1

) + (v

2

2

− 2v

2

w

2

+ w

2

2

) + (v

2

3

− 2v

3

w

3

+ w

2

3

)

= (v

2

1

+ v

2

2

+ v

2

3

) + (w

2

1

+ w

2

2

+ w

2

3

) − 2(v

1

w

1

+ v

2

w

2

+ v

3

w

3

)

= v

2

+ w

2

− 2(v··· w) , so

−2 v w cos θ = −2(v··· w) , so since v 0 and w 0 then

cos θ =

v··· w

v w

, since v > 0 and w > 0. QED

Example 1.5. Find the angle θ between the vectors v = (2, 1, −1) and w = (3, −4, 1).

Solution: Since v··· w = (2)(3) + (1)(−4) + (−1)(1) = 1, v =

√

6, and w =

√

26, then

cos θ =

v··· w

v w

=

1

√

6

√

26

=

1

2

√

39

≈ 0.08 =⇒ θ = 85.41

◦

Two nonzero vectors are perpendicular if the angle between them is 90

◦

. Since

cos 90

◦

= 0, we have the following important corollary to Theorem 1.6:

Corollary 1.7. Two nonzero vectors v and ware perpendicular if and only if v··· w = 0.

We will write v ⊥ w to indicate that v and w are perpendicular.

1.3 Dot Product 17

Since cos θ > 0 for 0

◦

≤ θ < 90

◦

and cos θ < 0 for 90

◦

< θ ≤ 180

◦

, we also have:

Corollary 1.8. If θ is the angle between nonzero vectors v and w, then

v··· w is

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

> 0 for 0

◦

≤ θ < 90

◦

0 for θ = 90

◦

< 0 for 90

◦

< θ ≤ 180

◦

By Corollary 1.8, the dot product can be thought of as a way of telling if the angle

between two vectors is acute, obtuse, or a right angle, depending on whether the dot

product is positive, negative, or zero, respectively. See Figure 1.3.3.

0

◦

≤ θ < 90

◦

v

w

(a) v··· w > 0

90

◦

< θ ≤ 180

◦

v

w

(b) v··· w < 0

θ = 90

◦

v

w

(c) v··· w = 0

Figure 1.3.3 Sign of the dot product & angle between vectors

Example 1.6. Are the vectors v = (−1, 5, −2) and w = (3, 1, 1) perpendicular?

Solution: Yes, v ⊥ w since v··· w = (−1)(3) + (5)(1) + (−2)(1) = 0.

The following theorem summarizes the basic properties of the dot product.

Theorem 1.9. For any vectors u, v, w, and scalar k, we have

(a) v··· w = w··· v Commutative Law

(b) (kv) ··· w = v··· (kw) = k(v··· w) Associative Law

(c) v··· 0 = 0 = 0 ··· v

(d) u··· (v + w) = u··· v + u··· w Distributive Law

(e) (u + v) ··· w = u··· w+ v··· w Distributive Law

(f) |v··· w| ≤ v w Cauchy-Schwarz Inequality

5

Proof: The proofs of parts (a)-(e) are straightforward applications of the definition of

the dot product, and are left to the reader as exercises. We will prove part (f).

(f) If either v = 0 or w = 0, then v ··· w = 0 by part (c), and so the inequality holds

trivially. So assume that v and w are nonzero vectors. Then by Theorem 1.6,

v··· w = cos θ v w , so

|v··· w| = |cos θ| v w , so

|v··· w| ≤ v w since |cos θ| ≤ 1. QED

5

Also known as the Cauchy-Schwarz-Buniakovski Inequality.

18 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Using Theorem 1.9, we see that if u ··· v = 0 and u ··· w = 0, then u ··· (kv + lw) =

k(u··· v) + l(u··· w) = k(0) + l(0) = 0 for all scalars k, l. Thus, we have the following fact:

If u ⊥ v and u ⊥ w, then u ⊥ (kv + lw) for all scalars k, l.

For vectors v and w, the collection of all scalar combinations kv + lw is called the

span of v and w. If nonzero vectors v and w are parallel, then their span is a line; if

they are not parallel, then their span is a plane. So what we showed above is that a

vector which is perpendicular to two other vectors is also perpendicular to their span.

The dot product can be used to derive properties of the magnitudes of vectors, the

most important of which is the Triangle Inequality, as given in the following theorem:

Theorem 1.10. For any vectors v, w, we have

(a) v

2

= v··· v

(b) v + w ≤ v + w Triangle Inequality

(c) v − w ≥ v − w

Proof: (a) Left as an exercise for the reader.

(b) By part (a) and Theorem 1.9, we have

v + w

2

= (v + w) ··· (v + w) = v··· v + v··· w+ w··· v + w··· w

= v

2

+ 2(v··· w) + w

2

, so since a ≤ |a| for any real number a, we have

≤ v

2

+ 2 |v··· w| + w

2

, so by Theorem 1.9(f) we have

≤ v

2

+ 2 v w + w

2

= (v + w)

2

and so

v + w ≤ v + w after taking square roots of both sides, which proves (b).

(c) Since v = w+ (v − w), then v = w+ (v − w) ≤ w + v − w by the Triangle

Inequality, so subtracting w from both sides gives v − w ≤ v − w. QED

v

w

v + w

Figure 1.3.4

The Triangle Inequality gets its name from the fact that in any tri-

angle, no one side is longer than the sum of the lengths of the other two

sides (see Figure 1.3.4). Another way of saying this is with the familiar

statement “the shortest distance between two points is a straight line.”

¨

©

Exercises

A

1. Let v = (5, 1, −2) and w = (4, −4, 3). Calculate v··· w.

2. Let v = −3 i − 2 j − k and w = 6 i + 4 j + 2 k. Calculate v··· w.

For Exercises 3-8, find the angle θ between the vectors v and w.

1.3 Dot Product 19

3. v = (5, 1, −2), w = (4, −4, 3) 4. v = (7, 2, −10), w = (2, 6, 4)

5. v = (2, 1, 4), w = (1, −2, 0) 6. v = (4, 2, −1), w = (8, 4, −2)

7. v = −i + 2 j + k, w = −3 i + 6 j + 3 k 8. v = i, w = 3 i + 2 j + 4k

9. Let v = (8, 4, 3) and w = (−2, 1, 4). Is v ⊥ w? Justify your answer.

10. Let v = (6, 0, 4) and w = (0, 2, −1). Is v ⊥ w? Justify your answer.

11. For v, w from Exercise 5, verify the Cauchy-Schwarz Inequality |v··· w| ≤ v w.

12. For v, w from Exercise 6, verify the Cauchy-Schwarz Inequality |v··· w| ≤ v w.

13. For v, w from Exercise 5, verify the Triangle Inequality v + w ≤ v + w.

14. For v, w from Exercise 6, verify the Triangle Inequality v + w ≤ v + w.

B

Note: Consider only vectors in

3

for Exercises 15-25.

15. Prove Theorem 1.9(a). 16. Prove Theorem 1.9(b).

17. Prove Theorem 1.9(c). 18. Prove Theorem 1.9(d).

19. Prove Theorem 1.9(e). 20. Prove Theorem 1.10(a).

21. Prove or give a counterexample: If u··· v = u··· w, then v = w.

C

22. Prove or give a counterexample: If v··· w = 0 for all v, then w = 0.

23. Prove or give a counterexample: If u··· v = u··· w for all u, then v = w.

24. Prove that

v − w

≤ v − w for all v, w.

L

w

v

u

Figure 1.3.5

25. For nonzero vectors v and w, the projection of v onto w

(sometimes written as pro j

w

v) is the vector u along the

same line L as w whose terminal point is obtained by drop-

ping a perpendicular line from the terminal point of v to L

(see Figure 1.3.5). Show that

u =

|v··· w|

w

.

(Hint: Consider the angle between v and w.)

26. Let α, β, and γ be the angles between a nonzero vector v in

3

and the vectors i, j,

and k, respectively. Show that cos

2

α + cos

2

β + cos

2

γ = 1.

(Note: α, β, γ are often called the direction angles of v, and cos α, cos β, cos γ are

called the direction cosines.)

20 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

1.4 Cross Product

In Section 1.3 we defined the dot product, which gave a way of multiplying two vectors.

The resulting product, however, was a scalar, not a vector. In this section we will

define a product of two vectors that does result in another vector. This product, called

the cross product, is only defined for vectors in

3

. The definition may appear strange

and lacking motivation, but we will see the geometric basis for it shortly.

Definition 1.8. Let v = (v

1

, v

2

, v

3

) and w = (w

1

, w

2

, w

3

) be vectors in

3

. The cross

product of v and w, denoted by v××× w, is the vector in

3

given by:

v××× w = (v

2

w

3

− v

3

w

2

, v

3

w

1

− v

1

w

3

, v

1

w

2

− v

2

w

1

) (1.10)

1

1

1

x

y

z

0

i

j

k = i ××× j

Figure 1.4.1

Example 1.7. Find i ××× j.

Solution: Since i = (1, 0, 0) and j = (0, 1, 0), then

i ××× j = ((0)(0) − (0)(1), (0)(0) − (1)(0), (1)(1) − (0)(0))

= (0, 0, 1)

= k

Similarly it can be shown that j ××× k = i and k××× i = j.

In the above example, the cross product of the given vectors was perpendicular to

both those vectors. It turns out that this will always be the case.

Theorem 1.11. If the cross product v ××× w of two nonzero vectors v and w is also a

nonzero vector, then it is perpendicular to both v and w.

Proof: We will show that (v××× w) ··· v = 0:

(v××× w) ··· v = (v

2

w

3

− v

3

w

2

, v

3

w

1

− v

1

w

3

, v

1

w

2

− v

2

w

1

) ··· (v

1

, v

2

, v

3

)

= v

2

w

3

v

1

− v

3

w

2

v

1

+ v

3

w

1

v

2

− v

1

w

3

v

2

+ v

1

w

2

v

3

− v

2

w

1

v

3

= v

1

v

2

w

3

− v

1

v

2

w

3

+ w

1

v

2

v

3

− w

1

v

2

v

3

+ v

1

w

2

v

3

− v

1

w

2

v

3

= 0 , after rearranging the terms.

∴ v××× w ⊥ v by Corollary 1.7.

The proof that v××× w ⊥ w is similar. QED

As a consequence of the above theorem and Theorem 1.9, we have the following:

Corollary 1.12. If the cross product v ××× w of two nonzero vectors v and w is also a

nonzero vector, then it is perpendicular to the span of v and w.

1.4 Cross Product 21

The span of any two nonzero, nonparallel vectors v, w in

3

is a plane P, so the

above corollary shows that v ××× w is perpendicular to that plane. As shown in Figure

1.4.2, there are two possible directions for v×××w, one the opposite of the other. It turns

out (see Appendix B) that the direction of v ××× w is given by the right-hand rule, that

is, the vectors v, w, v ××× w form a right-handed system. Recall from Section 1.1 that

this means that you can point your thumb upwards in the direction of v ××× w while

rotating v towards w with the remaining four fingers.

x

y

z

0

θ

v

w

v××× w

−v××× w

P

Figure 1.4.2 Direction of v××× w

We will now derive a formula for the magnitude of v××× w, for nonzero vectors v, w:

v××× w

2

= (v

2

w

3

− v

3

w

2

)

2

+ (v

3

w

1

− v

1

w

3

)

2

+ (v

1

w

2

− v

2

w

1

)

2

= v

2

2

w

2

3

− 2v

2

w

2

v

3

w

3

+ v

2

3

w

2

2

+ v

2

3

w

2

1

− 2v

1

w

1

v

3

w

3

+ v

2

1

w

2

3

+ v

2

1

w

2

2

− 2v

1

w

1

v

2

w

2

+ v

2

2

w

2

1

= v

2

1

(w

2

2

+ w

2

3

) + v

2

2

(w

2

1

+ w

2

3

) + v

2

3

(w

2

1

+ w

2

2

) − 2(v

1

w

1

v

2

w

2

+ v

1

w

1

v

3

w

3

+ v

2

w

2

v

3

w

3

)

and now adding and subtracting v

2

1

w

2

1

, v

2

2

w

2

2

, and v

2

3

w

2

3

on the right side gives

= v

2

1

(w

2

1

+ w

2

2

+ w

2

3

) + v

2

2

(w

2

1

+ w

2

2

+ w

2

3

) + v

2

3

(w

2

1

+ w

2

2

+ w

2

3

)

− (v

2

1

w

2

1

+ v

2

2

w

2

2

+ v

2

3

w

2

3

+ 2(v

1

w

1

v

2

w

2

+ v

1

w

1

v

3

w

3

+ v

2

w

2

v

3

w

3

))

= (v

2

1

+ v

2

2

+ v

2

3

)(w

2

1

+ w

2

2

+ w

2

3

)

− ((v

1

w

1

)

2

+ (v

2

w

2

)

2

+ (v

3

w

3

)

2

+ 2(v

1

w

1

)(v

2

w

2

) + 2(v

1

w

1

)(v

3

w

3

) + 2(v

2

w

2

)(v

3

w

3

))

so using (a + b + c)

2

= a

2

+ b

2

+ c

2

+ 2ab + 2ac + 2bc for the subtracted term gives

= (v

2

1

+ v

2

2

+ v

2

3

)(w

2

1

+ w

2

2

+ w

2

3

) − (v

1

w

1

+ v

2

w

2

+ v

3

w

3

)

2

= v

2

w

2

− (v··· w)

2

= v

2

w

2

1 −

(v··· w)

2

v

2

w

2

**, since v > 0 and w > 0, so by Theorem 1.6
**

= v

2

w

2

(1 − cos

2

θ) , where θ is the angle between v and w, so

v××× w

2

= v

2

w

2

sin

2

θ , and since 0

◦

≤ θ ≤ 180

◦

, then sin θ ≥ 0, so we have:

22 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

If θ is the angle between nonzero vectors v and w in

3

, then

v××× w = v w sin θ (1.11)

It may seem strange to bother with the above formula, when the magnitude of the

cross product can be calculated directly, like for any other vector. The formula is more

useful for its applications in geometry, as in the following example.

Example 1.8. Let △PQR and PQRS be a triangle and parallelogram, respectively, as

shown in Figure 1.4.3.

b

h h

θ θ

P P

Q Q R R

S S

v

w

Figure 1.4.3

Think of the triangle as existing in

3

, and identify the sides QR and QP with vectors

v and w, respectively, in

3

. Let θ be the angle between v and w. The area A

PQR

of

△PQR is

1

2

bh, where b is the base of the triangle and h is the height. So we see that

b = v and h = w sin θ

A

PQR

=

1

2

v w sin θ

=

1

2

v××× w

So since the area A

PQRS

of the parallelogram PQRS is twice the area of the triangle

△PQR, then

A

PQRS

= v w sin θ

By the discussion in Example 1.8, we have proved the following theorem:

Theorem 1.13. Area of triangles and parallelograms

(a) The area A of a triangle with adjacent sides v, w (as vectors in

3

) is:

A =

1

2

v××× w

(b) The area A of a parallelogram with adjacent sides v, w (as vectors in

3

) is:

A = v××× w

1.4 Cross Product 23

It may seem at first glance that since the formulas derived in Example 1.8 were

for the adjacent sides QP and QR only, then the more general statements in Theorem

1.13 that the formulas hold for any adjacent sides are not justified. We would get a

different formula for the area if we had picked PQ and PR as the adjacent sides, but it

can be shown (see Exercise 26) that the different formulas would yield the same value,

so the choice of adjacent sides indeed does not matter, and Theorem 1.13 is valid.

Theorem 1.13 makes it simpler to calculate the area of a triangle in 3-dimensional

space than by using traditional geometric methods.

Example 1.9. Calculate the area of the triangle △PQR, where P = (2, 4, −7), Q =

(3, 7, 18), and R = (−5, 12, 8).

y

z

x

0

v

w

R(−5, 12, 8)

Q(3, 7, 18)

P(2, 4, −7)

Figure 1.4.4

Solution: Let v =

−−→

PQ and w =

−−→

PR, as in Figure 1.4.4.

Then v = (3, 7, 18) −(2, 4, −7) = (1, 3, 25) and w = (−5, 12, 8) −

(2, 4, −7) = (−7, 8, 15), so the area A of the triangle △PQR is

A =

1

2

v××× w =

1

2

(1, 3, 25) ××× (−7, 8, 15)

=

1

2

¸

¸

¸((3)(15) − (25)(8), (25)(−7) − (1)(15), (1)(8) − (3)(−7))

¸

¸

¸

=

1

2

¸

¸

¸(−155, −190, 29)

¸

¸

¸

=

1

2

(−155)

2

+ (−190)

2

+ 29

2

=

1

2

√

60966

A ≈ 123.46

Example 1.10. Calculate the area of the parallelogram PQRS , where P = (1, 1), Q =

(2, 3), R = (5, 4), and S = (4, 2).

x

y

0

1

2

3

4

1 2 3 4 5

P

Q

R

S

v

w

Figure 1.4.5

Solution: Let v =

−−→

S P and w =

−−→

S R, as in Figure 1.4.5. Then

v = (1, 1) − (4, 2) = (−3, −1) and w = (5, 4) − (4, 2) = (1, 2).

But these are vectors in

2

, and the cross product is only

defined for vectors in

3

. However,

2

can be thought of

as the subset of

3

such that the z-coordinate is always 0.

So we can write v = (−3, −1, 0) and w = (1, 2, 0). Then the

area A of PQRS is

A = v××× w =

¸

¸

¸(−3, −1, 0) ××× (1, 2, 0)

¸

¸

¸

=

¸

¸

¸((−1)(0) − (0)(2), (0)(1) − (−3)(0), (−3)(2) − (−1)(1))

¸

¸

¸

=

¸

¸

¸(0, 0, −5)

¸

¸

¸

A = 5

24 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

The following theorem summarizes the basic properties of the cross product.

Theorem 1.14. For any vectors u, v, w in

3

, and scalar k, we have

(a) v××× w = −w××× v Anticommutative Law

(b) u××× (v + w) = u××× v + u××× w Distributive Law

(c) (u + v) ××× w = u××× w+ v××× w Distributive Law

(d) (kv) ××× w = v××× (kw) = k(v××× w) Associative Law

(e) v××× 0 = 0 = 0 ××× v

(f) v××× v = 0

(g) v××× w = 0 if and only if v w

Proof: The proofs of properties (b)-(f) are straightforward. We will prove parts (a)

and (g) and leave the rest to the reader as exercises.

x

y

z

0

v

w

v××× w

w××× v

Figure 1.4.6

(a) By the definition of the cross product and scalar multi-

plication, we have:

v××× w = (v

2

w

3

− v

3

w

2

, v

3

w

1

− v

1

w

3

, v

1

w

2

− v

2

w

1

)

= −(v

3

w

2

− v

2

w

3

, v

1

w

3

− v

3

w

1

, v

2

w

1

− v

1

w

2

)

= −(w

2

v

3

− w

3

v

2

, w

3

v

1

− w

1

v

3

, w

1

v

2

− w

2

v

1

)

= −w××× v

Note that this says that v ××× w and w××× v have the same

magnitude but opposite direction (see Figure 1.4.6).

(g) If either v or w is 0 then v×××w = 0 by part (e), and either v = 0 = 0wor w = 0 = 0v,

so v and w are scalar multiples, i.e. they are parallel.

If both v and w are nonzero, and θ is the angle between them, then by formula

(1.11), v ××× w = 0 if and only if v w sin θ = 0, which is true if and only if sin θ = 0

(since v > 0 and w > 0). So since 0

◦

≤ θ ≤ 180

◦

, then sin θ = 0 if and only if θ = 0

◦

or 180

◦

. But the angle between v and w is 0

◦

or 180

◦

if and only if v w. QED

Example 1.11. Adding to Example 1.7, we have

i ××× j = k j ××× k = i k××× i = j

j ××× i = −k k××× j = −i i ××× k = −j

i ××× i = j ××× j = k××× k = 0

Recall from geometry that a parallelepiped is a 3-dimensional solid with 6 faces, all

of which are parallelograms.

6

6

An equivalent definition of a parallelepiped is: the collection of all scalar combinations k

1

v

1

+k

2

v

2

+k

3

v

3

of some vectors v

1

, v

2

, v

3

in

3

, where 0 ≤ k

1

, k

2

, k

3

≤ 1.

1.4 Cross Product 25

Example 1.12. Volume of a parallelepiped: Let the vectors u, v, w in

3

represent

adjacent sides of a parallelepiped P, as in Figure 1.4.7. Show that the volume of P is

the scalar triple product u··· (v××× w).

h

θ

u

w

v

v××× w

Figure 1.4.7 Parallelepiped P

Solution: Recall that the volume of a par-

allelepiped is the area A of the base paral-

lelogram times the height h. By Theorem

1.13(b), the area A of the base parallelogram

is v×××w. And we can see that since v×××w is

perpendicular to the base parallelogram de-

termined by v and w, then the height h is

u cos θ, where θ is the angle between u and

v××× w. By Theorem 1.6 we know that

cos θ =

u··· (v××× w)

u v××× w

. Hence,

vol(P) = Ah

= v××× w

u u··· (v××× w)

u v××× w

= u··· (v××× w)

In Example 1.12 the height h of the parallelepiped is u cos θ, and not −u cos θ,

because the vector u is on the same side of the base parallelogram’s plane as the vector

v×××w(so that cos θ > 0). Since the volume is the same no matter which base and height

we use, then repeating the same steps using the base determined by u and v (since w

is on the same side of that base’s plane as u××× v), the volume is w··· (u××× v). Repeating

this with the base determined by w and u, we have the following result:

For any vectors u, v, w in

3

,

u··· (v××× w) = w··· (u××× v) = v··· (w××× u) (1.12)

(Note that the equalities hold trivially if any of the vectors are 0.)

Since v ××× w = −w××× v for any vectors v, w in

3

, then picking the wrong order for

the three adjacent sides in the scalar triple product in formula (1.12) will give you the

negative of the volume of the parallelepiped. So taking the absolute value of the scalar

triple product for any order of the three adjacent sides will always give the volume:

Theorem 1.15. If vectors u, v, w in

3

represent any three adjacent sides of a paral-

lelepiped, then the volume of the parallelepiped is |u··· (v××× w)|.

Another type of triple product is the vector triple product u××× (v ××× w). The proof of

the following theorem is left as an exercise for the reader:

26 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Theorem 1.16. For any vectors u, v, w in

3

,

u××× (v××× w) = (u··· w)v − (u··· v)w (1.13)

An examination of the formula in Theorem 1.16 gives some idea of the geometry of

the vector triple product. By the right side of formula (1.13), we see that u××× (v ××× w)

is a scalar combination of v and w, and hence lies in the plane containing v and w

(i.e. u ××× (v ××× w), v and w are coplanar). This makes sense since, by Theorem 1.11,

u ××× (v ××× w) is perpendicular to both u and v ××× w. In particular, being perpendicular

to v××× w means that u××× (v××× w) lies in the plane containing v and w, since that plane

is itself perpendicular to v ××× w. But then how is u××× (v ××× w) also perpendicular to u,

which could be any vector? The following example may help to see how this works.

Example 1.13. Find u××× (v××× w) for u = (1, 2, 4), v = (2, 2, 0), w = (1, 3, 0).

Solution: Since u··· v = 6 and u··· w = 7, then

u××× (v××× w) = (u··· w)v − (u··· v)w

= 7 (2, 2, 0) − 6 (1, 3, 0) = (14, 14, 0) − (6, 18, 0)

= (8, −4, 0)

Note that v and w lie in the xy-plane, and that u×××(v×××w) also lies in that plane. Also,

u××× (v××× w) is perpendicular to both u and v××× w = (0, 0, 4) (see Figure 1.4.8).

y

z

x

0

u

v

w

v ××× w

u ××× (v ××× w)

Figure 1.4.8

For vectors v = v

1

i + v

2

j + v

3

k and w = w

1

i + w

2

j + w

3

k in component form, the cross

product is written as: v ××× w = (v

2

w

3

− v

3

w

2

)i + (v

3

w

1

− v

1

w

3

)j + (v

1

w

2

− v

2

w

1

)k. It is often

easier to use the component form for the cross product, because it can be represented

as a determinant. We will not go too deeply into the theory of determinants

7

; we will

just cover what is essential for our purposes.

7

See ANTON and RORRES for a fuller development.

1.4 Cross Product 27

A 2 × 2 matrix is an array of two rows and two columns of scalars, written as

,

a b

c d

¸

or

¸

a b

c d

where a, b, c, d are scalars. The determinant of such a matrix, written as

a b

c d

or det

,

a b

c d

¸

,

is the scalar defined by the following formula:

a b

c d

= ad − bc

It may help to remember this formula as being the product of the scalars on the down-

ward diagonal minus the product of the scalars on the upward diagonal.

Example 1.14.

1 2

3 4

= (1)(4) − (2)(3) = 4 − 6 = −2

A 3 × 3 matrix is an array of three rows and three columns of scalars, written as

,

¸

¸

¸

¸

¸

¸

¸

¸

¸

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

or

¸

¸

¸

¸

¸

¸

¸

¸

¸

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

¸

¸

¸

¸

¸

¸

¸

¸

¸

,

and its determinant is given by the formula:

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

= a

1

b

2

b

3

c

2

c

3

− a

2

b

1

b

3

c

1

c

3

+ a

3

b

1

b

2

c

1

c

2

(1.14)

One way to remember the above formula is the following: multiply each scalar in the

first row by the determinant of the 2 × 2 matrix that remains after removing the row

and column that contain that scalar, then sum those products up, putting alternating

plus and minus signs in front of each (starting with a plus).

Example 1.15.

1 0 2

4 −1 3

1 0 2

= 1

−1 3

0 2

− 0

4 3

1 2

+ 2

4 −1

1 0

= 1(−2 − 0) − 0(8 − 3) + 2(0 + 1) = 0

28 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

We defined the determinant as a scalar, derived from algebraic operations on scalar

entries in a matrix. However, if we put three vectors in the first row of a 3 × 3 matrix,

then the definition still makes sense, since we would be performing scalar multiplica-

tion on those three vectors (they would be multiplied by the 2 × 2 scalar determinants

as before). This gives us a determinant that is now a vector, and lets us write the

cross product of v = v

1

i + v

2

j + v

3

k and w = w

1

i + w

2

j + w

3

k as a determinant:

v××× w =

i j k

v

1

v

2

v

3

w

1

w

2

w

3

=

v

2

v

3

w

2

w

3

i −

v

1

v

3

w

1

w

3

j +

v

1

v

2

w

1

w

2

k

= (v

2

w

3

− v

3

w

2

)i + (v

3

w

1

− v

1

w

3

)j + (v

1

w

2

− v

2

w

1

)k

Example 1.16. Let v = 4 i − j + 3 k and w = i + 2 k. Then

v××× w =

i j k

4 −1 3

1 0 2

=

−1 3

0 2

i −

4 3

1 2

j +

4 −1

1 0

k = −2 i − 5 j + k

The scalar triple product can also be written as a determinant. In fact, by Example

1.12, the following theorem provides an alternate definition of the determinant of a

3 × 3 matrix as the volume (or negative volume) of a parallelepiped whose adjacent

sides are the rows of the matrix. The proof is left as an exercise for the reader.

Theorem 1.17. For any vectors u = (u

1

, u

2

, u

3

), v = (v

1

, v

2

, v

3

), w = (w

1

, w

2

, w

3

) in

3

:

u··· (v××× w) =

u

1

u

2

u

3

v

1

v

2

v

3

w

1

w

2

w

3

(1.15)

Example 1.17. Find the volume of the parallelepiped with adjacent sides u = (2, 1, 3),

v = (−1, 3, 2), w = (1, 1, −2) (see Figure 1.4.9).

y

z

x

0

u

v

w

Figure 1.4.9 P

Solution: By Theorem 1.15, the volume of the parallelepiped

P is the absolute value of the scalar triple product of the three

adjacent sides (in any order). By Theorem 1.17,

u··· (v××× w) =

2 1 3

−1 3 2

1 1 −2

= 2

3 2

1 −2

− 1

−1 2

1 −2

+ 3

−1 3

1 1

= 2(−8) − 1(0) + 3(−4) = −28, so

vol(P) = |−28| = 28.

1.4 Cross Product 29

Interchanging the dot and cross products can be useful in proving vector identities:

Example 1.18. Prove: (u××× v) ··· (w××× z) =

u··· w u··· z

v··· w v··· z

**for all vectors u, v, w, z in
**

3

.

Solution: Let x = u××× v. Then

(u××× v) ··· (w××× z) = x··· (w××× z)

= w··· (z ××× x) (by formula (1.12))

= w··· (z ××× (u××× v))

= w··· ((z ··· v)u − (z ··· u)v) (by Theorem 1.16)

= (z ··· v)(w··· u) − (z ··· u)(w··· v)

= (u··· w)(v··· z) − (u··· z)(v··· w) (by commutativity of the dot product).

=

u··· w u··· z

v··· w v··· z

¨

©

Exercises

A

For Exercises 1-6, calculate v××× w.

1. v = (5, 1, −2), w = (4, −4, 3) 2. v = (7, 2, −10), w = (2, 6, 4)

3. v = (2, 1, 4), w = (1, −2, 0) 4. v = (1, 3, 2), w = (7, 2, −10)

5. v = −i + 2 j + k, w = −3 i + 6 j + 3 k 6. v = i, w = 3 i + 2 j + 4k

For Exercises 7-8, calculate the area of the triangle △PQR.

7. P = (5, 1, −2), Q = (4, −4, 3), R = (2, 4, 0) 8. P = (4, 0, 2), Q = (2, 1, 5), R = (−1, 0, −1)

For Exercises 9-10, calculate the area of the parallelogram PQRS .

9. P = (2, 1, 3), Q = (1, 4, 5), R = (2, 5, 3), S = (3, 2, 1)

10. P = (−2, −2), Q = (1, 4), R = (6, 6), S = (3, 0)

For Exercises 11-12, find the volume of the parallelepiped with adjacent sides u, v, w.

11. u = (1, 1, 3), v = (2, 1, 4), w = (5, 1, −2) 12. u = (1, 3, 2), v = (7, 2, −10), w = (1, 0, 1)

For Exercises 13-14, calculate u··· (v××× w) and u××× (v××× w).

13. u = (1, 1, 1), v = (3, 0, 2), w = (2, 2, 2) 14. u = (1, 0, 2), v = (−1, 0, 3), w = (2, 0, −2)

15. Calculate (u××× v) ··· (w××× z) for u = (1, 1, 1), v = (3, 0, 2), w = (2, 2, 2), z = (2, 1, 4).

30 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

B

16. If v and w are unit vectors in

3

, under what condition(s) would v ××× w also be a

unit vector in

3

? Justify your answer.

17. Show that if v××× w = 0 for all w in

3

, then v = 0.

18. Prove Theorem 1.14(b). 19. Prove Theorem 1.14(c).

20. Prove Theorem 1.14(d). 21. Prove Theorem 1.14(e).

22. Prove Theorem 1.14(f). 23. Prove Theorem 1.16.

24. Prove Theorem 1.17. (Hint: Expand both sides of the equation.)

25. Prove the following for all vectors v, w in

3

:

(a) v××× w

2

+ |v··· w|

2

= v

2

w

2

(b) If v··· w = 0 and v××× w = 0, then v = 0 or w = 0.

C

26. Prove that in Example 1.8 the formula for the area of the triangle △PQR yields the

same value no matter which two adjacent sides are chosen. To do this, show that

1

2

u××× (−w) =

1

2

v ××× w, where u = PR, −w = PQ, and v = QR, w = QP as before.

Similarly, show that

1

2

(−u) ××× (−v) =

1

2

v××× w, where −u = RP and −v = RQ.

27. Consider the vector equation a××× x = b in

3

, where a 0. Show that:

(a) a··· b = 0

(b) x =

b××× a

a

2

+ ka is a solution to the equation, for any scalar k

28. Prove the Jacobi identity: u××× (v××× w) + v××× (w××× u) + w××× (u××× v) = 0

29. Show that u, v, w lie in the same plane in

3

if and only if u··· (v××× w) = 0.

30. For all vectors u, v, w, z in

3

, show that

(u××× v) ××× (w××× z) = (z ··· (u××× v))w− (w··· (u××× v))z

and that

(u××× v) ××× (w××× z) = (u··· (w××× z))v − (v··· (w××× z))u

Why do both equations make sense geometrically?

1.5 Lines and Planes 31

1.5 Lines and Planes

Now that we know how to perform some operations on vectors, we can start to deal

with some familiar geometric objects, like lines and planes, in the language of vectors.

The reason for doing this is simple: using vectors makes it easier to study objects in

3-dimensional Euclidean space. We will first consider lines.

Line through a point, parallel to a vector

Let P = (x

0

, y

0

, z

0

) be a point in

3

, let v = (a, b, c) be a nonzero vector, and let L be the

line through P which is parallel to v (see Figure 1.5.1).

x

y

z

0

L

t > 0

t < 0

P(x

0

, y

0

, z

0

)

r

v

tv

r + tv

r + tv

Figure 1.5.1

Let r = (x

0

, y

0

, z

0

) be the vector pointing from the origin to P. Since multiplying the

vector v by a scalar t lengthens or shrinks v while preserving its direction if t > 0, and

reversing its direction if t < 0, then we see from Figure 1.5.1 that every point on the

line L can be obtained by adding the vector tv to the vector r for some scalar t. That

is, as t varies over all real numbers, the vector r +tv will point to every point on L. We

can summarize the vector representation of L as follows:

For a point P = (x

0

, y

0

, z

0

) and nonzero vector v in

3

, the line L through P parallel

to v is given by

r + tv, for − ∞ < t < ∞ (1.16)

where r = (x

0

, y

0

, z

0

) is the vector pointing to P.

Note that we used the correspondence between a vector and its terminal point.

Since v = (a, b, c), then the terminal point of the vector r + tv is (x

0

+ at, y

0

+ bt, z

0

+ ct).

We then get the parametric representation of L with the parameter t:

For a point P = (x

0

, y

0

, z

0

) and nonzero vector v = (a, b, c) in

3

, the line L through P

parallel to v consists of all points (x, y, z) given by

x = x

0

+ at, y = y

0

+ bt, z = z

0

+ ct, for − ∞ < t < ∞ (1.17)

Note that in both representations we get the point P on L by letting t = 0.

32 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

In formula (1.17), if a 0, then we can solve for the parameter t: t = (x − x

0

)/a. We

can also solve for t in terms of y and in terms of z if neither b nor c, respectively, is

zero: t = (y−y

0

)/b and t = (z −z

0

)/c. These three values all equal the same value t, so we

can write the following system of equalities, called the symmetric representation of L:

For a point P = (x

0

, y

0

, z

0

) and vector v = (a, b, c) in

3

with a, b and c all nonzero, the

line L through P parallel to v consists of all points (x, y, z) given by the equations

x − x

0

a

=

y − y

0

b

=

z − z

0

c

(1.18)

x

y

z

0

x = x

0

x

0

L

Figure 1.5.2

What if, say, a = 0 in the above scenario? We can not divide

by zero, but we do know that x = x

0

+ at, and so x = x

0

+ 0t = x

0

.

Then the symmetric representation of L would be:

x = x

0

,

y − y

0

b

=

z − z

0

c

(1.19)

Note that this says that the line L lies in the plane x = x

0

, which

is parallel to the yz-plane (see Figure 1.5.2). Similar equations

can be derived for the cases when b = 0 or c = 0.

You may have noticed that the vector representation of L in formula (1.16) is more

compact than the parametric and symmetric formulas. That is an advantage of using

vector notation. Technically, though, the vector representation gives us the vectors

whose terminal points make up the line L, not just L itself. So you have to remem-

ber to identify the vectors r + tv with their terminal points. On the other hand, the

parametric representation always gives just the points on L and nothing else.

Example 1.19. Write the line L through the point P = (2, 3, 5) and parallel to the

vector v = (4, −1, 6), in the following forms: (a) vector, (b) parametric, (c) symmetric.

Lastly: (d) find two points on L distinct from P.

Solution: (a) Let r = (2, 3, 5). Then by formula (1.16), L is given by:

r + tv = (2, 3, 5) + t(4, −1, 6), for − ∞ < t < ∞

(b) L consists of the points (x, y, z) such that

x = 2 + 4t, y = 3 − t, z = 5 + 6t, for − ∞ < t < ∞

(c) L consists of the points (x, y, z) such that

x − 2

4

=

y − 3

−1

=

z − 5

6

(d) Letting t = 1 and t = 2 in part(b) yields the points (6, 2, 11) and (10, 1, 17) on L.

1.5 Lines and Planes 33

Line through two points

x

y

z

0

L

P

1

(x

1

, y

1

, z

1

)

P

2

(x

2

, y

2

, z

2

)

r

1

r

2

r

2

− r

1

r

1

+ t(r

2

− r

1

)

Figure 1.5.3

Let P

1

= (x

1

, y

1

, z

1

) and P

2

= (x

2

, y

2

, z

2

) be distinct

points in

3

, and let L be the line through P

1

and

P

2

. Let r

1

= (x

1

, y

1

, z

1

) and r

2

= (x

2

, y

2

, z

2

) be the vectors

pointing to P

1

and P

2

, respectively. Then as we can

see from Figure 1.5.3, r

2

− r

1

is the vector from P

1

to

P

2

. So if we multiply the vector r

2

− r

1

by a scalar t

and add it to the vector r

1

, we will get the entire line

L as t varies over all real numbers. The following is

a summary of the vector, parametric, and symmetric

forms for the line L:

Let P

1

= (x

1

, y

1

, z

1

), P

2

= (x

2

, y

2

, z

2

) be distinct points in

3

, and let r

1

= (x

1

, y

1

, z

1

),

r

2

= (x

2

, y

2

, z

2

). Then the line L through P

1

and P

2

has the following representations:

Vector:

r

1

+ t(r

2

− r

1

) , for − ∞ < t < ∞ (1.20)

Parametric:

x = x

1

+ (x

2

− x

1

)t, y = y

1

+ (y

2

− y

1

)t, z = z

1

+ (z

2

− z

1

)t, for − ∞ < t < ∞ (1.21)

Symmetric:

x − x

1

x

2

− x

1

=

y − y

1

y

2

− y

1

=

z − z

1

z

2

− z

1

(if x

1

x

2

, y

1

y

2

, and z

1

z

2

) (1.22)

Example 1.20. Write the line L through the points P

1

= (−3, 1, −4) and P

2

= (4, 4, −6) in

parametric form.

Solution: By formula (1.21), L consists of the points (x, y, z) such that

x = −3 + 7t, y = 1 + 3t, z = −4 − 2t, for − ∞ < t < ∞

Distance between a point and a line

θ L

v

w d

Q

P

Figure 1.5.4

Let L be a line in

3

in vector form as r + tv (for −∞ < t < ∞),

and let P be a point not on L. The distance d from P to L is the

length of the line segment from P to L which is perpendicular

to L (see Figure 1.5.4). Pick a point Q on L, and let w be the

vector from Q to P. If θ is the angle between w and v, then

d = w sin θ. So since v××× w = v w sin θ and v 0, then:

d =

v××× w

v

(1.23)

34 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.21. Find the distance d from the point P = (1, 1, 1) to the line L in Example

1.20.

Solution: From Example 1.20, we see that we can represent L in vector form as: r+tv,

for r = (−3, 1, −4) and v = (7, 3, −2). Since the point Q = (−3, 1, −4) is on L, then for

w =

−−→

QP = (1, 1, 1) − (−3, 1, −4) = (4, 0, 5), we have:

v××× w =

i j k

7 3 −2

4 0 5

=

3 −2

0 5

i −

7 −2

4 5

j +

7 3

4 0

k = 15 i − 43 j − 12 k , so

d =

v××× w

v

=

¸

¸

¸15 i − 43 j − 12 k

¸

¸

¸

¸

¸

¸(7, 3, −2)

¸

¸

¸

=

15

2

+ (−43)

2

+ (−12)

2

7

2

+ 3

2

+ (−2)

2

=

√

2218

√

62

= 5.98

It is clear that two lines L

1

and L

2

, represented in vector form as r

1

+ sv

1

and r

2

+tv

2

,

respectively, are parallel (denoted as L

1

L

2

) if v

1

and v

2

are parallel. Also, L

1

and L

2

are perpendicular (denoted as L

1

⊥ L

2

) if v

1

and v

2

are perpendicular.

x

y

z

0

L

1

L

2

Figure 1.5.5

In 2-dimensional space, two lines are either identical, parallel, or

they intersect. In 3-dimensional space, there is an additional possi-

bility: two lines can be skew, that is, they do not intersect but they

are not parallel. However, even though they are not parallel, skew

lines are on parallel planes (see Figure 1.5.5).

To determine whether two lines in

3

intersect, it is often easier

to use the parametric representation of the lines. In this case, you

should use different parameter variables (usually s and t) for the lines, since the val-

ues of the parameters may not be the same at the point of intersection. Setting the

two (x, y, z) triples equal will result in a system of 3 equations in 2 unknowns (s and t).

Example 1.22. Find the point of intersection (if any) of the following lines:

x + 1

3

=

y − 2

2

=

z − 1

−1

and x + 3 =

y − 8

−3

=

z + 3

2

Solution: First we write the lines in parametric form, with parameters s and t:

x = −1 + 3s, y = 2 + 2s, z = 1 − s and x = −3 + t, y = 8 − 3t, z = −3 + 2t

The lines intersect when (−1 + 3s, 2 + 2s, 1 − s) = (−3 + t, 8 − 3t, −3 + 2t) for some s, t:

−1 + 3s = −3 + t : ⇒ t = 2 + 3s

2 + 2s = 8 − 3t : ⇒ 2 + 2s = 8 − 3(2 + 3s) = 2 − 9s ⇒ 2s = −9s ⇒ s = 0 ⇒ t = 2 + 3(0) = 2

1 − s = −3 + 2t : 1 − 0 = −3 + 2(2) ⇒ 1 = 1 (Note that we had to check this.)

Letting s = 0 in the equations for the first line, or letting t = 2 in the equations for the

second line, gives the point of intersection (−1, 2, 1).

1.5 Lines and Planes 35

We will now consider planes in 3-dimensional Euclidean space.

Plane through a point, perpendicular to a vector

Let P be a plane in

3

, and suppose it contains a point P

0

= (x

0

, y

0

, z

0

). Let n = (a, b, c)

be a nonzero vector which is perpendicular to the plane P. Such a vector is called a

normal vector (or just a normal) to the plane. Now let (x, y, z) be any point in the

plane P. Then the vector r = (x − x

0

, y − y

0

, z − z

0

) lies in the plane P (see Figure 1.5.6).

So if r 0, then r ⊥ n and hence n··· r = 0. And if r = 0 then we still have n··· r = 0.

(x

0

, y

0

, z

0

) (x, y, z)

n

r

Figure 1.5.6 The plane P

Conversely, if (x, y, z) is any point in

3

such that r = (x − x

0

, y − y

0

, z − z

0

) 0 and

n··· r = 0, then r ⊥ n and so (x, y, z) lies in P. This proves the following theorem:

Theorem 1.18. Let P be a plane in

3

, let (x

0

, y

0

, z

0

) be a point in P, and let n = (a, b, c)

be a nonzero vector which is perpendicular to P. Then P consists of the points (x, y, z)

satisfying the vector equation:

n··· r = 0 (1.24)

where r = (x − x

0

, y − y

0

, z − z

0

), or equivalently:

a(x − x

0

) + b(y − y

0

) + c(z − z

0

) = 0 (1.25)

The above equation is called the point-normal form of the plane P.

Example 1.23. Find the equation of the plane P containing the point (−3, 1, 3) and

perpendicular to the vector n = (2, 4, 8).

Solution: By formula (1.25), the plane P consists of all points (x, y, z) such that:

2(x + 3) + 4(y − 1) + 8(z − 3) = 0

If we multiply out the terms in formula (1.25) and combine the constant terms, we

get an equation of the plane in normal form:

ax + by + cz + d = 0 (1.26)

For example, the normal form of the plane in Example 1.23 is 2x + 4y + 8z − 22 = 0.

36 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Plane containing three noncollinear points

In 2-dimensional and 3-dimensional space, two points determine a line. Two points

do not determine a plane in

3

. In fact, three collinear points (i.e. all on the same

line) do not determine a plane; an infinite number of planes would contain the line

on which those three points lie. However, three noncollinear points do determine a

plane. For if Q, R and S are noncollinear points in

3

, then

−−→

QR and

−−→

QS are nonzero

vectors which are not parallel (by noncollinearity), and so their cross product

−−→

QR×××

−−→

QS

is perpendicular to both

−−→

QR and

−−→

QS . So

−−→

QR and

−−→

QS (and hence Q, R and S ) lie in the

plane through the point Q with normal vector n =

−−→

QR×××

−−→

QS (see Figure 1.5.7).

Q

R

S

n =

−−→

QR×××

−−→

QS

−−→

QR

−−→

QS

Figure 1.5.7 Noncollinear points Q, R, S

Example 1.24. Find the equation of the plane P containing the points (2, 1, 3), (1, −1, 2)

and (3, 2, 1).

Solution: Let Q = (2, 1, 3), R = (1, −1, 2) and S = (3, 2, 1). Then for the vectors

−−→

QR =

(−1, −2, −1) and

−−→

QS = (1, 1, −2), the plane P has a normal vector

n =

−−→

QR×××

−−→

QS = (−1, −2, −1) ××× (1, 1, −2) = (5, −3, 1)

So using formula (1.25) with the point Q (we could also use R or S ), the plane P consists

of all points (x, y, z) such that:

5(x − 2) − 3(y − 1) + (z − 3) = 0

or in normal form,

5x − 3y + z − 10 = 0

We mentioned earlier that skew lines in

3

lie on separate, parallel planes. So

two skew lines do not determine a plane. But two (nonidentical) lines which either

intersect or are parallel do determine a plane. In both cases, to find the equation of

the plane that contains those two lines, simply pick from the two lines a total of three

noncollinear points (i.e. one point from one line and two points from the other), then

use the technique above, as in Example 1.24, to write the equation. We will leave

examples of this as exercises for the reader.

1.5 Lines and Planes 37

Distance between a point and a plane

The distance between a point in

3

and a plane is the length of the line segment

from that point to the plane which is perpendicular to the plane. The following theo-

rem gives a formula for that distance.

Theorem 1.19. Let Q = (x

0

, y

0

, z

0

) be a point in

3

, and let P be a plane with normal

form ax + by + cz + d = 0 that does not contain Q. Then the distance D from Q to P is:

D =

|ax

0

+ by

0

+ cz

0

+ d|

√

a

2

+ b

2

+ c

2

(1.27)

Proof: Let R = (x, y, z) be any point in the plane P (so that ax + by + cz + d = 0) and

let r =

−−→

RQ = (x

0

− x, y

0

− y, z

0

− z). Then r 0 since Q does not lie in P. From the

normal form equation for P, we know that n = (a, b, c) is a normal vector for P. Now,

any plane divides

3

into two disjoint parts. Assume that n points toward the side

of P where the point Q is located. Place n so that its initial point is at R, and let θ be

the angle between r and n. Then 0

◦

< θ < 90

◦

, so cos θ > 0. Thus, the distance D is

cos θ r = |cos θ| r (see Figure 1.5.8).

Q

R

n

r D

θ

D

P

Figure 1.5.8

By Theorem 1.6 in Section 1.3, we know that cos θ =

n··· r

n r

, so

D = |cos θ| r =

n··· r

n r

r =

n··· r

n

=

|a(x

0

− x) + b(y

0

− y) + c(z

0

− z)|

√

a

2

+ b

2

+ c

2

=

|ax

0

+ by

0

+ cz

0

− (ax + by + cz)|

√

a

2

+ b

2

+ c

2

=

|ax

0

+ by

0

+ cz

0

− (−d)|

√

a

2

+ b

2

+ c

2

=

|ax

0

+ by

0

+ cz

0

+ d|

√

a

2

+ b

2

+ c

2

If n points away from the side of P where the point Q is located, then 90

◦

< θ < 180

◦

and so cos θ < 0. The distance D is then |cos θ| r, and thus repeating the same

argument as above still gives the same result. QED

Example 1.25. Find the distance D from (2, 4, −5) to the plane from Example 1.24.

Solution: Recall that the plane is given by 5x − 3y + z − 10 = 0. So

D =

|5(2) − 3(4) + 1(−5) − 10|

5

2

+ (−3)

2

+ 1

2

=

|−17|

√

35

=

17

√

35

≈ 2.87

38 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Line of intersection of two planes

L

Figure 1.5.9

Note that two planes are parallel if they have normal vectors

that are parallel, and the planes are perpendicular if their normal

vectors are perpendicular. If two planes do intersect, they do so in

a line (see Figure 1.5.9). Suppose that two planes P

1

and P

2

with

normal vectors n

1

and n

2

, respectively, intersect in a line L. Since

n

1

××× n

2

⊥ n

1

, then n

1

××× n

2

is parallel to the plane P

1

. Likewise,

n

1

××× n

2

⊥ n

2

means that n

1

××× n

2

is also parallel to P

2

. Thus, n

1

××× n

2

is parallel to the intersection of P

1

and P

2

, i.e. n

1

××× n

2

is parallel to L. Thus, we can

write L in the following vector form:

L : r + t(n

1

××× n

2

) , for − ∞ < t < ∞ (1.28)

where r is any vector pointing to a point belonging to both planes. To find a point in

both planes, find a common solution (x, y, z) to the two normal form equations of the

planes. This can often be made easier by setting one of the coordinate variables to

zero, which leaves you to solve two equations in just two unknowns.

Example 1.26. Find the line of intersection L of the planes 5x − 3y + z − 10 = 0 and

2x + 4y − z + 3 = 0.

Solution: The plane 5x − 3y + z − 10 = 0 has normal vector n

1

= (5, −3, 1) and the

plane 2x +4y −z +3 = 0 has normal vector n

2

= (2, 4, −1). Since n

1

and n

2

are not scalar

multiples, then the two planes are not parallel and hence will intersect. A point (x, y, z)

on both planes will satisfy the following system of two equations in three unknowns:

5x − 3y + z − 10 = 0

2x + 4y − z + 3 = 0

Set x = 0 (why is that a good choice?). Then the above equations are reduced to:

−3y + z − 10 = 0

4y − z + 3 = 0

The second equation gives z = 4y + 3, substituting that into the first equation gives

y = 7. Then z = 31, and so the point (0, 7, 31) is on L. Since n

1

×××n

2

= (−1, 7, 26), then L is

given by:

r + t(n

1

××× n

2

) = (0, 7, 31) + t(−1, 7, 26), for − ∞ < t < ∞

or in parametric form:

x = −t, y = 7 + 7t, z = 31 + 26t, for − ∞ < t < ∞

1.5 Lines and Planes 39

¨

©

Exercises

A

For Exercises 1-4, write the line L through the point P and parallel to the vector v in

the following forms: (a) vector, (b) parametric, and (c) symmetric.

1. P = (2, 3, −2), v = (5, 4, −3) 2. P = (3, −1, 2), v = (2, 8, 1)

3. P = (2, 1, 3), v = (1, 0, 1) 4. P = (0, 0, 0), v = (7, 2, −10)

For Exercises 5-6, write the line L through the points P

1

and P

2

in parametric form.

5. P

1

= (1, −2, −3), P

2

= (3, 5, 5) 6. P

1

= (4, 1, 5), P

2

= (−2, 1, 3)

For Exercises 7-8, find the distance d from the point P to the line L.

7. P = (1, −1, −1), L : x = −2 − 2t, y = 4t, z = 7 + t

8. P = (0, 0, 0), L : x = 3 + 2t, y = 4 + 3t, z = 5 + 4t

For Exercises 9-10, find the point of intersection (if any) of the given lines.

9. x = 7 + 3s, y = −4 − 3s, z = −7 − 5s and x = 1 + 6t, y = 2 + t, z = 3 − 2t

10.

x − 6

4

= y + 3 = z and

x − 11

3

=

y − 14

−6

=

z + 9

2

For Exercises 11-12, write the normal form of the plane P containing the point Q and

perpendicular to the vector n.

11. Q = (5, 1, −2), n = (4, −4, 3) 12. Q = (6, −2, 0), n = (2, 6, 4)

For Exercises 13-14, write the normal form of the plane containing the given points.

13. (1, 0, 3), (1, 2, −1), (6, 1, 6) 14. (−3, 1, −3), (4, −4, 3), (0, 0, 1)

15. Write the normal form of the plane containing the lines from Exercise 9.

16. Write the normal form of the plane containing the lines from Exercise 10.

For Exercises 17-18, find the distance D from the point Q to the plane P.

17. Q = (4, 1, 2), P : 3x − y − 5z + 8 = 0 18. Q = (0, 2, 0), P : −5x + 2y − 7z + 1 = 0

For Exercises 19-20, find the line of intersection (if any) of the given planes.

19. x + 3y + 2z − 6 = 0, 2x − y + z + 2 = 0 20. 3x + y − 5z = 0, x + 2y + z + 4 = 0

B

21. Find the point(s) of intersection (if any) of the line

x − 6

4

= y + 3 = z with the plane

x +3y +2z −6 = 0. (Hint: Put the equations of the line into the equation of the plane.)

40 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

1.6 Surfaces

In the previous section we discussed planes in Euclidean space. A plane is an exam-

ple of a surface, which we will define informally

8

as the solution set of the equation

F(x, y, z) = 0 in

3

, for some real-valued function F. For example, a plane given by

ax+by+cz+d = 0 is the solution set of F(x, y, z) = 0 for the function F(x, y, z) = ax+by+cz+d.

Surfaces are 2-dimensional. The plane is the simplest surface, since it is “flat”. In this

section we will look at some surfaces that are more complex, the most important of

which are the sphere and the cylinder.

Definition 1.9. A sphere S is the set of all points (x, y, z) in

3

which are a fixed

distance r (called the radius) from a fixed point P

0

= (x

0

, y

0

, z

0

) (called the center of

the sphere):

S = { (x, y, z) : (x − x

0

)

2

+ (y − y

0

)

2

+ (z − z

0

)

2

= r

2

} (1.29)

Using vector notation, this can be written in the equivalent form:

S = { x : x − x

0

= r } (1.30)

where x = (x, y, z) and x

0

= (x

0

, y

0

, z

0

) are vectors.

Figure 1.6.1 illustrates the vectorial approach to spheres.

y

z

x

0

x = r

x

(a) radius r, center (0, 0, 0)

y

z

x

0

x − x

0

= r

x

x

0

x − x

0

(x

0

, y

0

, z

0

)

(b) radius r, center (x

0

, y

0

, z

0

)

Figure 1.6.1 Spheres in

3

Note in Figure 1.6.1(a) that the intersection of the sphere with the xy-plane is a

circle of radius r (i.e. a great circle, given by x

2

+ y

2

= r

2

as a subset of

2

). Similarly

for the intersections with the xz-plane and the yz-plane. In general, a plane intersects

a sphere either at a single point or in a circle.

8

See O’NEILL for a deeper and more rigorous discussion of surfaces.

1.6 Surfaces 41

Example 1.27. Find the intersection of the sphere x

2

+ y

2

+ z

2

= 169 with the plane

z = 12.

y

z

x

0

z = 12

Figure 1.6.2

Solution: The sphere is centered at the origin and has

radius 13 =

√

169, so it does intersect the plane z = 12.

Putting z = 12 into the equation of the sphere gives

x

2

+ y

2

+ 12

2

= 169

x

2

+ y

2

= 169 − 144 = 25 = 5

2

which is a circle of radius 5 centered at (0, 0, 12), parallel

to the xy-plane (see Figure 1.6.2).

If the equation in formula (1.29) is multiplied out, we get an equation of the form:

x

2

+ y

2

+ z

2

+ ax + by + cz + d = 0 (1.31)

for some constants a, b, c and d. Conversely, an equation of this form may describe a

sphere, which can be determined by completing the square for the x, y and z variables.

Example 1.28. Is 2x

2

+ 2y

2

+ 2z

2

− 8x + 4y − 16z + 10 = 0 the equation of a sphere?

Solution: Dividing both sides of the equation by 2 gives

x

2

+ y

2

+ z

2

− 4x + 2y − 8z + 5 = 0

(x

2

− 4x + 4) + (y

2

+ 2y + 1) + (z

2

− 8z + 16) + 5 − 4 − 1 − 16 = 0

(x − 2)

2

+ (y + 1)

2

+ (z − 4)

2

= 16

which is a sphere of radius 4 centered at (2, −1, 4).

Example 1.29. Find the points(s) of intersection (if any) of the sphere from Example

1.28 and the line x = 3 + t, y = 1 + 2t, z = 3 − t.

Solution: Put the equations of the line into the equation of the sphere, which was

(x − 2)

2

+ (y + 1)

2

+ (z − 4)

2

= 16, and solve for t:

(3 + t − 2)

2

+ (1 + 2t + 1)

2

+ (3 − t − 4)

2

= 16

(t + 1)

2

+ (2t + 2)

2

+ (−t − 1)

2

= 16

6t

2

+ 12t − 10 = 0

The quadratic formula gives the solutions t = −1 ±

4

√

6

. Putting those two values into

the equations of the line gives the following two points of intersection:

¸

2 +

4

√

6

, −1 +

8

√

6

, 4 −

4

√

6

and

¸

2 −

4

√

6

, −1 −

8

√

6

, 4 +

4

√

6

**42 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
**

If two spheres intersect, they do so either at a single point or in a circle.

Example 1.30. Find the intersection (if any) of the spheres x

2

+ y

2

+ z

2

= 25 and x

2

+

y

2

+ (z − 2)

2

= 16.

Solution: For any point (x, y, z) on both spheres, we see that

x

2

+ y

2

+ z

2

= 25 ⇒ x

2

+ y

2

= 25 − z

2

, and

x

2

+ y

2

+ (z − 2)

2

= 16 ⇒ x

2

+ y

2

= 16 − (z − 2)

2

, so

16 − (z − 2)

2

= 25 − z

2

⇒ 4z − 4 = 9 ⇒ z = 13/4

⇒ x

2

+ y

2

= 25 − (13/4)

2

= 231/16

∴ The intersection is the circle x

2

+ y

2

=

231

16

of radius

√

231

4

≈ 3.8 centered at (0, 0,

13

4

).

The cylinders that we will consider are right circular cylinders. These are cylinders

obtained by moving a line L along a circle C in

3

in a way so that L is always per-

pendicular to the plane containing C. We will only consider the cases where the plane

containing C is parallel to one of the three coordinate planes (see Figure 1.6.3).

y

z

x

0

r

(a) x

2

+ y

2

= r

2

, any z

y

z

x

0

r

(b) x

2

+ z

2

= r

2

, any y

y

z

x

0

r

(c) y

2

+ z

2

= r

2

, any x

Figure 1.6.3 Cylinders in

3

For example, the equation of a cylinder whose base circle C lies in the xy-plane and

is centered at (a, b, 0) and has radius r is

(x − a)

2

+ (y − b)

2

= r

2

, (1.32)

where the value of the z coordinate is unrestricted. Similar equations can be written

when the base circle lies in one of the other coordinate planes. A plane intersects a

right circular cylinder in a circle, ellipse, or one or two lines, depending on whether

that plane is parallel, oblique

9

, or perpendicular, respectively, to the plane containing

C. The intersection of a surface with a plane is called the trace of the surface.

9

i.e. at an angle strictly between 0

◦

and 90

◦

.

1.6 Surfaces 43

The equations of spheres and cylinders are examples of second-degree equations in

3

, i.e. equations of the form

Ax

2

+ By

2

+ Cz

2

+ Dxy + Exz + Fyz + Gx + Hy + Iz + J = 0 (1.33)

for some constants A, B, . . . , J. If the above equation is not that of a sphere, cylinder,

plane, line or point, then the resulting surface is called a quadric surface.

y

z

x

0

a

b

c

Figure 1.6.4 Ellipsoid

One type of quadric surface is the ellipsoid,

given by an equation of the form:

x

2

a

2

+

y

2

b

2

+

z

2

c

2

= 1 (1.34)

In the case where a = b = c, this is just a sphere.

In general, an ellipsoid is egg-shaped (think of

an ellipse rotated around its major axis). Its

traces in the coordinate planes are ellipses.

Two other types of quadric surfaces are the hyperboloid of one sheet, given by

an equation of the form:

x

2

a

2

+

y

2

b

2

−

z

2

c

2

= 1 (1.35)

and the hyperboloid of two sheets, whose equation has the form:

x

2

a

2

−

y

2

b

2

−

z

2

c

2

= 1 (1.36)

y

z

x

0

Figure 1.6.5 Hyperboloid of one sheet

y

z

x

0

Figure 1.6.6 Hyperboloid of two sheets

44 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

For the hyperboloid of one sheet, the trace in any plane parallel to the xy-plane is

an ellipse. The traces in the planes parallel to the xz- or yz-planes are hyperbolas (see

Figure 1.6.5), except for the special cases x = ±a and y = ±b; in those planes the traces

are pairs of intersecting lines (see Exercise 8).

For the hyperboloid of two sheets, the trace in any plane parallel to the xy- or xz-

plane is a hyperbola (see Figure 1.6.6). There is no trace in the yz-plane. In any plane

parallel to the yz-plane for which | x| > |a|, the trace is an ellipse.

y

z

x

0

Figure 1.6.7 Paraboloid

The elliptic paraboloid is another type of quadric sur-

face, whose equation has the form:

x

2

a

2

+

y

2

b

2

=

z

c

(1.37)

The traces in planes parallel to the xy-plane are ellipses,

though in the xy-plane itself the trace is a single point. The

traces in planes parallel to the xz- or yz-planes are parabo-

las. Figure 1.6.7 shows the case where c > 0. When c < 0 the

surface is turned downward. In the case where a = b, the

surface is called a paraboloid of revolution, which is often

used as a reflecting surface, e.g. in vehicle headlights.

10

A more complicated quadric surface is the hyperbolic paraboloid, given by:

x

2

a

2

−

y

2

b

2

=

z

c

(1.38)

-10

-5

0

5

10

-10

-5

0

5

10

-100

-50

0

50

100

z

x

y

z

Figure 1.6.8 Hyperbolic paraboloid

10

For a discussion of this see pp. 157-158 in HECHT.

1.6 Surfaces 45

The hyperbolic paraboloid can be tricky to draw; using graphing software on a com-

puter can make it easier. For example, Figure 1.6.8 was created using the free Gnuplot

package (see Appendix C). It shows the graph of the hyperbolic paraboloid z = y

2

− x

2

,

which is the special case where a = b = 1 and c = −1 in equation (1.38). The mesh lines

on the surface are the traces in planes parallel to the coordinate planes. So we see

that the traces in planes parallel to the xz-plane are parabolas pointing upward, while

the traces in planes parallel to the yz-plane are parabolas pointing downward. Also,

notice that the traces in planes parallel to the xy-plane are hyperbolas, though in the

xy-plane itself the trace is a pair of intersecting lines through the origin. This is true

in general when c < 0 in equation (1.38). When c > 0, the surface would be similar to

that in Figure 1.6.8, only rotated 90

◦

around the z-axis and the nature of the traces in

planes parallel to the xz- or yz-planes would be reversed.

y

z

x

0

Figure 1.6.9 Elliptic cone

The last type of quadric surface that we will consider is

the elliptic cone, which has an equation of the form:

x

2

a

2

+

y

2

b

2

−

z

2

c

2

= 0 (1.39)

The traces in planes parallel to the xy-plane are ellipses,

except in the xy-plane itself where the trace is a single

point. The traces in planes parallel to the xz- or yz-planes

are hyperbolas, except in the xz- and yz-planes themselves

where the traces are pairs of intersecting lines.

Notice that every point on the elliptic cone is on a line

which lies entirely on the surface; in Figure 1.6.9 these

lines all go through the origin. This makes the elliptic

cone an example of a ruled surface. The cylinder is also a ruled surface.

What may not be as obvious is that both the hyperboloid of one sheet and the hy-

perbolic paraboloid are ruled surfaces. In fact, on both surfaces there are two lines

through each point on the surface (see Exercises 11-12). Such surfaces are called

doubly ruled surfaces, and the pairs of lines are called a regulus.

It is clear that for each of the six types of quadric surfaces that we discussed, the

surface can be translated away from the origin (e.g. by replacing x

2

by (x − x

0

)

2

in

its equation). It can be proved

11

that every quadric surface can be translated and/or

rotated so that its equation matches one of the six types that we described. For ex-

ample, z = 2xy is a case of equation (1.33) with “mixed” variables, e.g. with D 0 so

that we get an xy term. This equation does not match any of the types we considered.

However, by rotating the x- and y-axes by 45

◦

in the xy-plane by means of the coor-

dinate transformation x = (x

′

− y

′

)/

√

2, y = (x

′

+ y

′

)/

√

2, z = z

′

, then z = 2xy becomes

the hyperbolic paraboloid z

′

= (x

′

)

2

− (y

′

)

2

in the (x

′

, y

′

, z

′

) coordinate system. That is,

z = 2xy is a hyperbolic paraboloid as in equation (1.38), but rotated 45

◦

in the xy-plane.

11

See Ch. 7 in POGORELOV.

46 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

¨

©

Exercises

A

For Exercises 1-4, determine if the given equation describes a sphere. If so, find its

radius and center.

1. x

2

+ y

2

+ z

2

− 4x − 6y − 10z + 37 = 0 2. x

2

+ y

2

+ z

2

+ 2x − 2y − 8z + 19 = 0

3. 2x

2

+ 2y

2

+ 2z

2

+ 4x + 4y + 4z − 44 = 0 4. x

2

+ y

2

− z

2

+ 12x + 2y − 4z + 32 = 0

5. Find the point(s) of intersection of the sphere (x − 3)

2

+ (y + 1)

2

+ (z − 3)

2

= 9 and the

line x = −1 + 2t, y = −2 − 3t, z = 3 + t.

B

6. Find the intersection of the spheres x

2

+y

2

+z

2

= 9 and (x −4)

2

+(y +2)

2

+(z −4)

2

= 9.

7. Find the intersection of the sphere x

2

+ y

2

+ z

2

= 9 and the cylinder x

2

+ y

2

= 4.

8. Find the trace of the hyperboloid of one sheet

x

2

a

2

+

y

2

b

2

−

z

2

c

2

= 1 in the plane x = a, and

the trace in the plane y = b.

9. Find the trace of the hyperbolic paraboloid

x

2

a

2

−

y

2

b

2

=

z

c

in the xy-plane.

C

10. It can be shown that any four noncoplanar points (i.e. points that do not lie in

the same plane) determine a sphere.

12

Find the equation of the sphere that passes

through the points (0, 0, 0), (0, 0, 2), (1, −4, 3) and (0, −1, 3). (Hint: Equation (1.31))

11. Show that the hyperboloid of one sheet is a doubly ruled surface, i.e. each point

on the surface is on two lines lying entirely on the surface. (Hint: Write equation

(1.35) as

x

2

a

2

−

z

2

c

2

= 1−

y

2

b

2

, factor each side. Recall that two planes intersect in a line.)

12. Show that the hyperbolic paraboloid is a doubly ruled surface. (Hint: Exercise 11)

y

z

x

0

(0, 0, 2)

(x, y, 0)

(a, b, c)

1

S

Figure 1.6.10

13. Let S be the sphere with radius 1 centered at (0, 0, 1),

and let S

∗

be S without the “north pole” point (0, 0, 2).

Let (a, b, c) be an arbitrary point on S

∗

. Then the line

passing through (0, 0, 2) and (a, b, c) intersects the xy-

plane at some point (x, y, 0), as in Figure 1.6.10. Find

this point (x, y, 0) in terms of a, b and c.

(Note: Every point in the xy-plane can be matched

with a point on S

∗

, and vice versa, in this manner.

This method is called stereographic projection, which

essentially identifies all of

2

with a “punctured” sphere.)

12

See WELCHONS and KRICKENBERGER, p. 160, for a proof.

1.7 Curvilinear Coordinates 47

1.7 Curvilinear Coordinates

x

y

z

0

(x, y, z)

x

y

z

Figure 1.7.1

The Cartesian coordinates of a point (x, y, z) are determined

by following a family of straight paths from the origin: first

along the x-axis, then parallel to the y-axis, then parallel to

the z-axis, as in Figure 1.7.1. In curvilinear coordinate sys-

tems, these paths can be curved. The two types of curvilinear

coordinates which we will consider are cylindrical and spher-

ical coordinates. Instead of referencing a point in terms of

sides of a rectangular parallelepiped, as with Cartesian co-

ordinates, we will think of the point as lying on a cylinder or sphere. Cylindrical

coordinates are often used when there is symmetry around the z-axis, while spherical

coordinates are useful when there is symmetry about the origin.

Let P = (x, y, z) be a point in Cartesian coordinates in

3

, and let P

0

= (x, y, 0) be the

projection of P upon the xy-plane. Treating (x, y) as a point in

2

, let (r, θ) be its polar

coordinates (see Figure 1.7.2). Let ρ be the length of the line segment from the origin

to P, and let φ be the angle between that line segment and the positive z-axis (see

Figure 1.7.3). φ is called the zenith angle. Then the cylindrical coordinates (r, θ, z)

and the spherical coordinates (ρ, θ, φ) of P(x, y, z) are defined as follows:

x

y

z

0

P(x, y, z)

P

0

(x, y, 0)

θ x

y

z

r

Figure 1.7.2

Cylindrical coordinates

Cylindrical coordinates (r, θ, z):

x = r cos θ r =

x

2

+ y

2

y = r sin θ θ = tan

−1

y

x

z = z z = z

where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0

x

y

z

0

P(x, y, z)

P

0

(x, y, 0)

θ x

y

z

ρ

φ

Figure 1.7.3

Spherical coordinates

Spherical coordinates (ρ, θ, φ):

x = ρ sin φ cos θ ρ =

x

2

+ y

2

+ z

2

y = ρ sin φ sin θ θ = tan

−1

y

x

z = ρ cos φ φ = cos

−1

z

√

x

2

+y

2

+z

2

where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0

Both θ and φ are measured in radians. Note that r ≥ 0, 0 ≤ θ < 2π, ρ ≥ 0 and 0 ≤ φ ≤ π.

Also, θ is undefined when (x, y) = (0, 0), and φ is undefined when (x, y, z) = (0, 0, 0).

48 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.31. Convert the point (−2, −2, 1) from Cartesian coordinates to (a) cylin-

drical and (b) spherical coordinates.

Solution: (a) r =

(−2)

2

+ (−2)

2

= 2

√

2, θ = tan

−1

−2

−2

= tan

−1

(1) =

5π

4

, since y = −2 < 0.

∴ (r, θ, z) =

2

√

2,

5π

4

, 1

(b) ρ =

(−2)

2

+ (−2)

2

+ 1

2

=

√

9 = 3, φ = cos

−1

1

3

≈ 1.23 radians.

∴ (ρ, θ, φ) =

3,

5π

4

, 1.23

**For cylindrical coordinates (r, θ, z), and constants r
**

0

, θ

0

and z

0

, we see from Figure

1.7.4 that the surface r = r

0

is a cylinder of radius r

0

centered along the z-axis, the

surface θ = θ

0

is a half-plane emanating from the z-axis, and the surface z = z

0

is a

plane parallel to the xy-plane.

y

z

x

0

r

0

(a) r = r

0

y

z

x

0

θ

0

(b) θ = θ

0

y

z

x

0

z

0

(c) z = z

0

Figure 1.7.4 Cylindrical coordinate surfaces

For spherical coordinates (ρ, θ, φ), and constants ρ

0

, θ

0

and φ

0

, we see from Figure

1.7.5 that the surface ρ = ρ

0

is a sphere of radius ρ

0

centered at the origin, the surface

θ = θ

0

is a half-plane emanating from the z-axis, and the surface φ = φ

0

is a circular

cone whose vertex is at the origin.

y

z

x

0

ρ

0

(a) ρ = ρ

0

y

z

x

0

θ

0

(b) θ = θ

0

y

z

x

0

φ

0

(c) φ = φ

0

Figure 1.7.5 Spherical coordinate surfaces

Figures 1.7.4(a) and 1.7.5(a) show how these coordinate systems got their names.

1.7 Curvilinear Coordinates 49

Sometimes the equation of a surface in Cartesian coordinates can be transformed

into a simpler equation in some other coordinate system, as in the following example.

Example 1.32. Write the equation of the cylinder x

2

+y

2

= 4 in cylindrical coordinates.

Solution: Since r =

x

2

+ y

2

, then the equation in cylindrical coordinates is r = 2.

Using spherical coordinates to write the equation of a sphere does not necessarily

make the equation simpler, if the sphere is not centered at the origin.

Example 1.33. Write the equation (x − 2)

2

+ (y − 1)

2

+ z

2

= 9 in spherical coordinates.

Solution: Multiplying the equation out gives

x

2

+ y

2

+ z

2

− 4x − 2y + 5 = 9 , so we get

ρ

2

− 4ρ sin φ cos θ − 2ρ sin φ sin θ − 4 = 0 , or

ρ

2

− 2 sin φ (2 cos θ − sin θ ) ρ − 4 = 0

after combining terms. Note that this actually makes it more difficult to figure out

what the surface is, as opposed to the Cartesian equation where you could immedi-

ately identify the surface as a sphere of radius 3 centered at (2, 1, 0).

Example 1.34. Describe the surface given by θ = z in cylindrical coordinates.

Solution: This surface is called a helicoid. As the (vertical) z coordinate increases,

so does the angle θ, while the radius r is unrestricted. So this sweeps out a (ruled!)

surface shaped like a spiral staircase, where the spiral has an inﬁnite radius. Figure

1.7.6 shows a section of this surface restricted to 0 ≤ z ≤ 4π and 0 ≤ r ≤ 2.

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0

2

4

6

8

10

12

14

z

x

y

z

Figure 1.7.6 Helicoid θ = z

50 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

¨

©

Exercises

A

For Exercises 1-4, find the (a) cylindrical and (b) spherical coordinates of the point

whose Cartesian coordinates are given.

1. (2, 2

√

3, −1) 2. (−5, 5, 6) 3. (

√

21, −

√

7, 0) 4. (0,

√

2, 2)

For Exercises 5-7, write the given equation in (a) cylindrical and (b) spherical coordi-

nates.

5. x

2

+ y

2

+ z

2

= 25 6. x

2

+ y

2

= 2y 7. x

2

+ y

2

+ 9z

2

= 36

B

8. Describe the intersection of the surfaces whose equations in spherical coordinates

are θ =

π

2

and φ =

π

4

.

9. Show that for a 0, the equation ρ = 2a sin φ cos θ in spherical coordinates describes

a sphere centered at (a, 0, 0) with radius |a|.

C

10. Let P = (a, θ, φ) be a point in spherical coordinates, with a > 0 and 0 < φ < π. Then

P lies on the sphere ρ = a. Since 0 < φ < π, the line segment from the origin to P

can be extended to intersect the cylinder given by r = a (in cylindrical coordinates).

Find the cylindrical coordinates of that point of intersection.

11. Let P

1

and P

2

be points whose spherical coordinates are (ρ

1

, θ

1

, φ

1

) and (ρ

2

, θ

2

, φ

2

),

respectively. Let v

1

be the vector from the origin to P

1

, and let v

2

be the vector from

the origin to P

2

. For the angle γ between v

1

and v

2

, show that

cos γ = cos φ

1

cos φ

2

+ sin φ

1

sin φ

2

cos( θ

2

− θ

1

).

This formula is used in electrodynamics to prove the addition theorem for spherical

harmonics, which provides a general expression for the electrostatic potential at a

point due to a unit charge. See pp. 100-102 in JACKSON.

12. Show that the distance d between the points P

1

and P

2

with cylindrical coordinates

(r

1

, θ

1

, z

1

) and (r

2

, θ

2

, z

2

), respectively, is

d =

r

2

1

+ r

2

2

− 2r

1

r

2

cos( θ

2

− θ

1

) + (z

2

− z

1

)

2

.

13. Show that the distance d between the points P

1

and P

2

with spherical coordinates

(ρ

1

, θ

1

, φ

1

) and (ρ

2

, θ

2

, φ

2

), respectively, is

d =

ρ

2

1

+ ρ

2

2

− 2ρ

1

ρ

2

[sin φ

1

sin φ

2

cos( θ

2

− θ

1

) + cos φ

1

cos φ

2

] .

1.8 Vector-Valued Functions 51

1.8 Vector-Valued Functions

Now that we are familiar with vectors and their operations, we can begin discussing

functions whose values are vectors.

Definition 1.10. A vector-valued function of a real variable is a rule that asso-

ciates a vector f(t) with a real number t, where t is in some subset D of

1

(called the

domain of f). We write f : D →

3

to denote that f is a mapping of D into

3

.

For example, f(t) = ti + t

2

j + t

3

k is a vector-valued function in

3

, defined for all real

numbers t. We would write f : →

3

. At t = 1 the value of the function is the vector

i + j + k, which in Cartesian coordinates has the terminal point (1, 1, 1).

A vector-valued function of a real variable can be written in component form as

f(t) = f

1

(t)i + f

2

(t)j + f

3

(t)k

or in the form

f(t) = ( f

1

(t), f

2

(t), f

3

(t))

for some real-valued functions f

1

(t), f

2

(t), f

3

(t), called the component functions of f. The

first form is often used when emphasizing that f(t) is a vector, and the second form is

useful when considering just the terminal points of the vectors. By identifying vectors

with their terminal points, a curve in space can be written as a vector-valued function.

y

z

x

0

f(0)

f(2π)

Figure 1.8.1

Example 1.35. Define f : →

3

by f(t) = (cos t, sin t, t).

This is the equation of a helix (see Figure 1.8.1). As the value

of t increases, the terminal points of f(t) trace out a curve spi-

raling upward. For each t, the x- and y-coordinates of f(t) are

x = cos t and y = sin t, so

x

2

+ y

2

= cos

2

t + sin

2

t = 1.

Thus, the curve lies on the surface of the right circular cylin-

der x

2

+ y

2

= 1.

It may help to think of vector-valued functions of a real variable in

3

as a general-

ization of the parametric functions in

2

which you learned about in single-variable

calculus. Much of the theory of real-valued functions of a single real variable can be

applied to vector-valued functions of a real variable. Since each of the three compo-

nent functions are real-valued, it will sometimes be the case that results from single-

variable calculus can simply be applied to each of the component functions to yield

a similar result for the vector-valued function. However, there are times when such

generalizations do not hold (see Exercise 13). The concept of a limit, though, can be

extended naturally to vector-valued functions, as in the following definition.

52 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Definition 1.11. Let f(t) be a vector-valued function, let a be a real number and let c

be a vector. Then we say that the limit of f(t) as t approaches a equals c, written as

lim

t→a

f(t) = c, if lim

t→a

f(t) − c = 0. If f(t) = ( f

1

(t), f

2

(t), f

3

(t)), then

lim

t→a

f(t) =

lim

t→a

f

1

(t), lim

t→a

f

2

(t), lim

t→a

f

3

(t)

**provided that all three limits on the right side exist.
**

The above definition shows that continuity and the derivative of vector-valued func-

tions can also be defined in terms of its component functions.

Definition 1.12. Let f(t) = ( f

1

(t), f

2

(t), f

3

(t)) be a vector-valued function, and let a be a

real number in its domain. Then f(t) is continuous at a if lim

t→a

f(t) = f(a). Equivalently,

f(t) is continuous at a if and only if f

1

(t), f

2

(t), and f

3

(t) are continuous at a.

The derivative of f(t) at a, denoted by f

′

(a) or

df

dt

(a), is the limit

f

′

(a) = lim

h→0

f(a + h) − f(a)

h

if that limit exists. Equivalently, f

′

(a) = ( f

1

′

(a), f

2

′

(a), f

3

′

(a)), if the component deriva-

tives exist. We say that f(t) is differentiable at a if f

′

(a) exists.

Recall that the derivative of a real-valued function of a single variable is a real

number, representing the slope of the tangent line to the graph of the function at a

point. Similarly, the derivative of a vector-valued function is a tangent vector to the

curve in space which the function represents, and it lies on the tangent line to the

curve (see Figure 1.8.2).

y

z

x

0

L

f(t)

f

′

(a)

f(a)

f(a + h)

f

(

a

+

h

)

−

f

(

a

)

Figure 1.8.2 Tangent vector f

′

(a) and tangent line L = f(a) + sf

′

(a)

Example 1.36. Let f(t) = (cos t, sin t, t). Then f

′

(t) = (−sin t, cos t, 1) for all t. The tangent

line L to the curve at f(2π) = (1, 0, 2π) is L = f(2π) + s f

′

(2π) = (1, 0, 2π) + s(0, 1, 1), or in

parametric form: x = 1, y = s, z = 2π + s for −∞ < s < ∞.

1.8 Vector-Valued Functions 53

A scalar function is a real-valued function. Note that if u(t) is a scalar function

and f(t) is a vector-valued function, then their product, defined by (u f)(t) = u(t) f(t) for

all t, is a vector-valued function (since the product of a scalar with a vector is a vector).

The basic properties of derivatives of vector-valued functions are summarized in the

following theorem.

Theorem 1.20. Let f(t) and g(t) be differentiable vector-valued functions, let u(t)

be a differentiable scalar function, let k be a scalar, and let c be a constant vector. Then

(a)

d

dt

(c) = 0

(b)

d

dt

(kf) = k

df

dt

(c)

d

dt

(f + g) =

df

dt

+

dg

dt

(d)

d

dt

(f − g) =

df

dt

−

dg

dt

(e)

d

dt

(u f) =

du

dt

f + u

df

dt

(f)

d

dt

(f ··· g) =

df

dt

··· g + f ···

dg

dt

(g)

d

dt

(f ××× g) =

df

dt

××× g + f ×××

dg

dt

Proof: The proofs of parts (a)-(e) follow easily by differentiating the component func-

tions and using the rules for derivatives from single-variable calculus. We will prove

part (f), and leave the proof of part (g) as an exercise for the reader.

(f) Write f(t) = ( f

1

(t), f

2

(t), f

3

(t)) and g(t) = (g

1

(t), g

2

(t), g

3

(t)), where the component func-

tions f

1

(t), f

2

(t), f

3

(t), g

1

(t), g

2

(t), g

3

(t) are all differentiable real-valued functions. Then

d

dt

(f(t) ··· g(t)) =

d

dt

( f

1

(t) g

1

(t) + f

2

(t) g

2

(t) + f

3

(t) g

3

(t))

=

d

dt

( f

1

(t) g

1

(t)) +

d

dt

( f

2

(t) g

2

(t)) +

d

dt

( f

3

(t) g

3

(t))

=

d f

1

dt

(t) g

1

(t) + f

1

(t)

dg

1

dt

(t) +

d f

2

dt

(t) g

2

(t) + f

2

(t)

dg

2

dt

(t) +

d f

3

dt

(t) g

3

(t) + f

3

(t)

dg

3

dt

(t)

=

d f

1

dt

(t),

d f

2

dt

(t),

d f

3

dt

(t)

··· (g

1

(t), g

2

(t), g

3

(t))

+ ( f

1

(t), f

2

(t), f

3

(t)) ···

dg

1

dt

(t),

dg

2

dt

(t),

dg

3

dt

(t)

=

df

dt

(t) ··· g(t) + f(t) ···

dg

dt

(t) for all t. QED

54 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.37. Suppose f(t) is differentiable. Find the derivative of f(t).

Solution: Since f(t) is a real-valued function of t, then by the Chain Rule for real-

valued functions, we know that

d

dt

f(t)

2

= 2f(t)

d

dt

f(t).

But f(t)

2

= f(t) ··· f(t), so

d

dt

f(t)

2

=

d

dt

(f(t) ··· f(t)). Hence, we have

2f(t)

d

dt

f(t) =

d

dt

(f(t) ··· f(t)) = f

′

(t) ··· f(t) + f(t) ··· f

′

(t) by Theorem 1.20(f), so

= 2f

′

(t) ··· f(t) , so if f(t) 0 then

d

dt

f(t) =

f

′

(t) ··· f(t)

f(t)

.

We know that f(t) is constant if and only if

d

dt

f(t) = 0 for all t. Also, f(t) ⊥ f

′

(t) if

and only if f

′

(t) ··· f(t) = 0. Thus, the above example shows this important fact:

If f(t) 0, then f(t) is constant if and only if f(t) ⊥ f

′

(t) for all t.

This means that if a curve lies completely on a sphere (or circle) centered at the origin,

then the tangent vector f

′

(t) is always perpendicular to the position vector f(t).

Example 1.38. The spherical spiral f(t) =

cos t

√

1 + a

2

t

2

,

sin t

√

1 + a

2

t

2

,

−at

√

1 + a

2

t

2

, for a 0.

Figure 1.8.3 shows the graph of the curve when a = 0.2. In the exercises, the reader

will be asked to show that this curve lies on the sphere x

2

+ y

2

+ z

2

= 1 and to verify

directly that f

′

(t) ··· f(t) = 0 for all t.

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

z

x

y

z

Figure 1.8.3 Spherical spiral with a = 0.2

1.8 Vector-Valued Functions 55

Just as in single-variable calculus, higher-order derivatives of vector-valued func-

tions are obtained by repeatedly differentiating the (first) derivative of the function:

f

′′

(t) =

d

dt

f

′

(t) , f

′′′

(t) =

d

dt

f

′′

(t) , . . . ,

d

n

f

dt

n

=

d

dt

d

n−1

f

dt

n−1

(for n = 2, 3, 4, . . .)

We can use vector-valued functions to represent physical quantities, such as veloc-

ity, acceleration, force, momentum, etc. For example, let the real variable t represent

time elapsed from some initial time (t = 0), and suppose that an object of constant

mass m is subjected to some force so that it moves in space, with its position (x, y, z) at

time t a function of t. That is, x = x(t), y = y(t), z = z(t) for some real-valued functions

x(t), y(t), z(t). Call r(t) = (x(t), y(t), z(t)) the position vector of the object. We can define

various physical quantities associated with the object as follows:

13

position: r(t) = (x(t), y(t), z(t))

velocity: v(t) = ˙ r(t) = r

′

(t) =

dr

dt

= (x

′

(t), y

′

(t), z

′

(t))

acceleration: a(t) = ˙ v(t) = v

′

(t) =

dv

dt

= ¨ r(t) = r

′′

(t) =

d

2

r

dt

2

= (x

′′

(t), y

′′

(t), z

′′

(t))

momentum: p(t) = mv(t)

force: F(t) = ˙ p(t) = p

′

(t) =

dp

dt

(Newton’s Second Law of Motion)

The magnitude v(t) of the velocity vector is called the speed of the object. Note that

since the mass m is a constant, the force equation becomes the familiar F(t) = ma(t).

Example 1.39. Let r(t) = (5 cos t, 3 sin t, 4 sin t) be the position vector of an object at time

t ≥ 0. Find its (a) velocity and (b) acceleration vectors.

Solution: (a) v(t) = ˙ r(t) = (−5 sin t, 3 cos t, 4 cos t)

(b) a(t) = ˙ v(t) = (−5 cos t, −3 sin t, −4 sin t)

Note that r(t) =

25 cos

2

t + 25 sin

2

t = 5 for all t, so by Example 1.37 we know that

r(t) ··· ˙ r(t) = 0 for all t (which we can verify from part (a)). In fact, v(t) = 5 for all t

also. And not only does r(t) lie on the sphere of radius 5 centered at the origin, but

perhaps not so obvious is that it lies completely within a circle of radius 5 centered at

the origin. Also, note that a(t) = −r(t). It turns out (see Exercise 16) that whenever an

object moves in a circle with constant speed, the acceleration vector will point in the

opposite direction of the position vector (i.e. towards the center of the circle).

13

We will often use the older dot notation for derivatives when physics is involved.

56 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Recall from Section 1.5 that if r

1

, r

2

are position vectors to distinct points then

r

1

+t(r

2

−r

1

) represents a line through those two points as t varies over all real numbers.

That vector sum can be written as (1 − t)r

1

+ tr

2

. So the function l(t) = (1 − t)r

1

+ tr

2

is

a line through the terminal points of r

1

and r

2

, and when t is restricted to the interval

[0, 1] it is the line segment between the points, with l(0) = r

1

and l(1) = r

2

.

In general, a function of the form f(t) = (a

1

t + b

1

, a

2

t + b

2

, a

3

t + b

3

) represents a line in

3

. A function of the form f(t) = (a

1

t

2

+ b

1

t + c

1

, a

2

t

2

+ b

2

t + c

2

, a

3

t

2

+ b

3

t + c

3

) represents a

(possibly degenerate) parabola in

3

.

Example 1.40. Bézier curves are used in Computer Aided Design (CAD) to approx-

imate the shape of a polygonal path in space (called the Bézier polygon or control

polygon). For instance, given three points (or position vectors) b

0

, b

1

, b

2

in

3

, define

b

1

0

(t) = (1 − t)b

0

+ tb

1

b

1

1

(t) = (1 − t)b

1

+ tb

2

b

2

0

(t) = (1 − t)b

1

0

(t) + tb

1

1

(t)

= (1 − t)

2

b

0

+ 2t(1 − t)b

1

+ t

2

b

2

for all real t. For t in the interval [0, 1], we see that b

1

0

(t) is the line segment between

b

0

and b

1

, and b

1

1

(t) is the line segment between b

1

and b

2

. The function b

2

0

(t) is the

Bézier curve for the points b

0

, b

1

, b

2

. Note from the last formula that the curve is a

parabola that goes through b

0

(when t = 0) and b

2

(when t = 1).

As an example, let b

0

= (0, 0, 0), b

1

= (1, 2, 3), and b

2

= (4, 5, 2). Then the explicit

formula for the Bézier curve is b

2

0

(t) = (2t +2t

2

, 4t +t

2

, 6t −4t

2

), as shown in Figure 1.8.4,

where the line segments are b

1

0

(t) and b

1

1

(t), and the curve is b

2

0

(t).

0

0.5

1

1.5

2

2.5

3

3.5

4

0

1

2

3

4

5

0

0.5

1

1.5

2

2.5

3

z

(0,0,0)

(1,2,3)

(4,5,2)

x

y

z

Figure 1.8.4 Bézier curve approximation for three points

1.8 Vector-Valued Functions 57

In general, the polygonal path determined by n ≥ 3 noncollinear points in

3

can be

used to define the Bézier curve recursively by a process called repeated linear interpo-

lation. This curve will be a vector-valued function whose components are polynomials

of degree n−1, and its formula is given by de Casteljau’s algorithm.

14

In the exercises,

the reader will be given the algorithm for the case of n = 4 points and asked to write

the explicit formula for the Bézier curve for the four points shown in Figure 1.8.5.

0

0.5

1

1.5

2

2.5

3

3.5

4

0

1

2

3

4

5

0

0.5

1

1.5

2

z

(0,0,0)

(0,1,1)

(2,3,0)

(4,5,2)

x

y

z

Figure 1.8.5 Bézier curve approximation for four points

¨

©

Exercises

A

For Exercises 1-4, calculate f

′

(t) and find the tangent line at f(0).

1. f(t) = (t + 1, t

2

+ 1, t

3

+ 1) 2. f(t) = (e

t

+ 1, e

2t

+ 1, e

t

2

+ 1)

3. f(t) = (cos 2t, sin 2t, t) 4. f(t) = (sin 2t, 2 sin

2

t, 2 cos t)

For Exercises 5-6, find the velocity v(t) and acceleration a(t) of an object with the given

position vector r(t).

5. r(t) = (t, t − sin t, 1 − cos t) 6. r(t) = (3 cos t, 2 sin t, 1)

B

7. Let f(t) =

cos t

√

1 + a

2

t

2

,

sin t

√

1 + a

2

t

2

,

−at

√

1 + a

2

t

2

, with a 0.

(a) Show that f(t) = 1 for all t.

(b) Show directly that f

′

(t) ··· f(t) = 0 for all t.

8. If f

′

(t) = 0 for all t in some interval (a, b), show that f(t) is a constant vector in (a, b).

14

See pp. 27-30 in FARIN.

58 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

9. For a constant vector c 0, the function f(t) = tc represents a line parallel to c.

(a) What kind of curve does g(t) = t

3

c represent? Explain.

(b) What kind of curve does h(t) = e

t

c represent? Explain.

(c) Compare f

′

(0) and g

′

(0). Given your answer to part (a), how do you explain the

difference in the two derivatives?

10. Show that

d

dt

f ×××

df

dt

= f ×××

d

2

f

dt

2

.

11. Let a particle of (constant) mass m have position vector r(t), velocity v(t), accelera-

tion a(t) and momentum p(t) at time t. The angular momentum L(t) of the particle

with respect to the origin at time t is defined as L(t) = r(t) ××× p(t). If F(t) is the force

acting on the particle at time t, then define the torque N(t) acting on the particle

with respect to the origin as N(t) = r(t) ××× F(t). Show that L

′

(t) = N(t).

12. Show that

d

dt

(f ··· (g ××× h)) =

df

dt

··· (g ××× h) + f ···

dg

dt

××× h

+ f ···

g ×××

dh

dt

.

13. The Mean Value Theorem does not hold for vector-valued functions: Show that

for f(t) = (cos t, sin t, t), there is no t in the interval (0, 2π) such that

f

′

(t) =

f(2π) − f(0)

2π − 0

.

C

14. The Bézier curve b

3

0

(t) for four noncollinear points b

0

, b

1

, b

2

, b

3

in

3

is defined by

the following algorithm (going from the left column to the right):

b

1

0

(t) = (1 − t)b

0

+ tb

1

b

2

0

(t) = (1 − t)b

1

0

(t) + tb

1

1

(t) b

3

0

(t) = (1 − t)b

2

0

(t) + tb

2

1

(t)

b

1

1

(t) = (1 − t)b

1

+ tb

2

b

2

1

(t) = (1 − t)b

1

1

(t) + tb

1

2

(t)

b

1

2

(t) = (1 − t)b

2

+ tb

3

(a) Show that b

3

0

(t) = (1 − t)

3

b

0

+ 3t(1 − t)

2

b

1

+ 3t

2

(1 − t)b

2

+ t

3

b

3

.

(b) Write the explicit formula (as in Example 1.40) for the Bézier curve for the

points b

0

= (0, 0, 0), b

1

= (0, 1, 1), b

2

= (2, 3, 0), b

3

= (4, 5, 2).

15. Let r(t) be the position vector for a particle moving in

3

. Show that

d

dt

(r ××× (v ××× r)) = r

2

a + (r ··· v)v − (v

2

+ r ··· a)r.

16. Let r(t) be the position vector in

3

for a particle that moves with constant speed

c > 0 in a circle of radius a > 0 in the xy-plane. Show that a(t) points in the opposite

direction as r(t) for all t. (Hint: Use Example 1.37 to show that r(t) ⊥ v(t) and

a(t) ⊥ v(t), and hence a(t) r(t).)

17. Prove Theorem 1.20(g).

1.9 Arc Length 59

1.9 Arc Length

Let r(t) = (x(t), y(t), z(t)) be the position vector of an object moving in

3

. Since v(t) is

the speed of the object at time t, it seems natural to define the distance s traveled by

the object from time t = a to t = b as the definite integral

s =

b

a

v(t) dt =

b

a

x

′

(t)

2

+ y

′

(t)

2

+ z

′

(t)

2

dt , (1.40)

which is analogous to the case from single-variable calculus for parametric functions

in

2

. This is indeed how we will define the distance traveled and, in general, the arc

length of a curve in

3

.

Definition 1.13. Let f(t) = (x(t), y(t), z(t)) be a curve in

3

whose domain includes the

interval [a, b]. Suppose that in the interval (a, b) the first derivative of each component

function x(t), y(t) and z(t) exists and is continuous, and that no section of the curve is

repeated. Then the arc length L of the curve from t = a to t = b is

L =

b

a

f

′

(t) dt =

b

a

x

′

(t)

2

+ y

′

(t)

2

+ z

′

(t)

2

dt (1.41)

A real-valued function whose first derivative is continuous is called continuously

differentiable (or a C

1

function), and a function whose derivatives of all orders are

continuous is called smooth (or a C

∞

function). All the functions we will consider will

be smooth. A smooth curve f(t) is one whose derivative f

′

(t) is never the zero vector

and whose component functions are all smooth.

Note that we did not prove that the formula in the above definition actually gives the

length of a section of a curve. A rigorous proof requires dealing with some subtleties,

normally glossed over in calculus texts, which are beyond the scope of this book.

15

Example 1.41. Find the length L of the helix f(t) = (cos t, sin t, t) from t = 0 to t = 2π.

Solution: By formula (1.41), we have

L =

2π

0

(−sin t)

2

+ (cos t)

2

+ 1

2

dt =

2π

0

sin

2

t + cos

2

t + 1 dt =

2π

0

√

2 dt

=

√

2(2π − 0) = 2

√

2π

Similar to the case in

2

, if there are values of t in the interval [a, b] where the

derivative of a component function is not continuous then it is often possible to parti-

tion [a, b] into subintervals where all the component functions are continuously differ-

entiable (except at the endpoints, which can be ignored). The sum of the arc lengths

over the subintervals will be the arc length over [a, b].

15

In particular, Duhamel’s principle is needed. See the proof in TAYLOR and MANN, § 14.2 and § 18.2.

60 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Notice that the curve traced out by the function f(t) = (cos t, sin t, t) from Example

1.41 is also traced out by the function g(t) = (cos 2t, sin 2t, 2t). For example, over the

interval [0, π], g(t) traces out the same section of the curve as f(t) does over the interval

[0, 2π]. Intuitively, this says that g(t) traces the curve twice as fast as f(t). This makes

sense since, viewing the functions as position vectors and their derivatives as velocity

vectors, the speeds of f(t) and g(t) are f

′

(t) =

√

2 and g

′

(t) = 2

√

2, respectively. We

say that g(t) and f(t) are different parametrizations of the same curve.

Definition 1.14. Let C be a smooth curve in

3

represented by a function f(t) defined

on an interval [a, b], and let α : [c, d] → [a, b] be a smooth one-to-one mapping of an

interval [c, d] onto [a, b]. Then the function g : [c, d] →

3

defined by g(s) = f(α(s)) is a

parametrization of C with parameter s. If α is strictly increasing on [c, d] then we

say that g(s) is equivalent to f(t).

s t f(t)

[c, d] [a, b]

3

α f

g(s) = f(α(s)) = f(t)

Note that the differentiability of g(s) follows from a version of the Chain Rule for

vector-valued functions (the proof is left as an exercise):

Theorem 1.21. Chain Rule: If f(t) is a differentiable vector-valued function of t, and

t = α(s) is a differentiable scalar function of s, then f(s) = f(α(s)) is a differentiable

vector-valued function of s, and

df

ds

=

df

dt

dt

ds

(1.42)

for any s where the composite function f(α(s)) is defined.

Example 1.42. The following are all equivalent parametrizations of the same curve:

f(t) = (cos t, sin t, t) for t in [0, 2π]

g(s) = (cos 2s, sin 2s, 2s) for s in [0, π]

h(s) = (cos 2πs, sin 2πs, 2πs) for s in [0, 1]

To see that g(s) is equivalent to f(t), define α : [0, π] → [0, 2π] by α(s) = 2s. Then α

is smooth, one-to-one, maps [0, π] onto [0, 2π], and is strictly increasing (since α

′

(s) =

2 > 0 for all s). Likewise, defining α : [0, 1] → [0, 2π] by α(s) = 2πs shows that h(s) is

equivalent to f(t).

1.9 Arc Length 61

A curve can have many parametrizations, with different speeds, so which one is the

best to use? In some situations the arc length parametrization can be useful. The

idea behind this is to replace the parameter t, for any given smooth parametrization

f(t) defined on [a, b], by the parameter s given by

s = s(t) =

t

a

f

′

(u) du. (1.43)

In terms of motion along a curve, s is the distance traveled along the curve after

time t has elapsed. So the new parameter will be distance instead of time. There

is a natural correspondence between s and t: from a starting point on the curve, the

distance traveled along the curve (in one direction) is uniquely determined by the

amount of time elapsed, and vice versa.

Since s is the arc length of the curve over the interval [a, t] for each t in [a, b], then

it is a function of t. By the Fundamental Theorem of Calculus, its derivative is

s

′

(t) =

ds

dt

=

d

dt

t

a

f

′

(u) du = f

′

(t) for all t in [a, b].

Since f(t) is smooth, then f

′

(t) > 0 for all t in [a, b]. Thus s

′

(t) > 0 and hence s(t) is

strictly increasing on the interval [a, b]. Recall that this means that s is a one-to-one

mapping of the interval [a, b] onto the interval [s(a), s(b)]. But we see that

s(a) =

a

a

f

′

(u) du = 0 and s(b) =

b

a

f

′

(u) du = L = arc length from t = a to t = b

s t

[0, L] [a, b]

α(s)

s(t)

Figure 1.9.1 t = α(s)

So the function s : [a, b] → [0, L] is a one-to-one, differen-

tiable mapping onto the interval [0, L]. From single-variable

calculus, we know that this means that there exists an in-

verse function α : [0, L] → [a, b] that is differentiable and the

inverse of s : [a, b] → [0, L]. That is, for each t in [a, b] there

is a unique s in [0, L] such that s = s(t) and t = α(s). And we

know that the derivative of α is

α

′

(s) =

1

s

′

(α(s))

=

1

f

′

(α(s))

So define the arc length parametrization f : [0, L] →

3

by

f(s) = f(α(s)) for all s in [0, L].

Then f(s) is smooth, by the Chain Rule. In fact, f(s) has unit speed:

f

′

(s) = f

′

(α(s)) α

′

(s) by the Chain Rule, so

= f

′

(α(s))

1

f

′

(α(s))

, so

f

′

(s) = 1 for all s in [0, L].

So the arc length parametrization traverses the curve at a “normal” rate.

62 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

In practice, parametrizing a curve f(t) by arc length requires you to evaluate the

integral s =

t

a

f

′

(u) du in some closed form (as a function of t) so that you could then

solve for t in terms of s. If that can be done, you would then substitute the expression

for t in terms of s (which we called α(s)) into the formula for f(t) to get f(s).

Example 1.43. Parametrize the helix f(t) = (cos t, sin t, t), for t in [0, 2π], by arc length.

Solution: By Example 1.41 and formula (1.43), we have

s =

t

0

f

′

(u) du =

t

0

√

2 du =

√

2 t for all t in [0, 2π].

So we can solve for t in terms of s: t = α(s) =

s

√

2

.

∴ f(s) =

cos

s

√

2

, sin

s

√

2

,

s

√

2

for all s in [0, 2

√

2π]. Note that f

′

(s) = 1.

Arc length plays an important role when discussing curvature and moving frame

fields, in the field of mathematics known as differential geometry.

16

The methods in-

volve using an arc length parametrization, which often leads to an integral that is ei-

ther difficult or impossible to evaluate in a simple closed form. The simple integral in

Example 1.43 is the exception, not the norm. In general, arc length parametrizations

are more useful for theoretical purposes than for practical computations.

17

Curvature

and moving frame fields can be defined without using arc length, which makes their

computation much easier, and these definitions can be shown to be equivalent to those

using arc length. We will leave this to the exercises.

The arc length for curves given in other coordinate systems can also be calculated:

Theorem 1.22. Suppose that r = r(t), θ = θ(t) and z = z(t) are the cylindrical coor-

dinates of a curve f(t), for t in [a, b]. Then the arc length L of the curve over [a, b]

is

L =

b

a

r

′

(t)

2

+ r(t)

2

θ

′

(t)

2

+ z

′

(t)

2

dt (1.44)

Proof: The Cartesian coordinates (x(t), y(t), z(t)) of a point on the curve are given by

x(t) = r(t) cos θ(t), y(t) = r(t) sin θ(t), z(t) = z(t)

so differentiating the above expressions for x(t) and y(t) with respect to t gives

x

′

(t) = r

′

(t) cos θ(t) − r(t)θ

′

(t) sin θ(t), y

′

(t) = r

′

(t) sin θ(t) + r(t)θ

′

(t) cos θ(t)

16

See O’NEILL for an introduction to elementary differential geometry.

17

For example, the usual parametrizations of Bézier curves, which we discussed in Section 1.8, are

polynomial functions in

3

. This makes their computation relatively simple, which, in CAD, is desirable.

But their arc length parametrizations are not only not polynomials, they are in fact usually impossible

to calculate at all.

1.9 Arc Length 63

and so

x

′

(t)

2

+ y

′

(t)

2

= (r

′

(t) cos θ(t) − r(t)θ

′

(t) sin θ(t))

2

+ (r

′

(t) sin θ(t) + r(t)θ

′

(t) cos θ(t))

2

= r

′

(t)

2

(cos

2

θ + sin

2

θ) + r(t)

2

θ

′

(t)

2

(cos

2

θ + sin

2

θ)

− 2r

′

(t)r(t)θ

′

(t) cos θ sin θ + 2r

′

(t)r(t)θ

′

(t) cos θ sin θ

= r

′

(t)

2

+ r(t)

2

θ

′

(t)

2

, and so

L =

b

a

x

′

(t)

2

+ y

′

(t)

2

+ z

′

(t)

2

dt

=

b

a

r

′

(t)

2

+ r(t)

2

θ

′

(t)

2

+ z

′

(t)

2

dt QED

Example 1.44. Find the arc length L of the curve whose cylindrical coordinates are

r = e

t

, θ = t and z = e

t

, for t over the interval [0, 1].

Solution: Since r

′

(t) = e

t

, θ

′

(t) = 1 and z

′

(t) = e

t

, then

L =

1

0

r

′

(t)

2

+ r(t)

2

θ

′

(t)

2

+ z

′

(t)

2

dt

=

1

0

e

2t

+ e

2t

(1) + e

2t

dt

=

1

0

e

t

√

3 dt =

√

3(e − 1)

¨

©

Exercises

A

For Exercises 1-3, calculate the arc length of f(t) over the given interval.

1. f(t) = (3 cos 2t, 3 sin 2t, 3t) on [0, π/2]

2. f(t) = ((t

2

+ 1) cos t, (t

2

+ 1) sin t, 2

√

2t) on [0, 1]

3. f(t) = (2 cos 3t, 2 sin 3t, 2t

3/2

) on [0, 1]

4. Parametrize the curve from Exercise 1 by arc length.

5. Parametrize the curve from Exercise 3 by arc length.

B

6. Let f(t) be a differentiable curve such that f(t) 0 for all t. Show that

d

dt

¸

¸

¸

¸

¸

¸

f(t)

¸

¸

¸f(t)

¸

¸

¸

¸

¸

¸

¸

¸

¸

=

f(t) ××× (f

′

(t) ××× f(t))

f(t)

3

.

64 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Exercises 7-9 develop the moving frame field T, N, B at a point on a curve.

7. Let f(t) be a smooth curve such that f

′

(t) 0 for all t. Then we can define the unit

tangent vector T by

T(t) =

f

′

(t)

f

′

(t)

.

Show that

T

′

(t) =

f

′

(t) ××× (f

′′

(t) ××× f

′

(t))

f

′

(t)

3

.

8. Continuing Exercise 7, assume that f

′

(t) and f

′′

(t) are not parallel. Then T

′

(t) 0

so we can define the unit principal normal vector N by

N(t) =

T

′

(t)

T

′

(t)

.

Show that

N(t) =

f

′

(t) ××× (f

′′

(t) ××× f

′

(t))

f

′

(t) f

′′

(t) ××× f

′

(t)

.

9. Continuing Exercise 8, the unit binormal vector B is defined by

B(t) = T(t) ××× N(t).

Show that

B(t) =

f

′

(t) ××× f

′′

(t)

f

′

(t) ××× f

′′

(t)

.

Note: The vectors T(t), N(t) and B(t) form a right-handed system of mutually per-

pendicular unit vectors (called orthonormal vectors) at each point on the curve f(t).

10. Continuing Exercise 9, the curvature κ is defined by

κ(t) =

T

′

(t)

f

′

(t)

=

f

′

(t) ××× (f

′′

(t) ××× f

′

(t))

f

′

(t)

4

.

Show that

κ(t) =

f

′

(t) ××× f

′′

(t)

f

′

(t)

3

and that T

′

(t) = f

′

(t) κ(t) N(t).

Note: κ(t) gives a sense of how “curved” the curve f(t) is at each point.

11. Find T, N, B and κ at each point of the helix f(t) = (cos t, sin t, t).

12. Show that the arc length L of a curve whose spherical coordinates are ρ = ρ(t),

θ = θ(t) and φ = φ(t) for t in an interval [a, b] is

L =

b

a

ρ

′

(t)

2

+ (ρ(t)

2

sin

2

φ(t)) θ

′

(t)

2

+ ρ(t)

2

φ

′

(t)

2

dt.

2 Functions of Several Variables

2.1 Functions of Two or Three Variables

In Section 1.8 we discussed vector-valued functions of a single real variable. We will

now examine real-valued functions of a point (or vector) in

2

or

3

. For the most part

these functions will be defined on sets of points in

2

, but there will be times when we

will use points in

3

, and there will also be times when it will be convenient to think

of the points as vectors (or terminal points of vectors).

A real-valued function f defined on a subset D of

2

is a rule that assigns to

each point (x, y) in D a real number f (x, y). The largest possible set D in

2

on which

f is defined is called the domain of f , and the range of f is the set of all real num-

bers f (x, y) as (x, y) varies over the domain D. A similar definition holds for functions

f (x, y, z) defined on points (x, y, z) in

3

.

Example 2.1. The domain of the function

f (x, y) = xy

is all of

2

, and the range of f is all of .

Example 2.2. The domain of the function

f (x, y) =

1

x − y

is all of

2

except the points (x, y) for which x = y. That is, the domain is the set

D = {(x, y) : x y}. The range of f is all real numbers except 0.

Example 2.3. The domain of the function

f (x, y) =

1 − x

2

− y

2

is the set D = {(x, y) : x

2

+ y

2

≤ 1}, since the quantity inside the square root is nonneg-

ative if and only if 1 − (x

2

+ y

2

) ≥ 0. We see that D consists of all points on and inside

the unit circle in

2

(D is sometimes called the closed unit disk). The range of f is the

interval [0, 1] in .

65

66 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Example 2.4. The domain of the function

f (x, y, z) = e

x+y−z

is all of

3

, and the range of f is all positive real numbers.

A function f (x, y) defined in

2

is often written as z = f (x, y), as was mentioned in

Section 1.1, so that the graph of f (x, y) is the set {(x, y, z) : z = f (x, y)} in

3

. So we see

that this graph is a surface in

3

, since it satisfies an equation of the form F(x, y, z) = 0

(namely, F(x, y, z) = f (x, y) − z). The traces of this surface in the planes z = c, where

c varies over , are called the level curves of the function. Equivalently, the level

curves are the solution sets of the equations f (x, y) = c, for c in . Level curves are

often projected onto the xy-plane to give an idea of the various “elevation” levels of the

surface (as is done in topography).

Example 2.5. The graph of the function

f (x, y) =

sin

x

2

+ y

2

x

2

+ y

2

is shown below. Note that the level curves (shown both on the surface and projected

onto the xy-plane) are groups of concentric circles.

-10

-5

0

5

10

-10

-5

0

5

10

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

z

x

y

z

Figure 2.1.1 The function f (x, y) =

sin

√

x

2

+y

2

√

x

2

+y

2

You may be wondering what happens to the function in Example 2.5 at the point

(x, y) = (0, 0), since both the numerator and denominator are 0 at that point. The

function is not defined at (0, 0), but the limit of the function exists (and equals 1) as

(x, y) approaches (0, 0). We will now state explicitly what is meant by the limit of a

function of two variables.

2.1 Functions of Two or Three Variables 67

Definition 2.1. Let (a, b) be a point in

2

, and let f (x, y) be a real-valued function

defined on some set containing (a, b) (but not necessarily defined at (a, b) itself). Then

we say that the limit of f (x, y) equals L as (x, y) approaches (a, b), written as

lim

(x,y)→(a,b)

f (x, y) = L , (2.1)

if given any ǫ > 0, there exists a δ > 0 such that

| f (x, y) − L| < ǫ whenever 0 <

(x − a)

2

+ (y − b)

2

< δ.

A similar definition can be made for functions of three variables. The idea behind

the above definition is that the values of f (x, y) can get arbitrarily close to L (i.e. within

ǫ of L) if we pick (x, y) sufficiently close to (a, b) (i.e. inside a circle centered at (a, b) with

some sufficiently small radius δ).

If you recall the “epsilon-delta” proofs of limits of real-valued functions of a single

variable, you may remember how awkward they can be, and how they can usually

only be done easily for simple functions. In general, the multivariable cases are at

least equally awkward to go through, so we will not bother with such proofs. Instead,

we will simply state that when the function f (x, y) is given by a single formula and is

defined at the point (a, b) (e.g. is not some indeterminate form like 0/0) then you can

just substitute (x, y) = (a, b) into the formula for f (x, y) to find the limit.

Example 2.6.

lim

(x,y)→(1,2)

xy

x

2

+ y

2

=

(1)(2)

1

2

+ 2

2

=

2

5

since f (x, y) =

xy

x

2

+y

2

is properly defined at the point (1, 2).

The major difference between limits in one variable and limits in two or more vari-

ables has to do with how a point is approached. In the single-variable case, the state-

ment “x → a” means that x gets closer to the value a from two possible directions

along the real number line (see Figure 2.1.2(a)). In two dimensions, however, (x, y) can

approach a point (a, b) along an infinite number of paths (see Figure 2.1.2(b)).

0

x a

x x

(a) x → a in

x

y

0

(a, b)

(b) (x, y) → (a, b) in

2

Figure 2.1.2 “Approaching” a point in different dimensions

68 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Example 2.7.

lim

(x,y)→(0,0)

xy

x

2

+ y

2

does not exist

Note that we can not simply substitute (x, y) = (0, 0) into the function, since doing so

gives an indeterminate form 0/0. To show that the limit does not exist, we will show

that the function approaches different values as (x, y) approaches (0, 0) along different

paths in

2

. To see this, suppose that (x, y) → (0, 0) along the positive x-axis, so that

y = 0 along that path. Then

f (x, y) =

xy

x

2

+ y

2

=

x0

x

2

+ 0

2

= 0

along that path (since x > 0 in the denominator). But if (x, y) → (0, 0) along the straight

line y = x through the origin, for x > 0, then we see that

f (x, y) =

xy

x

2

+ y

2

=

x

2

x

2

+ x

2

=

1

2

which means that f (x, y) approaches different values as (x, y) → (0, 0) along different

paths. Hence the limit does not exist.

Limits of real-valued multivariable functions obey the same algebraic rules as in

the single-variable case, as shown in the following theorem, which we state without

proof.

Theorem 2.1. Suppose that lim

(x,y)→(a,b)

f (x, y) and lim

(x,y)→(a,b)

g(x, y) both exist, and that k is

some scalar. Then:

(a) lim

(x,y)→(a,b)

[ f (x, y) ± g(x, y)] =

,

lim

(x,y)→(a,b)

f (x, y)

,

±

,

lim

(x,y)→(a,b)

g(x, y)

,

(b) lim

(x,y)→(a,b)

k f (x, y) = k

,

lim

(x,y)→(a,b)

f (x, y)

,

(c) lim

(x,y)→(a,b)

[ f (x, y)g(x, y)] =

,

lim

(x,y)→(a,b)

f (x, y)

,,

lim

(x,y)→(a,b)

g(x, y)

,

(d) lim

(x,y)→(a,b)

f (x, y)

g(x, y)

=

lim

(x,y)→(a,b)

f (x, y)

lim

(x,y)→(a,b)

g(x, y)

if lim

(x,y)→(a,b)

g(x, y) 0

(e) If | f (x, y) − L| ≤ g(x, y) for all (x, y) and if lim

(x,y)→(a,b)

g(x, y) = 0, then lim

(x,y)→(a,b)

f (x, y) = L.

Note that in part (e), it suffices to have | f (x, y) − L| ≤ g(x, y) for all (x, y) “sufficiently

close” to (a, b) (but excluding (a, b) itself).

2.1 Functions of Two or Three Variables 69

Example 2.8. Show that

lim

(x,y)→(0,0)

y

4

x

2

+ y

2

= 0.

Since substituting (x, y) = (0, 0) into the function gives the indeterminate form 0/0,

we need an alternate method for evaluating this limit. We will use Theorem 2.1(e).

First, notice that y

4

=

y

2

4

and so 0 ≤ y

4

≤

x

2

+ y

2

4

for all (x, y). But

x

2

+ y

2

4

=

(x

2

+ y

2

)

2

. Thus, for all (x, y) (0, 0) we have

y

4

x

2

+ y

2

≤

(x

2

+ y

2

)

2

x

2

+ y

2

= x

2

+ y

2

→ 0 as (x, y) → (0, 0).

Therefore lim

(x,y)→(0,0)

y

4

x

2

+ y

2

= 0.

Continuity can be defined similarly as in the single-variable case.

Definition 2.2. A real-valued function f (x, y) with domain D in

2

is continuous

at the point (a, b) in D if lim

(x,y)→(a,b)

f (x, y) = f (a, b). We say that f (x, y) is a continuous

function if it is continuous at every point in its domain D.

Unless indicated otherwise, you can assume that all the functions we deal with

are continuous. In fact, we can modify the function from Example 2.8 so that it is

continuous on all of

2

.

Example 2.9. Define a function f (x, y) on all of

2

as follows:

f (x, y) =

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

0 if (x, y) = (0, 0)

y

4

x

2

+ y

2

if (x, y) (0, 0)

Then f (x, y) is well-defined for all (x, y) in

2

(i.e. there are no indeterminate forms for

any (x, y)), and we see that

lim

(x,y)→(a,b)

f (x, y) =

b

4

a

2

+ b

2

= f (a, b) for (a, b) (0, 0).

So since

lim

(x,y)→(0,0)

f (x, y) = 0 = f (0, 0) by Example 2.8,

then f (x, y) is continuous on all of

2

.

70 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

¨

©

Exercises

A

For Exercises 1-6, state the domain and range of the given function.

1. f (x, y) = x

2

+ y

2

− 1 2. f (x, y) =

1

x

2

+ y

2

3. f (x, y) =

x

2

+ y

2

− 4 4. f (x, y) =

x

2

+ 1

y

5. f (x, y, z) = sin(xyz) 6. f (x, y, z) =

(x − 1)(yz − 1)

For Exercises 7-18, evaluate the given limit.

7. lim

(x,y)→(0,0)

cos(xy) 8. lim

(x,y)→(0,0)

e

xy

9. lim

(x,y)→(0,0)

x

2

− y

2

x

2

+ y

2

10. lim

(x,y)→(0,0)

xy

2

x

2

+ y

4

11. lim

(x,y)→(1,−1)

x

2

− 2xy + y

2

x − y

12. lim

(x,y)→(0,0)

xy

2

x

2

+ y

2

13. lim

(x,y)→(1,1)

x

2

− y

2

x − y

14. lim

(x,y)→(0,0)

x

2

− 2xy + y

2

x − y

15. lim

(x,y)→(0,0)

y

4

sin(xy)

x

2

+ y

2

16. lim

(x,y)→(0,0)

(x

2

+ y

2

) cos

¸

1

xy

17. lim

(x,y)→(0,0)

x

y

18. lim

(x,y)→(0,0)

cos

¸

1

xy

B

19. Show that f (x, y) =

1

2πσ

2

e

−(x

2

+y

2

)/2σ

2

, for σ > 0, is constant on the circle of radius

r > 0 centered at the origin. This function is called a Gaussian blur, and is used as

a filter in image processing software to produce a “blurred” effect.

20. Suppose that f (x, y) ≤ f (y, x) for all (x, y) in

2

. Show that f (x, y) = f (y, x) for all

(x, y) in

2

.

21. Use the substitution r =

x

2

+ y

2

to show that

lim

(x,y)→(0,0)

sin

x

2

+ y

2

x

2

+ y

2

= 1 .

(Hint: You will need to use L’Hôpital’s Rule for single-variable limits.)

C

22. Prove Theorem 2.1(a) in the case of addition. (Hint: Use Definition 2.1.)

23. Prove Theorem 2.1(b).

2.2 Partial Derivatives 71

2.2 Partial Derivatives

Now that we have an idea of what functions of several variables are, and what a limit

of such a function is, we can start to develop an idea of a derivative of a function of

two or more variables. We will start with the notion of a partial derivative.

Definition 2.3. Let f (x, y) be a real-valued function with domain D in

2

, and let

(a, b) be a point in D. Then the partial derivative of f at (a, b) with respect to x,

denoted by

∂f

∂x

(a, b), is defined as

∂f

∂x

(a, b) = lim

h→0

f (a + h, b) − f (a, b)

h

(2.2)

and the partial derivative of f at (a, b) with respect to y, denoted by

∂f

∂y

(a, b), is

defined as

∂f

∂y

(a, b) = lim

h→0

f (a, b + h) − f (a, b)

h

. (2.3)

Note: The symbol ∂ is pronounced “del”.

1

Recall that the derivative of a function f (x) can be interpreted as the rate of change

of that function in the (positive) x direction. From the definitions above, we can see

that the partial derivative of a function f (x, y) with respect to x or y is the rate of

change of f (x, y) in the (positive) x or y direction, respectively. What this means is

that the partial derivative of a function f (x, y) with respect to x can be calculated by

treating the y variable as a constant, and then simply differentiating f (x, y) as if it were

a function of x alone, using the usual rules from single-variable calculus. Likewise,

the partial derivative of f (x, y) with respect to y is obtained by treating the x variable

as a constant and then differentiating f (x, y) as if it were a function of y alone.

Example 2.10. Find

∂f

∂x

(x, y) and

∂f

∂y

(x, y) for the function f (x, y) = x

2

y + y

3

.

Solution: Treating y as a constant and differentiating f (x, y) with respect to x gives

∂f

∂x

(x, y) = 2xy

and treating x as a constant and differentiating f (x, y) with respect to y gives

∂f

∂y

(x, y) = x

2

+ 3y

2

.

1

It is not a Greek letter. The symbol was first used by the mathematicians A. Clairaut and L. Euler

around 1740, to distinguish it from the letter d used for the “usual” derivative.

72 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

We will often simply write

∂f

∂x

and

∂f

∂y

instead of

∂f

∂x

(x, y) and

∂f

∂y

(x, y).

Example 2.11. Find

∂f

∂x

and

∂f

∂y

for the function f (x, y) =

sin(xy

2

)

x

2

+ 1

.

Solution: Treating y as a constant and differentiating f (x, y) with respect to x gives

∂f

∂x

=

(x

2

+ 1)(y

2

cos(xy

2

)) − (2x) sin(xy

2

)

(x

2

+ 1)

2

and treating x as a constant and differentiating f (x, y) with respect to y gives

∂f

∂y

=

2xy cos(xy

2

)

x

2

+ 1

.

Since both

∂f

∂x

and

∂f

∂y

are themselves functions of x and y, we can take their partial

derivatives with respect to x and y. This yields the higher-order partial derivatives:

∂

2

f

∂x

2

=

∂

∂x

∂f

∂x

∂

2

f

∂y

2

=

∂

∂y

∂f

∂y

∂

2

f

∂y ∂x

=

∂

∂y

∂f

∂x

∂

2

f

∂x ∂y

=

∂

∂x

∂f

∂y

∂

3

f

∂x

3

=

∂

∂x

∂

2

f

∂x

2

∂

3

f

∂y

3

=

∂

∂y

∂

2

f

∂y

2

∂

3

f

∂y ∂x

2

=

∂

∂y

∂

2

f

∂x

2

∂

3

f

∂x ∂y

2

=

∂

∂x

∂

2

f

∂y

2

∂

3

f

∂y

2

∂x

=

∂

∂y

∂

2

f

∂y ∂x

∂

3

f

∂x

2

∂y

=

∂

∂x

∂

2

f

∂x ∂y

∂

3

f

∂x ∂y ∂x

=

∂

∂x

∂

2

f

∂y ∂x

∂

3

f

∂y ∂x ∂y

=

∂

∂y

∂

2

f

∂x ∂y

.

.

.

Example 2.12. Find the partial derivatives

∂f

∂x

,

∂f

∂y

,

∂

2

f

∂x

2

,

∂

2

f

∂y

2

,

∂

2

f

∂y ∂x

and

∂

2

f

∂x ∂y

for the

function f (x, y) = e

x

2

y

+ xy

3

.

2.2 Partial Derivatives 73

Solution: Proceeding as before, we have

∂f

∂x

= 2xye

x

2

y

+ y

3

∂f

∂y

= x

2

e

x

2

y

+ 3xy

2

∂

2

f

∂x

2

=

∂

∂x

(2xye

x

2

y

+ y

3

)

∂

2

f

∂y

2

=

∂

∂y

(x

2

e

x

2

y

+ 3xy

2

)

= 2ye

x

2

y

+ 4x

2

y

2

e

x

2

y

= x

4

e

x

2

y

+ 6xy

∂

2

f

∂y ∂x

=

∂

∂y

(2xye

x

2

y

+ y

3

)

∂

2

f

∂x ∂y

=

∂

∂x

(x

2

e

x

2

y

+ 3xy

2

)

= 2xe

x

2

y

+ 2x

3

ye

x

2

y

+ 3y

2

= 2xe

x

2

y

+ 2x

3

ye

x

2

y

+ 3y

2

Higher-order partial derivatives that are taken with respect to different variables,

such as

∂

2

f

∂y ∂x

and

∂

2

f

∂x ∂y

, are called mixed partial derivatives. Notice in the above

example that

∂

2

f

∂y ∂x

=

∂

2

f

∂x ∂y

. It turns that this will usually be the case. Specifically,

whenever both

∂

2

f

∂y ∂x

and

∂

2

f

∂x ∂y

are continuous at a point (a, b), then they are equal at that

point.

2

All the functions we will deal with will have continuous partial derivatives of

all orders, so you can assume in the remainder of the text that

∂

2

f

∂y ∂x

=

∂

2

f

∂x ∂y

for all (x, y) in the domain of f .

In other words, it doesn’t matter in which order you take partial derivatives. This

applies even to mixed partial derivatives of order 3 or higher.

The notation for partial derivatives varies. All of the following are equivalent:

∂f

∂x

: f

x

(x, y) , f

1

(x, y) , D

x

(x, y) , D

1

(x, y)

∂f

∂y

: f

y

(x, y) , f

2

(x, y) , D

y

(x, y) , D

2

(x, y)

∂

2

f

∂x

2

: f

xx

(x, y) , f

11

(x, y) , D

xx

(x, y) , D

11

(x, y)

∂

2

f

∂y

2

: f

yy

(x, y) , f

22

(x, y) , D

yy

(x, y) , D

22

(x, y)

∂

2

f

∂y ∂x

: f

xy

(x, y) , f

12

(x, y) , D

xy

(x, y) , D

12

(x, y)

∂

2

f

∂x ∂y

: f

yx

(x, y) , f

21

(x, y) , D

yx

(x, y) , D

21

(x, y)

2

See pp. 214-216 in TAYLOR and MANN for a proof.

74 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

¨

©

Exercises

A

For Exercises 1-16, find

∂f

∂x

and

∂f

∂y

.

1. f (x, y) = x

2

+ y

2

2. f (x, y) = cos(x + y)

3. f (x, y) =

x

2

+ y + 4

4. f (x, y) =

x + 1

y + 1

5. f (x, y) = e

xy

+ xy 6. f (x, y) = x

2

− y

2

+ 6xy + 4x − 8y + 2

7. f (x, y) = x

4

8. f (x, y) = x + 2y

9. f (x, y) =

x

2

+ y

2

10. f (x, y) = sin(x + y)

11. f (x, y) =

3

x

2

+ y + 4

12. f (x, y) =

xy + 1

x + y

13. f (x, y) = e

−(x

2

+y

2

)

14. f (x, y) = ln(xy)

15. f (x, y) = sin(xy) 16. f (x, y) = tan(x + y)

For Exercises 17-26, find

∂

2

f

∂x

2

,

∂

2

f

∂y

2

and

∂

2

f

∂y ∂x

(use Exercises 1-8, 14, 15).

17. f (x, y) = x

2

+ y

2

18. f (x, y) = cos(x + y)

19. f (x, y) =

x

2

+ y + 4

20. f (x, y) =

x + 1

y + 1

21. f (x, y) = e

xy

+ xy 22. f (x, y) = x

2

− y

2

+ 6xy + 4x − 8y + 2

23. f (x, y) = x

4

24. f (x, y) = x + 2y

25. f (x, y) = ln(xy) 26. f (x, y) = sin(xy)

B

27. Show that the function f (x, y) = sin(x + y) + cos(x − y) satisfies the wave equation

∂

2

f

∂x

2

−

∂

2

f

∂y

2

= 0 .

The wave equation is an example of a partial differential equation.

28. Let u and v be twice-differentiable functions of a single variable, and let c 0

be a constant. Show that f (x, y) = u(x + cy) + v(x − cy) is a solution of the general

one-dimensional wave equation

3

∂

2

f

∂x

2

−

1

c

2

∂

2

f

∂y

2

= 0 .

3

Conversely, it turns out that any solution must be of this form. See Ch. 1 in WEINBERGER.

2.3 Tangent Plane to a Surface 75

2.3 Tangent Plane to a Surface

In the previous section we mentioned that the partial derivatives

∂f

∂x

and

∂f

∂y

can be

thought of as the rate of change of a function z = f (x, y) in the positive x and y direc-

tions, respectively. Recall that the derivative

dy

dx

of a function y = f (x) has a geometric

meaning, namely as the slope of the tangent line to the graph of f at the point (x, f (x))

in

2

. There is a similar geometric meaning to the partial derivatives

∂f

∂x

and

∂f

∂y

of

a function z = f (x, y): given a point (a, b) in the domain D of f (x, y), the trace of the

surface described by z = f (x, y) in the plane y = b is a curve in

3

through the point

(a, b, f (a, b)), and the slope of the tangent line L

x

to that curve at that point is

∂f

∂x

(a, b).

Similarly,

∂f

∂y

(a, b) is the slope of the tangent line L

y

to the trace of the surface z = f (x, y)

in the plane x = a (see Figure 2.3.1).

y

z

x

0

(a, b)

D

L

x

b

(a, b, f (a, b))

slope =

∂f

∂x

(a, b)

z = f (x, y)

(a) Tangent line L

x

in the plane y = b

y

z

x

0

(a, b)

D

L

y

a

(a, b, f (a, b))

slope =

∂f

∂y

(a, b)

z = f (x, y)

(b) Tangent line L

y

in the plane x = a

Figure 2.3.1 Partial derivatives as slopes

Since the derivative

dy

dx

of a function y = f (x) is used to find the tangent line to the

graph of f (which is a curve in

2

), you might expect that partial derivatives can be

used to define a tangent plane to the graph of a surface z = f (x, y). This indeed turns

out to be the case. First, we need a definition of a tangent plane. The intuitive idea is

that a tangent plane “just touches” a surface at a point. The formal definition mimics

the intuitive notion of a tangent line to a curve.

Definition 2.4. Let z = f (x, y) be the equation of a surface S in

3

, and let P = (a, b, c)

be a point on S . Let T be a plane which contains the point P, and let Q = (x, y, z)

represent a generic point on the surface S . If the (acute) angle between the vector

−−→

PQ and the plane T approaches zero as the point Q approaches P along the surface S ,

then we call T the tangent plane to S at P.

Note that since two lines in

3

determine a plane, then the two tangent lines to the

surface z = f (x, y) in the x and y directions described in Figure 2.3.1 are contained in

76 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

the tangent plane at that point, if the tangent plane exists at that point. The existence

of those two tangent lines does not by itself guarantee the existence of the tangent

plane. It is possible that if we take the trace of the surface in the plane x − y = 0

(which makes a 45

◦

angle with the positive x-axis), the resulting curve in that plane

may have a tangent line which is not in the plane determined by the other two tangent

lines, or it may not have a tangent line at all at that point. Luckily, it turns out

4

that

if

∂f

∂x

and

∂f

∂y

exist in a region around a point (a, b) and are continuous at (a, b) then the

tangent plane to the surface z = f (x, y) will exist at the point (a, b, f (a, b)). In this text,

those conditions will always hold.

y

z

x

0

(a, b, f (a, b))

z = f (x, y)

T

L

x

L

y

Figure 2.3.2 Tangent plane

Suppose that we want an equation of the tangent plane

T to the surface z = f (x, y) at a point (a, b, f (a, b)). Let L

x

and L

y

be the tangent lines to the traces of the surface

in the planes y = b and x = a, respectively (as in Figure

2.3.2), and suppose that the conditions for T to exist do

hold. Then the equation for T is

A(x − a) + B(y − b) + C(z − f (a, b)) = 0 (2.4)

where n = (A, B, C) is a normal vector to the plane T.

Since T contains the lines L

x

and L

y

, then all we need are vectors v

x

and v

y

that are

parallel to L

x

and L

y

, respectively, and then let n = v

x

××× v

y

.

x

z

0

v

x

= (1, 0,

∂f

∂x

(a, b))

∂f

∂x

(a, b)

1

Figure 2.3.3

Since the slope of L

x

is

∂f

∂x

(a, b), then the vector v

x

= (1, 0,

∂f

∂x

(a, b)) is

parallel to L

x

(since v

x

lies in the xz-plane and lies in a line with

slope

∂f

∂x

(a,b)

1

=

∂f

∂x

(a, b). See Figure 2.3.3). Similarly, the vector

v

y

= (0, 1,

∂f

∂y

(a, b)) is parallel to L

y

. Hence, the vector

n = v

x

××× v

y

=

i j k

1 0

∂f

∂x

(a, b)

0 1

∂f

∂y

(a, b)

= −

∂f

∂x

(a, b) i −

∂f

∂y

(a, b) j + k

is normal to the plane T. Thus the equation of T is

−

∂f

∂x

(a, b) (x − a) −

∂f

∂y

(a, b) (y − b) + z − f (a, b) = 0 . (2.5)

Multiplying both sides by −1, we have the following result:

The equation of the tangent plane to the surface z = f (x, y) at the point (a, b, f (a, b))

is

∂f

∂x

(a, b) (x − a) +

∂f

∂y

(a, b) (y − b) − z + f (a, b) = 0 (2.6)

4

See TAYLOR and MANN, § 6.4.

2.3 Tangent Plane to a Surface 77

Example 2.13. Find the equation of the tangent plane to the surface z = x

2

+y

2

at the

point (1, 2, 5).

Solution: For the function f (x, y) = x

2

+y

2

, we have

∂f

∂x

= 2x and

∂f

∂y

= 2y, so the equation

of the tangent plane at the point (1, 2, 5) is

2(1)(x − 1) + 2(2)(y − 2) − z + 5 = 0 , or

2x + 4y − z − 5 = 0 .

In a similar fashion, it can be shown that if a surface is defined implicitly by an

equation of the form F(x, y, z) = 0, then the tangent plane to the surface at a point

(a, b, c) is given by the equation

∂F

∂x

(a, b, c) (x − a) +

∂F

∂y

(a, b, c) (y − b) +

∂F

∂z

(a, b, c) (z − c) = 0 . (2.7)

Note that formula (2.6) is the special case of formula (2.7) where F(x, y, z) = f (x, y) − z.

Example 2.14. Find the equation of the tangent plane to the surface x

2

+y

2

+z

2

= 9 at

the point (2, 2, −1).

Solution: For the function F(x, y, z) = x

2

+ y

2

+ z

2

− 9, we have

∂F

∂x

= 2x,

∂F

∂y

= 2y, and

∂F

∂z

= 2z, so the equation of the tangent plane at (2, 2, −1) is

2(2)(x − 2) + 2(2)(y − 2) + 2(−1)(z + 1) = 0 , or

2x + 2y − z − 9 = 0 .

¨

©

Exercises

A

For Exercises 1-6, find the equation of the tangent plane to the surface z = f (x, y) at

the point P.

1. f (x, y) = x

2

+ y

3

, P = (1, 1, 2) 2. f (x, y) = xy, P = (1, −1, −1)

3. f (x, y) = x

2

y, P = (−1, 1, 1) 4. f (x, y) = xe

y

, P = (1, 0, 1)

5. f (x, y) = x + 2y, P = (2, 1, 4) 6. f (x, y) =

x

2

+ y

2

, P = (3, 4, 5)

For Exercises 7-10, find the equation of the tangent plane to the given surface at the

point P.

7.

x

2

4

+

y

2

9

+

z

2

16

= 1, P =

1, 2,

2

√

11

3

8. x

2

+ y

2

+ z

2

= 9, P = (0, 0, 3)

9. x

2

+ y

2

− z

2

= 0, P = (3, 4, 5) 10. x

2

+ y

2

= 4, P = (

√

3, 1, 0)

78 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

2.4 Directional Derivatives and the Gradient

For a function z = f (x, y), we learned that the partial derivatives

∂f

∂x

and

∂f

∂y

represent

the (instantaneous) rate of change of f in the positive x and y directions, respectively.

What about other directions? It turns out that we can find the rate of change in any

direction using a more general type of derivative called a directional derivative.

Definition 2.5. Let f (x, y) be a real-valued function with domain D in

2

, and let (a, b)

be a point in D. Let v be a unit vector in

2

. Then the directional derivative of f

at (a, b) in the direction of v, denoted by D

v

f (a, b), is defined as

D

v

f (a, b) = lim

h→0

f ((a, b) + hv) − f (a, b)

h

(2.8)

Notice in the definition that we seem to be treating the point (a, b) as a vector, since

we are adding the vector hv to it. But this is just the usual idea of identifying vectors

with their terminal points, which the reader should be used to by now. If we were to

write the vector v as v = (v

1

, v

2

), then

D

v

f (a, b) = lim

h→0

f (a + hv

1

, b + hv

2

) − f (a, b)

h

. (2.9)

From this we can immediately recognize that the partial derivatives

∂f

∂x

and

∂f

∂y

are

special cases of the directional derivative with v = i = (1, 0) and v = j = (0, 1), respec-

tively. That is,

∂f

∂x

= D

i

f and

∂f

∂y

= D

j

f . Since there are many vectors with the same

direction, we use a unit vector in the definition, as that represents a “standard” vector

for a given direction.

If f (x, y) has continuous partial derivatives

∂f

∂x

and

∂f

∂y

(which will always be the case

in this text), then there is a simple formula for the directional derivative:

Theorem 2.2. Let f (x, y) be a real-valued function with domain D in

2

such that the

partial derivatives

∂f

∂x

and

∂f

∂y

exist and are continuous in D. Let (a, b) be a point in D,

and let v = (v

1

, v

2

) be a unit vector in

2

. Then

D

v

f (a, b) = v

1

∂f

∂x

(a, b) + v

2

∂f

∂y

(a, b) . (2.10)

Proof: Note that if v = i = (1, 0) then the above formula reduces to D

v

f (a, b) =

∂f

∂x

(a, b),

which we know is true since D

i

f =

∂f

∂x

, as we noted earlier. Similarly, for v = j = (0, 1)

the formula reduces to D

v

f (a, b) =

∂f

∂y

(a, b), which is true since D

j

f =

∂f

∂y

. So since

i = (1, 0) and j = (0, 1) are the only unit vectors in

2

with a zero component, then we

2.4 Directional Derivatives and the Gradient 79

need only show the formula holds for unit vectors v = (v

1

, v

2

) with v

1

0 and v

2

0.

So fix such a vector v and fix a number h 0. Then

f (a +hv

1

, b +hv

2

) − f (a, b) = f (a +hv

1

, b +hv

2

) − f (a +hv

1

, b) + f (a +hv

1

, b) − f (a, b) . (2.11)

Since h 0 and v

2

0, then hv

2

0 and thus any number c between b and b + hv

2

can be written as c = b + αhv

2

for some number 0 < α < 1. So since the function

f (a + hv

1

, y) is a real-valued function of y (since a + hv

1

is a fixed number), then the

Mean Value Theorem from single-variable calculus can be applied to the function

g(y) = f (a + hv

1

, y) on the interval [b, b + hv

2

] (or [b + hv

2

, b] if one of h or v

2

is negative)

to find a number 0 < α < 1 such that

∂f

∂y

(a + hv

1

, b + αhv

2

) = g

′

(b + αhv

2

) =

g(b + hv

2

) − g(b)

b + hv

2

− b

=

f (a + hv

1

, b + hv

2

) − f (a + hv

1

, b)

hv

2

and so

f (a + hv

1

, b + hv

2

) − f (a + hv

1

, b) = hv

2

∂f

∂y

(a + hv

1

, b + αhv

2

) .

By a similar argument, there exists a number 0 < β < 1 such that

f (a + hv

1

, b) − f (a, b) = hv

1

∂f

∂x

(a + βhv

1

, b) .

Thus, by equation (2.11), we have

f (a + hv

1

, b + hv

2

) − f (a, b)

h

=

hv

2

∂f

∂y

(a + hv

1

, b + αhv

2

) + hv

1

∂f

∂x

(a + βhv

1

, b)

h

= v

2

∂f

∂y

(a + hv

1

, b + αhv

2

) + v

1

∂f

∂x

(a + βhv

1

, b)

so by formula (2.9) we have

D

v

f (a, b) = lim

h→0

f (a + hv

1

, b + hv

2

) − f (a, b)

h

= lim

h→0

,

v

2

∂f

∂y

(a + hv

1

, b + αhv

2

) + v

1

∂f

∂x

(a + βhv

1

, b)

¸

= v

2

∂f

∂y

(a, b) + v

1

∂f

∂x

(a, b) by the continuity of

∂f

∂x

and

∂f

∂y

, so

D

v

f (a, b) = v

1

∂f

∂x

(a, b) + v

2

∂f

∂y

(a, b)

after reversing the order of summation. QED

Note that D

v

f (a, b) = v···

∂f

∂x

(a, b),

∂f

∂y

(a, b)

**. The second vector has a special name:
**

80 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Definition 2.6. For a real-valued function f (x, y), the gradient of f , denoted by ∇f ,

is the vector

∇f =

∂f

∂x

,

∂f

∂y

(2.12)

in

2

. For a real-valued function f (x, y, z), the gradient is the vector

∇f =

∂f

∂x

,

∂f

∂y

,

∂f

∂z

(2.13)

in

3

. The symbol ∇ is pronounced “del”.

5

Corollary 2.3. D

v

f = v··· ∇f

Example 2.15. Find the directional derivative of f (x, y) = xy

2

+ x

3

y at the point (1, 2) in

the direction of v =

1

√

2

,

1

√

2

.

Solution: We see that ∇f = (y

2

+ 3x

2

y, 2xy + x

3

), so

D

v

f (1, 2) = v··· ∇f (1, 2) =

1

√

2

,

1

√

2

··· (2

2

+ 3(1)

2

(2), 2(1)(2) + 1

3

) =

15

√

2

A real-valued function z = f (x, y) whose partial derivatives

∂f

∂x

and

∂f

∂y

exist and are

continuous is called continuously differentiable. Assume that f (x, y) is such a function

and that ∇f 0. Let c be a real number in the range of f and let v be a unit vector in

2

which is tangent to the level curve f (x, y) = c (see Figure 2.4.1).

x

y

0

v ∇f

f (x, y) = c

Figure 2.4.1

5

Sometimes the notation grad( f ) is used instead of ∇f .

2.4 Directional Derivatives and the Gradient 81

The value of f (x, y) is constant along a level curve, so since v is a tangent vector to

this curve, then the rate of change of f in the direction of v is 0, i.e. D

v

f = 0. But we

know that D

v

f = v ··· ∇f = v ∇f cos θ, where θ is the angle between v and ∇f . So

since v = 1 then D

v

f = ∇f cos θ. So since ∇f 0 then D

v

f = 0 ⇒ cos θ = 0 ⇒ θ = 90

◦

.

In other words, ∇f ⊥ v, which means that ∇f is normal to the level curve.

In general, for any unit vector v in

2

, we still have D

v

f = ∇f cos θ, where θ is the

angle between v and ∇f . At a fixed point (x, y) the length ∇f is fixed, and the value

of D

v

f then varies as θ varies. The largest value that D

v

f can take is when cos θ = 1

(θ = 0

◦

), while the smallest value occurs when cos θ = −1 (θ = 180

◦

). In other words, the

value of the function f increases the fastest in the direction of ∇f (since θ = 0

◦

in that

case), and the value of f decreases the fastest in the direction of −∇f (since θ = 180

◦

in that case). We have thus proved the following theorem:

Theorem 2.4. Let f (x, y) be a continuously differentiable real-valued function, with

∇f 0. Then:

(a) The gradient ∇f is normal to any level curve f (x, y) = c.

(b) The value of f (x, y) increases the fastest in the direction of ∇f .

(c) The value of f (x, y) decreases the fastest in the direction of −∇f .

Example 2.16. In which direction does the function f (x, y) = xy

2

+ x

3

y increase the

fastest from the point (1, 2)? In which direction does it decrease the fastest?

Solution: Since ∇f = (y

2

+ 3x

2

y, 2xy + x

3

), then ∇f (1, 2) = (10, 5) 0. A unit vector in

that direction is v =

∇f

∇f

=

2

√

5

,

1

√

5

. Thus, f increases the fastest in the direction of

2

√

5

,

1

√

5

and decreases the fastest in the direction of

−2

√

5

,

−1

√

5

.

Though we proved Theorem 2.4 for functions of two variables, a similar argument

can be used to show that it also applies to functions of three or more variables. Like-

wise, the directional derivative in the three-dimensional case can also be defined by

the formula D

v

f = v··· ∇f .

Example 2.17. The temperature T of a solid is given by the function T(x, y, z) = e

−x

+

e

−2y

+e

4z

, where x, y, z are space coordinates relative to the center of the solid. In which

direction from the point (1, 1, 1) will the temperature decrease the fastest?

Solution: Since ∇f = (−e

−x

, −2e

−2y

, 4e

4z

), then the temperature will decrease the fastest

in the direction of −∇f (1, 1, 1) = (e

−1

, 2e

−2

, −4e

4

).

82 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

¨

©

Exercises

A

For Exercises 1-10, compute the gradient ∇f .

1. f (x, y) = x

2

+ y

2

− 1 2. f (x, y) =

1

x

2

+ y

2

3. f (x, y) =

x

2

+ y

2

+ 4 4. f (x, y) = x

2

e

y

5. f (x, y) = ln(xy) 6. f (x, y) = 2x + 5y

7. f (x, y, z) = sin(xyz) 8. f (x, y, z) = x

2

e

yz

9. f (x, y, z) = x

2

+ y

2

+ z

2

10. f (x, y, z) =

x

2

+ y

2

+ z

2

For Exercises 11-14, find the directional derivative of f at the point P in the direction

of v =

1

√

2

,

1

√

2

.

11. f (x, y) = x

2

+ y

2

− 1, P = (1, 1) 12. f (x, y) =

1

x

2

+ y

2

, P = (1, 1)

13. f (x, y) =

x

2

+ y

2

+ 4, P = (1, 1) 14. f (x, y) = x

2

e

y

, P = (1, 1)

For Exercises 15-16, find the directional derivative of f at the point P in the direction

of v =

1

√

3

,

1

√

3

,

1

√

3

.

15. f (x, y, z) = sin(xyz), P = (1, 1, 1) 16. f (x, y, z) = x

2

e

yz

, P = (1, 1, 1)

17. Repeat Example 2.16 at the point (2, 3).

18. Repeat Example 2.17 at the point (3, 1, 2).

B

For Exercises 19-26, let f (x, y) and g(x, y) be continuously differentiable real-valued

functions, let c be a constant, and let v be a unit vector in

2

. Show that:

19. ∇(c f ) = c ∇f 20. ∇( f + g) = ∇f + ∇g

21. ∇( f g) = f ∇g + g ∇f 22. ∇( f /g) =

g ∇f − f ∇g

g

2

if g(x, y) 0

23. D

−v

f = −D

v

f 24. D

v

(c f ) = c D

v

f

25. D

v

( f + g) = D

v

f + D

v

g 26. D

v

( f g) = f D

v

g + g D

v

f

27. The function r(x, y) =

x

2

+ y

2

is the length of the position vector r = x i + y j for

each point (x, y) in

2

. Show that ∇r =

1

r

r when (x, y) (0, 0), and that ∇(r

2

) = 2 r.

2.5 Maxima and Minima 83

2.5 Maxima and Minima

The gradient can be used to find extreme points of real-valued functions of several

variables, that is, points where the function has a local maximum or local minimum.

We will consider only functions of two variables; functions of three or more variables

require methods using linear algebra.

Definition 2.7. Let f (x, y) be a real-valued function, and let (a, b) be a point in the

domain of f . We say that f has a local maximum at (a, b) if f (x, y) ≤ f (a, b) for all

(x, y) inside some disk of positive radius centered at (a, b), i.e. there is some sufficiently

small r > 0 such that f (x, y) ≤ f (a, b) for all (x, y) for which (x − a)

2

+ (y − b)

2

< r

2

.

Likewise, we say that f has a local minimum at (a, b) if f (x, y) ≥ f (a, b) for all (x, y)

inside some disk of positive radius centered at (a, b).

If f (x, y) ≤ f (a, b) for all (x, y) in the domain of f , then f has a global maximum at

(a, b). If f (x, y) ≥ f (a, b) for all (x, y) in the domain of f , then f has a global minimum

at (a, b).

Suppose that (a, b) is a local maximum point for f (x, y), and that the first-order

partial derivatives of f exist at (a, b). We know that f (a, b) is the largest value of

f (x, y) as (x, y) goes in all directions from the point (a, b), in some sufficiently small

disk centered at (a, b). In particular, f (a, b) is the largest value of f in the x direction

(around the point (a, b)), that is, the single-variable function g(x) = f (x, b) has a local

maximum at x = a. So we know that g

′

(a) = 0. Since g

′

(x) =

∂f

∂x

(x, b), then

∂f

∂x

(a, b) = 0.

Similarly, f (a, b) is the largest value of f near (a, b) in the y direction and so

∂f

∂y

(a, b) = 0.

We thus have the following theorem:

Theorem 2.5. Let f (x, y) be a real-valued function such that both

∂f

∂x

(a, b) and

∂f

∂y

(a, b)

exist. Then a necessary condition for f (x, y) to have a local maximum or minimum at

(a, b) is that ∇f (a, b) = 0.

Note: Theorem 2.5 can be extended to apply to functions of three or more variables.

A point (a, b) where ∇f (a, b) = 0 is called a critical point for the function f (x, y). So

given a function f (x, y), to find the critical points of f you have to solve the equations

∂f

∂x

(x, y) = 0 and

∂f

∂y

(x, y) = 0 simultaneously for (x, y). Similar to the single-variable case,

the necessary condition that ∇f (a, b) = 0 is not always sufficient to guarantee that a

critical point is a local maximum or minimum.

Example 2.18. The function f (x, y) = xy has a critical point at (0, 0):

∂f

∂x

= y = 0 ⇒ y = 0,

and

∂f

∂y

= x = 0 ⇒ x = 0, so (0, 0) is the only critical point. But clearly f does not have a

local maximum or minimum at (0, 0) since any disk around (0, 0) contains points (x, y)

where the values of x and y have the same sign (so that f (x, y) = xy > 0 = f (0, 0))

and different signs (so that f (x, y) = xy < 0 = f (0, 0)). In fact, along the path y = x

84 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

in

2

, f (x, y) = x

2

, which has a local minimum at (0, 0), while along the path y = −x

we have f (x, y) = −x

2

, which has a local maximum at (0, 0). So (0, 0) is an example of

a saddle point, i.e. it is a local maximum in one direction and a local minimum in

another direction. The graph of f (x, y) is shown in Figure 2.5.1, which is a hyperbolic

paraboloid.

-10

-5

0

5

10

-10

-5

0

5

10

-100

-50

0

50

100

z

x

y

z

Figure 2.5.1 f (x, y) = xy, saddle point at (0, 0)

The following theorem gives sufficient conditions for a critical point to be a local

maximum or minimum of a smooth function (i.e. a function whose partial derivatives

of all orders exist and are continuous), which we will not prove here.

6

Theorem 2.6. Let f (x, y) be a smooth real-valued function, with a critical point at

(a, b) (i.e. ∇f (a, b) = 0). Define

D =

∂

2

f

∂x

2

(a, b)

∂

2

f

∂y

2

(a, b) −

∂

2

f

∂y ∂x

(a, b)

2

Then

(a) if D > 0 and

∂

2

f

∂x

2

(a, b) > 0, then f has a local minimum at (a, b)

(b) if D > 0 and

∂

2

f

∂x

2

(a, b) < 0, then f has a local maximum at (a, b)

(c) if D < 0, then f has neither a local minimum nor a local maximum at (a, b)

(d) if D = 0, then the test fails.

6

See TAYLOR and MANN, § 7.6.

2.5 Maxima and Minima 85

If condition (c) holds, then (a, b) is a saddle point. Note that the assumption that

f (x, y) is smooth means that

D =

∂

2

f

∂x

2

(a, b)

∂

2

f

∂y ∂x

(a, b)

∂

2

f

∂x ∂y

(a, b)

∂

2

f

∂y

2

(a, b)

since

∂

2

f

∂y ∂x

=

∂

2

f

∂x ∂y

. Also, if D > 0 then

∂

2

f

∂x

2

(a, b)

∂

2

f

∂y

2

(a, b) = D +

∂

2

f

∂y ∂x

(a, b)

2

> 0, and so

∂

2

f

∂x

2

(a, b) and

∂

2

f

∂y

2

(a, b) have the same sign. This means that in parts (a) and (b) of the

theorem one can replace

∂

2

f

∂x

2

(a, b) by

∂

2

f

∂y

2

(a, b) if desired.

Example 2.19. Find all local maxima and minima of f (x, y) = x

2

+ xy + y

2

− 3x.

Solution: First find the critical points, i.e. where ∇f = 0. Since

∂f

∂x

= 2x + y − 3 and

∂f

∂y

= x + 2y

then the critical points (x, y) are the common solutions of the equations

2x + y − 3 = 0

x + 2y = 0

which has the unique solution (x, y) = (2, −1). So (2, −1) is the only critical point.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

= 2 ,

∂

2

f

∂y

2

= 2 ,

∂

2

f

∂y ∂x

= 1

and so

D =

∂

2

f

∂x

2

(2, −1)

∂

2

f

∂y

2

(2, −1) −

∂

2

f

∂y ∂x

(2, −1)

2

= (2)(2) − 1

2

= 3 > 0

and

∂

2

f

∂x

2

(2, −1) = 2 > 0. Thus, (2, −1) is a local minimum.

Example 2.20. Find all local maxima and minima of f (x, y) = xy − x

3

− y

2

.

Solution: First find the critical points, i.e. where ∇f = 0. Since

∂f

∂x

= y − 3x

2

and

∂f

∂y

= x − 2y

86 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

then the critical points (x, y) are the common solutions of the equations

y − 3x

2

= 0

x − 2y = 0

The first equation yields y = 3x

2

, substituting that into the second equation yields

x − 6x

2

= 0, which has the solutions x = 0 and x =

1

6

. So x = 0 ⇒ y = 3(0) = 0 and

x =

1

6

⇒ y = 3

1

6

2

=

1

12

.

So the critical points are (x, y) = (0, 0) and (x, y) =

1

6

,

1

12

.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

= −6x ,

∂

2

f

∂y

2

= −2 ,

∂

2

f

∂y ∂x

= 1

So

D =

∂

2

f

∂x

2

(0, 0)

∂

2

f

∂y

2

(0, 0) −

∂

2

f

∂y ∂x

(0, 0)

2

= (−6(0))(−2) − 1

2

= −1 < 0

and thus (0, 0) is a saddle point. Also,

D =

∂

2

f

∂x

2

1

6

,

1

12

∂

2

f

∂y

2

1

6

,

1

12

−

∂

2

f

∂y ∂x

1

6

,

1

12

2

= (−6

1

6

)(−2) − 1

2

= 1 > 0

and

∂

2

f

∂x

2

1

6

,

1

12

= −1 < 0. Thus,

1

6

,

1

12

is a local maximum.

Example 2.21. Find all local maxima and minima of f (x, y) = (x − 2)

4

+ (x − 2y)

2

.

Solution: First find the critical points, i.e. where ∇f = 0. Since

∂f

∂x

= 4(x − 2)

3

+ 2(x − 2y) and

∂f

∂y

= −4(x − 2y)

then the critical points (x, y) are the common solutions of the equations

4(x − 2)

3

+ 2(x − 2y) = 0

−4(x − 2y) = 0

The second equation yields x = 2y, substituting that into the first equation yields

4(2y − 2)

3

= 0, which has the solution y = 1, and so x = 2(1) = 2. Thus, (2, 1) is the only

critical point.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

= 12(x − 2)

2

+ 2 ,

∂

2

f

∂y

2

= 8 ,

∂

2

f

∂y ∂x

= −4

2.5 Maxima and Minima 87

So

D =

∂

2

f

∂x

2

(2, 1)

∂

2

f

∂y

2

(2, 1) −

∂

2

f

∂y ∂x

(2, 1)

2

= (2)(8) − (−4)

2

= 0

and so the test fails. What can be done in this situation? Sometimes it is possible

to examine the function to see directly the nature of a critical point. In our case, we

see that f (x, y) ≥ 0 for all (x, y), since f (x, y) is the sum of fourth and second powers

of numbers and hence must be nonnegative. But we also see that f (2, 1) = 0. Thus

f (x, y) ≥ 0 = f (2, 1) for all (x, y), and hence (2, 1) is in fact a global minimum for f .

Example 2.22. Find all local maxima and minima of f (x, y) = (x

2

+ y

2

)e

−(x

2

+y

2

)

.

Solution: First find the critical points, i.e. where ∇f = 0. Since

∂f

∂x

= 2x(1 − (x

2

+ y

2

))e

−(x

2

+y

2

)

∂f

∂y

= 2y(1 − (x

2

+ y

2

))e

−(x

2

+y

2

)

then the critical points are (0, 0) and all points (x, y) on the unit circle x

2

+ y

2

= 1.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

= 2[1 − (x

2

+ y

2

) − 2x

2

− 2x

2

(1 − (x

2

+ y

2

))]e

−(x

2

+y

2

)

∂

2

f

∂y

2

= 2[1 − (x

2

+ y

2

) − 2y

2

− 2y

2

(1 − (x

2

+ y

2

))]e

−(x

2

+y

2

)

∂

2

f

∂y ∂x

= −4xy[2 − (x

2

+ y

2

)]e

−(x

2

+y

2

)

At (0, 0), we have D = 4 > 0 and

∂

2

f

∂x

2

(0, 0) = 2 > 0, so (0, 0) is a local minimum. However,

for points (x, y) on the unit circle x

2

+ y

2

= 1, we have

D = (−4x

2

e

−1

)(−4y

2

e

−1

) − (−4xye

−1

)

2

= 0

and so the test fails. If we look at the graph of f (x, y), as shown in Figure 2.5.2, it

looks like we might have a local maximum for (x, y) on the unit circle x

2

+ y

2

= 1. If we

switch to using polar coordinates (r, θ) instead of (x, y) in

2

, where r

2

= x

2

+ y

2

, then

we see that we can write f (x, y) as a function g(r) of the variable r alone: g(r) = r

2

e

−r

2

.

Then g

′

(r) = 2r(1 − r

2

)e

−r

2

, so it has a critical point at r = 1, and we can check that

g

′′

(1) = −4e

−1

< 0, so the Second Derivative Test from single-variable calculus says

that r = 1 is a local maximum. But r = 1 corresponds to the unit circle x

2

+ y

2

= 1.

Thus, the points (x, y) on the unit circle x

2

+ y

2

= 1 are local maximum points for f .

88 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

-3

-2

-1

0

1

2

3

-3

-2

-1

0

1

2

3

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

z

x

y

z

Figure 2.5.2 f (x, y) = (x

2

+ y

2

)e

−(x

2

+y

2

)

¨

©

Exercises

A

For Exercises 1-10, find all local maxima and minima of the function f (x, y).

1. f (x, y) = x

3

− 3x + y

2

2. f (x, y) = x

3

− 12x + y

2

+ 8y

3. f (x, y) = x

3

− 3x + y

3

− 3y 4. f (x, y) = x

3

+ 3x

2

+ y

3

− 3y

2

5. f (x, y) = 2x

3

+ 6xy + 3y

2

6. f (x, y) = 2x

3

− 6xy + y

2

7. f (x, y) =

x

2

+ y

2

8. f (x, y) = x + 2y

9. f (x, y) = 4x

2

− 4xy + 2y

2

+ 10x − 6y 10. f (x, y) = −4x

2

+ 4xy − 2y

2

+ 16x − 12y

B

11. For a rectangular solid of volume 1000 cubic meters, find the dimensions that will

minimize the surface area. (Hint: Use the volume condition to write the surface

area as a function of just two variables.)

12. Prove that if (a, b) is a local maximum or local minimum point for a smooth func-

tion f (x, y), then the tangent plane to the surface z = f (x, y) at the point (a, b, f (a, b))

is parallel to the xy-plane. (Hint: Use Theorem 2.5.)

C

13. Find three positive numbers x, y, z whose sum is 10 such that x

2

y

2

z is a maximum.

2.6 Unconstrained Optimization: Numerical Methods 89

2.6 Unconstrained Optimization: Numerical Methods

The types of problems that we solved in the previous section were examples of uncon-

strained optimization problems. That is, we tried to find local (and perhaps even

global) maximum and minimum points of real-valued functions f (x, y), where the

points (x, y) could be any points in the domain of f . The method we used required

us to find the critical points of f , which meant having to solve the equation ∇f = 0,

which in general is a system of two equations in two unknowns (x and y). While this

was relatively simple for the examples we did, in general this will not be the case. If

the equations involve polynomials in x and y of degree three or higher, or complicated

expressions involving trigonometric, exponential, or logarithmic functions, then solv-

ing even one such equation, let alone two, could be impossible by elementary means.

7

For example, if one of the equations that had to be solved was

x

3

+ 9x − 2 = 0 ,

you may have a hard time getting the exact solutions. Trial and error would not help

much, especially since the only real solution

8

turns out to be

3

√

28 + 1−

3

√

28 − 1. In a

situation such as this, the only choice may be to find a solution using some numerical

method which gives a sequence of numbers which converge to the actual solution. For

example, Newton’s method for solving equations f (x) = 0, which you probably learned

in single-variable calculus. In this section we will describe another method of Newton

for finding critical points of real-valued functions of two variables.

Let f (x, y) be a smooth real-valued function, and define

D(x, y) =

∂

2

f

∂x

2

(x, y)

∂

2

f

∂y

2

(x, y) −

∂

2

f

∂y ∂x

(x, y)

2

.

Newton’s algorithm: Pick an initial point (x

0

, y

0

). For n = 0, 1, 2, 3, . . . , define:

x

n+1

= x

n

−

∂

2

f

∂y

2

(x

n

, y

n

)

∂

2

f

∂x ∂y

(x

n

, y

n

)

∂f

∂y

(x

n

, y

n

)

∂f

∂x

(x

n

, y

n

)

D(x

n

, y

n

)

, y

n+1

= y

n

−

∂

2

f

∂x

2

(x

n

, y

n

)

∂

2

f

∂x ∂y

(x

n

, y

n

)

∂f

∂x

(x

n

, y

n

)

∂f

∂y

(x

n

, y

n

)

D(x

n

, y

n

)

(2.14)

Then the sequence of points (x

n

, y

n

)

∞

n=1

converges to a critical point. If there are several

critical points, then you will have to try different initial points to find them.

7

This is also a problem for the equivalent method (the Second Derivative Test) in single-variable calcu-

lus, though one that is not usually emphasized.

8

There are also two nonreal, complex number solutions. Cubic polynomial equations in one variable

can be solved using Cardan’s formulas, which are not quite as simple as the familiar quadratic formula.

See USPENSKY for more details. There are formulas for solving polynomial equations of degree 4, but

it can be proved that there is no general formula for solving equations for polynomials of degree five or

higher.

90 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Example 2.23. Find all local maxima and minima of f (x, y) = x

3

− xy − x + xy

3

− y

4

.

Solution: First calculate the necessary partial derivatives:

∂f

∂x

= 3x

2

− y − 1 + y

3

,

∂f

∂y

= −x + 3xy

2

− 4y

3

∂

2

f

∂x

2

= 6x ,

∂

2

f

∂y

2

= 6xy − 12y

2

,

∂

2

f

∂y ∂x

= −1 + 3y

2

Notice that solving ∇f = 0 would involve solving two third-degree polynomial equa-

tions in x and y, which in this case can not be done easily.

We need to pick an initial point (x

0

, y

0

) for our algorithm. Looking at the graph of

z = f (x, y) over a large region may help (see Figure 2.6.1 below), though it may be hard

to tell where the critical points are.

-20

-15

-10

-5

0

5

10

15

20

-20

-15

-10

-5

0

5

10

15

20

-350000

-300000

-250000

-200000

-150000

-100000

-50000

0

50000

z

x

y

z

Figure 2.6.1 f (x, y) = x

3

− xy − x + xy

3

− y

4

for −20 ≤ x ≤ 20 and −20 ≤ y ≤ 20

Notice in the formulas (2.14) that we divide by D, so we should pick an initial point

where D is not zero. And we can see that D(0, 0) = (0)(0) − (−1)

2

= −1 0, so take

(0, 0) as our initial point. Since it may take a large number of iterations of Newton’s

algorithm to be sure that we are close enough to the actual critical point, and since

the computations are quite tedious, we will let a computer do the computing. For this,

we will write a simple program, using the Java programming language, which will

take a given initial point as a parameter and then perform 100 iterations of Newton’s

algorithm. In each iteration the new point will be printed, so that we can see if there

is convergence. The full code is shown in Listing 2.1.

2.6 Unconstrained Optimization: Numerical Methods 91

//Program t o f i nd the c r i t i c al poi nt s of f ( x , y)=x^3−xy−x+xy^3−y^4

public class newton {

public static void main( Stri ng [ ] args ) {

//Get the i ni t i al poi nt ( x , y ) as command−l i ne parameters

double x = Double . parseDouble ( args [ 0 ] ) ; //I ni t i al x value

double y = Double . parseDouble ( args [ 1 ] ) ; //I ni t i al y value

System. out . pri nt l n ( " I ni t i al poi nt : ( " + x + " , " + y + " ) " ) ;

//Go through 100 i t e r at i ons of Newton ’ s algorithm

for ( int n=1; n<=100; n++) {

double D = fxx ( x , y) ∗ fyy ( x , y ) − Math. pow( fxy ( x , y ) , 2 ) ;

double xn = x ; double yn = y ; //The current x and y values

i f (D == 0) { //We can not di vi de by 0

System. out . pri nt l n ( " Error : D = 0 at i t er at i on n = " + n ) ;

System. exi t ( 0 ) ; //End the program

} else { //Cal cul ate the new values f or x and y

x = xn − ( fyy ( xn , yn) ∗ f x ( xn , yn) − fxy ( xn , yn) ∗ f y ( xn , yn ) ) / D;

y = yn − ( fxx ( xn , yn) ∗ f y ( xn , yn) − fxy ( xn , yn) ∗ f x ( xn , yn ) ) / D;

System. out . pri nt l n ( "n = " + n + " : ( " + x + " , " + y + " ) " ) ;

}

}

}

//Below are the parts s pe c i f i c t o the f unct i on f

//The f i r s t part i al deri vat i ve of f wrt x : 3x^2−y−1+y^3

public static double f x ( double x , double y ) {

return 3∗Math. pow( x , 2) − y − 1 + Math. pow( y , 3 ) ;

}

//The f i r s t part i al deri vat i ve of f wrt y : −x+3xy^2−4y^3

public static double f y ( double x , double y ) {

return −x + 3∗x∗Math. pow( y , 2) − 4∗Math. pow( y , 3 ) ;

}

//The second part i al deri vat i ve of f wrt x : 6x

public static double fxx ( double x , double y ) {

return 6∗x ;

}

//The second part i al deri vat i ve of f wrt y : 6xy−12y^2

public static double fyy ( double x , double y ) {

return 6∗x∗y − 12∗Math. pow( y , 2 ) ;

}

//The mixed second part i al deri vat i ve of f wrt x and y : −1+3y^2

public static double fxy ( double x , double y ) {

return −1 + 3∗Math. pow( y , 2 ) ;

}

}

Listing 2.1 Program listing for newton.java

92 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

To use this program, you should first save the code in Listing 2.1 in a plain text

file called newton.java. You will need the Java Development Kit

9

to compile the

code. In the directory where newton.java is saved, run this command at a command

prompt to compile the code: javac newton.java

Then run the program with the initial point (0, 0) with this command:

java newton 0 0

Below is the output of the program using (0, 0) as the initial point, truncated to show

the first 10 lines and the last 5 lines:

java newton 0 0

Initial point: (0.0,0.0)

n = 1: (0.0,-1.0)

n = 2: (1.0,-0.5)

n = 3: (0.6065857885615251,-0.44194107452339687)

n = 4: (0.484506572966545,-0.405341511995805)

n = 5: (0.47123972682634485,-0.3966334583092305)

n = 6: (0.47113558510349535,-0.39636450001936047)

n = 7: (0.4711356343449705,-0.3963643379632247)

n = 8: (0.4711356343449874,-0.39636433796318005)

n = 9: (0.4711356343449874,-0.39636433796318005)

n = 10: (0.4711356343449874,-0.39636433796318005)

...

n = 96: (0.4711356343449874,-0.39636433796318005)

n = 97: (0.4711356343449874,-0.39636433796318005)

n = 98: (0.4711356343449874,-0.39636433796318005)

n = 99: (0.4711356343449874,-0.39636433796318005)

n = 100: (0.4711356343449874,-0.39636433796318005)

As you can see, we appear to have converged fairly quickly (after only 8 iterations)

to what appears to be an actual critical point (up to Java’s level of precision), namely

the point (0.4711356343449874, −0.39636433796318005). It is easy to confirm that ∇f = 0

at this point, either by evaluating

∂f

∂x

and

∂f

∂y

at the point ourselves or by modifying our

program to also print the values of the partial derivatives at the point. It turns out

that both partial derivatives are indeed close enough to zero to be considered zero:

∂f

∂x

(0.4711356343449874, −0.39636433796318005) = 4.85722573273506 × 10

−17

∂f

∂y

(0.4711356343449874, −0.39636433796318005) = −8.326672684688674 × 10

−17

We also have D(0.4711356343449874, −0.39636433796318005) = −8.776075636032301 < 0,

so by Theorem 2.6 we know that (0.4711356343449874, −0.39636433796318005) is a sad-

dle point.

9

Available for free at http://java.sun.com/javase/downloads

2.6 Unconstrained Optimization: Numerical Methods 93

Since ∇f consists of cubic polynomials, it seems likely that there may be three criti-

cal points. The computer program makes experimenting with other initial points easy,

and trying different values does indeed lead to different sequences which converge:

java newton -1 -1

Initial point: (-1.0,-1.0)

n = 1: (-0.5,-0.5)

n = 2: (-0.49295774647887325,-0.08450704225352113)

n = 3: (-0.1855674752461383,-1.2047647348546167)

n = 4: (-0.4540060574531383,-0.8643989895639324)

n = 5: (-0.3672160534444,-0.5426077421319053)

n = 6: (-0.4794622222856417,-0.24529117721011612)

n = 7: (0.11570743992954591,-2.4319791238981274)

n = 8: (-0.05837851765533317,-1.6536079835854451)

n = 9: (-0.129841298650007,-1.121516233310142)

n = 10: (-1.004453014967208,-0.9206128022529645)

n = 11: (-0.5161209914612475,-0.4176293491131443)

n = 12: (-0.5788664043863884,0.2918236503332734)

n = 13: (-0.6985177124230715,0.49848120123515316)

n = 14: (-0.6733618916578702,0.4345777963475479)

n = 15: (-0.6704392913413444,0.4252025996474051)

n = 16: (-0.6703832679150286,0.4250147307973365)

n = 17: (-0.6703832459238701,0.42501465652421205)

n = 18: (-0.6703832459238667,0.4250146565242004)

n = 19: (-0.6703832459238667,0.42501465652420045)

n = 20: (-0.6703832459238667,0.42501465652420045)

...

n = 98: (-0.6703832459238667,0.42501465652420045)

n = 99: (-0.6703832459238667,0.42501465652420045)

n = 100: (-0.6703832459238667,0.42501465652420045)

Again, it is easy to confirm that both

∂f

∂x

and

∂f

∂y

vanish at the point

(−0.6703832459238667, 0.42501465652420045), which means it is a critical point. And

D(−0.6703832459238667, 0.42501465652420045) = 15.3853578526055 > 0

∂

2

f

∂x

2

(−0.6703832459238667, 0.42501465652420045) = −4.0222994755432 < 0

so we know that (−0.6703832459238667, 0.42501465652420045) is a local maximum. An

idea of what the graph of f looks like near that point is shown in Figure 2.6.2, which

does suggest a local maximum around that point.

Finally, running the computer program with the initial point (−5, −5) yields the crit-

ical point (−7.540962756992551, −5.595509445899435), with D < 0 at that point, which

makes it a saddle point.

94 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

-1

-0.8

-0.6

-0.4

-0.2

0

0

0.2

0.4

0.6

0.8

1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

z

x

y

z

.

(-0.67,0.42,0.57)

Figure 2.6.2 f (x, y) = x

3

− xy − x + xy

3

− y

4

for −1 ≤ x ≤ 0 and 0 ≤ y ≤ 1

We can summarize our findings for the function f (x, y) = x

3

− xy − x + xy

3

− y

4

:

(0.4711356343449874, −0.39636433796318005) : saddle point

(−0.6703832459238667, 0.42501465652420045) : local maximum

(−7.540962756992551, −5.595509445899435) : saddle point

The derivation of Newton’s algorithm, and the proof that it converges (given a “rea-

sonable” choice for the initial point) requires techniques beyond the scope of this text.

See RALSTON and RABINOWITZ for more detail and for discussion of other numerical

methods. Our description of Newton’s algorithm is the special two-variable case of a

more general algorithm that can be applied to functions of n ≥ 2 variables.

In the case of functions which have a global maximum or minimum, Newton’s algo-

rithm can be used to find those points. In general, global maxima and minima tend

to be more interesting than local versions, at least in practical applications. A maxi-

mization problem can always be turned into a minimization problem (why?), so a large

number of methods have been developed to find the global minimum of functions of

any number of variables. This field of study is called nonlinear programming. Many of

these methods are based on the steepest descent technique, which is based on an idea

that we discussed in Section 2.4. Recall that the negative gradient −∇f gives the di-

rection of the fastest rate of decrease of a function f . The crux of the steepest descent

idea, then, is that starting from some initial point, you move a certain amount in the

2.6 Unconstrained Optimization: Numerical Methods 95

direction of −∇f at that point. Wherever that takes you becomes your new point, and

you then just keep repeating that procedure until eventually (hopefully) you reach the

point where f has its smallest value. There is a “pure” steepest descent method, and a

multitude of variations on it that improve the rate of convergence, ease of calculation,

etc. In fact, Newton’s algorithm can be interpreted as a modified steepest descent

method. For more discussion of this, and of nonlinear programming in general, see

BAZARAA, SHERALI and SHETTY.

¨

©

Exercises

C

1. Recall Example 2.21 from the previous section, where we showed that the point

(2, 1) was a global minimum for the function f (x, y) = (x − 2)

4

+ (x − 2y)

2

. Notice

that our computer program can be modified fairly easily to use this function (just

change the return values in the fx, fy, fxx, fyy and fxy function definitions to use the

appropriate partial derivative). Either modify that program or write one of your

own in a programming language of your choice to show that Newton’s algorithm

does lead to the point (2, 1). First use the initial point (0, 3), then use the initial

point (3, 2), and compare the results. Make sure that your program attempts to do

100 iterations of the algorithm. Did anything strange happen when your program

ran? If so, how do you explain it? (Hint: Something strange should happen.)

2. There is a version of Newton’s algorithm for solving a system of two equations

f

1

(x, y) = 0 and f

2

(x, y) = 0 ,

where f

1

(x, y) and f

2

(x, y) are smooth real-valued functions:

Pick an initial point (x

0

, y

0

). For n = 0, 1, 2, 3, . . . , define:

x

n+1

= x

n

−

f

1

(x

n

, y

n

) f

2

(x

n

, y

n

)

∂f

1

∂y

(x

n

, y

n

)

∂f

2

∂y

(x

n

, y

n

)

D(x

n

, y

n

)

, y

n+1

= y

n

+

f

1

(x

n

, y

n

) f

2

(x

n

, y

n

)

∂f

1

∂x

(x

n

, y

n

)

∂f

2

∂x

(x

n

, y

n

)

D(x

n

, y

n

)

, where

D(x

n

, y

n

) =

∂f

1

∂x

(x

n

, y

n

)

∂f

2

∂y

(x

n

, y

n

) −

∂f

1

∂y

(x

n

, y

n

)

∂f

2

∂x

(x

n

, y

n

) .

Then the sequence of points (x

n

, y

n

)

∞

n=1

converges to a solution. Write a computer

program that uses this algorithm to find approximate solutions to the system of

equations

f

1

(x, y) = sin(xy) − x − y = 0 and f

2

(x, y) = e

2x

− 2x + 3y = 0 .

Show that you get two different solutions when using (0, 0) and (1, 1) for the initial

point (x

0

, y

0

).

96 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

2.7 Constrained Optimization: Lagrange Multipliers

In Sections 2.5 and 2.6 we were concerned with finding maxima and minima of func-

tions without any constraints on the variables (other than being in the domain of the

function). What would we do if there were constraints on the variables? The following

example illustrates a simple case of this type of problem.

Example 2.24. For a rectangle whose perimeter is 20 m, find the dimensions that will

maximize the area.

Solution: The area A of a rectangle with width x and height y is A = xy. The perimeter

P of the rectangle is then given by the formula P = 2x +2y. Since we are given that the

perimeter P = 20, this problem can be stated as:

Maximize : f (x, y) = xy

given : 2x + 2y = 20

The reader is probably familiar with a simple method, using single-variable calculus,

for solving this problem. Since we must have 2x + 2y = 20, then we can solve for, say,

y in terms of x using that equation. This gives y = 10 − x, which we then substitute

into f to get f (x, y) = xy = x(10 − x) = 10x − x

2

. This is now a function of x alone, so we

now just have to maximize the function f (x) = 10x − x

2

on the interval [0, 10]. Since

f

′

(x) = 10−2x = 0 ⇒ x = 5 and f

′′

(5) = −2 < 0, then the Second Derivative Test tells us

that x = 5 is a local maximum for f , and hence x = 5 must be the global maximum on

the interval [0, 10] (since f = 0 at the endpoints of the interval). So since y = 10 − x = 5,

then the maximum area occurs for a rectangle whose width and height both are 5 m.

Notice in the above example that the ease of the solution depended on being able to

solve for one variable in terms of the other in the equation 2x + 2y = 20. But what if

that were not possible (which is often the case)? In this section we will use a general

method, called the Lagrange multiplier method

10

, for solving constrained optimization

problems:

Maximize (or minimize) : f (x, y) (or f (x, y, z))

given : g(x, y) = c (or g(x, y, z) = c) for some constant c

The equation g(x, y) = c is called the constraint equation, and we say that x and y are

constrained by g(x, y) = c. Points (x, y) which are maxima or minima of f (x, y) with the

condition that they satisfy the constraint equation g(x, y) = c are called constrained

maximum or constrained minimum points, respectively. Similar definitions hold for

functions of three variables.

The Lagrange multiplier method for solving such problems can now be stated:

10

Named after the French mathematician Joseph Louis Lagrange (1736-1813).

2.7 Constrained Optimization: Lagrange Multipliers 97

Theorem2.7. Let f (x, y) and g(x, y) be smooth functions, and suppose that c is a scalar

constant such that ∇g(x, y) 0 for all (x, y) that satisfy the equation g(x, y) = c. Then to

solve the constrained optimization problem

Maximize (or minimize) : f (x, y)

given : g(x, y) = c ,

find the points (x, y) that solve the equation ∇f (x, y) = λ∇g(x, y) for some constant λ

(the number λ is called the Lagrange multiplier). If there is a constrained maximum

or minimum, then it must be such a point.

A rigorous proof of the above theorem requires use of the Implicit Function The-

orem, which is beyond the scope of this text.

11

Note that the theorem only gives a

necessary condition for a point to be a constrained maximum or minimum. Whether a

point (x, y) that satisfies ∇f (x, y) = λ∇g(x, y) for some λ actually is a constrained max-

imum or minimum can sometimes be determined by the nature of the problem itself.

For instance, in Example 2.24 it was clear that there had to be a global maximum.

So how can you tell when a point that satisfies the condition in Theorem 2.7 really is

a constrained maximum or minimum? The answer is that it depends on the constraint

function g(x, y), together with any implicit constraints. It can be shown

12

that if the

constraint equation g(x, y) = c (plus any hidden constraints) describes a bounded set

B in

2

, then the constrained maximum or minimum of f (x, y) will occur either at a

point (x, y) satisfying ∇f (x, y) = λ∇g(x, y) or at a “boundary” point of the set B.

In Example 2.24 the constraint equation 2x + 2y = 20 describes a line in

2

, which

by itself is not bounded. However, there are “hidden” constraints, due to the nature

of the problem, namely 0 ≤ x, y ≤ 10, which cause that line to be restricted to a line

segment in

2

(including the endpoints of that line segment), which is bounded.

Example 2.25. For a rectangle whose perimeter is 20 m, use the Lagrange multiplier

method to find the dimensions that will maximize the area.

Solution: As we saw in Example 2.24, with x and y representing the width and height,

respectively, of the rectangle, this problem can be stated as:

Maximize : f (x, y) = xy

given : g(x, y) = 2x + 2y = 20

Then solving the equation ∇f (x, y) = λ∇g(x, y) for some λ means solving the equations

11

See TAYLOR and MANN, § 6.8 for more detail.

12

Again, see TAYLOR and MANN.

98 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

∂f

∂x

= λ

∂g

∂x

and

∂f

∂y

= λ

∂g

∂y

, namely:

y = 2λ ,

x = 2λ

The general idea is to solve for λ in both equations, then set those expressions equal

(since they both equal λ) to solve for x and y. Doing this we get

y

2

= λ =

x

2

⇒ x = y ,

so now substitute either of the expressions for x or y into the constraint equation to

solve for x and y:

20 = g(x, y) = 2x + 2y = 2x + 2x = 4x ⇒ x = 5 ⇒ y = 5

There must be a maximum area, since the minimum area is 0 and f (5, 5) = 25 >

0, so the point (5, 5) that we found (called a constrained critical point) must be the

constrained maximum.

∴ The maximum area occurs for a rectangle whose width and height both are 5 m.

Example 2.26. Find the points on the circle x

2

+ y

2

= 80 which are closest to and

farthest from the point (1, 2).

Solution: The distance d from any point (x, y) to the point (1, 2) is

d =

(x − 1)

2

+ (y − 2)

2

,

and minimizing the distance is equivalent to minimizing the square of the distance.

Thus the problem can be stated as:

Maximize (and minimize) : f (x, y) = (x − 1)

2

+ (y − 2)

2

given : g(x, y) = x

2

+ y

2

= 80

Solving ∇f (x, y) = λ∇g(x, y) means solving the following equations:

2(x − 1) = 2λx ,

2(y − 2) = 2λy

Note that x 0 since otherwise we would get −2 = 0 in the first equation. Similarly,

y 0. So we can solve both equations for λ as follows:

x − 1

x

= λ =

y − 2

y

⇒ xy − y = xy − 2x ⇒ y = 2x

2.7 Constrained Optimization: Lagrange Multipliers 99

x

y

0

(4, 8)

(1, 2)

(−4, −8)

x

2

+ y

2

= 80

Figure 2.7.1

Substituting this into g(x, y) = x

2

+ y

2

= 80 yields 5x

2

= 80,

so x = ±4. So the two constrained critical points are (4, 8)

and (−4, −8). Since f (4, 8) = 45 and f (−4, −8) = 125, and since

there must be points on the circle closest to and farthest

from (1, 2), then it must be the case that (4, 8) is the point

on the circle closest to (1, 2) and (−4, −8) is the farthest from

(1, 2) (see Figure 2.7.1).

Notice that since the constraint equation x

2

+ y

2

= 80 de-

scribes a circle, which is a bounded set in

2

, then we were

guaranteed that the constrained critical points we found

were indeed the constrained maximum and minimum.

The Lagrange multiplier method can be extended to functions of three variables.

Example 2.27.

Maximize (and minimize) : f (x, y, z) = x + z

given : g(x, y, z) = x

2

+ y

2

+ z

2

= 1

Solution: Solve the equation ∇f (x, y, z) = λ∇g(x, y, z):

1 = 2λx

0 = 2λy

1 = 2λz

The first equation implies λ 0 (otherwise we would have 1 = 0), so we can divide

by λ in the second equation to get y = 0 and we can divide by λ in the first and

third equations to get x =

1

2λ

= z. Substituting these expressions into the constraint

equation g(x, y, z) = x

2

+ y

2

+ z

2

= 1 yields the constrained critical points

1

√

2

, 0,

1

√

2

and

−1

√

2

, 0,

−1

√

2

. Since f

1

√

2

, 0,

1

√

2

> f

−1

√

2

, 0,

−1

√

2

**, and since the constraint equation
**

x

2

+ y

2

+ z

2

= 1 describes a sphere (which is bounded) in

3

, then

1

√

2

, 0,

1

√

2

is the

constrained maximum point and

−1

√

2

, 0,

−1

√

2

**is the constrained minimum point.
**

So far we have not attached any significance to the value of the Lagrange multiplier

λ. We needed λ only to find the constrained critical points, but made no use of its value.

It turns out that λ gives an approximation of the change in the value of the function

f (x, y) that we wish to maximize or minimize, when the constant c in the constraint

equation g(x, y) = c is changed by 1.

100 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

For example, in Example 2.25 we showed that the constrained optimization problem

Maximize : f (x, y) = xy

given : g(x, y) = 2x + 2y = 20

had the solution (x, y) = (5, 5), and that λ = x/2 = y/2. Thus, λ = 2.5. In a similar

fashion we could show that the constrained optimization problem

Maximize : f (x, y) = xy

given : g(x, y) = 2x + 2y = 21

has the solution (x, y) = (5.25, 5.25). So we see that the value of f (x, y) at the constrained

maximum increased from f (5, 5) = 25 to f (5.25, 5.25) = 27.5625, i.e. it increased by

2.5625 when we increased the value of c in the constraint equation g(x, y) = c from

c = 20 to c = 21. Notice that λ = 2.5 is close to 2.5625, that is,

λ ≈ ∆f = f (new max. pt) − f (old max. pt) .

Finally, note that solving the equation ∇f (x, y) = λ∇g(x, y) means having to solve a

system of two (possibly nonlinear) equations in three unknowns, which as we have

seen before, may not be possible to do. And the 3-variable case can get even more

complicated. All of this somewhat restricts the usefulness of Lagrange’s method to

relatively simple functions. Luckily there are many numerical methods for solving

constrained optimization problems, though we will not discuss them here.

13

¨

©

Exercises

A

1. Find the constrained maxima and minima of f (x, y) = 2x + y given that x

2

+ y

2

= 4.

2. Find the constrained maxima and minima of f (x, y) = xy given that x

2

+ 3y

2

= 6.

3. Find the points on the circle x

2

+y

2

= 100 which are closest to and farthest from the

point (2, 3).

B

4. Find the constrained maxima and minima of f (x, y, z) = x + y

2

+ 2z given that 4x

2

+

9y

2

− 36z

2

= 36.

5. Find the volume of the largest rectangular parallelepiped that can be inscribed in

the ellipsoid

x

2

a

2

+

y

2

b

2

+

z

2

c

2

= 1 .

13

See BAZARAA, SHERALI and SHETTY.

3 Multiple Integrals

3.1 Double Integrals

In single-variable calculus, differentiation and integration are thought of as inverse

operations. For instance, to integrate a function f (x) it is necessary to find the an-

tiderivative of f , that is, another function F(x) whose derivative is f (x). Is there a

similar way of defining integration of real-valued functions of two or more variables?

The answer is yes, as we will see shortly. Recall also that the definite integral of a

nonnegative function f (x) ≥ 0 represented the area “under” the curve y = f (x). As

we will now see, the double integral of a nonnegative real-valued function f (x, y) ≥ 0

represents the volume “under” the surface z = f (x, y).

Let f (x, y) be a continuous function such that f (x, y) ≥ 0 for all (x, y) on the rectangle

R = {(x, y) : a ≤ x ≤ b, c ≤ y ≤ d} in

2

. We will often write this as R = [a, b] × [c, d].

For any number x∗ in the interval [a, b], slice the surface z = f (x, y) with the plane

x = x∗ parallel to the yz-plane. Then the trace of the surface in that plane is the curve

f (x∗, y), where x∗ is fixed and only y varies. The area A under that curve (i.e. the area

of the region between the curve and the xy-plane) as y varies over the interval [c, d]

then depends only on the value of x∗. So using the variable x instead of x∗, let A(x) be

that area (see Figure 3.1.1).

y

z

x

0 A(x)

R

a

x

b

c d

z = f (x, y)

Figure 3.1.1 The area A(x) varies with x

Then A(x) =

d

c

f (x, y) dy since we are treating x as fixed, and only y varies. This

makes sense since for a fixed x the function f (x, y) is a continuous function of y over

the interval [c, d], so we know that the area under the curve is the definite integral.

101

102 CHAPTER 3. MULTIPLE INTEGRALS

The area A(x) is a function of x, so by the “slice” or cross-section method from single-

variable calculus we know that the volume V of the solid under the surface z = f (x, y)

but above the xy-plane over the rectangle R is the integral over [a, b] of that cross-

sectional area A(x):

V =

b

a

A(x) dx =

b

a

,

d

c

f (x, y) dy

¸

dx (3.1)

We will always refer to this volume as “the volume under the surface”. The above

expression uses what are called iterated integrals. First the function f (x, y) is inte-

grated as a function of y, treating the variable x as a constant (this is called integrat-

ing with respect to y). That is what occurs in the “inner” integral between the square

brackets in equation (3.1). This is the first iterated integral. Once that integration

is performed, the result is then an expression involving only x, which can then be

integrated with respect to x. That is what occurs in the “outer” integral above (the sec-

ond iterated integral). The final result is then a number (the volume). This process

of going through two iterations of integrals is called double integration, and the last

expression in equation (3.1) is called a double integral.

Notice that integrating f (x, y) with respect to y is the inverse operation of taking the

partial derivative of f (x, y) with respect to y. Also, we could just as easily have taken

the area of cross-sections under the surface which were parallel to the xz-plane, which

would then depend only on the variable y, so that the volume V would be

V =

d

c

,

b

a

f (x, y) dx

¸

dy . (3.2)

It turns out that in general

1

the order of the iterated integrals does not matter. Also,

we will usually discard the brackets and simply write

V =

d

c

b

a

f (x, y) dx dy , (3.3)

where it is understood that the fact that dx is written before dy means that the func-

tion f (x, y) is first integrated with respect to x using the “inner” limits of integration a

and b, and then the resulting function is integrated with respect to y using the “outer”

limits of integration c and d. This order of integration can be changed if it is more

convenient.

Example 3.1. Find the volume V under the plane z = 8x + 6y over the rectangle R =

[0, 1] × [0, 2].

1

due to Fubini’s Theorem. See Ch. 18 in TAYLOR and MANN.

3.1 Double Integrals 103

Solution: We see that f (x, y) = 8x + 6y ≥ 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2, so:

V =

2

0

1

0

(8x + 6y) dx dy

=

2

0

4x

2

+ 6xy

x=1

x=0

dy

=

2

0

(4 + 6y) dy

= 4y + 3y

2

2

0

= 20

Suppose we had switched the order of integration. We can verify that we still get the

same answer:

V =

1

0

2

0

(8x + 6y) dy dx

=

1

0

8xy + 3y

2

y=2

y=0

dx

=

1

0

(16x + 12) dx

= 8x

2

+ 12x

1

0

= 20

Example 3.2. Find the volume V under the surface z = e

x+y

over the rectangle R =

[2, 3] × [1, 2].

Solution: We know that f (x, y) = e

x+y

> 0 for all (x, y), so

V =

2

1

3

2

e

x+y

dx dy

=

2

1

e

x+y

x=3

x=2

dy

=

2

1

(e

y+3

− e

y+2

) dy

= e

y+3

− e

y+2

2

1

= e

5

− e

4

− (e

4

− e

3

) = e

5

− 2e

4

+ e

3

Recall that for a general function f (x), the integral

b

a

f (x) dx represents the differ-

ence of the area below the curve y = f (x) but above the x-axis when f (x) ≥ 0, and the

104 CHAPTER 3. MULTIPLE INTEGRALS

area above the curve but below the x-axis when f (x) ≤ 0. Similarly, the double inte-

gral of any continuous function f (x, y) represents the difference of the volume below

the surface z = f (x, y) but above the xy-plane when f (x, y) ≥ 0, and the volume above

the surface but below the xy-plane when f (x, y) ≤ 0. Thus, our method of double inte-

gration by means of iterated integrals can be used to evaluate the double integral of

any continuous function over a rectangle, regardless of whether f (x, y) ≥ 0 or not.

Example 3.3. Evaluate

2π

0

π

0

sin(x + y) dx dy.

Solution: Note that f (x, y) = sin(x + y) is both positive and negative over the rectangle

[0, π] × [0, 2π]. We can still evaluate the double integral:

2π

0

π

0

sin(x + y) dx dy =

2π

0

−cos(x + y)

x=π

x=0

dy

=

2π

0

(−cos(y + π) + cos y) dy

= −sin(y + π) + sin y

2π

0

= −sin 3π + sin 2π − (−sin π + sin 0)

= 0

¨

©

Exercises

A

For Exercises 1-4, find the volume under the surface z = f (x, y) over the rectangle R.

1. f (x, y) = 4xy, R = [0, 1] × [0, 1] 2. f (x, y) = e

x+y

, R = [0, 1] × [−1, 1]

3. f (x, y) = x

3

+ y

2

, R = [0, 1] × [0, 1] 4. f (x, y) = x

4

+ xy + y

3

, R = [1, 2] × [0, 2]

For Exercises 5-12, evaluate the given double integral.

5.

1

0

2

1

(1 − y)x

2

dx dy 6.

1

0

2

0

x(x + y) dx dy

7.

2

0

1

0

(x + 2) dx dy 8.

2

−1

1

−1

x(xy + sin x) dx dy

9.

π/2

0

1

0

xy cos(x

2

y) dx dy 10.

π

0

π/2

0

sin x cos(y − π) dx dy

11.

2

0

4

1

xy dx dy 12.

1

−1

2

−1

1 dx dy

13. Let M be a constant. Show that

d

c

b

a

M dx dy = M(d − c)(b − a).

3.2 Double Integrals Over a General Region 105

3.2 Double Integrals Over a General Region

In the previous section we got an idea of what a double integral over a rectangle

represents. We can now define the double integral of a real-valued function f (x, y)

over more general regions in

2

.

Suppose that we have a region R in the xy-plane that is bounded on the left by

the vertical line x = a, bounded on the right by the vertical line x = b (where a < b),

bounded belowby a curve y = g

1

(x), and bounded above by a curve y = g

2

(x), as in Figure

3.2.1(a). We will assume that g

1

(x) and g

2

(x) do not intersect on the open interval (a, b)

(they could intersect at the endpoints x = a and x = b, though).

x

y

0

y = g

2

(x)

y = g

1

(x)

R

a

b

(a) Vertical slice:

b

a

g

2

(x)

g

1

(x)

f (x, y) dy dx

x

y

0

x = h

1

(y)

x = h

2

(y)

R

c

d

(b) Horizontal slice:

d

c

h

2

(y)

h

1

(y)

f (x, y) dx dy

Figure 3.2.1 Double integral over a nonrectangular region R

Then using the slice method from the previous section, the double integral of a

real-valued function f (x, y) over the region R, denoted by

R

f (x, y) dA, is given by

R

f (x, y) dA =

b

a

,

g

2

(x)

g

1

(x)

f (x, y) dy

¸

dx (3.4)

This means that we take vertical slices in the region R between the curves y = g

1

(x)

and y = g

2

(x). The symbol dA is sometimes called an area element or infinitesimal,

with the A signifying area. Note that f (x, y) is first integrated with respect to y, with

functions of x as the limits of integration. This makes sense since the result of the

first iterated integral will have to be a function of x alone, which then allows us to

take the second iterated integral with respect to x.

Similarly, if we have a region R in the xy-plane that is bounded on the left by a curve

x = h

1

(y), bounded on the right by a curve x = h

2

(y), bounded below by the horizontal

106 CHAPTER 3. MULTIPLE INTEGRALS

line y = c, and bounded above by the horizontal line y = d (where c < d), as in Figure

3.2.1(b) (assuming that h

1

(y) and h

2

(y) do not intersect on the open interval (c, d)), then

taking horizontal slices gives

R

f (x, y) dA =

d

c

,

h

2

(y)

h

1

(y)

f (x, y) dx

¸

dy (3.5)

Notice that these definitions include the case when the region R is a rectangle.

Also, if f (x, y) ≥ 0 for all (x, y) in the region R, then

R

f (x, y) dA is the volume under the

surface z = f (x, y) over the region R.

Example 3.4. Find the volume V under the plane z = 8x+6y over the region R = {(x, y) :

0 ≤ x ≤ 1, 0 ≤ y ≤ 2x

2

}.

x

y

0

y = 2x

2

R

1

Figure 3.2.2

Solution: The region R is shown in Figure 3.2.2. Using vertical slices

we get:

V =

R

(8x + 6y) dA

=

1

0

,

¸

¸

¸

¸

¸

¸

2x

2

0

(8x + 6y) dy

¸

¸

¸

¸

¸

¸

¸

dx

=

1

0

¸

8xy + 3y

2

y=2x

2

y=0

dx

=

1

0

(16x

3

+ 12x

4

) dx

= 4x

4

+

12

5

x

5

1

0

= 4 +

12

5

=

32

5

= 6.4

x

y

0

2

x =

y/2

R

1

Figure 3.2.3

We get the same answer using horizontal slices (see Figure 3.2.3):

V =

R

(8x + 6y) dA

=

2

0

,

¸

¸

¸

¸

¸

¸

1

√

y/2

(8x + 6y) dx

¸

¸

¸

¸

¸

¸

¸

dy

=

2

0

¸

4x

2

+ 6xy

x=1

x=

√

y/2

dy

=

2

0

(4 + 6y − (2y +

6

√

2

y

√

y )) dy =

2

0

(4 + 4y − 3

√

2y

3/2

) dy

= 4y + 2y

2

−

6

√

2

5

y

5/2

2

0

= 8 + 8 −

6

√

2

√

32

5

= 16 −

48

5

=

32

5

= 6.4

3.2 Double Integrals Over a General Region 107

Example 3.5. Find the volume V of the solid bounded by the three coordinate planes

and the plane 2x + y + 4z = 4.

y

z

x

0

(0, 4, 0)

(0, 0, 1)

(2, 0, 0)

2x + y + 4z = 4

(a)

x

y

0

y = −2x + 4

R

2

4

(b)

Figure 3.2.4

Solution: The solid is shown in Figure 3.2.4(a) with a typical vertical slice. The volume

V is given by

R

f (x, y) dA, where f (x, y) = z =

1

4

(4 − 2x − y) and the region R, shown in

Figure 3.2.4(b), is R = {(x, y) : 0 ≤ x ≤ 2, 0 ≤ y ≤ −2x +4}. Using vertical slices in R gives

V =

R

1

4

(4 − 2x − y) dA

=

2

0

,

−2x+4

0

1

4

(4 − 2x − y) dy

¸

dx

=

2

0

−

1

8

(4 − 2x − y)

2

y=−2x+4

y=0

dx

=

2

0

1

8

(4 − 2x)

2

dx

= −

1

48

(4 − 2x)

3

2

0

=

64

48

=

4

3

For a general region R, which may not be one of the types of regions we have consid-

ered so far, the double integral

R

f (x, y) dA is defined as follows. Assume that f (x, y)

is a nonnegative real-valued function and that R is a bounded region in

2

, so it can

be enclosed in some rectangle [a, b] × [c, d]. Then divide that rectangle into a grid of

subrectangles. Only consider the subrectangles that are enclosed completely within

the region R, as shown by the shaded subrectangles in Figure 3.2.5(a). In any such

subrectangle [x

i

, x

i+1

] ×[y

j

, y

j+1

], pick a point (x

i∗

, y

j∗

). Then the volume under the surface

z = f (x, y) over that subrectangle is approximately f (x

i∗

, y

j∗

) ∆x

i

∆y

j

, where ∆x

i

= x

i+1

− x

i

,

108 CHAPTER 3. MULTIPLE INTEGRALS

∆y

j

= y

j+1

−y

j

, and f (x

i∗

, y

j∗

) is the height and ∆x

i

∆y

j

is the base area of a parallelepiped,

as shown in Figure 3.2.5(b). Then the total volume under the surface is approximately

the sum of the volumes of all such parallelepipeds, namely

¸

j

¸

i

f (x

i∗

, y

j∗

) ∆x

i

∆y

j

, (3.6)

where the summation occurs over the indices of the subrectangles inside R. If we

take smaller and smaller subrectangles, so that the length of the largest diagonal of

the subrectangles goes to 0, then the subrectangles begin to fill more and more of

the region R, and so the above sum approaches the actual volume under the surface

z = f (x, y) over the region R. We then define

R

f (x, y) dA as the limit of that double

summation (the limit is taken over all subdivisions of the rectangle [a, b] ×[c, d] as the

largest diagonal of the subrectangles goes to 0).

x

y

0

d

c

y

j

y

j+1

a

b

x

i

x

i+1

(x

i∗

, y

j∗

)

(a) Subrectangles inside the region R

y

z

x

0

R

x

i

x

i+1

y

j

y

j+1

z = f (x, y)

∆y

j

∆x

i

(x

i∗

, y

j∗

)

f (x

i∗

, y

j∗

)

(b) Parallelepiped over a subrectan-

gle, with volume f (x

i∗

, y

j∗

) ∆x

i

∆y

j

Figure 3.2.5 Double integral over a general region R

A similar definition can be made for a function f (x, y) that is not necessarily always

nonnegative: just replace each mention of volume by the negative volume in the de-

scription above when f (x, y) < 0. In the case of a region of the type shown in Figure

3.2.1, using the definition of the Riemann integral from single-variable calculus, our

definition of

R

f (x, y) dA reduces to a sequence of two iterated integrals.

Finally, the region R does not have to be bounded. We can evaluate improper double

integrals (i.e. over an unbounded region, or over a region which contains points where

the function f (x, y) is not defined) as a sequence of iterated improper single-variable

integrals.

3.2 Double Integrals Over a General Region 109

Example 3.6. Evaluate

∞

1

1/x

2

0

2y dy dx.

Solution:

∞

1

1/x

2

0

2y dy dx =

∞

1

¸

y

2

y=1/x

2

y=0

dx

=

∞

1

x

−4

dx = −

1

3

x

−3

∞

1

= 0 − (−

1

3

) =

1

3

¨

©

Exercises

A

For Exercises 1-6, evaluate the given double integral.

1.

1

0

1

√

x

24x

2

y dy dx

2.

π

0

y

0

sin x dx dy

3.

2

1

ln x

0

4x dy dx 4.

2

0

2y

0

e

y

2

dx dy

5.

π/2

0

y

0

cos x sin y dx dy

6.

∞

0

∞

0

xye

−(x

2

+y

2

)

dx dy

7.

2

0

y

0

1 dx dy

8.

1

0

x

2

0

2 dy dx

9. Find the volume V of the solid bounded by the three coordinate planes and the

plane x + y + z = 1.

10. Find the volume V of the solid bounded by the three coordinate planes and the

plane 3x + 2y + 5z = 6.

B

11. Explain why the double integral

R

1 dA gives the area of the region R. For sim-

plicity, you can assume that R is a region of the type shown in Figure 3.2.1(a).

C

b

c

a

Figure 3.2.6

12. Prove that the volume of a tetrahedron with mutually per-

pendicular adjacent sides of lengths a, b, and c, as in Figure

3.2.6, is

abc

6

. (Hint: Mimic Example 3.5, and recall from

Section 1.5 how three noncollinear points determine a plane.)

13. Show how Exercise 12 can be used to solve Exercise 10.

110 CHAPTER 3. MULTIPLE INTEGRALS

3.3 Triple Integrals

Our definition of a double integral of a real-valued function f (x, y) over a region R in

2

can be extended to define a triple integral of a real-valued function f (x, y, z) over

a solid S in

3

. We simply proceed as before: the solid S can be enclosed in some

rectangular parallelepiped, which is then divided into subparallelepipeds. In each

subparallelepiped inside S , with sides of lengths ∆x, ∆y and ∆z, pick a point (x

∗

, y

∗

, z

∗

).

Then define the triple integral of f (x, y, z) over S , denoted by

S

f (x, y, z) dV, by

S

f (x, y, z) dV = lim

¸ ¸ ¸

f (x

∗

, y

∗

, z

∗

) ∆x ∆y ∆z , (3.7)

where the limit is over all divisions of the rectangular parallelepiped enclosing S into

subparallelepipeds whose largest diagonal is going to 0, and the triple summation

is over all the subparallelepipeds inside S . It can be shown that this limit does not

depend on the choice of the rectangular parallelepiped enclosing S . The symbol dV is

often called the volume element.

Physically, what does the triple integral represent? We saw that a double integral

could be thought of as the volume under a two-dimensional surface. It turns out that

the triple integral simply generalizes this idea: it can be thought of as representing

the hypervolume under a three-dimensional hypersurface w = f (x, y, z) whose graph

lies in

4

. In general, the word “volume” is often used as a general term to signify the

same concept for any n-dimensional object (e.g. length in

1

, area in

2

). It may be

hard to get a grasp on the concept of the “volume” of a four-dimensional object, but at

least we now know how to calculate that volume!

In the case where S is a rectangular parallelepiped [x

1

, x

2

] × [y

1

, y

2

] × [z

1

, z

2

], that is,

S = {(x, y, z) : x

1

≤ x ≤ x

2

, y

1

≤ y ≤ y

2

, z

1

≤ z ≤ z

2

}, the triple integral is a sequence of

three iterated integrals, namely

S

f (x, y, z) dV =

z

2

z

1

y

2

y

1

x

2

x

1

f (x, y, z) dx dy dz , (3.8)

where the order of integration does not matter. This is the simplest case.

A more complicated case is where S is a solid which is bounded below by a surface

z = g

1

(x, y), bounded above by a surface z = g

2

(x, y), y is bounded between two curves

h

1

(x) and h

2

(x), and x varies between a and b. Then

S

f (x, y, z) dV =

b

a

h

2

(x)

h

1

(x)

g

2

(x,y)

g

1

(x,y)

f (x, y, z) dz dy dx . (3.9)

Notice in this case that the first iterated integral will result in a function of x and y

(since its limits of integration are functions of x and y), which then leaves you with a

3.3 Triple Integrals 111

double integral of a type that we learned how to evaluate in Section 3.2. There are, of

course, many variations on this case (for example, changing the roles of the variables

x, y, z), so as you can probably tell, triple integrals can be quite tricky. At this point,

just learning how to evaluate a triple integral, regardless of what it represents, is the

most important thing. We will see some other ways in which triple integrals are used

later in the text.

Example 3.7. Evaluate

3

0

2

0

1

0

(xy + z) dx dy dz.

Solution:

3

0

2

0

1

0

(xy + z) dx dy dz =

3

0

2

0

1

2

x

2

y + xz

x=1

x=0

dy dz

=

3

0

2

0

1

2

y + z

dy dz

=

3

0

1

4

y

2

+ yz

y=2

y=0

dz

=

3

0

(1 + 2z) dz

= z + z

2

3

0

= 12

Example 3.8. Evaluate

1

0

1−x

0

2−x−y

0

(x + y + z) dz dy dx.

Solution:

1

0

1−x

0

2−x−y

0

(x + y + z) dz dy dx =

1

0

1−x

0

(x + y)z +

1

2

z

2

z=2−x−y

z=0

dy dx

=

1

0

1−x

0

(x + y)(2 − x − y) +

1

2

(2 − x − y)

2

dy dx

=

1

0

1−x

0

2 −

1

2

x

2

− xy −

1

2

y

2

dy dx

=

1

0

2y −

1

2

x

2

y − xy −

1

2

xy

2

−

1

6

y

3

y=1−x

y=0

dx

=

1

0

11

6

− 2x +

1

6

x

3

dx

=

11

6

x − x

2

+

1

24

x

4

1

0

=

7

8

112 CHAPTER 3. MULTIPLE INTEGRALS

Note that the volume V of a solid in

3

is given by

V =

S

1 dV . (3.10)

Since the function being integrated is the constant 1, then the above triple integral

reduces to a double integral of the types that we considered in the previous section

if the solid is bounded above by some surface z = f (x, y) and bounded below by the

xy-plane z = 0. There are many other possibilities. For example, the solid could be

bounded below and above by surfaces z = g

1

(x, y) and z = g

2

(x, y), respectively, with y

bounded between two curves h

1

(x) and h

2

(x), and x varies between a and b. Then

V =

S

1 dV =

b

a

h

2

(x)

h

1

(x)

g

2

(x,y)

g

1

(x,y)

1 dz dy dx =

b

a

h

2

(x)

h

1

(x)

(g

2

(x, y) − g

1

(x, y)) dy dx

just like in equation (3.9). See Exercise 10 for an example.

¨

©

Exercises

A

For Exercises 1-8, evaluate the given triple integral.

1.

3

0

2

0

1

0

xyz dx dy dz 2.

1

0

x

0

y

0

xyz dz dy dx

3.

π

0

x

0

xy

0

x

2

sin z dz dy dx

4.

1

0

z

0

y

0

ze

y

2

dx dy dz

5.

e

1

y

0

1/y

0

x

2

z dx dz dy

6.

2

1

y

2

0

z

2

0

yz dx dz dy

7.

2

1

4

2

3

0

1 dx dy dz 8.

1

0

1−x

0

1−x−y

0

1 dz dy dx

9. Let M be a constant. Show that

z

2

z

1

y

2

y

1

x

2

x

1

M dx dy dz = M(z

2

− z

1

)(y

2

− y

1

)(x

2

− x

1

).

B

10. Find the volume V of the solid S bounded by the three coordinate planes, bounded

above by the plane x + y + z = 2, and bounded below by the plane z = x + y.

C

11. Show that

b

a

z

a

y

a

f (x) dx dy dz =

b

a

(b−x)

2

2

f (x) dx. (Hint: Think of how changing

the order of integration in the triple integral changes the limits of integration.)

3.4 Numerical Approximation of Multiple Integrals 113

3.4 Numerical Approximation of Multiple Integrals

As you have seen, calculating multiple integrals is tricky even for simple functions

and regions. For complicated functions, it may not be possible to evaluate one of the

iterated integrals in a simple closed form. Luckily there are numerical methods for

approximating the value of a multiple integral. The method we will discuss is called

the Monte Carlo method. The idea behind it is based on the concept of the average

value of a function, which you learned in single-variable calculus. Recall that for a

continuous function f (x), the average value

¯

f of f over an interval [a, b] is defined as

¯

f =

1

b − a

b

a

f (x) dx . (3.11)

The quantity b − a is the length of the interval [a, b], which can be thought of as the

“volume” of the interval. Applying the same reasoning to functions of two or three

variables, we define the average value of f (x, y) over a region R to be

¯

f =

1

A(R)

R

f (x, y) dA , (3.12)

where A(R) is the area of the region R, and we define the average value of f (x, y, z)

over a solid S to be

¯

f =

1

V(S )

S

f (x, y, z) dV , (3.13)

where V(S ) is the volume of the solid S . Thus, for example, we have

R

f (x, y) dA = A(R)

¯

f . (3.14)

The average value of f (x, y) over R can be thought of as representing the sum of all the

values of f divided by the number of points in R. Unfortunately there are an infinite

number (in fact, uncountably many) points in any region, i.e. they can not be listed in

a discrete sequence. But what if we took a very large number N of random points in

the region R (which can be generated by a computer) and then took the average of the

values of f for those points, and used that average as the value of

¯

f ? This is exactly

what the Monte Carlo method does. So in formula (3.14) the approximation we get is

R

f (x, y) dA ≈ A(R)

¯

f ± A(R)

f

2

− (

¯

f )

2

N

, (3.15)

where

¯

f =

¸

N

i=1

f (x

i

, y

i

)

N

and f

2

=

¸

N

i=1

( f (x

i

, y

i

))

2

N

, (3.16)

114 CHAPTER 3. MULTIPLE INTEGRALS

with the sums taken over the N random points (x

1

, y

1

), . . ., (x

N

, y

N

). The ± “error term”

in formula (3.15) does not really provide hard bounds on the approximation. It repre-

sents a single standard deviation from the expected value of the integral. That is, it

provides a likely bound on the error. Due to its use of random points, the Monte Carlo

method is an example of a probabilistic method (as opposed to deterministic methods

such as Newton’s method, which use a specific formula for generating points).

For example, we can use formula (3.15) to approximate the volume V under the

plane z = 8x + 6y over the rectangle R = [0, 1] × [0, 2]. In Example 3.1 in Section 3.1,

we showed that the actual volume is 20. Below is a code listing (montecarlo.java) for

a Java program that calculates the volume, using a number of points N that is passed

on the command line as a parameter.

//Program t o approximate the double i nt egral of f ( x , y)=8x+6y

//over the rect angl e [ 0 , 1] x [ 0 , 2] .

public class montecarlo {

public static void main( Stri ng [ ] args ) {

//Get the number N of random poi nt s as a command−l i ne parameter

int N = Integer . parseInt ( args [ 0 ] ) ;

double x = 0; //x−coordi nat e of a random poi nt

double y = 0; //y−coordi nat e of a random poi nt

double f = 0. 0; //Value of f at a random poi nt

double mf = 0. 0; //Mean of the values of f

double mf2 = 0. 0; //Mean of the values of f ^2

for ( int i =0; i <N; i ++) { //Get the random coordi nat es

x = Math. random( ) ; //x i s between 0 and 1

y = 2 ∗ Math. random( ) ; //y i s between 0 and 2

f = 8∗x + 6∗y ; //Value of the f unct i on

mf = mf + f ; //Add t o the sum of the f values

mf2 = mf2 + f ∗ f ; //Add t o the sum of the f ^2 values

}

mf = mf /N; //Compute the mean of the f values

mf2 = mf2/N; //Compute the mean of the f ^2 values

System. out . pri nt l n ( "N = " + N + " : i nt egral = " + vol ( ) ∗ mf + " +/ − "

+ vol ( ) ∗Math. sqrt ( ( mf2 − Math. pow( mf , 2 ) ) / N) ) ; //Print the r e s ul t

}

//The volume of the rect angl e [ 0 , 1] x [ 0 , 2]

public static double vol ( ) {

return 1∗2;

}

}

Listing 3.1 Program listing for montecarlo.java

The results of running this program with various numbers of random points (e.g.

java montecarlo 100) are shown below:

3.4 Numerical Approximation of Multiple Integrals 115

N = 10: 19.36543087722646 +/- 2.7346060413546147

N = 100: 21.334419561385353 +/- 0.7547037194998519

N = 1000: 19.807662237526227 +/- 0.26701709691370235

N = 10000: 20.080975812043256 +/- 0.08378816229769506

N = 100000: 20.009403854556716 +/- 0.026346782289498317

N = 1000000: 20.000866994982314 +/- 0.008321168748642816

As you can see, the approximation is fairly good. As N → ∞, it can be shown that the

Monte Carlo approximation converges to the actual volume (on the order of O(

√

N), in

computational complexity terminology).

In the above example the region R was a rectangle. To use the Monte Carlo method

for a nonrectangular (bounded) region R, only a slight modification is needed. Pick a

rectangle

˜

R that encloses R, and generate random points in that rectangle as before.

Then use those points in the calculation of

¯

f only if they are inside R. There is no need

to calculate the area of R for formula (3.15) in this case, since the exclusion of points

not inside R allows you to use the area of the rectangle

˜

R instead, similar to before.

For instance, in Example 3.4 we showed that the volume under the surface z = 8x+6y

over the nonrectangular region R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 2x

2

} is 6.4. Since the

rectangle

˜

R = [0, 1] × [0, 2] contains R, we can use the same program as before, with

the only change being a check to see if y < 2x

2

for a random point (x, y) in [0, 1] × [0, 2].

Listing 3.2 below contains the code (montecarlo2.java):

//Program t o approximate the double i nt egral of f ( x , y)=8x+6y over the

//regi on bounded by x=0, x=1, y=0, and y=2x^2

public class montecarlo2 {

public static void main( Stri ng [ ] args ) {

//Get the number N of random poi nt s as a command−l i ne parameter

int N = Integer . parseInt ( args [ 0 ] ) ;

double x = 0; //x−coordi nat e of a random poi nt

double y = 0; //y−coordi nat e of a random poi nt

double f = 0. 0; //Value of f at a random poi nt

double mf = 0. 0; //Mean of the values of f

double mf2 = 0. 0; //Mean of the values of f ^2

for ( int i =0; i <N; i ++) { //Get the random coordi nat es

x = Math. random( ) ; //x i s between 0 and 1

y = 2 ∗ Math. random( ) ; //y i s between 0 and 2

i f ( y < 2∗Math. pow( x , 2 ) ) { //The poi nt i s in the regi on

f = 8∗x + 6∗y ; //Value of the f unct i on

mf = mf + f ; //Add t o the sum of the f values

mf2 = mf2 + f ∗ f ; //Add t o the sum of the f ^2 values

}

}

mf = mf /N; //Compute the mean of the f values

mf2 = mf2/N; //Compute the mean of the f ^2 values

System. out . pri nt l n ( "N = " + N + " : i nt egral = " + vol ( ) ∗ mf +

116 CHAPTER 3. MULTIPLE INTEGRALS

" +/ − " + vol ( ) ∗Math. sqrt ( ( mf2 − Math. pow( mf , 2 ) ) / N) ) ;

}

//The volume of the rect angl e [ 0 , 1] x [ 0 , 2]

public static double vol ( ) {

return 1∗2;

}

}

Listing 3.2 Program listing for montecarlo2.java

The results of running the program with various numbers of random points (e.g.

java montecarlo2 1000) are shown below:

N = 10: integral = 6.95747529014894 +/- 2.9185131565120592

N = 100: integral = 6.3149056229650355 +/- 0.9549009662159909

N = 1000: integral = 6.477032813858756 +/- 0.31916837260973624

N = 10000: integral = 6.349975080015089 +/- 0.10040086346895105

N = 100000: integral = 6.440184132811864 +/- 0.03200476870881392

N = 1000000: integral = 6.417050897922222 +/- 0.01009454409789472

To use the Monte Carlo method to evaluate triple integrals, you will need to gen-

erate random triples (x, y, z) in a parallelepiped, instead of random pairs (x, y) in a

rectangle, and use the volume of the parallelepiped instead of the area of a rectan-

gle in formula (3.15) (see Exercise 2). For a more detailed discussion of numerical

integration methods, see PRESS et al.

¨

©

Exercises

C

1. Write a program that uses the Monte Carlo method to approximate the double

integral

R

e

xy

dA, where R = [0, 1] × [0, 1]. Show the program output for N =

10, 100, 1000, 10000, 100000 and 1000000 random points.

2. Write a program that uses the Monte Carlo method to approximate the triple in-

tegral

S

e

xyz

dV, where S = [0, 1] × [0, 1] × [0, 1]. Show the program output for

N = 10, 100, 1000, 10000, 100000 and 1000000 random points.

3. Repeat Exercise 1 with the region R = {(x, y) : −1 ≤ x ≤ 1, 0 ≤ y ≤ x

2

}.

4. Repeat Exercise 2 with the solid S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 − x − y}.

5. Use the Monte Carlo method to approximate the volume of a sphere of radius 1.

6. Use the Monte Carlo method to approximate the volume of the ellipsoid

x

2

9

+

y

2

4

+

z

2

1

=

1.

3.5 Change of Variables in Multiple Integrals 117

3.5 Change of Variables in Multiple Integrals

Given the difficulty of evaluating multiple integrals, the reader may be wondering if

it is possible to simplify those integrals using a suitable substitution for the variables.

The answer is yes, though it is a bit more complicated than the substitution method

which you learned in single-variable calculus.

Recall that if you are given, for example, the definite integral

2

1

x

3

x

2

− 1 dx ,

then you would make the substitution

u = x

2

− 1 ⇒ x

2

= u + 1

du = 2x dx

which changes the limits of integration

x = 1 ⇒ u = 0

x = 2 ⇒ u = 3

so that we get

2

1

x

3

x

2

− 1 dx =

2

1

1

2

x

2

· 2x

x

2

− 1 dx

=

3

0

1

2

(u + 1)

√

u du

=

1

2

3

0

u

3/2

+ u

1/2

**du , which can be easily integrated to give
**

=

14

√

3

5

.

Let us take a different look at what happened when we did that substitution, which

will give some motivation for how substitution works in multiple integrals. First, we

let u = x

2

− 1. On the interval of integration [1, 2], the function x → x

2

− 1 is strictly

increasing (and maps [1, 2] onto [0, 3]) and hence has an inverse function (defined on

the interval [0, 3]). That is, on [0, 3] we can define x as a function of u, namely

x = g(u) =

√

u + 1 .

Then substituting that expression for x into the function f (x) = x

3

√

x

2

− 1 gives

f (x) = f (g(u)) = (u + 1)

3/2

√

u ,

118 CHAPTER 3. MULTIPLE INTEGRALS

and we see that

dx

du

= g

′

(u) ⇒ dx = g

′

(u) du

dx =

1

2

(u + 1)

−1/2

du ,

so since

g(0) = 1 ⇒ 0 = g

−1

(1)

g(3) = 2 ⇒ 3 = g

−1

(2)

then performing the substitution as we did earlier gives

2

1

f (x) dx =

2

1

x

3

x

2

− 1 dx

=

3

0

1

2

(u + 1)

√

u du , which can be written as

=

3

0

(u + 1)

3/2

√

u ·

1

2

(u + 1)

−1/2

du , which means

2

1

f (x) dx =

g

−1

(2)

g

−1

(1)

f (g(u)) g

′

(u) du .

In general, if x = g(u) is a one-to-one, differentiable function from an interval [c, d]

(which you can think of as being on the “u-axis”) onto an interval [a, b] (on the x-axis),

which means that g

′

(u) 0 on the interval (c, d), so that a = g(c) and b = g(d), then

c = g

−1

(a) and d = g

−1

(b), and

b

a

f (x) dx =

g

−1

(b)

g

−1

(a)

f (g(u)) g

′

(u) du . (3.17)

This is called the change of variable formula for integrals of single-variable functions,

and it is what you were implicitly using when doing integration by substitution. This

formula turns out to be a special case of a more general formula which can be used

to evaluate multiple integrals. We will state the formulas for double and triple inte-

grals involving real-valued functions of two and three variables, respectively. We will

assume that all the functions involved are continuously differentiable and that the re-

gions and solids involved all have “reasonable” boundaries. The proof of the following

theorem is beyond the scope of the text.

2

2

See TAYLOR and MANN, § 15.32 and § 15.62 for all the details.

3.5 Change of Variables in Multiple Integrals 119

Theorem 3.1. Change of Variables Formula for Multiple Integrals

Let x = x(u, v) and y = y(u, v) define a one-to-one mapping of a region R

′

in the uv-plane

onto a region R in the xy-plane such that the determinant

J(u, v) =

∂x

∂u

∂x

∂v

∂y

∂u

∂y

∂v

(3.18)

is never 0 in R

′

. Then

R

f (x, y) dA(x, y) =

R

′

f (x(u, v), y(u, v)) | J(u, v)| dA(u, v) . (3.19)

We use the notation dA(x, y) and dA(u, v) to denote the area element in the (x, y) and

(u, v) coordinates, respectively.

Similarly, if x = x(u, v, w), y = y(u, v, w) and z = z(u, v, w) define a one-to-one mapping

of a solid S

′

in uvw-space onto a solid S in xyz-space such that the determinant

J(u, v, w) =

∂x

∂u

∂x

∂v

∂x

∂w

∂y

∂u

∂y

∂v

∂y

∂w

∂z

∂u

∂z

∂v

∂z

∂w

(3.20)

is never 0 in S

′

, then

S

f (x, y, z) dV(x, y, z) =

S

′

f (x(u, v, w), y(u, v, w), z(u, v, w)) | J(u, v, w)| dV(u, v, w) .

(3.21)

The determinant J(u, v) in formula (3.18) is called the Jacobian of x and y with

respect to u and v, and is sometimes written as

J(u, v) =

∂(x, y)

∂(u, v)

. (3.22)

Similarly, the Jacobian J(u, v, w) of three variables is sometimes written as

J(u, v, w) =

∂(x, y, z)

∂(u, v, w)

. (3.23)

Notice that formula (3.19) is saying that dA(x, y) = | J(u, v)| dA(u, v), which you can think

of as a two-variable version of the relation dx = g

′

(u) du in the single-variable case.

The following example shows how the change of variables formula is used.

120 CHAPTER 3. MULTIPLE INTEGRALS

Example 3.9. Evaluate

R

e

x−y

x+y

dA, where R = {(x, y) : x ≥ 0, y ≥ 0, x + y ≤ 1}.

Solution: First, note that evaluating this double integral without using substitution

is probably impossible, at least in a closed form. By looking at the numerator and

denominator of the exponent of e, we will try the substitution u = x −y and v = x +y. To

use the change of variables formula (3.19), we need to write both x and y in terms of u

and v. So solving for x and y gives x =

1

2

(u + v) and y =

1

2

(v − u). In Figure 3.5.1 below,

we see how the mapping x = x(u, v) =

1

2

(u + v), y = y(u, v) =

1

2

(v − u) maps the region R

′

onto R in a one-to-one manner.

x

y

0

x + y = 1

1

1

R

u

v

0

1

−1 1

R

′

u = v u = −v

x =

1

2

(u + v)

y =

1

2

(v − u)

Figure 3.5.1 The regions R and R

′

Now we see that

J(u, v) =

∂x

∂u

∂x

∂v

∂y

∂u

∂y

∂v

=

1

2

1

2

−

1

2

1

2

=

1

2

⇒ | J(u, v)| =

1

2

=

1

2

,

so using horizontal slices in R

′

, we have

R

e

x−y

x+y

dA =

R

′

f (x(u, v), y(u, v)) | J(u, v)| dA

=

1

0

v

−v

e

u

v

1

2

du dv

=

1

0

v

2

e

u

v

u=v

u=−v

dv

=

1

0

v

2

(e − e

−1

) dv

=

v

2

4

(e − e

−1

)

1

0

=

1

4

¸

e −

1

e

=

e

2

− 1

4e

3.5 Change of Variables in Multiple Integrals 121

The change of variables formula can be used to evaluate double integrals in polar

coordinates. Letting

x = x(r, θ) = r cos θ and y = y(r, θ) = r sin θ ,

we have

J(u, v) =

∂x

∂r

∂x

∂θ

∂y

∂r

∂y

∂θ

=

cos θ −r sin θ

sin θ r cos θ

= r cos

2

θ + r sin

2

θ = r ⇒ | J(u, v)| = |r| = r ,

so we have the following formula:

Double Integral in Polar Coordinates

R

f (x, y) dx dy =

R

′

f (r cos θ, r sin θ) r dr dθ , (3.24)

where the mapping x = r cos θ, y = r sin θ maps the region R

′

in the rθ-plane onto the

region R in the xy-plane in a one-to-one manner.

Example 3.10. Find the volume V inside the paraboloid z = x

2

+ y

2

for 0 ≤ z ≤ 1.

y

z

x

0

x

2

+ y

2

= 1

1

Figure 3.5.2 z = x

2

+ y

2

Solution: Using vertical slices, we see that

V =

R

(1 − z) dA =

R

(1 − (x

2

+ y

2

)) dA ,

where R = {(x, y) : x

2

+ y

2

≤ 1} is the unit disk in

2

(see Figure 3.5.2). In polar coordinates (r, θ) we know

that x

2

+ y

2

= r

2

and that the unit disk R is the set

R

′

= {(r, θ) : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}. Thus,

V =

2π

0

1

0

(1 − r

2

) r dr dθ

=

2π

0

1

0

(r − r

3

) dr dθ

=

2π

0

r

2

2

−

r

4

4

r=1

r=0

dθ

=

2π

0

1

4

dθ

=

π

2

122 CHAPTER 3. MULTIPLE INTEGRALS

Example 3.11. Find the volume V inside the cone z =

x

2

+ y

2

for 0 ≤ z ≤ 1.

y

z

x

0

x

2

+ y

2

= 1

1

Figure 3.5.3 z =

x

2

+ y

2

Solution: Using vertical slices, we see that

V =

R

(1 − z) dA =

R

¸

1 −

x

2

+ y

2

dA ,

where R = {(x, y) : x

2

+ y

2

≤ 1} is the unit disk in

2

(see Figure 3.5.3). In polar coordinates (r, θ) we know

that

x

2

+ y

2

= r and that the unit disk R is the set

R

′

= {(r, θ) : 0 ≤ r ≤ 1, 0 ≤ θ ≤ 2π}. Thus,

V =

2π

0

1

0

(1 − r) r dr dθ

=

2π

0

1

0

(r − r

2

) dr dθ

=

2π

0

r

2

2

−

r

3

3

r=1

r=0

dθ

=

2π

0

1

6

dθ

=

π

3

In a similar fashion, it can be shown (see Exercises 5-6) that triple integrals in

cylindrical and spherical coordinates take the following forms:

Triple Integral in Cylindrical Coordinates

S

f (x, y, z) dx dy dz =

S

′

f (r cos θ, r sin θ, z) r dr dθ dz , (3.25)

where the mapping x = r cos θ, y = r sin θ, z = z maps the solid S

′

in rθz-space onto

the solid S in xyz-space in a one-to-one manner.

Triple Integral in Spherical Coordinates

S

f (x, y, z) dx dy dz =

S

′

f (ρ sin φ cos θ, ρ sin φ sin θ, ρ cos φ) ρ

2

sin φ dρ dφ dθ ,

(3.26)

where the mapping x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ maps the solid S

′

in

ρφθ-space onto the solid S in xyz-space in a one-to-one manner.

3.5 Change of Variables in Multiple Integrals 123

Example 3.12. For a > 0, find the volume V inside the sphere S = x

2

+ y

2

+ z

2

= a

2

.

Solution: We see that S is the set ρ = a in spherical coordinates, so

V =

S

1 dV =

2π

0

π

0

a

0

1 ρ

2

sin φ dρ dφ dθ

=

2π

0

π

0

¸

ρ

3

3

ρ=a

ρ=0

sin φ dφ dθ =

2π

0

π

0

a

3

3

sin φ dφ dθ

=

2π

0

¸

−

a

3

3

cos φ

φ=π

φ=0

dθ =

2π

0

2a

3

3

dθ =

4πa

3

3

.

¨

©

Exercises

A

1. Find the volume V inside the paraboloid z = x

2

+ y

2

for 0 ≤ z ≤ 4.

2. Find the volume V inside the cone z =

x

2

+ y

2

for 0 ≤ z ≤ 3.

B

3. Find the volume V of the solid inside both x

2

+ y

2

+ z

2

= 4 and x

2

+ y

2

= 1.

4. Find the volume V inside both the sphere x

2

+ y

2

+ z

2

= 1 and the cone z =

x

2

+ y

2

.

5. Prove formula (3.25). 6. Prove formula (3.26).

7. Evaluate

R

sin

x+y

2

cos

x−y

2

**dA, where R is the triangle with vertices (0, 0), (2, 0)
**

and (1, 1). (Hint: Use the change of variables u = (x + y)/2, v = (x − y)/2.)

8. Find the volume of the solid bounded by z = x

2

+ y

2

and z

2

= 4(x

2

+ y

2

).

9. Find the volume inside the elliptic cylinder

x

2

a

2

+

y

2

b

2

= 1 for 0 ≤ z ≤ 2.

C

10. Show that the volume inside the ellipsoid

x

2

a

2

+

y

2

b

2

+

z

2

c

2

= 1 is

4πabc

3

. (Hint: Use the

change of variables x = au, y = bv, z = cw, then consider Example 3.12.)

11. Show that the Beta function, defined by

B(x, y) =

1

0

t

x−1

(1 − t)

y−1

dt , for x > 0, y > 0,

satisfies the relation B(y, x) = B(x, y) for x > 0, y > 0.

12. Using the substitution t = u/(u + 1), show that the Beta function can be written as

B(x, y) =

∞

0

u

x−1

(u + 1)

x+y

du , for x > 0, y > 0.

124 CHAPTER 3. MULTIPLE INTEGRALS

3.6 Application: Center of Mass

x

y

0

y = f (x)

R

( ¯ x, ¯ y)

a

b

Figure 3.6.1 Center of mass of R

Recall from single-variable calculus that for a re-

gion R = {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ f (x)} in

2

that rep-

resents a thin, flat plate (see Figure 3.6.1), where

f (x) is a continuous function on [a, b], the center of

mass of R has coordinates ( ¯ x, ¯ y) given by

¯ x =

M

y

M

and ¯ y =

M

x

M

,

where

M

x

=

b

a

( f (x))

2

2

dx , M

y

=

b

a

x f (x) dx , M =

b

a

f (x) dx , (3.27)

assuming that R has uniform density, i.e the mass of R is uniformly distributed over

the region. In this case the area M of the region is considered the mass of R (the

density is constant, and taken as 1 for simplicity).

In the general case where the density of a region (or lamina) R is a continuous

function δ = δ(x, y) of the coordinates (x, y) of points inside R (where R can be any

region in

2

) the coordinates ( ¯ x, ¯ y) of the center of mass of R are given by

¯ x =

M

y

M

and ¯ y =

M

x

M

, (3.28)

where

M

y

=

R

xδ(x, y) dA , M

x

=

R

yδ(x, y) dA , M =

R

δ(x, y) dA , (3.29)

The quantities M

x

and M

y

are called the moments (or first moments) of the region R

about the x-axis and y-axis, respectively. The quantity M is the mass of the region R.

To see this, think of taking a small rectangle inside R with dimensions ∆x and ∆y close

to 0. The mass of that rectangle is approximately δ(x

∗

, y

∗

)∆x ∆y, for some point (x

∗

, y

∗

)

in that rectangle. Then the mass of R is the limit of the sums of the masses of all such

rectangles inside R as the diagonals of the rectangles approach 0, which is the double

integral

R

δ(x, y) dA.

Note that the formulas in (3.27) represent a special case when δ(x, y) = 1 throughout

R in the formulas in (3.29).

Example 3.13. Find the center of mass of the region R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 2x

2

},

if the density function at (x, y) is δ(x, y) = x + y.

3.6 Application: Center of Mass 125

x

y

0

y = 2x

2

R

1

Figure 3.6.2

Solution: The region R is shown in Figure 3.6.2. We have

M =

R

δ(x, y) dA

=

1

0

2x

2

0

(x + y) dy dx

=

1

0

¸

¸

¸

¸

¸

¸

¸

xy +

y

2

2

y=2x

2

y=0

¸

¸

¸

¸

¸

¸

¸

dx

=

1

0

(2x

3

+ 2x

4

) dx

=

x

4

2

+

2x

5

5

1

0

=

9

10

and

M

x

=

R

yδ(x, y) dA M

y

=

R

xδ(x, y) dA

=

1

0

2x

2

0

y(x + y) dy dx =

1

0

2x

2

0

x(x + y) dy dx

=

1

0

¸

¸

¸

¸

¸

¸

¸

xy

2

2

+

y

3

3

y=2x

2

y=0

¸

¸

¸

¸

¸

¸

¸

dx =

1

0

¸

¸

¸

¸

¸

¸

¸

x

2

y +

xy

2

2

y=2x

2

y=0

¸

¸

¸

¸

¸

¸

¸

dx

=

1

0

(2x

5

+

8x

6

3

) dx =

1

0

(2x

4

+ 2x

5

) dx

=

x

6

3

+

8x

7

21

1

0

=

5

7

=

2x

5

5

+

x

6

3

1

0

=

11

15

,

so the center of mass ( ¯ x, ¯ y) is given by

¯ x =

M

y

M

=

11/15

9/10

=

22

27

, ¯ y =

M

x

M

=

5/7

9/10

=

50

63

.

Note how this center of mass is a little further towards the upper corner of the region

R than when the density is uniform(it is easy to use the formulas in (3.27) to showthat

( ¯ x, ¯ y) =

3

4

,

3

5

**in that case). This makes sense since the density function δ(x, y) = x + y
**

increases as (x, y) approaches that upper corner, where there is quite a bit of area.

In the special case where the density function δ(x, y) is a constant function on the

region R, the center of mass ( ¯ x, ¯ y) is called the centroid of R.

126 CHAPTER 3. MULTIPLE INTEGRALS

The formulas for the center of mass of a region in

2

can be generalized to a solid S

in

3

. Let S be a solid with a continuous mass density function δ(x, y, z) at any point

(x, y, z) in S . Then the center of mass of S has coordinates ( ¯ x, ¯ y, ¯ z), where

¯ x =

M

yz

M

, ¯ y =

M

xz

M

, ¯ z =

M

xy

M

, (3.30)

where

M

yz

=

S

xδ(x, y, z) dV , M

xz

=

S

yδ(x, y, z) dV , M

xy

=

S

zδ(x, y, z) dV , (3.31)

M =

S

δ(x, y, z) dV . (3.32)

In this case, M

yz

, M

xz

and M

xy

are called the moments (or first moments) of S around

the yz-plane, xz-plane and xy-plane, respectively. Also, M is the mass of S .

Example 3.14. Find the center of mass of the solid S = {(x, y, z) : z ≥ 0, x

2

+y

2

+z

2

≤ a

2

},

if the density function at (x, y, z) is δ(x, y, z) = 1.

y

z

x

0 a

( ¯ x, ¯ y, ¯ z)

a

Figure 3.6.3

Solution: The solid S is just the upper hemisphere inside the

sphere of radius a centered at the origin (see Figure 3.6.3). So

since the density function is a constant and S is symmetric

about the z-axis, then it is clear that ¯ x = 0 and ¯ y = 0, so we

need only find ¯ z. We have

M =

S

δ(x, y, z) dV =

S

1 dV = Volume(S ).

But since the volume of S is half the volume of the sphere of

radius a, which we know by Example 3.12 is

4πa

3

3

, then M =

2πa

3

3

. And

M

xy

=

S

zδ(x, y, z) dV

=

S

z dV , which in spherical coordinates is

=

2π

0

π/2

0

a

0

(ρ cos φ) ρ

2

sin φ dρ dφ dθ

=

2π

0

π/2

0

sin φ cos φ

¸

a

0

ρ

3

dρ

dφ dθ

=

2π

0

π/2

0

a

4

4

sin φ cos φ dφ dθ

3.6 Application: Center of Mass 127

M

xy

=

2π

0

π/2

0

a

4

8

sin 2φ dφ dθ (since sin 2φ = 2 sin φ cos φ)

=

2π

0

−

a

4

16

cos 2φ

φ=π/2

φ=0

dθ

=

2π

0

a

4

8

dθ

=

πa

4

4

,

so

¯ z =

M

xy

M

=

πa

4

4

2πa

3

3

=

3a

8

.

Thus, the center of mass of S is ( ¯ x, ¯ y, ¯ z) =

0, 0,

3a

8

.

¨

©

Exercises

A

For Exercises 1-5, find the center of mass of the region R with the given density func-

tion δ(x, y).

1. R = {(x, y) : 0 ≤ x ≤ 2, 0 ≤ y ≤ 4 }, δ(x, y) = 2y

2. R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ x

2

}, δ(x, y) = x + y

3. R = {(x, y) : y ≥ 0, x

2

+ y

2

≤ a

2

}, δ(x, y) = 1

4. R = {(x, y) : y ≥ 0, x ≥ 0, 1 ≤ x

2

+ y

2

≤ 4 }, δ(x, y) =

x

2

+ y

2

5. R = {(x, y) : y ≥ 0, x

2

+ y

2

≤ 1 }, δ(x, y) = y

B

For Exercises 6-10, find the center of mass of the solid S with the given density func-

tion δ(x, y, z).

6. S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 }, δ(x, y, z) = xyz

7. S = {(x, y, z) : z ≥ 0, x

2

+ y

2

+ z

2

≤ a

2

}, δ(x, y, z) = x

2

+ y

2

+ z

2

8. S = {(x, y, z) : x ≥ 0, y ≥ 0, z ≥ 0, x

2

+ y

2

+ z

2

≤ a

2

}, δ(x, y, z) = 1

9. S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 }, δ(x, y, z) = x

2

+ y

2

+ z

2

10. S = {(x, y, z) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, 0 ≤ z ≤ 1 − x − y}, δ(x, y, z) = 1

128 CHAPTER 3. MULTIPLE INTEGRALS

3.7 Application: Probability and Expected Value

In this section we will briefly discuss some applications of multiple integrals in the

field of probability theory. In particular we will see ways in which multiple integrals

can be used to calculate probabilities and expected values.

Probability

Suppose that you have a standard six-sided (fair) die, and you let a variable X

represent the value rolled. Then the probability of rolling a 3, written as P(X = 3),

is

1

6

, since there are six sides on the die and each one is equally likely to be rolled,

and hence in particular the 3 has a one out of six chance of being rolled. Likewise the

probability of rolling at most a 3, written as P(X ≤ 3), is

3

6

=

1

2

, since of the six numbers

on the die, there are three equally likely numbers (1, 2, and 3) that are less than or

equal to 3. Note that P(X ≤ 3) = P(X = 1) + P(X = 2) + P(X = 3). We call X a discrete

random variable on the sample space (or probability space) Ω consisting of all possible

outcomes. In our case, Ω = {1, 2, 3, 4, 5, 6}. An event A is a subset of the sample space.

For example, in the case of the die, the event X ≤ 3 is the set {1, 2, 3}.

Now let X be a variable representing a random real number in the interval (0, 1).

Note that the set of all real numbers between 0 and 1 is not a discrete (or countable)

set of values, i.e. it can not be put into a one-to-one correspondence with the set of

positive integers.

3

In this case, for any real number x in (0, 1), it makes no sense

to consider P(X = x) since it must be 0 (why?). Instead, we consider the probability

P(X ≤ x), which is given by P(X ≤ x) = x. The reasoning is this: the interval (0, 1) has

length 1, and for x in (0, 1) the interval (0, x) has length x. So since X represents a

random number in (0, 1), and hence is uniformly distributed over (0, 1), then

P(X ≤ x) =

length of (0, x)

length of (0, 1)

=

x

1

= x .

We call X a continuous random variable on the sample space Ω = (0, 1). An event A is

a subset of the sample space. For example, in our case the event X ≤ x is the set (0, x).

In the case of a discrete random variable, we saw how the probability of an event

was the sum of the probabilities of the individual outcomes comprising that event (e.g.

P(X ≤ 3) = P(X = 1) + P(X = 2) + P(X = 3) in the die example). For a continuous random

variable, the probability of an event will instead be the integral of a function, which

we will now describe.

Let X be a continuous real-valued random variable on a sample space Ω in . For

3

For a proof see p. 9-10 in KAMKE, E., Theory of Sets, New York: Dover, 1950.

3.7 Application: Probability and Expected Value 129

simplicity, let Ω = (a, b). Define the distribution function F of X as

F(x) = P(X ≤ x) , for −∞ < x < ∞ (3.33)

=

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

1, for x ≥ b

P(X ≤ x), for a < x < b

0, for x ≤ a .

(3.34)

Suppose that there is a nonnegative, continuous real-valued function f on such that

F(x) =

x

−∞

f (y) dy , for −∞ < x < ∞ , (3.35)

and

∞

−∞

f (x) dx = 1 . (3.36)

Then we call f the probability density function (or p.d.f. for short) for X. We thus have

P(X ≤ x) =

x

a

f (y) dy , for a < x < b . (3.37)

Also, by the Fundamental Theorem of Calculus, we have

F

′

(x) = f (x) , for −∞ < x < ∞. (3.38)

Example 3.15. Let X represent a randomly selected real number in the interval (0, 1).

We say that X has the uniform distribution on (0, 1), with distribution function

F(x) = P(X ≤ x) =

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

1, for x ≥ 1

x, for 0 < x < 1

0, for x ≤ 0 ,

(3.39)

and probability density function

f (x) = F

′

(x) =

¸

¸

¸

¸

¸

¸

1, for 0 < x < 1

0, elsewhere.

(3.40)

In general, if X represents a randomly selected real number in an interval (a, b), then

X has the uniform distribution function

F(x) = P(X ≤ x) =

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

1, for x ≥ b

x

b−a

, for a < x < b

0, for x ≤ a ,

(3.41)

and probability density function

f (x) = F

′

(x) =

¸

¸

¸

¸

¸

¸

1

b−a

, for a < x < b

0, elsewhere.

(3.42)

130 CHAPTER 3. MULTIPLE INTEGRALS

Example 3.16. A famous distribution function is given by the standard normal dis-

tribution, whose probability density function f is

f (x) =

1

√

2π

e

−x

2

/2

, for −∞ < x < ∞. (3.43)

This is often called a “bell curve”, and is used widely in statistics. Since we are claim-

ing that f is a p.d.f., we should have

∞

−∞

1

√

2π

e

−x

2

/2

dx = 1 (3.44)

by formula (3.36), which is equivalent to

∞

−∞

e

−x

2

/2

dx =

√

2π . (3.45)

We can use a double integral in polar coordinates to verify this integral. First,

∞

−∞

∞

−∞

e

−(x

2

+y

2

)/2

dx dy =

∞

−∞

e

−y

2

/2

¸

∞

−∞

e

−x

2

/2

dx

dy

=

¸

∞

−∞

e

−x

2

/2

dx

¸

∞

−∞

e

−y

2

/2

dy

=

¸

∞

−∞

e

−x

2

/2

dx

2

since the same function is being integrated twice in the middle equation, just with

different variables. But using polar coordinates, we see that

∞

−∞

∞

−∞

e

−(x

2

+y

2

)/2

dx dy =

2π

0

∞

0

e

−r

2

/2

r dr dθ

=

2π

0

¸

−e

−r

2

/2

r=∞

r=0

dθ

=

2π

0

(0 − (−e

0

)) dθ =

2π

0

1 dθ = 2π ,

and so

¸

∞

−∞

e

−x

2

/2

dx

2

= 2π , and hence

∞

−∞

e

−x

2

/2

dx =

√

2π .

3.7 Application: Probability and Expected Value 131

In addition to individual random variables, we can consider jointly distributed ran-

dom variables. For this, we will let X, Y and Z be three real-valued continuous random

variables defined on the same sample space Ω in (the discussion for two random

variables is similar). Then the joint distribution function F of X, Y and Z is given by

F(x, y, z) = P(X ≤ x, Y ≤ y, Z ≤ z) , for −∞ < x, y, z < ∞. (3.46)

If there is a nonnegative, continuous real-valued function f on

3

such that

F(x, y, z) =

z

−∞

y

−∞

x

−∞

f (u, v, w) du dv dw , for −∞ < x, y, z < ∞ (3.47)

and

∞

−∞

∞

−∞

∞

−∞

f (x, y, z) dx dy dz = 1 , (3.48)

then we call f the joint probability density function (or joint p.d.f. for short) for X, Y

and Z. In general, for a

1

< b

1

, a

2

< b

2

, a

3

< b

3

, we have

P(a

1

< X ≤ b

1

, a

2

< Y ≤ b

2

, a

3

< Z ≤ b

3

) =

b

3

a

3

b

2

a

2

b

1

a

1

f (x, y, z) dx dy dz , (3.49)

with the ≤ and < symbols interchangeable in any combination. A triple integral, then,

can be thought of as representing a probability (for a function f which is a p.d.f.).

Example 3.17. Let a, b, and c be real numbers selected randomly from the interval

(0, 1). What is the probability that the equation ax

2

+ bx + c = 0 has at least one real

solution x?

a

c

0

c =

1

4a

1

1

1

4

R

1

R

2

Figure 3.7.1 Region

R = R

1

∪ R

2

Solution: We know by the quadratic formula that there is at

least one real solution if b

2

− 4ac ≥ 0. So we need to calculate

P(b

2

− 4ac ≥ 0). We will use three jointly distributed random

variables to do this. First, since 0 < a, b, c < 1, we have

b

2

− 4ac ≥ 0 ⇔ 0 < 4ac ≤ b

2

< 1 ⇔ 0 < 2

√

a

√

c ≤ b < 1 ,

where the last relation holds for all 0 < a, c < 1 such that

0 < 4ac < 1 ⇔ 0 < c <

1

4a

.

Considering a, b and c as real variables, the region R in the ac-plane where the above

relation holds is given by R = {(a, c) : 0 < a < 1, 0 < c < 1, 0 < c <

1

4a

}, which we can see

is a union of two regions R

1

and R

2

, as in Figure 3.7.1 above.

Now let X, Y and Z be continuous random variables, each representing a randomly

selected real number from the interval (0, 1) (think of X, Y and Z representing a, b

and c, respectively). Then, similar to how we showed that f (x) = 1 is the p.d.f. of the

132 CHAPTER 3. MULTIPLE INTEGRALS

uniform distribution on (0, 1), it can be shown that f (x, y, z) = 1 for x, y, z in (0, 1)

(0 elsewhere) is the joint p.d.f. of X, Y and Z. Now,

P(b

2

− 4ac ≥ 0) = P((a, c) ∈ R, 2

√

a

√

c ≤ b < 1) ,

so this probability is the triple integral of f (a, b, c) = 1 as b varies from 2

√

a

√

c to 1 and

as (a, c) varies over the region R. Since R can be divided into two regions R

1

and R

2

,

then the required triple integral can be split into a sum of two triple integrals, using

vertical slices in R:

P(b

2

− 4ac ≥ 0) =

1/4

0

1

0

..............

R

1

1

2

√

a

√

c

1 db dc da +

1

1/4

1/4a

0

..................

R

2

1

2

√

a

√

c

1 db dc da

=

1/4

0

1

0

(1 − 2

√

a

√

c) dc da +

1

1/4

1/4a

0

(1 − 2

√

a

√

c) dc da

=

1/4

0

c −

4

3

√

a c

3/2

c=1

c=0

da +

1

1/4

c −

4

3

√

a c

3/2

c=1/4a

c=0

da

=

1/4

0

1 −

4

3

√

a

da +

1

1/4

1

12a

da

= a −

8

9

a

3/2

1/4

0

+

1

12

ln a

1

1/4

=

¸

1

4

−

1

9

+

¸

0 −

1

12

ln

1

4

=

5

36

+

1

12

ln 4

P(b

2

− 4ac ≥ 0) =

5 + 3 ln 4

36

≈ 0.2544

In other words, the equation ax

2

+ bx + c = 0 has about a 25% chance of being solved!

Expected Value

The expected value EX of a random variable X can be thought of as the “average”

value of X as it varies over its sample space. If X is a discrete random variable, then

EX =

¸

x

x P(X = x) , (3.50)

with the sum being taken over all elements x of the sample space. For example, if X

represents the number rolled on a six-sided die, then

EX =

6

¸

x=1

x P(X = x) =

6

¸

x=1

x

1

6

= 3.5 (3.51)

is the expected value of X, which is the average of the integers 1 − 6.

3.7 Application: Probability and Expected Value 133

If X is a real-valued continuous random variable with p.d.f. f , then

EX =

∞

−∞

x f (x) dx . (3.52)

For example, if X has the uniform distribution on the interval (0, 1), then its p.d.f. is

f (x) =

¸

¸

¸

¸

¸

¸

1, for 0 < x < 1

0, elsewhere,

(3.53)

and so

EX =

∞

−∞

x f (x) dx =

1

0

x dx =

1

2

. (3.54)

For a pair of jointly distributed, real-valued continuous random variables X and Y

with joint p.d.f. f (x, y), the expected values of X and Y are given by

EX =

∞

−∞

∞

−∞

x f (x, y) dx dy and EY =

∞

−∞

∞

−∞

y f (x, y) dx dy , (3.55)

respectively.

Example 3.18. If you were to pick n > 2 random real numbers from the interval (0, 1),

what are the expected values for the smallest and largest of those numbers?

Solution: Let U

1

, . . . , U

n

be n continuous random variables, each representing a ran-

domly selected real number from (0, 1), i.e. each has the uniform distribution on (0, 1).

Define random variables X and Y by

X = min(U

1

, . . . , U

n

) and Y = max(U

1

, . . . , U

n

) .

Then it can be shown

4

that the joint p.d.f. of X and Y is

f (x, y) =

¸

¸

¸

¸

¸

¸

n(n − 1)(y − x)

n−2

, for 0 ≤ x ≤ y ≤ 1

0, elsewhere.

(3.56)

Thus, the expected value of X is

EX =

1

0

1

x

n(n − 1)x(y − x)

n−2

dy dx

=

1

0

nx(y − x)

n−1

y=1

y=x

dx

=

1

0

nx(1 − x)

n−1

dx , so integration by parts yields

= −x(1 − x)

n

−

1

n + 1

(1 − x)

n+1

1

0

EX =

1

n + 1

,

4

See Ch. 6 in HOEL, PORT and STONE.

134 CHAPTER 3. MULTIPLE INTEGRALS

and similarly (see Exercise 3) it can be shown that

EY =

1

0

y

0

n(n − 1)y(y − x)

n−2

dx dy =

n

n + 1

.

So, for example, if you were to repeatedly take samples of n = 3 random real numbers

from (0, 1), and each time store the minimum and maximum values in the sample,

then the average of the minimums would approach

1

4

and the average of the max-

imums would approach

3

4

as the number of samples grows. It would be relatively

simple (see Exercise 4) to write a computer program to test this.

¨

©

Exercises

B

1. Evaluate the integral

∞

−∞

e

−x

2

dx using anything you have learned so far.

2. For σ > 0 and µ > 0, evaluate

∞

−∞

1

σ

√

2π

e

−(x−µ)

2

/2σ

2

dx.

3. Show that EY =

n

n+1

in Example 3.18

C

4. Write a computer program (in the language of your choice) that verifies the results

in Example 3.18 for the case n = 3 by taking large numbers of samples.

5. Repeat Exercise 4 for the case when n = 4.

6. For continuous random variables X, Y with joint p.d.f. f (x, y), define the second

moments E(X

2

) and E(Y

2

) by

E(X

2

) =

∞

−∞

∞

−∞

x

2

f (x, y) dx dy and E(Y

2

) =

∞

−∞

∞

−∞

y

2

f (x, y) dx dy ,

and the variances Var(X) and Var(Y) by

Var(X) = E(X

2

) − (EX)

2

and Var(Y) = E(Y

2

) − (EY)

2

.

Find Var(X) and Var(Y) for X and Y as in Example 3.18.

7. Continuing Exercise 6, the correlation ρ between X and Y is defined as

ρ =

E(XY) − (EX)(EY)

Var(X) Var(Y)

,

where E(XY) =

∞

−∞

∞

−∞

xy f (x, y) dx dy. Find ρ for X and Y as in Example 3.18.

(Note: The quantity E(XY) − (EX)(EY) is called the covariance of X and Y.)

8. In Example 3.17 would the answer change if the interval (0, 100) is used instead of

(0, 1)? Explain.

4 Line and Surface Integrals

4.1 Line Integrals

In single-variable calculus you learned how to integrate a real-valued function f (x)

over an interval [a, b] in

1

. This integral (usually called a Riemann integral) can be

thought of as an integral over a path in

1

, since an interval (or collection of intervals)

is really the only kind of “path” in

1

. You may also recall that if f (x) represented the

force applied along the x-axis to an object at position x in [a, b], then the work W done

in moving that object from position x = a to x = b was defined as the integral:

W =

b

a

f (x) dx

In this section, we will see how to define the integral of a function (either real-

valued or vector-valued) of two variables over a general path (i.e. a curve) in

2

.

This definition will be motivated by the physical notion of work. We will begin with

real-valued functions of two variables.

In physics, the intuitive idea of work is that

Work = Force × Distance .

Suppose that we want to find the total amount W of work done in moving an object

along a curve C in

2

with a smooth parametrization x = x(t), y = y(t), a ≤ t ≤ b, with

a force f (x, y) which varies with the position (x, y) of the object and is applied in the

direction of motion along C (see Figure 4.1.1 below).

x

y

0

C

t = a

t = b

∆s

i

≈

∆x

i

2

+ ∆y

i

2

t = t

i

t = t

i+1

∆y

i

∆x

i

Figure 4.1.1 Curve C : x = x(t), y = y(t) for t in [a, b]

We will assume for now that the function f (x, y) is continuous and real-valued, so

we only consider the magnitude of the force. Partition the interval [a, b] as follows:

a = t

0

< t

1

< t

2

< · · · < t

n−1

< t

n

= b , for some integer n ≥ 2

135

136 CHAPTER 4. LINE AND SURFACE INTEGRALS

As we can see from Figure 4.1.1, over a typical subinterval [t

i

, t

i+1

] the distance ∆s

i

traveled along the curve is approximately

∆x

i

2

+ ∆y

i

2

, by the Pythagorean Theorem.

Thus, if the subinterval is small enough then the work done in moving the object along

that piece of the curve is approximately

Force × Distance ≈ f (x

i∗

, y

i∗

)

∆x

i

2

+ ∆y

i

2

, (4.1)

where (x

i∗

, y

i∗

) = (x(t

i

∗), y(t

i

∗)) for some t

i

∗ in [t

i

, t

i+1

], and so

W ≈

n−1

¸

i=0

f (x

i∗

, y

i∗

)

∆x

i

2

+ ∆y

i

2

(4.2)

is approximately the total amount of work done over the entire curve. But since

∆x

i

2

+ ∆y

i

2

=

¸

∆x

i

∆t

i

2

+

¸

∆y

i

∆t

i

2

∆t

i

,

where ∆t

i

= t

i+1

− t

i

, then

W ≈

n−1

¸

i=0

f (x

i∗

, y

i∗

)

¸

∆x

i

∆t

i

2

+

¸

∆y

i

∆t

i

2

∆t

i

. (4.3)

Taking the limit of that sum as the length of the largest subinterval goes to 0, the sum

over all subintervals becomes the integral from t = a to t = b,

∆x

i

∆t

i

and

∆y

i

∆t

i

become x

′

(t)

and y

′

(t), respectively, and f (x

i∗

, y

i∗

) becomes f (x(t), y(t)), so that

W =

b

a

f (x(t), y(t))

x

′

(t)

2

+ y

′

(t)

2

dt . (4.4)

The integral on the right side of the above equation gives us our idea of how to

define, for any real-valued function f (x, y), the integral of f (x, y) along the curve C,

called a line integral:

Definition 4.1. For a real-valued function f (x, y) and a curve C in

2

, parametrized

by x = x(t), y = y(t), a ≤ t ≤ b, the line integral of f (x, y) along C with respect to arc

length s is

C

f (x, y) ds =

b

a

f (x(t), y(t))

x

′

(t)

2

+ y

′

(t)

2

dt . (4.5)

The symbol ds is the differential of the arc length function

s = s(t) =

t

a

x

′

(u)

2

+ y

′

(u)

2

du , (4.6)

4.1 Line Integrals 137

which you may recognize fromSection 1.9 as the length of the curve C over the interval

[a, t], for all t in [a, b]. That is,

ds = s

′

(t) dt =

x

′

(t)

2

+ y

′

(t)

2

dt , (4.7)

by the Fundamental Theorem of Calculus.

For a general real-valued function f (x, y), what does the line integral

C

f (x, y) ds

represent? The preceding discussion of ds gives us a clue. You can think of differen-

tials as infinitesimal lengths. So if you think of f (x, y) as the height of a picket fence

along C, then f (x, y) ds can be thought of as approximately the area of a section of that

fence over some infinitesimally small section of the curve, and thus the line integral

C

f (x, y) ds is the total area of that picket fence (see Figure 4.1.2).

x

y

0

C

ds

f (x, y)

Figure 4.1.2 Area of shaded rectangle = height × width ≈ f (x, y) ds

Example 4.1. Use a line integral to show that the lateral surface area A of a right

circular cylinder of radius r and height h is 2πrh.

y

z

x

0

r

h = f (x, y)

C : x

2

+ y

2

= r

2

Figure 4.1.3

Solution: We will use the right circular cylinder with base cir-

cle C given by x

2

+ y

2

= r

2

and with height h in the positive z

direction (see Figure 4.1.3). Parametrize C as follows:

x = x(t) = r cos t , y = y(t) = r sin t , 0 ≤ t ≤ 2π

Let f (x, y) = h for all (x, y). Then

A =

C

f (x, y) ds =

b

a

f (x(t), y(t))

x

′

(t)

2

+ y

′

(t)

2

dt

=

2π

0

h

(−r sin t)

2

+ (r cos t)

2

dt

= h

2π

0

r

sin

2

t + cos

2

t dt

= rh

2π

0

1 dt = 2πrh

138 CHAPTER 4. LINE AND SURFACE INTEGRALS

Note in Example 4.1 that if we had traversed the circle C twice, i.e. let t vary from

0 to 4π, then we would have gotten an area of 4πrh, i.e. twice the desired area, even

though the curve itself is still the same (namely, a circle of radius r). Also, notice

that we traversed the circle in the counter-clockwise direction. If we had gone in the

clockwise direction, using the parametrization

x = x(t) = r cos(2π − t) , y = y(t) = r sin(2π − t) , 0 ≤ t ≤ 2π , (4.8)

then it is easy to verify (see Exercise 12) that the value of the line integral is un-

changed.

In general, it can be shown (see Exercise 15) that reversing the direction in which

a curve C is traversed leaves

C

f (x, y) ds unchanged, for any f (x, y). If a curve C has a

parametrization x = x(t), y = y(t), a ≤ t ≤ b, then denote by −C the same curve as C but

traversed in the opposite direction. Then −C is parametrized by

x = x(a + b − t) , y = y(a + b − t) , a ≤ t ≤ b , (4.9)

and we have

C

f (x, y) ds =

−C

f (x, y) ds . (4.10)

Notice that our definition of the line integral was with respect to the arc length

parameter s. We can also define

C

f (x, y) dx =

b

a

f (x(t), y(t)) x

′

(t) dt (4.11)

as the line integral of f (x, y) along C with respect to x, and

C

f (x, y) dy =

b

a

f (x(t), y(t)) y

′

(t) dt (4.12)

as the line integral of f (x, y) along C with respect to y.

In the derivation of the formula for a line integral, we used the idea of work as force

multiplied by distance. However, we know that force is actually a vector. So it would

be helpful to develop a vector form for a line integral. For this, suppose that we have

a function f(x, y) defined on

2

by

f(x, y) = P(x, y) i + Q(x, y) j

for some continuous real-valued functions P(x, y) and Q(x, y) on

2

. Such a function f

is called a vector field on

2

. It is defined at points in

2

, and its values are vectors

in

2

. For a curve C with a smooth parametrization x = x(t), y = y(t), a ≤ t ≤ b, let

r(t) = x(t) i + y(t) j

4.1 Line Integrals 139

be the position vector for a point (x(t), y(t)) on C. Then r

′

(t) = x

′

(t) i + y

′

(t) j and so

C

P(x, y) dx +

C

Q(x, y) dy =

b

a

P(x(t), y(t)) x

′

(t) dt +

b

a

Q(x(t), y(t)) y

′

(t) dt

=

b

a

(P(x(t), y(t)) x

′

(t) + Q(x(t), y(t)) y

′

(t)) dt

=

b

a

f(x(t), y(t)) ··· r

′

(t) dt

by definition of f(x, y). Notice that the function f(x(t), y(t)) ··· r

′

(t) is a real-valued func-

tion on [a, b], so the last integral on the right looks somewhat similar to our earlier

definition of a line integral. This leads us to the following definition:

Definition 4.2. For a vector field f(x, y) = P(x, y) i + Q(x, y) j and a curve C with a

smooth parametrization x = x(t), y = y(t), a ≤ t ≤ b, the line integral of f along C is

C

f ··· dr =

C

P(x, y) dx +

C

Q(x, y) dy (4.13)

=

b

a

f(x(t), y(t)) ··· r

′

(t) dt , (4.14)

where r(t) = x(t) i + y(t) j is the position vector for points on C.

We use the notation dr = r

′

(t) dt = dx i + dy j to denote the differential of the vector-

valued function r. The line integral in Definition 4.2 is often called a line integral of

a vector field to distinguish it from the line integral in Definition 4.1 which is called a

line integral of a scalar field. For convenience we will often write

C

P(x, y) dx +

C

Q(x, y) dy =

C

P(x, y) dx + Q(x, y) dy ,

where it is understood that the line integral along C is being applied to both P and

Q. The quantity P(x, y) dx + Q(x, y) dy is known as a differential form. For a real-

valued function F(x, y), the differential of F is dF =

∂F

∂x

dx +

∂F

∂y

dy. A differential form

P(x, y) dx + Q(x, y) dy is called exact if it equals dF for some function F(x, y).

Recall that if the points on a curve C have position vector r(t) = x(t) i+y(t) j, then r

′

(t)

is a tangent vector to C at the point (x(t), y(t)) in the direction of increasing t (which we

call the direction of C). Since C is a smooth curve, then r

′

(t) 0 on [a, b] and hence

T(t) =

r

′

(t)

¸

¸

¸r

′

(t)

¸

¸

¸

is the unit tangent vector to C at (x(t), y(t)). Putting Definitions 4.1 and 4.2 together

we get the following theorem:

140 CHAPTER 4. LINE AND SURFACE INTEGRALS

Theorem 4.1. For a vector field f(x, y) = P(x, y) i +Q(x, y) j and a curve C with a smooth

parametrization x = x(t), y = y(t), a ≤ t ≤ b and position vector r(t) = x(t) i + y(t) j,

C

f ··· dr =

C

f ··· Tds , (4.15)

where T(t) =

r

′

(t)

r

′

(t)

is the unit tangent vector to C at (x(t), y(t)).

If the vector field f(x, y) represents the force moving an object along a curve C, then

the work W done by this force is

W =

C

f ··· Tds =

C

f ··· dr . (4.16)

Example 4.2. Evaluate

C

(x

2

+ y

2

) dx + 2xy dy, where:

(a) C : x = t , y = 2t , 0 ≤ t ≤ 1

(b) C : x = t , y = 2t

2

, 0 ≤ t ≤ 1

x

y

0

(1, 2)

2

1

Figure 4.1.4

Solution: Figure 4.1.4 shows both curves.

(a) Since x

′

(t) = 1 and y

′

(t) = 2, then

C

(x

2

+ y

2

) dx + 2xy dy =

1

0

(x(t)

2

+ y(t)

2

)x

′

(t) + 2x(t)y(t) y

′

(t)

dt

=

1

0

(t

2

+ 4t

2

)(1) + 2t(2t)(2)

dt

=

1

0

13t

2

dt

=

13t

3

3

1

0

=

13

3

(b) Since x

′

(t) = 1 and y

′

(t) = 4t, then

C

(x

2

+ y

2

) dx + 2xy dy =

1

0

(x(t)

2

+ y(t)

2

)x

′

(t) + 2x(t)y(t) y

′

(t)

dt

=

1

0

(t

2

+ 4t

4

)(1) + 2t(2t

2

)(4t)

dt

=

1

0

(t

2

+ 20t

4

) dt

=

t

3

3

+ 4t

5

1

0

=

1

3

+ 4 =

13

3

4.1 Line Integrals 141

So in both cases, if the vector field f(x, y) = (x

2

+ y

2

) i + 2xy j represents the force

moving an object from (0, 0) to (1, 2) along the given curve C, then the work done is

13

3

.

This may lead you to think that work (and more generally, the line integral of a vector

field) is independent of the path taken. However, as we will see in the next section,

this is not always the case.

Although we defined line integrals over a single smooth curve, if C is a piecewise

smooth curve, that is

C = C

1

∪ C

2

∪ . . . ∪ C

n

is the union of smooth curves C

1

, . . . , C

n

, then we can define

C

f ··· dr =

C

1

f ··· dr

1

+

C

2

f ··· dr

2

+ . . . +

C

n

f ··· dr

n

where each r

i

is the position vector of the curve C

i

.

Example 4.3. Evaluate

C

(x

2

+y

2

) dx +2xy dy, where C is the polygonal path from (0, 0)

to (0, 2) to (1, 2).

x

y

0

(1, 2) 2

1

C

1

C

2

Figure 4.1.5

Solution: Write C = C

1

∪C

2

, where C

1

is the curve given by x = 0, y = t,

0 ≤ t ≤ 2 and C

2

is the curve given by x = t, y = 2, 0 ≤ t ≤ 1 (see Figure

4.1.5). Then

C

(x

2

+ y

2

) dx + 2xy dy =

C

1

(x

2

+ y

2

) dx + 2xy dy

+

C

2

(x

2

+ y

2

) dx + 2xy dy

=

2

0

(0

2

+ t

2

)(0) + 2(0)t(1)

dt +

1

0

(t

2

+ 4)(1) + 2t(2)(0)

dt

=

2

0

0 dt +

1

0

(t

2

+ 4) dt

=

t

3

3

+ 4t

1

0

=

1

3

+ 4 =

13

3

Line integral notation varies quite a bit. For example, in physics it is common to

see the notation

b

a

f ··· dl, where it is understood that the limits of integration a and

b are for the underlying parameter t of the curve, and the letter l signifies length.

Also, the formulation

C

f ··· Tds from Theorem 4.1 is often preferred in physics since it

emphasizes the idea of integrating the tangential component f···T of f in the direction of

T (i.e. in the direction of C), which is a useful physical interpretation of line integrals.

142 CHAPTER 4. LINE AND SURFACE INTEGRALS

¨

©

Exercises

A

For Exercises 1-4, calculate

C

f (x, y) ds for the given function f (x, y) and curve C.

1. f (x, y) = xy; C : x = cos t, y = sin t, 0 ≤ t ≤ π/2

2. f (x, y) =

x

x

2

+ 1

; C : x = t, y = 0, 0 ≤ t ≤ 1

3. f (x, y) = 2x + y; C: polygonal path from (0, 0) to (3, 0) to (3, 2)

4. f (x, y) = x + y

2

; C: path from (2, 0) counterclockwise along the circle x

2

+ y

2

= 4 to

the point (−2, 0) and then back to (2, 0) along the x-axis

5. Use a line integral to find the lateral surface area of the part of the cylinder

x

2

+ y

2

= 4 below the plane x + 2y + z = 6 and above the xy-plane.

For Exercises 6-11, calculate

C

f ··· dr for the given vector field f(x, y) and curve C.

6. f(x, y) = i − j; C : x = 3t, y = 2t, 0 ≤ t ≤ 1

7. f(x, y) = y i − x j; C : x = cos t, y = sin t, 0 ≤ t ≤ 2π

8. f(x, y) = x i + y j; C : x = cos t, y = sin t, 0 ≤ t ≤ 2π

9. f(x, y) = (x

2

− y) i + (x − y

2

) j; C : x = cos t, y = sin t, 0 ≤ t ≤ 2π

10. f(x, y) = xy

2

i + xy

3

j; C : the polygonal path from (0, 0) to (1, 0) to (0, 1) to (0, 0)

11. f(x, y) = (x

2

+ y

2

) i; C : x = 2 + cos t, y = sin t, 0 ≤ t ≤ 2π

B

12. Verify that the value of the line integral in Example 4.1 is unchanged using the

parametrization of the circle C given in formulas (4.8).

13. Show that if f ⊥ r

′

(t) at each point r(t) along a smooth curve C, then

C

f ··· dr = 0.

14. Showthat if f points in the same direction as r

′

(t) at each point r(t) along a smooth

curve C, then

C

f ··· dr =

C

f ds.

C

15. Prove that

C

f (x, y) ds =

−C

f (x, y) ds. (Hint: Use formulas (4.9).)

16. Let C be a smooth curve with arc length L, and suppose that f(x, y) = P(x, y) i +

Q(x, y) j is a vector field such that f(x, y) ≤ M for all (x, y) on C. Show that

C

f ··· dr

≤ ML. (Hint: Recall that

b

a

g(x) dx

≤

b

a

|g(x)| dx for Riemann integrals.)

17. Prove that the Riemann integral

b

a

f (x) dx is a special case of a line integral.

4.2 Properties of Line Integrals 143

4.2 Properties of Line Integrals

We know from the previous section that for line integrals of real-valued functions

(scalar fields), reversing the direction in which the integral is taken along a curve

does not change the value of the line integral:

C

f (x, y) ds =

−C

f (x, y) ds (4.17)

For line integrals of vector fields, however, the value does change. To see this, let

f(x, y) = P(x, y) i + Q(x, y) j be a vector field, with P and Q continuously differentiable

functions. Let C be a smooth curve parametrized by x = x(t), y = y(t), a ≤ t ≤ b,

with position vector r(t) = x(t) i + y(t) j (we will usually abbreviate this by saying that

C : r(t) = x(t) i + y(t) j is a smooth curve). We know that the curve −C traversed in the

opposite direction is parametrized by x = x(a + b − t), y = y(a + b − t), a ≤ t ≤ b. Then

−C

P(x, y) dx =

b

a

P(x(a + b − t), y(a + b − t))

d

dt

(x(a + b − t)) dt

=

b

a

P(x(a + b − t), y(a + b − t)) (−x

′

(a + b − t)) dt (by the Chain Rule)

=

a

b

P(x(u), y(u)) (−x

′

(u)) (−du) (by letting u = a + b − t)

=

a

b

P(x(u), y(u)) x

′

(u) du

= −

b

a

P(x(u), y(u)) x

′

(u) du , since

a

b

= −

b

a

, so

−C

P(x, y) dx = −

C

P(x, y) dx

since we are just using a different letter (u) for the line integral along C. A similar

argument shows that

−C

Q(x, y) dy = −

C

Q(x, y) dy ,

and hence

−C

f ··· dr =

−C

P(x, y) dx +

−C

Q(x, y) dy

= −

C

P(x, y) dx + −

C

Q(x, y) dy

= −

¸

C

P(x, y) dx +

C

Q(x, y) dy

−C

f ··· dr = −

C

f ··· dr . (4.18)

144 CHAPTER 4. LINE AND SURFACE INTEGRALS

The above formula can be interpreted in terms of the work done by a force f(x, y)

(treated as a vector) moving an object along a curve C: the total work performed

moving the object along C from its initial point to its terminal point, and then back

to the initial point moving backwards along the same path, is zero. This is because

when force is considered as a vector, direction is accounted for.

The preceding discussion shows the importance of always taking the direction of

the curve into account when using line integrals of vector fields. For this reason, the

curves in line integrals are sometimes referred to as directed curves or oriented curves.

Recall that our definition of a line integral required that we have a parametrization

x = x(t), y = y(t), a ≤ t ≤ b for the curve C. But as we know, any curve has infinitely

many parametrizations. So could we get a different value for a line integral using

some other parametrization of C, say, x = ˜ x(u), y = ˜ y(u), c ≤ u ≤ d ? If so, this would

mean that our definition is not well-defined. Luckily, it turns out that the value of a

line integral of a vector field is unchanged as long as the direction of the curve C is

preserved by whatever parametrization is chosen:

Theorem 4.2. Let f(x, y) = P(x, y) i + Q(x, y) j be a vector field, and let C be a smooth

curve parametrized by x = x(t), y = y(t), a ≤ t ≤ b. Suppose that t = α(u) for c ≤ u ≤ d,

such that a = α(c), b = α(d), and α

′

(u) > 0 on the open interval (c, d) (i.e. α(u) is strictly

increasing on [c, d]). Then

C

f··· dr has the same value for the parametrizations x = x(t),

y = y(t), a ≤ t ≤ b and x = ˜ x(u) = x(α(u)), y = ˜ y(u) = y(α(u)), c ≤ u ≤ d.

Proof: Since α(u) is strictly increasing and maps [c, d] onto [a, b], then we know that

t = α(u) has an inverse function u = α

−1

(t) defined on [a, b] such that c = α

−1

(a),

d = α

−1

(b), and

du

dt

=

1

α

′

(u)

. Also, dt = α

′

(u) du, and by the Chain Rule

˜ x

′

(u) =

d ˜ x

du

=

d

du

(x(α(u))) =

dx

dt

dt

du

= x

′

(t) α

′

(u) ⇒ x

′

(t) =

˜ x

′

(u)

α

′

(u)

so making the susbstitution t = α(u) gives

b

a

P(x(t), y(t)) x

′

(t) dt =

α

−1

(b)

α

−1

(a)

P(x(α(u)), y(α(u)))

˜ x

′

(u)

α

′

(u)

(α

′

(u) du)

=

d

c

P( ˜ x(u), ˜ y(u)) ˜ x

′

(u) du ,

which shows that

C

P(x, y) dx has the same value for both parametrizations. A simi-

lar argument shows that

C

Q(x, y) dy has the same value for both parametrizations,

and hence

C

f ··· dr has the same value. QED

Notice that the condition α

′

(u) > 0 in Theorem 4.2 means that the two parametriza-

tions move along C in the same direction. That was not the case with the “reverse”

parametrization for −C: for u = a + b − t we have t = α(u) = a + b − u ⇒ α

′

(u) = −1 < 0.

4.2 Properties of Line Integrals 145

Example 4.4. Evaluate the line integral

C

(x

2

+y

2

) dx+2xy dy fromExample 4.2, Section

4.1, along the curve C : x = t, y = 2t

2

, 0 ≤ t ≤ 1, where t = sin u for 0 ≤ u ≤ π/2.

Solution: First, we notice that 0 = sin 0, 1 = sin(π/2), and

dt

du

= cos u > 0 on (0, π/2). So

by Theorem 4.2 we know that if C is parametrized by

x = sin u , y = 2 sin

2

u , 0 ≤ u ≤ π/2

then

C

(x

2

+ y

2

) dx + 2xy dy should have the same value as we found in Example 4.2,

namely

13

3

. And we can indeed verify this:

C

(x

2

+ y

2

) dx + 2xy dy =

π/2

0

(sin

2

u + (2 sin

2

u)

2

) cos u + 2(sin u)(2 sin

2

u)4 sin u cos u

du

=

π/2

0

sin

2

u + 20 sin

4

u

cos u du

=

sin

3

u

3

+ 4 sin

5

u

π/2

0

=

1

3

+ 4 =

13

3

In other words, the line integral is unchanged whether t or u is the parameter for C.

By a closed curve, we mean a curve C whose initial point and terminal point are

the same, i.e. for C : x = x(t), y = y(t), a ≤ t ≤ b, we have (x(a), y(a)) = (x(b), y(b)).

◮

◭

C

t = a t = b

(a) Closed

◮

◭

C

t = a

t = b

(b) Not closed

Figure 4.2.1 Closed vs nonclosed curves

A simple closed curve is a closed curve which does not intersect itself. Note that

any closed curve can be regarded as a union of simple closed curves (think of the loops

in a figure eight). We use the special notation

C

f (x, y) ds and

C

f ··· dr

to denote line integrals of scalar and vector fields, respectively, along closed curves.

In some older texts you may see the notation

or

**to indicate a line integral
**

traversing a closed curve in a counterclockwise or clockwise direction, respectively.

146 CHAPTER 4. LINE AND SURFACE INTEGRALS

So far, the examples we have seen of line integrals (e.g. Example 4.2) have had the

same value for different curves joining the initial point to the terminal point. That

is, the line integral has been independent of the path joining the two points. As we

mentioned before, this is not always the case. The following theorem gives a necessary

and sufficient condition for this path independence:

Theorem 4.3. In a region R, the line integral

C

f ··· dr is independent of the path

between any two points in R if and only if

C

f ··· dr = 0 for every closed curve C which is

contained in R.

Proof: Suppose that

C

f ··· dr = 0 for every closed curve C which is contained in R. Let

P

1

and P

2

be two distinct points in R. Let C

1

be a curve in R going from P

1

to P

2

, and

let C

2

be another curve in R going from P

1

to P

2

, as in Figure 4.2.2.

◮

◮

C

1

C

2

P

1

P

2

Figure 4.2.2

Then C = C

1

∪ −C

2

is a closed curve in R (from P

1

to P

1

), and so

C

f ··· dr = 0. Thus,

0 =

C

f ··· dr

=

C

1

f ··· dr +

−C

2

f ··· dr

=

C

1

f ··· dr −

C

2

f ··· dr , and so

C

1

f ··· dr =

C

2

f ··· dr. This proves path independence.

Conversely, suppose that the line integral

C

f ··· dr is independent of the path be-

tween any two points in R. Let C be a closed curve contained in R. Let P

1

and P

2

be

two distinct points on C. Let C

1

be a part of the curve C that goes from P

1

to P

2

, and

let C

2

be the remaining part of C that goes from P

1

to P

2

, again as in Figure 4.2.2.

Then by path independence we have

C

1

f ··· dr =

C

2

f ··· dr

C

1

f ··· dr −

C

2

f ··· dr = 0

C

1

f ··· dr +

−C

2

f ··· dr = 0 , so

C

f ··· dr = 0

since C = C

1

∪ −C

2

. QED

Clearly, the above theorem does not give a practical way to determine path inde-

4.2 Properties of Line Integrals 147

pendence, since it is impossible to check the line integrals around all possible closed

curves in a region. What it mostly does is give an idea of the way in which line inte-

grals behave, and how seemingly unrelated line integrals can be related (in this case,

a specific line integral between two points and all line integrals around closed curves).

For a more practical method for determining path independence, we first need a

version of the Chain Rule for multivariable functions:

Theorem 4.4. (Chain Rule) If z = f (x, y) is a continuously differentiable function

of x and y, and both x = x(t) and y = y(t) are differentiable functions of t, then z is a

differentiable function of t, and

dz

dt

=

∂z

∂x

dx

dt

+

∂z

∂y

dy

dt

(4.19)

at all points where the derivatives on the right are defined.

The proof is virtually identical to the proof of Theorem 2.2 from Section 2.4 (which

uses the Mean Value Theorem), so we omit it.

1

We will now use this Chain Rule to

prove the following sufficient condition for path independence of line integrals:

Theorem 4.5. Let f(x, y) = P(x, y) i + Q(x, y) j be a vector field in some region R, with

P and Q continuously differentiable functions on R. Let C be a smooth curve in R

parametrized by x = x(t), y = y(t), a ≤ t ≤ b. Suppose that there is a real-valued

function F(x, y) such that ∇F = f on R. Then

C

f ··· dr = F(B) − F(A) , (4.20)

where A = (x(a), y(a)) and B = (x(b), y(b)) are the endpoints of C. Thus, the line integral

is independent of the path between its endpoints, since it depends only on the values

of F at those endpoints.

Proof: By definition of

C

f ··· dr, we have

C

f ··· dr =

b

a

P(x(t), y(t)) x

′

(t) + Q(x(t), y(t)) y

′

(t)

dt

=

b

a

¸

∂F

∂x

dx

dt

+

∂F

∂y

dy

dt

dt (since ∇F = f ⇒

∂F

∂x

= P and

∂F

∂y

= Q)

=

b

a

F

′

(x(t), y(t)) dt (by the Chain Rule in Theorem 4.4)

= F(x(t), y(t))

b

a

= F(B) − F(A)

by the Fundamental Theorem of Calculus. QED

1

See TAYLOR and MANN, § 6.5.

148 CHAPTER 4. LINE AND SURFACE INTEGRALS

Theorem 4.5 can be thought of as the line integral version of the Fundamental

Theorem of Calculus. A real-valued function F(x, y) such that ∇F(x, y) = f(x, y) is called

a potential for f. A conservative vector field is one which has a potential.

Example 4.5. Recall from Examples 4.2 and 4.3 in Section 4.1 that the line integral

C

(x

2

+ y

2

) dx + 2xy dy was found to have the value

13

3

for three different curves C going

from the point (0, 0) to the point (1, 2). Use Theorem 4.5 to show that this line integral

is indeed path independent.

Solution: We need to find a real-valued function F(x, y) such that

∂F

∂x

= x

2

+ y

2

and

∂F

∂y

= 2xy .

Suppose that

∂F

∂x

= x

2

+y

2

, Then we must have F(x, y) =

1

3

x

3

+xy

2

+g(y) for some function

g(y). So

∂F

∂y

= 2xy + g

′

(y) satisfies the condition

∂F

∂y

= 2xy if g

′

(y) = 0, i.e. g(y) = K, where

K is a constant. Since any choice for K will do (why?), we pick K = 0. Thus, a potential

F(x, y) for f(x, y) = (x

2

+ y

2

) i + 2xy j exists, namely

F(x, y) =

1

3

x

3

+ xy

2

.

Hence the line integral

C

(x

2

+ y

2

) dx + 2xy dy is path independent.

Note that we can also verify that the value of the line integral of f along any curve C

going from (0, 0) to (1, 2) will always be

13

3

, since by Theorem 4.5

C

f ··· dr = F(1, 2) − F(0, 0) =

1

3

(1)

3

+ (1)(2)

2

− (0 + 0) =

1

3

+ 4 =

13

3

.

A consequence of Theorem 4.5 in the special case where C is a closed curve, so that

the endpoints A and B are the same point, is the following important corollary:

Corollary 4.6. If a vector field f has a potential in a region R, then

C

f ··· dr = 0 for

any closed curve C in R (i.e.

C

∇F ··· dr = 0 for any real-valued function F(x, y)).

Example 4.6. Evaluate

C

x dx + y dy for C : x = 2 cos t, y = 3 sin t, 0 ≤ t ≤ 2π.

Solution: The vector field f(x, y) = x i + y j has a potential F(x, y):

∂F

∂x

= x ⇒ F(x, y) =

1

2

x

2

+ g(y) , so

∂F

∂y

= y ⇒ g

′

(y) = y ⇒ g(y) =

1

2

y

2

+ K

4.2 Properties of Line Integrals 149

for any constant K, so F(x, y) =

1

2

x

2

+

1

2

y

2

is a potential for f(x, y). Thus,

C

x dx + y dy =

C

f ··· dr = 0

by Corollary 4.6, since the curve C is closed (it is the ellipse

x

2

4

+

y

2

9

= 1).

¨

©

Exercises

A

1. Evaluate

C

(x

2

+ y

2

) dx + 2xy dy for C : x = cos t, y = sin t, 0 ≤ t ≤ 2π.

2. Evaluate

C

(x

2

+ y

2

) dx + 2xy dy for C : x = cos t, y = sin t, 0 ≤ t ≤ π.

3. Is there a potential F(x, y) for f(x, y) = y i − x j? If so, find one.

4. Is there a potential F(x, y) for f(x, y) = x i − y j? If so, find one.

5. Is there a potential F(x, y) for f(x, y) = xy

2

i + x

3

y j? If so, find one.

B

6. Let f(x, y) and g(x, y) be vector fields, let a and b be constants, and let C be a curve

in

2

. Show that

C

(a f ± b g) ··· dr = a

C

f ··· dr ± b

C

g··· dr .

7. Let C be a curve whose arc length is L. Show that

C

1 ds = L.

8. Let f (x, y) and g(x, y) be continuously differentiable real-valued functions in a region

R. Show that

C

f ∇g ··· dr = −

C

g ∇f ··· dr

for any closed curve C in R. (Hint: Use Exercise 21 in Section 2.4.)

9. Let f(x, y) =

−y

x

2

+y

2

i +

x

x

2

+y

2

j for all (x, y) (0, 0), and C : x = cos t, y = sin t, 0 ≤ t ≤ 2π.

(a) Show that f = ∇F, for F(x, y) = tan

−1

(y/x).

(b) Show that

C

f ··· dr = 2π. Does this contradict Corollary 4.6? Explain.

C

10. Let g(x) and h(y) be differentiable functions, and let f(x, y) = h(y) i + g(x) j. Can f

have a potential F(x, y)? If so, find it. You may assume that F would be smooth.

(Hint: Consider the mixed partial derivatives of F.)

150 CHAPTER 4. LINE AND SURFACE INTEGRALS

4.3 Green’s Theorem

We will now see a way of evaluating the line integral of a smooth vector field around a

simple closed curve. A vector field f(x, y) = P(x, y) i+Q(x, y) j is smooth if its component

functions P(x, y) and Q(x, y) are smooth. We will use Green’s Theorem(sometimes called

Green’s Theorem in the plane) to relate the line integral around a closed curve with a

double integral over the region inside the curve:

Theorem 4.7. (Green’s Theorem) Let R be a region in

2

whose boundary is a

simple closed curve C which is piecewise smooth. Let f(x, y) = P(x, y) i + Q(x, y) j be a

smooth vector field defined on both R and C. Then

C

f ··· dr =

R

¸

∂Q

∂x

−

∂P

∂y

dA , (4.21)

where C is traversed so that R is always on the left side of C.

Proof: We will prove the theorem in the case for a simple region R, that is, where the

boundary curve C can be written as C = C

1

∪ C

2

in two distinct ways:

C

1

= the curve y = y

1

(x) from the point X

1

to the point X

2

(4.22)

C

2

= the curve y = y

2

(x) from the point X

2

to the point X

1

, (4.23)

where X

1

and X

2

are the points on C farthest to the left and right, respectively; and

C

1

= the curve x = x

1

(y) from the point Y

2

to the point Y

1

(4.24)

C

2

= the curve x = x

2

(y) from the point Y

1

to the point Y

2

, (4.25)

where Y

1

and Y

2

are the lowest and highest points, respectively, on C. See Figure

4.3.1.

x

y

◭

◮

y = y

2

(x)

y = y

1

(x)

x = x

2

(y)

x = x

1

(y)

Y

2

Y

1

X

2

X

1 R

C

a

b

d

c

Figure 4.3.1

Integrate P(x, y) around C using the representation C = C

1

∪ C

2

given by (4.23) and

4.3 Green’s Theorem 151

(4.24). Since y = y

1

(x) along C

1

(as x goes from a to b) and y = y

2

(x) along C

2

(as x goes

from b to a), as we see from Figure 4.3.1, then we have

C

P(x, y) dx =

C

1

P(x, y) dx +

C

2

P(x, y) dx

=

b

a

P(x, y

1

(x)) dx +

a

b

P(x, y

2

(x)) dx

=

b

a

P(x, y

1

(x)) dx −

b

a

P(x, y

2

(x)) dx

= −

b

a

(P(x, y

2

(x)) − P(x, y

1

(x))) dx

= −

b

a

P(x, y)

y=y

2

(x)

y=y

1

(x)

dx

= −

b

a

y

2

(x)

y

1

(x)

∂P(x, y)

∂y

dy dx (by the Fundamental Theorem of Calculus)

= −

R

∂P

∂y

dA .

Likewise, integrate Q(x, y) around C using the representation C = C

1

∪ C

2

given by

(4.25) and (4.26). Since x = x

1

(y) along C

1

(as y goes from d to c) and x = x

2

(y) along C

2

(as y goes from c to d), as we see from Figure 4.3.1, then we have

C

Q(x, y) dy =

C

1

Q(x, y) dy +

C

2

Q(x, y) dy

=

c

d

Q(x

1

(y), y) dy +

d

c

Q(x

2

(y), y) dy

= −

d

c

Q(x

1

(y), y) dy +

d

c

Q(x

2

(y), y) dy

=

d

c

(Q(x

2

(y), y) − Q(x

1

(y), y)) dy

=

d

c

Q(x, y)

x=x

2

(y)

x=x

1

(y)

dy

=

d

c

x

2

(y)

x

1

(y)

∂Q(x, y)

∂x

dx dy (by the Fundamental Theorem of Calculus)

=

R

∂Q

∂x

dA , and so

152 CHAPTER 4. LINE AND SURFACE INTEGRALS

C

f ··· dr =

C

P(x, y) dx +

C

Q(x, y) dy

= −

R

∂P

∂y

dA +

R

∂Q

∂x

dA

=

R

¸

∂Q

∂x

−

∂P

∂y

dA .

QED

Though we proved Green’s Theorem only for a simple region R, the theorem can also

be proved for more general regions (say, a union of simple regions).

2

Example 4.7. Evaluate

C

(x

2

+y

2

) dx+2xy dy, where C is the boundary (traversed coun-

terclockwise) of the region R = { (x, y) : 0 ≤ x ≤ 1, 2x

2

≤ y ≤ 2x }.

x

y

0

(1, 2)

2

1

C

Figure 4.3.2

Solution: R is the shaded region in Figure 4.3.2. By Green’s Theorem,

for P(x, y) = x

2

+ y

2

and Q(x, y) = 2xy, we have

C

(x

2

+ y

2

) dx + 2xy dy =

R

¸

∂Q

∂x

−

∂P

∂y

dA

=

R

(2y − 2y) dA =

R

0 dA = 0 .

We actually already knew that the answer was zero. Recall from Example 4.5 in

Section 4.2 that the vector field f(x, y) = (x

2

+ y

2

) i + 2xy j has a potential function

F(x, y) =

1

3

x

3

+ xy

2

, and so

C

f ··· dr = 0 by Corollary 4.6.

Example 4.8. Let f(x, y) = P(x, y) i + Q(x, y) j, where

P(x, y) =

−y

x

2

+ y

2

and Q(x, y) =

x

x

2

+ y

2

,

and let R = { (x, y) : 0 < x

2

+ y

2

≤ 1 }. For the boundary curve C : x

2

+ y

2

= 1, traversed

counterclockwise, it was shown in Exercise 9(b) in Section 4.2 that

C

f ··· dr = 2π. But

∂Q

∂x

=

y

2

− x

2

(x

2

+ y

2

)

2

=

∂P

∂y

⇒

R

¸

∂Q

∂x

−

∂P

∂y

dA =

R

0 dA = 0 .

This would seem to contradict Green’s Theorem. However, note that R is not the

entire region enclosed by C, since the point (0, 0) is not contained in R. That is, R has a

“hole” at the origin, so Green’s Theorem does not apply.

2

See TAYLOR and MANN, § 15.31 for a discussion of some of the difficulties involved when the boundary

curve is “complicated”.

4.3 Green’s Theorem 153

x

y

0

C

1

C

2

1

1

1/2

1/2

R

◮

◭

Figure 4.3.3 The annulus R

If we modify the region R to be the annulus

R = { (x, y) : 1/4 ≤ x

2

+ y

2

≤ 1 } (see Figure 4.3.3),

and take the “boundary” C of R to be C = C

1

∪ C

2

,

where C

1

is the unit circle x

2

+ y

2

= 1 traversed

counterclockwise and C

2

is the circle x

2

+ y

2

= 1/4

traversed clockwise, then it can be shown (see Ex-

ercise 8) that

C

f ··· dr = 0 .

We would still have

R

∂Q

∂x

−

∂P

∂y

dA = 0, so for this

R we would have

C

f ··· dr =

R

¸

∂Q

∂x

−

∂P

∂y

dA ,

which shows that Green’s Theorem holds for the annular region R.

It turns out that Green’s Theorem can be extended to multiply connected regions,

that is, regions like the annulus in Example 4.8, which have one or more regions cut

out from the interior, as opposed to discrete points being cut out. For such regions, the

“outer” boundary and the “inner” boundaries are traversed so that R is always on the

left side.

C

1

C

2

R

1

R

2

◮

◭

◭

◮

(a) Region R with one hole

C

1

C

2

C

3

R

1

R

2

◮ ◮

◭ ◭

◭

◮

(b) Region R with two holes

Figure 4.3.4 Multiply connected regions

The intuitive idea for why Green’s Theorem holds for multiply connected regions

is shown in Figure 4.3.4 above. The idea is to cut “slits” between the boundaries of a

multiply connected region R so that R is divided into subregions which do not have any

“holes”. For example, in Figure 4.3.4(a) the region R is the union of the regions R

1

and

R

2

, which are divided by the slits indicated by the dashed lines. Those slits are part

of the boundary of both R

1

and R

2

, and we traverse then in the manner indicated by

154 CHAPTER 4. LINE AND SURFACE INTEGRALS

the arrows. Notice that along each slit the boundary of R

1

is traversed in the opposite

direction as that of R

2

, which means that the line integrals of f along those slits cancel

each other out. Since R

1

and R

2

do not have holes in them, then Green’s Theorem holds

in each subregion, so that

bdy

of R

1

f ··· dr =

R

1

¸

∂Q

∂x

−

∂P

∂y

dA and

bdy

of R

2

f ··· dr =

R

2

¸

∂Q

∂x

−

∂P

∂y

dA .

But since the line integrals along the slits cancel out, we have

C

1

∪C

2

f ··· dr =

bdy

of R

1

f ··· dr +

bdy

of R

2

f ··· dr ,

and so

C

1

∪C

2

f ··· dr =

R

1

¸

∂Q

∂x

−

∂P

∂y

dA +

R

2

¸

∂Q

∂x

−

∂P

∂y

dA =

R

¸

∂Q

∂x

−

∂P

∂y

dA ,

which shows that Green’s Theorem holds in the region R. A similar argument shows

that the theorem holds in the region with two holes shown in Figure 4.3.4(b).

We know from Corollary 4.6 that when a smooth vector field f(x, y) = P(x, y) i+Q(x, y) j

on a region R (whose boundary is a piecewise smooth, simple closed curve C) has a

potential in R, then

C

f ··· dr = 0. And if the potential F(x, y) is smooth in R, then

∂F

∂x

= P

and

∂F

∂y

= Q, and so we know that

∂

2

F

∂y ∂x

=

∂

2

F

∂x ∂y

⇒

∂P

∂y

=

∂Q

∂x

in R.

Conversely, if

∂P

∂y

=

∂Q

∂x

in R then

C

f ··· dr =

R

¸

∂Q

∂x

−

∂P

∂y

dA =

R

0 dA = 0 .

For a simply connected region R (i.e. a region with no holes), the following can be

shown:

The following statements are equivalent for a simply connected region R in

2

:

(a) f(x, y) = P(x, y) i + Q(x, y) j has a smooth potential F(x, y) in R

(b)

C

f ··· dr is independent of the path for any curve C in R

(c)

C

f ··· dr = 0 for every simple closed curve C in R

(d)

∂P

∂y

=

∂Q

∂x

in R (in this case, the differential form Pdx + Qdy is exact)

4.3 Green’s Theorem 155

¨

©

Exercises

A

For Exercises 1-4, use Green’s Theorem to evaluate the given line integral around the

curve C, traversed counterclockwise.

1.

C

(x

2

− y

2

) dx + 2xy dy; C is the boundary of R = { (x, y) : 0 ≤ x ≤ 1, 2x

2

≤ y ≤ 2x }

2.

C

x

2

y dx + 2xy dy; C is the boundary of R = { (x, y) : 0 ≤ x ≤ 1, x

2

≤ y ≤ x }

3.

C

2y dx − 3x dy; C is the circle x

2

+ y

2

= 1

4.

C

(e

x

2

+ y

2

) dx + (e

y

2

+ x

2

) dy; C is the boundary of the triangle with vertices (0, 0),

(4, 0) and (0, 4)

5. Is there a potential F(x, y) for f(x, y) = (y

2

+ 3x

2

) i + 2xy j? If so, find one.

6. Is there a potential F(x, y) for f(x, y) = (x

3

cos(xy) + 2x sin(xy)) i + x

2

y cos(xy) j? If so,

find one.

7. Is there a potential F(x, y) for f(x, y) = (8xy + 3) i + 4(x

2

+ y) j? If so, find one.

8. Show that for any constants a, b and any closed simple curve C,

C

a dx + b dy = 0.

B

9. For the vector field f as in Example 4.8, show directly that

C

f ··· dr = 0, where C is

the boundary of the annulus R = { (x, y) : 1/4 ≤ x

2

+ y

2

≤ 1 } traversed so that R is

always on the left.

10. Evaluate

C

e

x

sin y dx + (y

3

+ e

x

cos y) dy, where C is the boundary of the rectangle

with vertices (1, −1), (1, 1), (−1, 1) and (−1, −1), traversed counterclockwise.

C

11. For a region R bounded by a simple closed curve C, show that the area A of R is

A = −

C

y dx =

C

x dy =

1

2

C

x dy − y dx ,

where C is traversed so that R is always on the left. (Hint: Use Green’s Theorem

and the fact that A =

R

1 dA.)

156 CHAPTER 4. LINE AND SURFACE INTEGRALS

4.4 Surface Integrals and the Divergence Theorem

In Section 4.1 we learned how to integrate along a curve. We will now learn how to

perform integration over a surface in

3

, such as a sphere or a paraboloid. Recall

from Section 1.8 how we identified points (x, y, z) on a curve C in

3

, parametrized by

x = x(t), y = y(t), z = z(t), a ≤ t ≤ b, with the terminal points of the position vector

r(t) = x(t)i + y(t)j + z(t)k for t in [a, b].

The idea behind a parametrization of a curve is that it “transforms” a subset of

1

(normally an interval [a, b]) into a curve in

2

or

3

(see Figure 4.4.1).

1

a

t

b

y

z

x

0

(x(a), y(a), z(a))

(x(t), y(t), z(t))

(x(b), y(b), z(b)) r(t)

C

x = x(t)

y = y(t)

z = z(t)

Figure 4.4.1 Parametrization of a curve C in

3

Similar to how we used a parametrization of a curve to define the line integral along

the curve, we will use a parametrization of a surface to define a surface integral. We

will use two variables, u and v, to parametrize a surface Σ in

3

: x = x(u, v), y = y(u, v),

z = z(u, v), for (u, v) in some region R in

2

(see Figure 4.4.2).

u

v

R

2

(u, v)

y

z

x

0

Σ

r(u, v)

x = x(u, v)

y = y(u, v)

z = z(u, v)

Figure 4.4.2 Parametrization of a surface Σ in

3

In this case, the position vector of a point on the surface Σ is given by the vector-

valued function

r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k for (u, v) in R.

4.4 Surface Integrals and the Divergence Theorem 157

Since r(u, v) is a function of two variables, define the partial derivatives

∂r

∂u

and

∂r

∂v

for (u, v) in R by

∂r

∂u

(u, v) =

∂x

∂u

(u, v)i +

∂y

∂u

(u, v)j +

∂z

∂u

(u, v)k , and

∂r

∂v

(u, v) =

∂x

∂v

(u, v)i +

∂y

∂v

(u, v)j +

∂z

∂v

(u, v)k .

The parametrization of Σ can be thought of as “transforming” a region in

2

(in the

uv-plane) into a 2-dimensional surface in

3

. This parametrization of the surface is

sometimes called a patch, based on the idea of “patching” the region R onto Σ in the

grid-like manner shown in Figure 4.4.2.

In fact, those gridlines in R lead us to how we will define a surface integral over Σ.

Along the vertical gridlines in R, the variable u is constant. So those lines get mapped

to curves on Σ, and the variable u is constant along the position vector r(u, v). Thus, the

tangent vector to those curves at a point (u, v) is

∂r

∂v

. Similarly, the horizontal gridlines

in R get mapped to curves on Σ whose tangent vectors are

∂r

∂u

.

Now take a point (u, v) in R as, say, the lower left corner of one of the rectangular grid

sections in R, as shown in Figure 4.4.2. Suppose that this rectangle has a small width

and height of ∆u and ∆v, respectively. The corner points of that rectangle are (u, v),

(u +∆u, v), (u +∆u, v +∆v) and (u, v +∆v). So the area of that rectangle is A = ∆u ∆v. Then

that rectangle gets mapped by the parametrization onto some section of the surface

Σ which, for ∆u and ∆v small enough, will have a surface area (call it dσ) that is very

close to the area of the parallelogram which has adjacent sides r(u + ∆u, v) − r(u, v)

(corresponding to the line segment from (u, v) to (u + ∆u, v) in R) and r(u, v + ∆v) − r(u, v)

(corresponding to the line segment from (u, v) to (u, v + ∆v) in R). But by combining our

usual notion of a partial derivative (see Definition 2.3 in Section 2.2) with that of the

derivative of a vector-valued function (see Definition 1.12 in Section 1.8) applied to a

function of two variables, we have

∂r

∂u

≈

r(u + ∆u, v) − r(u, v)

∆u

, and

∂r

∂v

≈

r(u, v + ∆v) − r(u, v)

∆v

,

and so the surface area element dσ is approximately

¸

¸

¸(r(u + ∆u, v) − r(u, v)) ××× (r(u, v + ∆v) − r(u, v))

¸

¸

¸ ≈

¸

¸

¸

¸

¸

¸

(∆u

∂r

∂u

) ××× (∆v

∂r

∂v

)

¸

¸

¸

¸

¸

¸

=

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

∆u ∆v

by Theorem 1.13 in Section 1.4. Thus, the total surface area S of Σ is approximately

the sum of all the quantities

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸ ∆u ∆v, summed over the rectangles in R. Taking

the limit of that sum as the diagonal of the largest rectangle goes to 0 gives

S =

R

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

du dv . (4.26)

158 CHAPTER 4. LINE AND SURFACE INTEGRALS

We will write the double integral on the right using the special notation

Σ

dσ =

R

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

du dv . (4.27)

This is a special case of a surface integral over the surface Σ, where the surface area

element dσ can be thought of as 1 dσ. Replacing 1 by a general real-valued function

f (x, y, z) defined in

3

, we have the following:

Definition 4.3. Let Σ be a surface in

3

parametrized by x = x(u, v), y = y(u, v),

z = z(u, v), for (u, v) in some region R in

2

. Let r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k be the

position vector for any point on Σ, and let f (x, y, z) be a real-valued function defined on

some subset of

3

that contains Σ. The surface integral of f (x, y, z) over Σ is

Σ

f (x, y, z) dσ =

R

f (x(u, v), y(u, v), z(u, v))

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

du dv . (4.28)

In particular, the surface area S of Σ is

S =

Σ

1 dσ . (4.29)

Example 4.9. A torus T is a surface obtained by revolving a circle of radius a in the

yz-plane around the z-axis, where the circle’s center is at a distance b from the z-axis

(0 < a < b), as in Figure 4.4.3. Find the surface area of T.

y

z

0

a

(y − b)

2

+ z

2

= a

2

u

b

(a) Circle in the yz-plane

x

y

z

v

a

(x,y,z)

(b) Torus T

Figure 4.4.3

Solution: For any point on the circle, the line segment from the center of the circle

to that point makes an angle u with the y-axis in the positive y direction (see Figure

4.4 Surface Integrals and the Divergence Theorem 159

4.4.3(a)). And as the circle revolves around the z-axis, the line segment from the origin

to the center of that circle sweeps out an angle v with the positive x-axis (see Figure

4.4.3(b)). Thus, the torus can be parametrized as:

x = (b + a cos u) cos v , y = (b + a cos u) sin v , z = a sin u , 0 ≤ u ≤ 2π , 0 ≤ v ≤ 2π

So for the position vector

r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k

= (b + a cos u) cos v i + (b + a cos u) sin v j + a sin u k

we see that

∂r

∂u

= −a sin u cos v i − a sin u sin v j + a cos u k

∂r

∂v

= −(b + a cos u) sin v i + (b + a cos u) cos v j + 0k ,

and so computing the cross product gives

∂r

∂u

×××

∂r

∂v

= −a(b + a cos u) cos v cos u i − a(b + a cos u) sin v cos u j − a(b + a cos u) sin u k ,

which has magnitude

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

= a(b + a cos u) .

Thus, the surface area of T is

S =

Σ

1 dσ

=

2π

0

2π

0

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

du dv

=

2π

0

2π

0

a(b + a cos u) du dv

=

2π

0

abu + a

2

sin u

u=2π

u=0

dv

=

2π

0

2πab dv

= 4π

2

ab

Since

∂r

∂u

and

∂r

∂v

are tangent to the surface Σ (i.e. lie in the tangent plane to Σ at each

point on Σ), then their cross product

∂r

∂u

×××

∂r

∂v

is perpendicular to the tangent plane to

160 CHAPTER 4. LINE AND SURFACE INTEGRALS

the surface at each point of Σ. Thus,

Σ

f (x, y, z) dσ =

R

f (x(u, v), y(u, v), z(u, v)) n dσ ,

where n =

∂r

∂u

×××

∂r

∂v

. We say that n is a normal vector to Σ.

y

z

x

0

Figure 4.4.4

Recall that normal vectors to a plane can point in two

opposite directions. By an outward unit normal vector

to a surface Σ, we will mean the unit vector that is normal

to Σ and points away from the “top” (or “outer” part) of the

surface. This is a hazy definition, but the picture in Figure

4.4.4 gives a better idea of what outward normal vectors

look like, in the case of a sphere. With this idea in mind,

we make the following definition of a surface integral of a

3-dimensional vector field over a surface:

Definition 4.4. Let Σ be a surface in

3

and let f(x, y, z) = f

1

(x, y, z)i + f

2

(x, y, z)j +

f

3

(x, y, z)k be a vector field defined on some subset of

3

that contains Σ. The surface

integral of f over Σ is

Σ

f ··· dσ =

Σ

f ··· ndσ , (4.30)

where, at any point on Σ, n is the outward unit normal vector to Σ.

Note in the above definition that the dot product inside the integral on the right is

a real-valued function, and hence we can use Definition 4.3 to evaluate the integral.

Example 4.10. Evaluate the surface integral

Σ

f ··· dσ, where f(x, y, z) = yzi + xzj + xyk

and Σ is the part of the plane x +y +z = 1 with x ≥ 0, y ≥ 0, and z ≥ 0, with the outward

unit normal n pointing in the positive z direction (see Figure 4.4.5).

y

z

x

0

1

1

1

Σ

x + y + z = 1

n

Figure 4.4.5

Solution: Since the vector v = (1, 1, 1) is normal to the plane x + y +

z = 1 (why?), then dividing v by its length yields the outward unit

normal vector n =

1

√

3

,

1

√

3

,

1

√

3

**. We now need to parametrize Σ. As
**

we can see from Figure 4.4.5, projecting Σ onto the xy-plane yields

a triangular region R = { (x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1 − x }. Thus, using

(u, v) instead of (x, y), we see that

x = u, y = v, z = 1 − (u + v), for 0 ≤ u ≤ 1, 0 ≤ v ≤ 1 − u

4.4 Surface Integrals and the Divergence Theorem 161

is a parametrization of Σ over R (since z = 1 − (x + y) on Σ). So on Σ,

f ··· n = (yz, xz, xy) ···

¸

1

√

3

,

1

√

3

,

1

√

3

=

1

√

3

(yz + xz + xy)

=

1

√

3

((x + y)z + xy) =

1

√

3

((u + v)(1 − (u + v)) + uv)

=

1

√

3

((u + v) − (u + v)

2

+ uv)

for (u, v) in R, and for r(u, v) = x(u, v)i + y(u, v)j + z(u, v)k = ui + vj + (1 − (u + v))k we have

∂r

∂u

×××

∂r

∂v

= (1, 0, −1) ××× (0, 1, −1) = (1, 1, 1) ⇒

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

=

√

3 .

Thus, integrating over R using vertical slices (e.g. as indicated by the dashed line in

Figure 4.4.5) gives

Σ

f ··· dσ =

Σ

f ··· ndσ

=

R

(f(x(u, v), y(u, v), z(u, v)) ··· n)

¸

¸

¸

¸

¸

¸

∂r

∂u

×××

∂r

∂v

¸

¸

¸

¸

¸

¸

dv du

=

1

0

1−u

0

1

√

3

((u + v) − (u + v)

2

+ uv)

√

3 dv du

=

1

0

¸

¸

¸

¸

¸

¸

(u + v)

2

2

−

(u + v)

3

3

+

uv

2

2

v=1−u

v=0

¸

¸

¸

¸

¸

¸

du

=

1

0

¸

1

6

+

u

2

−

3u

2

2

+

5u

3

6

du

=

u

6

+

u

2

4

−

u

3

2

+

5u

4

24

1

0

=

1

8

.

Computing surface integrals can often be tedious, especially when the formula for

the outward unit normal vector at each point of Σ changes. The following theorem pro-

vides an easier way in the case when Σ is a closed surface, that is, when Σ encloses

a bounded solid in

3

. For example, spheres, cubes, and ellipsoids are closed surfaces,

but planes and paraboloids are not.

162 CHAPTER 4. LINE AND SURFACE INTEGRALS

Theorem 4.8. (Divergence Theorem) Let Σ be a closed surface in

3

which bounds

a solid S , and let f(x, y, z) = f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k be a vector field defined on

some subset of

3

that contains Σ. Then

Σ

f ··· dσ =

S

div f dV , (4.31)

where

div f =

∂f

1

∂x

+

∂f

2

∂y

+

∂f

3

∂z

(4.32)

is called the divergence of f.

The proof of the Divergence Theoremis very similar to the proof of Green’s Theorem,

i.e. it is first proved for the simple case when the solid S is bounded above by one

surface, bounded below by another surface, and bounded laterally by one or more

surfaces. The proof can then be extended to more general solids.

3

Example 4.11. Evaluate

Σ

f ··· dσ, where f(x, y, z) = xi +yj +zk and Σ is the unit sphere

x

2

+ y

2

+ z

2

= 1.

Solution: We see that div f = 1 + 1 + 1 = 3, so

Σ

f ··· dσ =

S

div f dV =

S

3 dV

= 3

S

1 dV = 3 vol(S ) = 3 ·

4π(1)

3

3

= 4π .

In physical applications, the surface integral

Σ

f··· dσ is often referred to as the flux

of f through the surface Σ. For example, if f represents the velocity field of a fluid,

then the flux is the net quantity of fluid to flow through the surface Σ per unit time.

A positive flux means there is a net flow out of the surface (i.e. in the direction of the

outward unit normal vector n), while a negative flux indicates a net flow inward (in

the direction of −n).

The term divergence comes from interpreting div f as a measure of how much a

vector field “diverges” from a point. This is best seen by using another definition of

div f which is equivalent

4

to the definition given by formula (4.32). Namely, for a point

(x, y, z) in

3

,

div f(x, y, z) = lim

V→0

1

V

Σ

f ··· dσ , (4.33)

3

See TAYLOR and MANN, § 15.6 for the details.

4

See SCHEY, p. 36-39, for an intuitive discussion of this.

4.4 Surface Integrals and the Divergence Theorem 163

where V is the volume enclosed by a closed surface Σ around the point (x, y, z). In the

limit, V → 0 means that we take smaller and smaller closed surfaces around (x, y, z),

which means that the volumes they enclose are going to zero. It can be shown that this

limit is independent of the shapes of those surfaces. Notice that the limit being taken

is of the ratio of the flux through a surface to the volume enclosed by that surface,

which gives a rough measure of the flow “leaving” a point, as we mentioned. Vector

fields which have zero divergence are often called solenoidal fields.

The following theorem is a simple consequence of formula (4.33).

Theorem 4.9. If the flux of a vector field f is zero through every closed surface con-

taining a given point, then div f = 0 at that point.

Proof: By formula (4.33), at the given point (x, y, z) we have

div f(x, y, z) = lim

V→0

1

V

Σ

f ··· dσ for closed surfaces Σ containing (x, y, z), so

= lim

V→0

1

V

(0) by our assumption that the flux through each Σ is zero, so

= lim

V→0

0

= 0 . QED

Lastly, we note that sometimes the notation

Σ

f (x, y, z) dσ and

Σ

f ··· dσ

is used to denote surface integrals of scalar and vector fields, respectively, over closed

surfaces. Especially in physics texts, it is common to see simply

Σ

instead of

Σ

.

¨

©

Exercises

A

For Exercises 1-4, use the Divergence Theorem to evaluate the surface integral

Σ

f···dσ

of the given vector field f(x, y, z) over the surface Σ.

1. f(x, y, z) = xi + 2yj + 3zk, Σ : x

2

+ y

2

+ z

2

= 9

2. f(x, y, z) = xi + yj + zk, Σ : boundary of the solid cube S = { (x, y, z) : 0 ≤ x, y, z ≤ 1 }

3. f(x, y, z) = x

3

i + y

3

j + z

3

k, Σ : x

2

+ y

2

+ z

2

= 1

4. f(x, y, z) = 2i + 3j + 5k, Σ : x

2

+ y

2

+ z

2

= 1

164 CHAPTER 4. LINE AND SURFACE INTEGRALS

B

5. Show that the flux of any constant vector field through any closed surface is zero.

6. Evaluate the surface integral from Exercise 2 without using the Divergence Theo-

rem, i.e. using only Definition 4.3, as in Example 4.10. Note that there will be a

different outward unit normal vector to each of the six faces of the cube.

7. Evaluate the surface integral

Σ

f ··· dσ, where f(x, y, z) = x

2

i + xyj + zk and Σ is the

part of the plane 6x + 3y + 2z = 6 with x ≥ 0, y ≥ 0, and z ≥ 0, with the outward unit

normal n pointing in the positive z direction.

8. Use a surface integral to show that the surface area of a sphere of radius r is 4πr

2

.

(Hint: Use spherical coordinates to parametrize the sphere.)

9. Use a surface integral to show that the surface area of a right circular cone of

radius R and height h is πR

√

h

2

+ R

2

. (Hint: Use the parametrization x = r cos θ,

y = r sin θ, z =

h

R

r, for 0 ≤ r ≤ R and 0 ≤ θ ≤ 2π.)

10. The ellipsoid

x

2

a

2

+

y

2

b

2

+

z

2

c

2

= 1 can be parametrized using ellipsoidal coordinates

x = a sin φ cos θ , y = b sin φ sin θ , z = c cos φ , for 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π.

Show that the surface area S of the ellipsoid is

S =

π

0

2π

0

sin φ

a

2

b

2

cos

2

φ + c

2

(a

2

sin

2

θ + b

2

cos

2

θ) sin

2

φ dθ dφ .

(Note: The above double integral can not be evaluated by elementary means. For

specific values of a, b and c it can be evaluated using numerical methods. An

alternative is to express the surface area in terms of elliptic integrals.

5

)

C

11. Use Definition 4.3 to prove that the surface area S over a region R in

2

of a

surface z = f (x, y) is given by the formula

S =

R

1 +

∂f

∂x

2

+

∂f

∂y

2

dA .

(Hint: Think of the parametrization of the surface.)

5

BOWMAN, F., Introduction to Elliptic Functions, with Applications, New York: Dover, 1961, § III.7.

4.5 Stokes’ Theorem 165

4.5 Stokes’ Theorem

So far the only types of line integrals which we have discussed are those along curves

in

2

. But the definitions and properties which were covered in Sections 4.1 and 4.2

can easily be extended to include functions of three variables, so that we can now

discuss line integrals along curves in

3

.

Definition 4.5. For a real-valued function f (x, y, z) and a curve C in

3

, parametrized

by x = x(t), y = y(t), z = z(t), a ≤ t ≤ b, the line integral of f (x, y, z) along C with

respect to arc length s is

C

f (x, y, z) ds =

b

a

f (x(t), y(t), z(t))

x

′

(t)

2

+ y

′

(t)

2

+ z

′

(t)

2

dt . (4.34)

The line integral of f (x, y, z) along C with respect to x is

C

f (x, y, z) dx =

b

a

f (x(t), y(t), z(t)) x

′

(t) dt . (4.35)

The line integral of f (x, y, z) along C with respect to y is

C

f (x, y, z) dy =

b

a

f (x(t), y(t), z(t)) y

′

(t) dt . (4.36)

The line integral of f (x, y, z) along C with respect to z is

C

f (x, y, z) dz =

b

a

f (x(t), y(t), z(t)) z

′

(t) dt . (4.37)

Similar to the two-variable case, if f (x, y, z) ≥ 0 then the line integral

C

f (x, y, z) ds

can be thought of as the total area of the “picket fence” of height f (x, y, z) at each point

along the curve C in

3

.

Vector fields in

3

are defined in a similar fashion to those in

2

, which allows us

to define the line integral of a vector field along a curve in

3

.

Definition 4.6. For a vector field f(x, y, z) = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k and a

curve C in

3

with a smooth parametrization x = x(t), y = y(t), z = z(t), a ≤ t ≤ b, the

line integral of f along C is

C

f ··· dr =

C

P(x, y, z) dx +

C

Q(x, y, z) dy +

C

R(x, y, z) dz (4.38)

=

b

a

f(x(t), y(t), z(t)) ··· r

′

(t) dt , (4.39)

where r(t) = x(t) i + y(t) j + z(t) k is the position vector for points on C.

166 CHAPTER 4. LINE AND SURFACE INTEGRALS

Similar to the two-variable case, if f(x, y, z) represents the force applied to an object

at a point (x, y, z) then the line integral

C

f ··· dr represents the work done by that force

in moving the object along the curve C in

3

.

Some of the most important results we will need for line integrals in

3

are stated

below without proof (the proofs are similar to their two-variable equivalents).

Theorem4.10. For a vector field f(x, y, z) = P(x, y, z) i+Q(x, y, z) j+R(x, y, z) k and a curve

C with a smooth parametrization x = x(t), y = y(t), z = z(t), a ≤ t ≤ b and position vector

r(t) = x(t) i + y(t) j + z(t) k,

C

f ··· dr =

C

f ··· Tds , (4.40)

where T(t) =

r

′

(t)

r

′

(t)

is the unit tangent vector to C at (x(t), y(t), z(t)).

Theorem 4.11. (Chain Rule) If w = f (x, y, z) is a continuously differentiable function

of x, y, and z, and x = x(t), y = y(t) and z = z(t) are differentiable functions of t, then w is

a differentiable function of t, and

dw

dt

=

∂w

∂x

dx

dt

+

∂w

∂y

dy

dt

+

∂w

∂z

dz

dt

. (4.41)

Also, if x = x(t

1

, t

2

), y = y(t

1

, t

2

) and z = z(t

1

, t

2

) are continuously differentiable function of

(t

1

, t

2

), then

6

∂w

∂t

1

=

∂w

∂x

∂x

∂t

1

+

∂w

∂y

∂y

∂t

1

+

∂w

∂z

∂z

∂t

1

(4.42)

and

∂w

∂t

2

=

∂w

∂x

∂x

∂t

2

+

∂w

∂y

∂y

∂t

2

+

∂w

∂z

∂z

∂t

2

. (4.43)

Theorem 4.12. Let f(x, y, z) = P(x, y, z) i +Q(x, y, z) j +R(x, y, z) k be a vector field in some

solid S , with P, Q and R continuously differentiable functions on S . Let C be a smooth

curve in S parametrized by x = x(t), y = y(t), z = z(t), a ≤ t ≤ b. Suppose that there is a

real-valued function F(x, y, z) such that ∇F = f on S . Then

C

f ··· dr = F(B) − F(A) , (4.44)

where A = (x(a), y(a), z(a)) and B = (x(b), y(b), z(b)) are the endpoints of C.

Corollary 4.13. If a vector field f has a potential in a solid S , then

C

f··· dr = 0 for any

closed curve C in S (i.e.

C

∇F ··· dr = 0 for any real-valued function F(x, y, z)).

6

See TAYLOR and MANN, § 6.5 for a proof.

4.5 Stokes’ Theorem 167

Example 4.12. Let f (x, y, z) = z and let C be the curve in

3

parametrized by

x = t sin t , y = t cos t , z = t , 0 ≤ t ≤ 8π .

Evaluate

C

f (x, y, z) ds. (Note: C is called a conical helix. See Figure 4.5.1).

Solution: Since x

′

(t) = sin t + t cos t, y

′

(t) = cos t − t sin t, and z

′

(t) = 1, we have

x

′

(t)

2

+ y

′

(t)

2

+ z

′

(t)

2

= (sin

2

t + 2t sin t cos t + t

2

cos

2

t) + (cos

2

t − 2t sin t cos t + t

2

sin

2

t) + 1

= t

2

(sin

2

t + cos

2

t) + sin

2

t + cos

2

t + 1

= t

2

+ 2 ,

so since f (x(t), y(t), z(t)) = z(t) = t along the curve C, then

C

f (x, y, z) ds =

8π

0

f (x(t), y(t), z(t))

x

′

(t)

2

+ y

′

(t)

2

+ z

′

(t)

2

dt

=

8π

0

t

t

2

+ 2 dt

=

¸

1

3

(t

2

+ 2)

3/2

8π

0

=

1

3

(64π

2

+ 2)

3/2

− 2

√

2

.

-25

-20

-15

-10

-5

0

5

10

15

20

25

-25

-20

-15

-10

-5

0

5

10

15

20

25

30

0

5

10

15

20

25

30

z

x

y

z

t = 0

t = 8π

Figure 4.5.1 Conical helix C

Example 4.13. Let f(x, y, z) = x i + y j + 2z k be a vector field in

3

. Using the same

curve C from Example 4.12, evaluate

C

f ··· dr.

Solution: It is easy to see that F(x, y, z) =

x

2

2

+

y

2

2

+ z

2

is a potential for f(x, y, z) (i.e.

168 CHAPTER 4. LINE AND SURFACE INTEGRALS

∇F = f). So by Theorem 4.12 we know that

C

f ··· dr = F(B) − F(A) , where A = (x(0), y(0), z(0)) and B = (x(8π), y(8π), z(8π)), so

= F(8π sin 8π, 8π cos 8π, 8π) − F(0 sin 0, 0 cos 0, 0)

= F(0, 8π, 8π) − F(0, 0, 0)

= 0 +

(8π)

2

2

+ (8π)

2

− (0 + 0 + 0) = 96π

2

.

We will now discuss a generalization of Green’s Theorem in

2

to orientable surfaces

in

3

, called Stokes’ Theorem. A surface Σ in

3

is orientable if there is a continuous

vector field N in

3

such that N is nonzero and normal to Σ (i.e. perpendicular to the

tangent plane) at each point of Σ. We say that such an N is a normal vector field.

y

z

x

0

N

−N

Figure 4.5.2

For example, the unit sphere x

2

+y

2

+z

2

= 1 is orientable, since

the continuous vector field N(x, y, z) = x i+y j+z k is nonzero and

normal to the sphere at each point. In fact, −N(x, y, z) is another

normal vector field (see Figure 4.5.2). We see in this case that

N(x, y, z) is what we have called an outward normal vector, and

−N(x, y, z) is an inward normal vector. These “outward” and

“inward” normal vector fields on the sphere correspond to an

“outer” and “inner” side, respectively, of the sphere. That is,

we say that the sphere is a two-sided surface. Roughly, “two-

sided” means “orientable”. Other examples of two-sided, and

hence orientable, surfaces are cylinders, paraboloids, ellipsoids, and planes.

You may be wondering what kind of surface would not have two sides. An example

is the Möbius strip, which is constructed by taking a thin rectangle and connecting

its ends at the opposite corners, resulting in a “twisted” strip (see Figure 4.5.3).

A

B A

B

−→

(a) Connect A to A and B to B along the ends

A

→

A

→

(b) Not orientable

Figure 4.5.3 Möbius strip

If you imagine walking along a line down the center of the Möbius strip, as in Figure

4.5.3(b), then you arrive back at the same place from which you started but upside

down! That is, your orientation changed even though your motion was continuous

4.5 Stokes’ Theorem 169

along that center line. Informally, thinking of your vertical direction as a normal

vector field along the strip, there is a discontinuity at your starting point (and, in

fact, at every point) since your vertical direction takes two different values there. The

Möbius strip has only one side, and hence is nonorientable.

7

For an orientable surface Σ which has a boundary curve C, pick a unit normal vector

n such that if you walked along C with your head pointing in the direction of n, then

the surface would be on your left. We say in this situation that n is a positive unit

normal vector and that C is traversed n-positively. We can now state Stokes’ Theorem:

Theorem 4.14. (Stokes’ Theorem) Let Σ be an orientable surface in

3

whose

boundary is a simple closed curve C, and let f(x, y, z) = P(x, y, z)i + Q(x, y, z)j + R(x, y, z)k

be a smooth vector field defined on some subset of

3

that contains Σ. Then

C

f ··· dr =

Σ

(curl f ) ··· ndσ , (4.45)

where

curl f =

¸

∂R

∂y

−

∂Q

∂z

i +

¸

∂P

∂z

−

∂R

∂x

j +

¸

∂Q

∂x

−

∂P

∂y

k , (4.46)

n is a positive unit normal vector over Σ, and C is traversed n-positively.

Proof: As the general case is beyond the scope of this text, we will prove the theorem

only for the special case where Σ is the graph of z = z(x, y) for some smooth real-valued

function z(x, y), with (x, y) varying over a region D in

2

.

y

z

x

0

n

(x, y)

D

C

D

C

Σ : z = z(x, y)

Figure 4.5.4

Projecting Σ onto the xy-plane, we see that the

closed curve C (the boundary curve of Σ) projects onto

a closed curve C

D

which is the boundary curve of D

(see Figure 4.5.4). Assuming that C has a smooth

parametrization, its projection C

D

in the xy-plane

also has a smooth parametrization, say

C

D

: x = x(t) , y = y(t) , a ≤ t ≤ b ,

and so C can be parametrized (in

3

) as

C : x = x(t) , y = y(t) , z = z(x(t), y(t)) , a ≤ t ≤ b ,

since the curve C is part of the surface z = z(x, y). Now, by the Chain Rule (Theorem

4.4 in Section 4.2), for z = z(x(t), y(t)) as a function of t, we know that

z

′

(t) =

∂z

∂x

x

′

(t) +

∂z

∂y

y

′

(t) ,

7

For further discussion of orientability, see O’NEILL, § IV.7.

170 CHAPTER 4. LINE AND SURFACE INTEGRALS

and so

C

f ··· dr =

C

P(x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz

=

b

a

¸

P x

′

(t) + Qy

′

(t) + R

¸

∂z

∂x

x

′

(t) +

∂z

∂y

y

′

(t)

dt

=

b

a

¸¸

P + R

∂z

∂x

x

′

(t) +

¸

Q + R

∂z

∂y

y

′

(t)

dt

=

C

D

˜

P(x, y) dx +

˜

Q(x, y) dy ,

where

˜

P(x, y) = P(x, y, z(x, y)) + R(x, y, z(x, y))

∂z

∂x

(x, y) , and

˜

Q(x, y) = Q(x, y, z(x, y)) + R(x, y, z(x, y))

∂z

∂y

(x, y)

for (x, y) in D. Thus, by Green’s Theorem applied to the region D, we have

C

f ··· dr =

D

¸

∂

˜

Q

∂x

−

∂

˜

P

∂y

dA . (4.47)

Thus,

∂

˜

Q

∂x

=

∂

∂x

¸

Q(x, y, z(x, y)) + R(x, y, z(x, y))

∂z

∂y

(x, y)

**, so by the Product Rule we get
**

=

∂

∂x

(Q(x, y, z(x, y))) +

¸

∂

∂x

R(x, y, z(x, y))

∂z

∂y

(x, y) + R(x, y, z(x, y))

∂

∂x

¸

∂z

∂y

(x, y)

.

Now, by formula (4.42) in Theorem 4.11, we have

∂

∂x

(Q(x, y, z(x, y))) =

∂Q

∂x

∂x

∂x

+

∂Q

∂y

∂y

∂x

+

∂Q

∂z

∂z

∂x

=

∂Q

∂x

· 1 +

∂Q

∂y

· 0 +

∂Q

∂z

∂z

∂x

=

∂Q

∂x

+

∂Q

∂z

∂z

∂x

.

Similarly,

∂

∂x

(R(x, y, z(x, y))) =

∂R

∂x

+

∂R

∂z

∂z

∂x

.

4.5 Stokes’ Theorem 171

Thus,

∂

˜

Q

∂x

=

∂Q

∂x

+

∂Q

∂z

∂z

∂x

+

¸

∂R

∂x

+

∂R

∂z

∂z

∂x

∂z

∂y

+ R(x, y, z(x, y))

∂

2

z

∂x ∂y

=

∂Q

∂x

+

∂Q

∂z

∂z

∂x

+

∂R

∂x

∂z

∂y

+

∂R

∂z

∂z

∂x

∂z

∂y

+ R

∂

2

z

∂x ∂y

.

In a similar fashion, we can calculate

∂

˜

P

∂y

=

∂P

∂y

+

∂P

∂z

∂z

∂y

+

∂R

∂y

∂z

∂x

+

∂R

∂z

∂z

∂y

∂z

∂x

+ R

∂

2

z

∂y ∂x

.

So subtracting gives

∂

˜

Q

∂x

−

∂

˜

P

∂y

=

¸

∂Q

∂z

−

∂R

∂y

∂z

∂x

+

¸

∂R

∂x

−

∂P

∂z

∂z

∂y

+

¸

∂Q

∂x

−

∂P

∂y

(4.48)

since

∂

2

z

∂x ∂y

=

∂

2

z

∂y ∂x

by the smoothness of z = z(x, y). Hence, by equation (4.47),

C

f ··· dr =

D

¸

−

¸

∂R

∂y

−

∂Q

∂z

∂z

∂x

−

¸

∂P

∂z

−

∂R

∂x

∂z

∂y

+

¸

∂Q

∂x

−

∂P

∂y

dA (4.49)

after factoring out a −1 from the terms in the first two products in equation (4.48).

Now, recall from Section 2.3 (see p.76) that the vector N = −

∂z

∂x

i −

∂z

∂y

j + k is normal

to the tangent plane to the surface z = z(x, y) at each point of Σ. Thus,

n =

N

¸

¸

¸N

¸

¸

¸

=

−

∂z

∂x

i −

∂z

∂y

j + k

1 +

∂z

∂x

2

+

∂z

∂y

2

is in fact a positive unit normal vector to Σ (see Figure 4.5.4). Hence, using the

parametrization r(x, y) = x i + y j + z(x, y) k, for (x, y) in D, of the surface Σ, we have

∂r

∂x

= i +

∂z

∂x

k and

∂r

∂y

= j +

∂z

∂y

k, and so

¸

¸

¸

∂r

∂x

×××

∂r

∂y

¸

¸

¸ =

1 +

∂z

∂x

2

+

∂z

∂y

2

. So we see that

using formula (4.46) for curl f, we have

Σ

(curl f) ··· ndσ =

D

(curl f ) ··· n

¸

¸

¸

¸

¸

¸

∂r

∂x

×××

∂r

∂y

¸

¸

¸

¸

¸

¸

dA

=

D

¸¸

∂R

∂y

−

∂Q

∂z

i +

¸

∂P

∂z

−

∂R

∂x

j +

¸

∂Q

∂x

−

∂P

∂y

k

···

¸

−

∂z

∂x

i −

∂z

∂y

j + k

dA

=

D

¸

−

¸

∂R

∂y

−

∂Q

∂z

∂z

∂x

−

¸

∂P

∂z

−

∂R

∂x

∂z

∂y

+

¸

∂Q

∂x

−

∂P

∂y

dA ,

which, upon comparing to equation (4.49), proves the Theorem. QED

172 CHAPTER 4. LINE AND SURFACE INTEGRALS

Note: The condition in Stokes’ Theorem that the surface Σ have a (continuously

varying) positive unit normal vector n and a boundary curve C traversed n-positively

can be expressed more precisely as follows: if r(t) is the position vector for C and

T(t) = r

′

(t)/r

′

(t) is the unit tangent vector to C, then the vectors T, n, T××× n form a

right-handed system.

Also, it should be noted that Stokes’ Theorem holds even when the boundary curve

C is piecewise smooth.

Example 4.14. Verify Stokes’ Theorem for f(x, y, z) = z i + x j + y k when Σ is the

paraboloid z = x

2

+ y

2

such that z ≤ 1 (see Figure 4.5.5).

y

z

x

0

n

C

Σ

1

Figure 4.5.5 z = x

2

+ y

2

Solution: The positive unit normal vector to the surface

z = z(x, y) = x

2

+ y

2

is

n =

−

∂z

∂x

i −

∂z

∂y

j + k

1 +

∂z

∂x

2

+

∂z

∂y

2

=

−2x i − 2y j + k

1 + 4x

2

+ 4y

2

,

and curl f = (1 − 0) i + (1 − 0) j + (1 − 0) k = i + j + k, so

(curl f ) ··· n = (−2x − 2y + 1)/

1 + 4x

2

+ 4y

2

.

Since Σ can be parametrized as r(x, y) = x i + y j + (x

2

+ y

2

) k

for (x, y) in the region D = { (x, y) : x

2

+ y

2

≤ 1 }, then

Σ

(curl f ) ··· ndσ =

D

(curl f ) ··· n

¸

¸

¸

¸

¸

¸

∂r

∂x

×××

∂r

∂y

¸

¸

¸

¸

¸

¸

dA

=

D

−2x − 2y + 1

1 + 4x

2

+ 4y

2

1 + 4x

2

+ 4y

2

dA

=

D

(−2x − 2y + 1) dA , so switching to polar coordinates gives

=

2π

0

1

0

(−2r cos θ − 2r sin θ + 1)r dr dθ

=

2π

0

1

0

(−2r

2

cos θ − 2r

2

sin θ + r) dr dθ

=

2π

0

−

2r

3

3

cos θ −

2r

3

3

sin θ +

r

2

2

r=1

r=0

dθ

=

2π

0

−

2

3

cos θ −

2

3

sin θ +

1

2

dθ

= −

2

3

sin θ +

2

3

cos θ +

1

2

θ

2π

0

= π .

4.5 Stokes’ Theorem 173

The boundary curve C is the unit circle x

2

+ y

2

= 1 laying in the plane z = 1 (see

Figure 4.5.5), which can be parametrized as x = cos t, y = sin t, z = 1 for 0 ≤ t ≤ 2π. So

C

f ··· dr =

2π

0

((1)(−sin t) + (cos t)(cos t) + (sin t)(0)) dt

=

2π

0

¸

−sin t +

1 + cos 2t

2

dt

¸

here we used cos

2

t =

1 + cos 2t

2

= cos t +

t

2

+

sin 2t

4

2π

0

= π .

So we see that

C

f ··· dr =

Σ

(curl f ) ··· ndσ, as predicted by Stokes’ Theorem.

The line integral in the preceding example was far simpler to calculate than the

surface integral, but this will not always be the case.

Example 4.15. Let Σ be the elliptic paraboloid z =

x

2

4

+

y

2

9

for z ≤ 1, and let C be its

boundary curve. Calculate

C

f ··· dr for f(x, y, z) = (9xz + 2y)i + (2x + y

2

)j + (−2y

2

+ 2z)k,

where C is traversed counterclockwise.

Solution: The surface is similar to the one in Example 4.14, except now the boundary

curve C is the ellipse

x

2

4

+

y

2

9

= 1 laying in the plane z = 1. In this case, using Stokes’

Theorem is easier than computing the line integral directly. As in Example 4.14, at

each point (x, y, z(x, y)) on the surface z = z(x, y) =

x

2

4

+

y

2

9

the vector

n =

−

∂z

∂x

i −

∂z

∂y

j + k

1 +

∂z

∂x

2

+

∂z

∂y

2

=

−

x

2

i −

2y

9

j + k

1 +

x

2

4

+

4y

2

9

,

is a positive unit normal vector to Σ. And calculating the curl of f gives

curl f = (−4y − 0)i + (9x − 0)j + (2 − 2)k = −4y i + 9x j + 0 k ,

so

(curl f ) ··· n =

(−4y)(−

x

2

) + (9x)(−

2y

9

) + (0)(1)

1 +

x

2

4

+

4y

2

9

=

2xy − 2xy + 0

1 +

x

2

4

+

4y

2

9

= 0 ,

and so by Stokes’ Theorem

C

f ··· dr =

Σ

(curl f ) ··· ndσ =

Σ

0 dσ = 0 .

174 CHAPTER 4. LINE AND SURFACE INTEGRALS

In physical applications, for a simple closed curve C the line integral

C

f··· dr is often

called the circulation of f around C. For example, if E represents the electrostatic

field due to a point charge, then it turns out

8

that curl E = 0, which means that the

circulation

C

E ··· dr = 0 by Stokes’ Theorem. Vector fields which have zero curl are

often called irrotational fields.

In fact, the term curl was created by the 19

th

century Scottish physicist James Clerk

Maxwell in his study of electromagnetism, where it is used extensively. In physics,

the curl is interpreted as a measure of circulation density. This is best seen by using

another definition of curl f which is equivalent

9

to the definition given by formula

(4.46). Namely, for a point (x, y, z) in

3

,

n··· (curl f )(x, y, z) = lim

S →0

1

S

C

f ··· dr , (4.50)

where S is the surface area of a surface Σ containing the point (x, y, z) and with a

simple closed boundary curve C and positive unit normal vector n at (x, y, z). In the

limit, think of the curve C shrinking to the point (x, y, z), which causes Σ, the surface it

bounds, to have smaller and smaller surface area. That ratio of circulation to surface

area in the limit is what makes the curl a rough measure of circulation density (i.e.

circulation per unit area).

x

y

0

f

Figure 4.5.6 Curl and rotation

An idea of how the curl of a vector field is related

to rotation is shown in Figure 4.5.6. Suppose we

have a vector field f(x, y, z) which is always parallel

to the xy-plane at each point (x, y, z) and that the vec-

tors grow larger the further the point (x, y, z) is from

the y-axis. For example, f(x, y, z) = (1 + x

2

) j. Think

of the vector field as representing the flow of wa-

ter, and imagine dropping two wheels with paddles

into that water flow, as in Figure 4.5.6. Since the

flow is stronger (i.e. the magnitude of f is larger) as

you move away from the y-axis, then such a wheel

would rotate counterclockwise if it were dropped to

the right of the y-axis, and it would rotate clockwise if it were dropped to the left of

the y-axis. In both cases the curl would be nonzero (curl f(x, y, z) = 2x k in our example)

and would obey the right-hand rule, that is, curl f(x, y, z) points in the direction of your

thumb as you cup your right hand in the direction of the rotation of the wheel. So

the curl points outward (in the positive z-direction) if x > 0 and points inward (in the

negative z-direction) if x < 0. Notice that if all the vectors had the same direction and

the same magnitude, then the wheels would not rotate and hence there would be no

curl (which is why such fields are called irrotational, meaning no rotation).

8

See Ch. 2 in REITZ, MILFORD and CHRISTY.

9

See SCHEY, p. 78-81, for the derivation.

4.5 Stokes’ Theorem 175

Finally, by Stokes’ Theorem, we know that if C is a simple closed curve in some solid

region S in

3

and if f(x, y, z) is a smooth vector field such that curl f = 0 in S , then

C

f ··· dr =

Σ

(curl f ) ··· ndσ =

Σ

0 ··· ndσ =

Σ

0 dσ = 0 ,

where Σ is any orientable surface inside S whose boundary is C (such a surface is

sometimes called a capping surface for C). So similar to the two-variable case, we

have a three-dimensional version of a result from Section 4.3, for solid regions in

3

which are simply connected (i.e. regions having no holes):

The following statements are equivalent for a simply connected solid region S in

3

:

(a) f(x, y, z) = P(x, y, z) i + Q(x, y, z) j + R(x, y, z) k has a smooth potential F(x, y, z) in S

(b)

C

f ··· dr is independent of the path for any curve C in S

(c)

C

f ··· dr = 0 for every simple closed curve C in S

(d)

∂R

∂y

=

∂Q

∂z

,

∂P

∂z

=

∂R

∂x

, and

∂Q

∂x

=

∂P

∂y

in S (i.e. curl f = 0 in S )

Part (d) is also a way of saying that the differential form Pdx + Qdy + Rdz is exact.

Example 4.16. Determine if the vector field f(x, y, z) = xyz i + xz j + xy k has a potential

in

3

.

Solution: Since

3

is simply connected, we just need to check whether curl f = 0

throughout

3

, that is,

∂R

∂y

=

∂Q

∂z

,

∂P

∂z

=

∂R

∂x

, and

∂Q

∂x

=

∂P

∂y

throughout

3

, where P(x, y, z) = xyz, Q(x, y, z) = xz, and R(x, y, z) = xy. But we see that

∂P

∂z

= xy ,

∂R

∂x

= y ⇒

∂P

∂z

∂R

∂x

for some (x, y, z) in

3

.

Thus, f(x, y, z) does not have a potential in

3

.

¨

©

Exercises

A

For Exercises 1-3, calculate

C

f (x, y, z) ds for the given function f (x, y, z) and curve C.

176 CHAPTER 4. LINE AND SURFACE INTEGRALS

1. f (x, y, z) = z; C : x = cos t, y = sin t, z = t, 0 ≤ t ≤ 2π

2. f (x, y, z) =

x

y

+ y + 2yz; C : x = t

2

, y = t, z = 1, 1 ≤ t ≤ 2

3. f (x, y, z) = z

2

; C : x = t sin t, y = t cos t, z =

2

√

2

3

t

3/2

, 0 ≤ t ≤ 1

For Exercises 4-9, calculate

C

f ··· dr for the given vector field f(x, y, z) and curve C.

4. f(x, y, z) = i − j + k; C : x = 3t, y = 2t, z = t, 0 ≤ t ≤ 1

5. f(x, y, z) = y i − x j + z k; C : x = cos t, y = sin t, z = t, 0 ≤ t ≤ 2π

6. f(x, y, z) = x i + y j + z k; C : x = cos t, y = sin t, z = 2, 0 ≤ t ≤ 2π

7. f(x, y, z) = (y − 2z) i + xy j + (2xz + y) k; C : x = t, y = 2t, z = t

2

− 1, 0 ≤ t ≤ 1

8. f(x, y, z) = yz i + xz j + xy k; C : the polygonal path from (0, 0, 0) to (1, 0, 0) to (1, 2, 0)

9. f(x, y, z) = xy i + (z − x) j + 2yz k; C : the polygonal path from (0, 0, 0) to (1, 0, 0) to

(1, 2, 0) to (1, 2, −2)

For Exercises 10-13, state whether or not the vector field f(x, y, z) has a potential in

3

(you do not need to find the potential itself).

10. f(x, y, z) = y i − x j + z k 11. f(x, y, z) = a i + b j + c k (a, b, c constant)

12. f(x, y, z) = (x + y) i + x j + z

2

k 13. f(x, y, z) = xy i − (x − yz

2

) j + y

2

z k

B

For Exercises 14-15, verify Stokes’ Theorem for the given vector field f(x, y, z) and

surface Σ.

14. f(x, y, z) = 2y i − x j + z k; Σ : x

2

+ y

2

+ z

2

= 1, z ≥ 0

15. f(x, y, z) = xy i + xz j + yz k; Σ : z = x

2

+ y

2

, z ≤ 1

16. Construct a Möbius strip from a piece of paper, then draw a line down its center

(like the dotted line in Figure 4.5.3(b)). Cut the Möbius strip along that center line

completely around the strip. How many surfaces does this result in? How would

you describe them? Are they orientable?

17. Use Gnuplot (see Appendix C) to plot the Möbius strip parametrized as:

r(u, v) = cos u (1 + v cos

u

2

) i + sin u (1 + v cos

u

2

) j + v sin

u

2

k , 0 ≤ u ≤ 2π , −

1

2

≤ v ≤

1

2

C

18. Let Σ be a closed surface and f(x, y, z) a smooth vector field. Show that

Σ

(curl f ) ··· ndσ = 0. (Hint: Split Σ in half.)

19. Show that Green’s Theorem is a special case of Stokes’ Theorem.

4.6 Gradient, Divergence, Curl and Laplacian 177

4.6 Gradient, Divergence, Curl and Laplacian

In this final section we will establish some relationships between the gradient, diver-

gence and curl, and we will also introduce a new quantity called the Laplacian. We

will then show how to write these quantities in cylindrical and spherical coordinates.

For a real-valued function f (x, y, z) on

3

, the gradient ∇f (x, y, z) is a vector-valued

function on

3

, that is, its value at a point (x, y, z) is the vector

∇f (x, y, z) =

¸

∂f

∂x

,

∂f

∂y

,

∂f

∂z

=

∂f

∂x

i +

∂f

∂y

j +

∂f

∂z

k

in

3

, where each of the partial derivatives is evaluated at the point (x, y, z). So in this

way, you can think of the symbol ∇ as being “applied” to a real-valued function f to

produce a vector ∇f .

It turns out that the divergence and curl can also be expressed in terms of the

symbol ∇. This is done by thinking of ∇ as a vector in

3

, namely

∇ =

∂

∂x

i +

∂

∂y

j +

∂

∂z

k . (4.51)

Here, the symbols

∂

∂x

,

∂

∂y

and

∂

∂z

are to be thought of as “partial derivative operators”

that will get “applied” to a real-valued function, say f (x, y, z), to produce the partial

derivatives

∂f

∂x

,

∂f

∂y

and

∂f

∂z

. For instance,

∂

∂x

“applied” to f (x, y, z) produces

∂f

∂x

.

Is ∇ really a vector? Strictly speaking, no, since

∂

∂x

,

∂

∂y

and

∂

∂z

are not actual numbers.

But it helps to think of ∇ as a vector, especially with the divergence and curl, as we

will soon see. The process of “applying”

∂

∂x

,

∂

∂y

,

∂

∂z

to a real-valued function f (x, y, z) is

normally thought of as multiplying the quantities:

¸

∂

∂x

( f ) =

∂f

∂x

,

¸

∂

∂y

( f ) =

∂f

∂y

,

¸

∂

∂z

( f ) =

∂f

∂z

For this reason, ∇ is often referred to as the “del operator”, since it “operates” on

functions.

For example, it is often convenient to write the divergence div f as ∇ ··· f, since for

a vector field f(x, y, z) = f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k, the dot product of f with ∇

(thought of as a vector) makes sense:

∇··· f =

¸

∂

∂x

i +

∂

∂y

j +

∂

∂z

k

···

f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k

=

¸

∂

∂x

( f

1

) +

¸

∂

∂y

( f

2

) +

¸

∂

∂z

( f

3

)

=

∂f

1

∂x

+

∂f

2

∂y

+

∂f

3

∂z

= div f

178 CHAPTER 4. LINE AND SURFACE INTEGRALS

We can also write curl f in terms of ∇, namely as ∇ ××× f, since for a vector field

f(x, y, z) = P(x, y, z)i + Q(x, y, z)j + R(x, y, z)k, we have:

∇××× f =

i j k

∂

∂x

∂

∂y

∂

∂z

P(x, y, z) Q(x, y, z) R(x, y, z)

=

¸

∂R

∂y

−

∂Q

∂z

i −

¸

∂R

∂x

−

∂P

∂z

j +

¸

∂Q

∂x

−

∂P

∂y

k

=

¸

∂R

∂y

−

∂Q

∂z

i +

¸

∂P

∂z

−

∂R

∂x

j +

¸

∂Q

∂x

−

∂P

∂y

k

= curl f

For a real-valued function f (x, y, z), the gradient ∇f (x, y, z) =

∂f

∂x

i +

∂f

∂y

j +

∂f

∂z

k is a

vector field, so we can take its divergence:

div ∇f = ∇··· ∇f

=

¸

∂

∂x

i +

∂

∂y

j +

∂

∂z

k

···

¸

∂f

∂x

i +

∂f

∂y

j +

∂f

∂z

k

=

∂

∂x

¸

∂f

∂x

+

∂

∂y

¸

∂f

∂y

+

∂

∂z

¸

∂f

∂z

=

∂

2

f

∂x

2

+

∂

2

f

∂y

2

+

∂

2

f

∂z

2

Note that this is a real-valued function, to which we will give a special name:

Definition 4.7. For a real-valued function f (x, y, z), the Laplacian of f , denoted by

∆f , is given by

∆f (x, y, z) = ∇··· ∇f =

∂

2

f

∂x

2

+

∂

2

f

∂y

2

+

∂

2

f

∂z

2

. (4.52)

Often the notation ∇

2

f is used for the Laplacian instead of ∆f , using the convention

∇

2

= ∇··· ∇.

Example 4.17. Let r(x, y, z) = x i + y j + z k be the position vector field on

3

. Then

r(x, y, z)

2

= r ··· r = x

2

+ y

2

+ z

2

is a real-valued function. Find

(a) the gradient of r

2

(b) the divergence of r

(c) the curl of r

(d) the Laplacian of r

2

4.6 Gradient, Divergence, Curl and Laplacian 179

Solution: (a) ∇r

2

= 2x i + 2y j + 2z k = 2 r

(b) ∇··· r =

∂

∂x

(x) +

∂

∂y

(y) +

∂

∂z

(z) = 1 + 1 + 1 = 3

(c)

∇××× r =

i j k

∂

∂x

∂

∂y

∂

∂z

x y z

= (0 − 0) i − (0 − 0) j + (0 − 0) k = 0

(d) ∆r

2

=

∂

2

∂x

2

(x

2

+ y

2

+ z

2

) +

∂

2

∂y

2

(x

2

+ y

2

+ z

2

) +

∂

2

∂z

2

(x

2

+ y

2

+ z

2

) = 2 + 2 + 2 = 6

Note that we could have calculated ∆r

2

another way, using the ∇ notation along with

parts (a) and (b):

∆r

2

= ∇··· ∇r

2

= ∇··· 2 r = 2 ∇··· r = 2(3) = 6

Notice that in Example 4.17 if we take the curl of the gradient of r

2

we get

∇××× (∇r

2

) = ∇××× 2 r = 2 ∇××× r = 2 0 = 0 .

The following theorem shows that this will be the case in general:

Theorem 4.15. For any smooth real-valued function f (x, y, z), ∇××× (∇f ) = 0.

Proof: We see by the smoothness of f that

∇××× (∇f ) =

i j k

∂

∂x

∂

∂y

∂

∂z

∂f

∂x

∂f

∂y

∂f

∂z

=

¸

∂

2

f

∂y ∂z

−

∂

2

f

∂z ∂y

i −

¸

∂

2

f

∂x ∂z

−

∂

2

f

∂z ∂x

j +

¸

∂

2

f

∂x ∂y

−

∂

2

f

∂y ∂x

k = 0 ,

since the mixed partial derivatives in each component are equal. QED

Corollary 4.16. If a vector field f(x, y, z) has a potential, then curl f = 0.

Another way of stating Theorem 4.15 is that gradients are irrotational. Also, notice

that in Example 4.17 if we take the divergence of the curl of r we trivially get

∇··· (∇××× r) = ∇··· 0 = 0 .

The following theorem shows that this will be the case in general:

Theorem 4.17. For any smooth vector field f(x, y, z), ∇··· (∇××× f) = 0.

The proof is straightforward and left as an exercise for the reader.

180 CHAPTER 4. LINE AND SURFACE INTEGRALS

Corollary 4.18. The flux of the curl of a smooth vector field f(x, y, z) through any

closed surface is zero.

Proof: Let Σ be a closed surface which bounds a solid S . The flux of ∇×××f through Σ is

Σ

(∇××× f ) ··· dσ =

S

∇··· (∇××× f ) dV (by the Divergence Theorem)

=

S

0 dV (by Theorem 4.17)

= 0 . QED

There is another method for proving Theorem 4.15 which can be useful, and is often

used in physics. Namely, if the surface integral

Σ

f (x, y, z) dσ = 0 for all surfaces Σ in

some solid region (usually all of

3

), then we must have f (x, y, z) = 0 throughout that

region. The proof is not trivial, and physicists do not usually bother to prove it. But

the result is true, and can also be applied to double and triple integrals.

For instance, to prove Theorem 4.15, assume that f (x, y, z) is a smooth real-valued

function on

3

. Let C be a simple closed curve in

3

and let Σ be any capping surface

for C (i.e. Σ is orientable and its boundary is C). Since ∇f is a vector field, then

Σ

(∇××× (∇f )) ··· ndσ =

C

∇f ··· dr by Stokes’ Theorem, so

= 0 by Corollary 4.13.

Since the choice of Σ was arbitrary, then we must have (∇×××(∇f ))··· n = 0 throughout

3

,

where n is any unit vector. Using i, j and k in place of n, we see that we must have

∇××× (∇f ) = 0 in

3

, which completes the proof.

Example 4.18. A system of electric charges has a charge density ρ(x, y, z) and produces

an electrostatic field E(x, y, z) at points (x, y, z) in space. Gauss’ Law states that

Σ

E··· dσ = 4π

S

ρ dV

for any closed surface Σ which encloses the charges, with S being the solid region

enclosed by Σ. Show that ∇··· E = 4πρ. This is one of Maxwell’s Equations.

10

10

In Gaussian (or CGS) units.

4.6 Gradient, Divergence, Curl and Laplacian 181

Solution: By the Divergence Theorem, we have

S

∇··· E dV =

Σ

E··· dσ

= 4π

S

ρ dV by Gauss’ Law, so combining the integrals gives

S

(∇··· E− 4πρ) dV = 0 , so

∇··· E− 4πρ = 0 since Σ and hence S was arbitrary, so

∇··· E = 4πρ .

Often (especially in physics) it is convenient to use other coordinate systems when

dealing with quantities such as the gradient, divergence, curl and Laplacian. We will

present the formulas for these in cylindrical and spherical coordinates.

Recall from Section 1.7 that a point (x, y, z) can be represented in cylindrical coordi-

nates (r, θ, z), where x = r cos θ, y = r sin θ, z = z. At each point (r, θ, z), let e

r

, e

θ

, e

z

be unit

vectors in the direction of increasing r, θ, z, respectively (see Figure 4.6.1). Then e

r

, e

θ

,

e

z

form an orthonormal set of vectors. Note, by the right-hand rule, that e

z

××× e

r

= e

θ

.

x

y

z

0

(x, y, z)

(x, y, 0)

θ x

y

z

r

e

r

e

θ

e

z

Figure 4.6.1

Orthonormal vectors e

r

, e

θ

, e

z

in cylindrical coordinates

x

y

z

0

(x, y, z)

(x, y, 0)

θ x

y

z

ρ

φ

e

ρ

e

θ

e

φ

Figure 4.6.2

Orthonormal vectors e

ρ

, e

θ

, e

φ

in spherical coordinates

Similarly, a point (x, y, z) can be represented in spherical coordinates (ρ, θ, φ), where

x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ. At each point (ρ, θ, φ), let e

ρ

, e

θ

, e

φ

be unit

vectors in the direction of increasing ρ, θ, φ, respectively (see Figure 4.6.2). Then the

vectors e

ρ

, e

θ

, e

φ

are orthonormal. By the right-hand rule, we see that e

θ

××× e

ρ

= e

φ

.

We can nowsummarize the expressions for the gradient, divergence, curl and Lapla-

cian in Cartesian, cylindrical and spherical coordinates in the following tables:

182 CHAPTER 4. LINE AND SURFACE INTEGRALS

Cartesian (x, y, z): Scalar function F; Vector field f = f

1

i + f

2

j + f

3

k

gradient : ∇F =

∂F

∂x

i +

∂F

∂y

j +

∂F

∂z

k

divergence : ∇··· f =

∂f

1

∂x

+

∂f

2

∂y

+

∂f

3

∂z

curl : ∇××× f =

¸

∂f

3

∂y

−

∂f

2

∂z

i +

¸

∂f

1

∂z

−

∂f

3

∂x

j +

¸

∂f

2

∂x

−

∂f

1

∂y

k

Laplacian : ∆F =

∂

2

F

∂x

2

+

∂

2

F

∂y

2

+

∂

2

F

∂z

2

Cylindrical (r, θ, z): Scalar function F; Vector field f = f

r

e

r

+ f

θ

e

θ

+ f

z

e

z

gradient : ∇F =

∂F

∂r

e

r

+

1

r

∂F

∂θ

e

θ

+

∂F

∂z

e

z

divergence : ∇··· f =

1

r

∂

∂r

(r f

r

) +

1

r

∂f

θ

∂θ

+

∂f

z

∂z

curl : ∇××× f =

¸

1

r

∂f

z

∂θ

−

∂f

θ

∂z

e

r

+

¸

∂f

r

∂z

−

∂f

z

∂r

e

θ

+

1

r

¸

∂

∂r

(r f

θ

) −

∂f

r

∂θ

e

z

Laplacian : ∆F =

1

r

∂

∂r

¸

r

∂F

∂r

+

1

r

2

∂

2

F

∂θ

2

+

∂

2

F

∂z

2

Spherical (ρ, θ, φ): Scalar function F; Vector field f = f

ρ

e

ρ

+ f

θ

e

θ

+ f

φ

e

φ

gradient : ∇F =

∂F

∂ρ

e

ρ

+

1

ρ sin φ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

divergence : ∇··· f =

1

ρ

2

∂

∂ρ

(ρ

2

f

ρ

) +

1

ρ sin φ

∂f

θ

∂θ

+

1

ρ sin φ

∂

∂φ

(sin φ f

θ

)

curl : ∇××× f =

1

ρ sin φ

¸

∂

∂φ

(sin φ f

θ

) −

∂f

φ

∂θ

e

ρ

+

1

ρ

¸

∂

∂ρ

(ρf

φ

) −

∂f

ρ

∂φ

e

θ

+

¸

1

ρ sin φ

∂f

ρ

∂θ

−

1

ρ

∂

∂ρ

(ρf

θ

)

e

φ

Laplacian : ∆F =

1

ρ

2

∂

∂ρ

¸

ρ

2

∂F

∂ρ

+

1

ρ

2

sin

2

φ

∂

2

F

∂θ

2

+

1

ρ

2

sin φ

∂

∂φ

¸

sin φ

∂F

∂φ

**The derivation of the above formulas for cylindrical and spherical coordinates is
**

straightforward but extremely tedious. The basic idea is to take the Cartesian equiv-

alent of the quantity in question and to substitute into that formula using the appro-

priate coordinate transformation. As an example, we will derive the formula for the

gradient in spherical coordinates.

4.6 Gradient, Divergence, Curl and Laplacian 183

Goal: Show that the gradient of a real-valued function F(ρ, θ, φ) in spherical coordi-

nates is:

∇F =

∂F

∂ρ

e

ρ

+

1

ρ sin φ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

Idea: In the Cartesian gradient formula ∇F(x, y, z) =

∂F

∂x

i +

∂F

∂y

j +

∂F

∂z

k, put the Carte-

sian basis vectors i, j, k in terms of the spherical coordinate basis vectors e

ρ

, e

θ

, e

φ

and functions of ρ, θ and φ. Then put the partial derivatives

∂F

∂x

,

∂F

∂y

,

∂F

∂z

in terms of

∂F

∂ρ

,

∂F

∂θ

,

∂F

∂φ

and functions of ρ, θ and φ.

Step 1: Get formulas for e

ρ

, e

θ

, e

φ

in terms of i, j, k.

We can see from Figure 4.6.2 that the unit vector e

ρ

in the ρ direction at a general

point (ρ, θ, φ) is e

ρ

=

r

r

, where r = x i + y j + z k is the position vector of the point in

Cartesian coordinates. Thus,

e

ρ

=

r

r

=

x i + y j + z k

x

2

+ y

2

+ z

2

,

so using x = ρ sin φ cos θ, y = ρ sin φ sin θ, z = ρ cos φ, and ρ =

x

2

+ y

2

+ z

2

, we get:

e

ρ

= sin φ cos θ i + sin φ sin θ j + cos φ k

Now, since the angle θ is measured in the xy-plane, then the unit vector e

θ

in the θ

direction must be parallel to the xy-plane. That is, e

θ

is of the form a i + b j + 0 k. To

figure out what a and b are, note that since e

θ

⊥ e

ρ

, then in particular e

θ

⊥ e

ρ

when

e

ρ

is in the xy-plane. That occurs when the angle φ is π/2. Putting φ = π/2 into the

formula for e

ρ

gives e

ρ

= cos θ i + sin θ j + 0 k, and we see that a vector perpendicular

to that is −sin θ i + cos θ j + 0 k. Since this vector is also a unit vector and points in the

(positive) θ direction, it must be e

θ

:

e

θ

= −sin θ i + cos θ j + 0 k

Lastly, since e

φ

= e

θ

××× e

ρ

, we get:

e

φ

= cos φ cos θ i + cos φ sin θ j − sin φ k

Step 2: Use the three formulas from Step 1 to solve for i, j, k in terms of e

ρ

, e

θ

, e

φ

.

This comes down to solving a system of three equations in three unknowns. There

are many ways of doing this, but we will do it by combining the formulas for e

ρ

and

e

φ

to eliminate k, which will give us an equation involving just i and j. This, with the

formula for e

θ

, will then leave us with a system of two equations in two unknowns (i

and j), which we will use to solve first for j then for i. Lastly, we will solve for k.

First, note that

sin φ e

ρ

+ cos φ e

φ

= cos θ i + sin θ j

184 CHAPTER 4. LINE AND SURFACE INTEGRALS

so that

sin θ (sin φ e

ρ

+ cos φ e

φ

) + cos θ e

θ

= (sin

2

θ + cos

2

θ)j = j ,

and so:

j = sin φ sin θ e

ρ

+ cos θ e

θ

+ cos φ sin θ e

φ

Likewise, we see that

cos θ (sin φ e

ρ

+ cos φ e

φ

) − sin θ e

θ

= (cos

2

θ + sin

2

θ)i = i ,

and so:

i = sin φ cos θ e

ρ

− sin θ e

θ

+ cos φ cos θ e

φ

Lastly, we see that:

k = cos φ e

ρ

− sin φ e

φ

Step 3: Get formulas for

∂F

∂ρ

,

∂F

∂θ

,

∂F

∂φ

in terms of

∂F

∂x

,

∂F

∂y

,

∂F

∂z

.

By the Chain Rule, we have

∂F

∂ρ

=

∂F

∂x

∂x

∂ρ

+

∂F

∂y

∂y

∂ρ

+

∂F

∂z

∂z

∂ρ

,

∂F

∂θ

=

∂F

∂x

∂x

∂θ

+

∂F

∂y

∂y

∂θ

+

∂F

∂z

∂z

∂θ

,

∂F

∂φ

=

∂F

∂x

∂x

∂φ

+

∂F

∂y

∂y

∂φ

+

∂F

∂z

∂z

∂φ

,

which yields:

∂F

∂ρ

= sin φ cos θ

∂F

∂x

+ sin φ sin θ

∂F

∂y

+ cos φ

∂F

∂z

∂F

∂θ

= −ρ sin φ sin θ

∂F

∂x

+ ρ sin φ cos θ

∂F

∂y

∂F

∂φ

= ρ cos φ cos θ

∂F

∂x

+ ρ cos φ sin θ

∂F

∂y

− ρ sin φ

∂F

∂z

Step 4: Use the three formulas from Step 3 to solve for

∂F

∂x

,

∂F

∂y

,

∂F

∂z

in terms of

∂F

∂ρ

,

∂F

∂θ

,

∂F

∂φ

.

Again, this involves solving a system of three equations in three unknowns. Using

a similar process of elimination as in Step 2, we get:

∂F

∂x

=

1

ρ sin φ

¸

ρ sin

2

φ cos θ

∂F

∂ρ

− sin θ

∂F

∂θ

+ sin φ cos φ cos θ

∂F

∂φ

∂F

∂y

=

1

ρ sin φ

¸

ρ sin

2

φ sin θ

∂F

∂ρ

+ cos θ

∂F

∂θ

+ sin φ cos φ sin θ

∂F

∂φ

∂F

∂z

=

1

ρ

¸

ρ cos φ

∂F

∂ρ

− sin φ

∂F

∂φ

**4.6 Gradient, Divergence, Curl and Laplacian 185
**

Step 5: Substitute the formulas for i, j, k from Step 2 and the formulas for

∂F

∂x

,

∂F

∂y

,

∂F

∂z

from Step 4 into the Cartesian gradient formula ∇F(x, y, z) =

∂F

∂x

i +

∂F

∂y

j +

∂F

∂z

k.

Doing this last step is perhaps the most tedious, since it involves simplifying 3 ×3 +

3 × 3 + 2 × 2 = 22 terms! Namely,

∇F =

1

ρ sin φ

¸

ρ sin

2

φ cos θ

∂F

∂ρ

− sin θ

∂F

∂θ

+ sin φ cos φ cos θ

∂F

∂φ

(sin φ cos θ e

ρ

− sin θ e

θ

+ cos φ cos θ e

φ

)

+

1

ρ sin φ

¸

ρ sin

2

φ sin θ

∂F

∂ρ

+ cos θ

∂F

∂θ

+ sin φ cos φ sin θ

∂F

∂φ

(sin φ sin θ e

ρ

+ cos θ e

θ

+ cos φ sin θ e

φ

)

+

1

ρ

¸

ρ cos φ

∂F

∂ρ

− sin φ

∂F

∂φ

(cos φ e

ρ

− sin φ e

φ

) ,

which we see has 8 terms involving e

ρ

, 6 terms involving e

θ

, and 8 terms involving e

φ

.

But the algebra is straightforward and yields the desired result:

∇F =

∂F

∂ρ

e

ρ

+

1

ρ sin φ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

Example 4.19. In Example 4.17 we showed that ∇r

2

= 2 r and ∆r

2

= 6, where

r(x, y, z) = x i + y j + z k in Cartesian coordinates. Verify that we get the same answers

if we switch to spherical coordinates.

Solution: Since r

2

= x

2

+ y

2

+ z

2

= ρ

2

in spherical coordinates, let F(ρ, θ, φ) = ρ

2

(so

that F(ρ, θ, φ) = r

2

). The gradient of F in spherical coordinates is

∇F =

∂F

∂ρ

e

ρ

+

1

ρ sin φ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

= 2ρ e

ρ

+

1

ρ sin φ

(0) e

θ

+

1

ρ

(0) e

φ

= 2ρ e

ρ

= 2ρ

r

r

, as we showed earlier, so

= 2ρ

r

ρ

= 2 r , as expected. And the Laplacian is

∆F =

1

ρ

2

∂

∂ρ

¸

ρ

2

∂F

∂ρ

+

1

ρ

2

sin

2

φ

∂

2

F

∂θ

2

+

1

ρ

2

sin φ

∂

∂φ

¸

sin φ

∂F

∂φ

=

1

ρ

2

∂

∂ρ

(ρ

2

2ρ) +

1

ρ

2

sin φ

(0) +

1

ρ

2

sin φ

∂

∂φ

(sin φ (0))

=

1

ρ

2

∂

∂ρ

(2ρ

3

) + 0 + 0

=

1

ρ

2

(6ρ

2

) = 6 , as expected.

186 CHAPTER 4. LINE AND SURFACE INTEGRALS

¨

©

Exercises

A

For Exercises 1-6, find the Laplacian of the function f (x, y, z) in Cartesian coordinates.

1. f (x, y, z) = x + y + z 2. f (x, y, z) = x

5

3. f (x, y, z) = (x

2

+ y

2

+ z

2

)

3/2

4. f (x, y, z) = e

x+y+z

5. f (x, y, z) = x

3

+ y

3

+ z

3

6. f (x, y, z) = e

−x

2

−y

2

−z

2

7. Find the Laplacian of the function in Exercise 3 in spherical coordinates.

8. Find the Laplacian of the function in Exercise 6 in spherical coordinates.

9. Let f (x, y, z) =

z

x

2

+ y

2

in Cartesian coordinates. Find ∇f in cylindrical coordinates.

10. For f(r, θ, z) = r e

r

+ z sin θ e

θ

+ rz e

z

in cylindrical coordinates, find div f and curl f.

11. For f(ρ, θ, φ) = e

ρ

+ ρ cos θ e

θ

+ ρ e

φ

in spherical coordinates, find div f and curl f.

B

For Exercises 12-23, prove the given formula (r = r is the length of the position

vector field r(x, y, z) = x i + y j + z k).

12. ∇(1/r) = −r/r

3

13. ∆(1/r) = 0 14. ∇··· (r/r

3

) = 0 15. ∇(ln r) = r/r

2

16. div (F + G) = div F + div G 17. curl (F + G) = curl F + curl G

18. div ( f F) = f div F + F··· ∇f 19. div (F××× G) = G··· curl F − F··· curl G

20. div (∇f ××× ∇g) = 0 21. curl ( f F) = f curl F + (∇f ) ××× F

22. curl (curl F) = ∇(div F) − ∆F 23. ∆( f g) = f ∆g + g ∆ f + 2(∇f ··· ∇g)

C

24. Prove Theorem 4.17.

25. Derive the gradient formula in cylindrical coordinates: ∇F =

∂F

∂r

e

r

+

1

r

∂F

∂θ

e

θ

+

∂F

∂z

e

z

26. Use f = u ∇v in the Divergence Theorem to prove:

(a) Green’s first identity:

S

(u ∆v + (∇u) ··· (∇v)) dV =

Σ

(u ∇v) ··· dσ

(b) Green’s second identity:

S

(u ∆v − v ∆u) dV =

Σ

(u ∇v − v ∇u) ··· dσ

27. Suppose that ∆u = 0 (i.e. u is harmonic) over

3

. Define the normal derivative

∂u

∂n

of u over a closed surface Σ with outward unit normal vector n by

∂u

∂n

= D

n

u = n··· ∇u.

Show that

Σ

∂u

∂n

dσ = 0. (Hint: Use Green’s second identity.)

Bibliography

Abbott, E.A., Flatland, 7th edition. New York: Dover Publications, Inc., 1952

Classic tale about a creature living in a 2-dimensional world who encounters a higher-

dimensional creature, with lots of humor thrown in.

Anton, H. and C. Rorres, Elementary Linear Algebra: Applications Version, 8th edi-

tion. New York: John Wiley & Sons, 2000

Standard treatment of elementary linear algebra.

Bazaraa, M.S., H.D. Sherali and C.M. Shetty, Nonlinear Programming: Theory and

Algorithms, 2nd edition. New York: John Wiley & Sons, 1993

Thorough treatment of nonlinear optimization.

Farin, G., Curves and Surfaces for Computer Aided Geometric Design: A Practical

Guide, 2nd edition. San Diego, CA: Academic Press, 1990

An intermediate-level book on curve and surface design.

Hecht, E., Optics, 2nd edition. Reading, MA: Addison-Wesley Publishing Co., 1987

An intermediate-level book on optics, covering a wide range of topics.

Hoel, P.G., S.C. Port and C.J. Stone, Introduction to Probability Theory, Boston, MA:

Houghton Mifﬂin Co., 1971

An excellent introduction to elementary, calculus-based probability theory. Lots of good

exercises.

Jackson, J.D., Classical Electrodynamics, 2nd edition. New York: John Wiley & Sons,

1975

An advanced book on electromagnetism, famous for being intimidating. Most of the

mathematics will be understandable after reading the present book.

Marion, J.B., Classical Dynamics of Particles and Systems, 2nd edition. New York:

Academic Press, 1970

Standard intermediate-level treatment of classical mechanics. Very thorough.

O’Neill, B., Elementary Differential Geometry, New York: Academic Press, 1966

Intermediate-level book on differential geometry, with a modern approach based on dif-

ferential forms.

187

188 Bibliography

Pogorelov, A.V., Analytical Geometry, Moscow: Mir Publishers, 1980

An intermediate/advanced book on analytic geometry.

Press, W.H., S.A. Teukolsky, W.T. Vetterling and B.P. Flannery, Numerical Recipes

in FORTRAN: The Art of Scientific Computing, 2nd edition. Cambridge, UK:

Cambridge University Press, 1992

An excellent source of information on numerical methods for solving a wide variety of

problems. Though all the examples are in the FORTRAN programming language, the code

is clear enough to implement in the language of your choice.

Protter, M.H. and C.B. Morrey, Analytic Geometry, 2nd edition. Reading, MA:

Addison-Wesley Publishing Co., 1975

Thorough treatment of elementary analytic geometry, with a rigor not found in most

recent books.

Ralston, A. and P. Rabinowitz, A First Course in Numerical Analysis, 2nd edition.

New York: McGraw-Hill, 1978

Standard treatment of elementary numerical analysis.

Reitz, J.R., F.J. Milford and R.W. Christy, Foundations of Electromagnetic Theory,

3rd edition. Reading, MA: Addison-Wesley Publishing Co., 1979

Intermediate text on electromagnetism.

Schey, H.M., Div, Grad, Curl, and All That: An Informal Text on Vector Calculus, New

York: W.W. Norton & Co., 1973

Very intuitive approach to the subject, from a physicist’s viewpoint. Highly recom-

mended.

Taylor, A.E. and W.R. Mann, Advanced Calculus, 2nd edition. New York: John Wiley

& Sons, 1972

Excellent treatment of n-dimensional calculus. A good book to study after the present

book. Many intriguing exercises.

Uspensky, J.V., Theory of Equations, New York: McGraw-Hill, 1948

A classic on the subject, discussing many interesting topics.

Weinberger, H.F., A First Course in Partial Differential Equations, New York: John

Wiley & Sons, 1965

A good introduction to the vast subject of partial differential equations.

Welchons, A.M. and W.R. Krickenberger, Solid Geometry, Boston, MA: Ginn & Co.,

1936

A very thorough treatment of 3-dimensional geometry from an elementary perspective,

includes many topics which (sadly) do not seem to be taught anymore.

Appendix A

Answers and Hints to Selected Exercises

Chapter 1

Section 1.1 (p. 8)

1. (a)

√

5 (b)

√

5 (c)

√

17 (d) 1

(e) 2

√

17 2. Yes 3. No

Section 1.2 (p. 14)

1. (a) (−4, 4, −3) (b) (2, 6, −1)

(c)

−1

√

30

,

5

√

30

,

−2

√

30

(d)

√

41

2

(e)

√

41

2

(f) (14, −6, 8) (g) (−7, 3, −4)

(h) (−1, −6, 1) (i) (−2, −4, 2) (j) No.

3. No. v + w is larger.

Section 1.3 (p. 18)

1. 10 3. 73.4

◦

5. 90

◦

7. 0

◦

9. Yes, since v··· w = 0.

11. |v··· w| = 0 <

√

21

√

5 = v w

13. v+w =

√

26 <

√

21 +

√

5 = v +w

15. Hint: use Definition 1.6.

24. Hint: See Theorem 1.10(c).

Section 1.4 (p. 29)

1. (−5, −23, −24) 3. (8, 4, −5) 5. 0

7. 16.72 9. 4

√

5 11. 9 13. 0

and (8, −10, 2) 15. 14

Section 1.5 (p. 39)

1. (a) (2, 3, 2) + t(5, 4, −3) (b) x = 2 + 5t,

y = 3 + 4t, z = 2 − 3t (c)

x−2

5

=

y−3

4

=

z−2

−3

3. (a) (2, 1, 3) + t(1, 0, 1) (b) x = 2 + t,

y = 1, z = 3 + t (c) x − 2 = z − 3, y = 1

5. x = 1 + 2t, y = −2 + 7t, z = −3 + 8t

7. 7.65 9. (1, 2, 3)

11. 4x − 4y + 3z − 10 = 0

13. x − 2y − z + 2 = 0

15. 11x − 24y + 21z − 26 = 0 17. 9/

√

35

19. x = 5t, y = 2+3t, z = −7t 21. (10, −2, 1)

Section 1.6 (p. 46)

1. radius: 1, center: (2, 3, 5) 3. radius: 5,

center: (−1, −1, −1) 5. No intersection.

7. circle x

2

+ y

2

= 4 in the planes z = ±

√

5

9. lines

x

a

=

y

b

, z = 0 and

x

a

= −

y

b

, z = 0

13.

2a

2−c

,

2b

2−c

, 0

Section 1.7 (p. 50)

1. (a) (4,

π

3

, −1) (b) (

√

17,

π

3

, 1.816)

3. (a) (2

√

7,

11π

6

, 0) (b) (2

√

7,

11π

6

,

π

2

)

5. (a) r

2

+ z

2

= 25 (b) ρ = 5

7. (a) r

2

+9z

2

= 36 (b) ρ

2

(1 +8 cos

2

φ) = 36

10. (a, θ, a cot φ) 12. Hint: Use the dis-

tance formula for Cartesian coordinates.

Section 1.8 (p. 57)

1. f

′

(t) = (1, 2t, 3t

2

); x = 1 + t, y = z = 1

3. f

′

(t) = (−2 sin 2t, 2 cos 2t, 1); x = 1,

y = 2t, z = t 5. v(t) = (1, 1 − cos t, sin t),

a(t) = (0, sin t, cos t)

9. (a) Line parallel to c (b) Half-line

189

190 Appendix A: Answers and Hints to Selected Exercises

parallel to c (c) Hint: Think of the

functions as position vectors.

15. Hint: Theorem 1.16

Section 1.9 (p. 63)

1.

3π

√

5

2

3.

2

27

(13

3/2

− 8) 5. Replace

t by

27s+16

2

2/3

− 4

9 6. Hint: Use

Theorem 1.20(e), Example 1.37, and

Theorem 1.16 7. Hint: Use Exercise 6.

9. Hint: Use f

′

(t) = f(t)T, differ-

entiate that to get f

′′

(t), put those ex-

pressions into f

′

(t) ××× f

′′

(t), then write

T

′

(t) in terms of N(t). 11. T(t) =

1

√

2

(−sin t, cos t, 1), N(t) = (−cos t, −sin t, 0),

B(t) =

1

√

2

(sin t, −cos t, 1), κ(t) = 1/2

Chapter 2

Section 2.1 (p. 70)

1. domain:

2

, range: [−1, ∞) 3. domain:

{(x, y) : x

2

+ y

2

≥ 4}, range: [0, ∞)

5. domain:

3

, range: [−1, 1] 7. 1

9. does not exist 11. 2 13. 2 15. 0

17. does not exist

Section 2.2 (p. 74)

1.

∂f

∂x

= 2x,

∂f

∂y

= 2y 3.

∂f

∂x

= x(x

2

+y +4)

−1/2

,

∂f

∂y

=

1

2

(x

2

+ y + 4)

−1/2

5.

∂f

∂x

= ye

xy

+ y,

∂f

∂y

= xe

xy

+ x 7.

∂f

∂x

= 4x

3

,

∂f

∂y

= 0

9.

∂f

∂x

= x(x

2

+ y

2

)

−1/2

,

∂f

∂y

= y(x

2

+ y

2

)

−1/2

11.

∂f

∂x

=

2x

3

(x

2

+ y + 4)

−2/3

,

∂f

∂y

=

1

3

(x

2

+ y + 4)

−2/3

13.

∂f

∂x

= −2xe

−(x

2

+y

2

)

,

∂f

∂y

= −2ye

−(x

2

+y

2

)

15.

∂f

∂x

= y cos(xy),

∂f

∂y

= x cos(xy) 17.

∂

2

f

∂x

2

= 2,

∂

2

f

∂y

2

= 2,

∂

2

f

∂x ∂y

= 0 19.

∂

2

f

∂x

2

= (y + 4)(x

2

+ y + 4)

−3/2

,

∂

2

f

∂y

2

= −

1

4

(x

2

+ y + 4)

−3/2

,

∂

2

f

∂x ∂y

= −

1

2

x(x

2

+ y + 4)

−3/2

21.

∂

2

f

∂x

2

= y

2

e

xy

,

∂

2

f

∂y

2

= x

2

e

xy

,

∂

2

f

∂x ∂y

= (1 + xy)e

xy

+ 1 23.

∂

2

f

∂x

2

= 12x

2

,

∂

2

f

∂y

2

= 0,

∂

2

f

∂x ∂y

= 0 25.

∂

2

f

∂x

2

= −x

−2

,

∂

2

f

∂y

2

= −y

−2

,

∂

2

f

∂x ∂y

= 0

Section 2.3 (p. 77)

1. 2x + 3y − z − 3 = 0 3. −2x + y − z − 2 = 0

5. x + 2y = z 7.

1

2

(x − 1) +

4

9

(y − 2) +

√

11

12

(z −

2

√

11

3

) = 0 9. 3x + 4y − 5z = 0

Section 2.4 (p. 82)

1. (2x, 2y) 3. (

x

√

x

2

+y

2

+4

,

y

√

x

2

+y

2

+4

)

5. (1/x, 1/y) 7. (yz cos(xyz), xz cos(xyz), xy cos(xyz))

9. (2x, 2y, 2z) 11. 2

√

2 13.

1

√

3

15.

√

3 cos(1) 17. increase: (45, 20),

decrease: (−45, −20)

Section 2.5 (p. 88)

1. local min. (1, 0); saddle pt. (−1, 0)

3. local min. (1, 1); local max. (−1, −1);

saddle pts. (1, −1), (−1, 1) 5. local min.

(1, −1); saddle pt. (0, 0) 7. local min. (0, 0)

9. local min. (−1, 1/2) 11. width = height

= depth=10 13. x = y = 4, z = 2

Section 2.6 (p. 95)

2. (x

0

, y

0

) = (0, 0) : → (0.2858, −0.3998);

(x

0

, y

0

) = (1, 1) : → (1.03256, −1.94037)

191

Section 2.7 (p. 100)

1. min.

−4

√

5

,

−2

√

5

; max.

4

√

5

,

2

√

5

3. min.

20

√

13

,

30

√

13

; max.

−

20

√

13

, −

30

√

13

4. min.

−9

√

5

, 0,

2

√

5

; max.

9

8

,

√

59

4

,

−1

4

5.

8abc

3

√

3

Chapter 3

Section 3.1 (p. 104)

1. 1 3.

7

12

5.

7

6

7. 5 9.

1

2

11. 15

Section 3.2 (p. 109)

1. 1 3. 8 ln 2 − 3 5.

π

4

6.

1

4

7. 2 9.

1

6

10.

6

5

Section 3.3 (p. 112)

1.

9

2

3. (2 cos(π

2

) + π

4

− 2)/4 5.

1

6

7. 6

10.

1

3

Section 3.4 (p. 116)

1. The values should converge to ≈ 1.318.

(Hint: In Java the exponential function

e

x

can be obtained with Math.exp(x).

Other languages have similar functions,

otherwise use e = 2.7182818284590455 in

your program.)

2. ≈ 1.146 3. ≈ 0.705 4. ≈ 0.168

Section 3.5 (p. 123)

1. 8π 3.

4π

3

(8 − 3

3/2

) 7. 1 −

sin 2

2

9. 2πab

Section 3.6 (p. 127)

1. (1, 8/3) 3. (0,

4a

3π

) 5. (0, 3π/16)

7. (0, 0, 5a/12) 9. (7/12, 7/12, 7/12)

Section 3.7 (p. 134)

1.

√

π 2. 1 6. Both are

n

(n+1)

2

(n+2)

7.

1

n

Chapter 4

Section 4.1 (p. 142)

1. 1/2 3. 23 5. 24π 7. −2π 9. 2π

11. 4π

Section 4.2 (p. 149)

1. 0 3. No 4. Yes. F(x, y) =

x

2

2

−

y

2

2

5. No 9. (b) No. Hint: Think of how F is

defined. 10. Yes. F(x, y) = axy +bx +cy +d

Section 4.3 (p. 155)

1. 16/15 3. −5π 5. Yes. F(x, y) = xy

2

+ x

3

7. Yes. F(x, y) = 4x

2

y + 2y

2

+ 3x

Section 4.4 (p. 163)

1. 216π 2. 3 3. 12π/5 7. 15/4

Section 4.5 (p. 175)

1. 2

√

2 π

2

2. (17

√

17 − 5

√

5)/3 3. 2/5

4. 1 5. 2π(π − 1) 7. 67/15 9. 6

11. Yes 13. No 19. Hint: Think of

how a vector field f(x, y) = P(x, y) i+Q(x, y) j

in

2

can be extended in a natural way to

be a vector field in

3

.

Section 4.6 (p. 186)

1. 0 3. 12

x

2

+ y

2

+ z

2

5. 6(x + y + z)

7. 12ρ 8. (4ρ

2

− 6)e

−ρ

2

9. −

2z

r

3

e

r

+

1

r

2

e

z

11. div f =

2

ρ

−

sin θ

sin φ

+ cot φ;

curl f = cot φ cos θ e

ρ

+ 2e

θ

− 2 cos θ e

φ

25. Hint: Start by showing that e

r

=

cos θ i + sin θ j, e

θ

= −sin θ i + cos θ j, e

z

= k.

Appendix B

We will prove the right-hand rule for the cross product of two vectors in

3

.

For any vectors v and w in

3

, define a new vector, n(v, w), as follows:

1. If v and w are nonzero and not parallel, and θ is the angle between them, then

n(v, w) is the vector in

3

such that:

(a) the magnitude of n(v, w) is v w sin θ,

(b) n(v, w) is perpendicular to the plane containing v and w, and

(c) v, w, n(v, w) form a right-handed system.

2. If v and w are nonzero and parallel, then n(v, w) = 0.

3. If either v or w is 0, then n(v, w) = 0.

The goal is to show that n(v, w) = v ××× w for all v, w in

3

, which would prove the

right-hand rule for the cross product (by part 1(c) of our definition). To do this, we will

perform the following steps:

Step 1: Show that n(v, w) = v××× w if v and w are any two of the basis vectors i, j, k.

This was already shown in Example 1.11 in Section 1.4.

Step 2: Show that n(av, bw) = ab(v ××× w) for any scalars a, b if v and w are any two of

the basis vectors i, j, k.

If either a = 0 or b = 0 then n(av, bw) = 0 = ab(v ××× w), so the result holds. So assume

that a 0 and b 0. Let v and w be any two of the basis vectors i, j, k. For example,

we will show that the result holds for v = i and w = k (the other possibilities follow in

a similar fashion).

For av = ai and bw = bk, the angle θ between av and bwis 90

◦

. Hence the magnitude

of n(av, bw), by definition, is ai bk sin 90

◦

= |ab|. Also, by definition, n(av, bw) is

perpendicular to the plane containing ai and bk, namely, the xz-plane. Thus, n(av, bw)

must be a scalar multiple of j. Since its magnitude is |ab|, then n(av, bw) must be

either |ab|j or −|ab|j.

There are four possibilities for the combinations of signs for a and b. We will con-

sider the case when a > 0 and b > 0 (the other three possibilities are handled simi-

larly).

192

193

In this case, n(av, bw) must be either abj or −abj. Now, since i, j, k form a right-

handed system, then i, k, j form a left-handed system, and so i, k, −j form a right-

handed system. Thus, ai, bk, −abj form a right-handed system (since a > 0, b > 0, and

ab > 0). So since, by definition, ai, bk, n(ai, bk) form a right-handed system, and since

n(ai, bk) has to be either abj or −abj, this means that we must have n(ai, bk) = −abj.

But we know that ai ××× bk = ab(i ××× k) = ab(−j) = −abj. Therefore, n(ai, bk) = ab(i ××× k),

which is what we needed to show.

∴ n(av, bw) = ab(v××× w)

Step 3: Show that n(u, v + w) = n(u, v) + n(u, w) for any vectors u, v, w.

If u = 0 then the result holds trivially since n(u, v +w), n(u, v) and n(u, w) are all the

zero vector. If v = 0, then the result follows easily since n(u, v + w) = n(u, 0 + w) =

n(u, w) = 0+n(u, w) = n(u, 0) = n(u, w) = n(u, v) +n(u, w). A similar argument shows

that the result holds if w = 0.

So now assume that u, v and ware all nonzero vectors. We will describe a geometric

construction of n(u, v), which is shown in the figure below. Let P be a plane perpen-

dicular to u. Multiply the vector v by the positive scalar u, then project the vector

u v straight down onto the plane P. You can think of this projection vector (denoted

by pro j

P

u v) as the shadow of the vector u v on the plane P, with the light source

directly overhead the terminal point of u v. If θ is the angle between u and v, then

we see that pro j

P

u v has magnitude u v sin θ, which is the magnitude of n(u, v).

So rotating pro j

P

u v by 90

◦

in a counter-clockwise direction in the plane P gives a

vector whose magnitude is the same as that of n(u, v) and which is perpendicular to

pro j

P

u v (and hence perpendicular to v). Since this vector is in P then it is also per-

pendicular to u. And we can see that u, v and this vector form a right-handed system.

Hence this vector must be n(u, v). Note that this holds even if u v, since in that case

θ = 0

◦

and so sin θ = 0 which means that n(u, v) has magnitude 0, which is what we

would expect.

u

v

pro j

P

u v

u v

n(u, v)

θ

θ

P

Now apply this same geometric construction to get n(u, w) and n(u, v + w). Since

u (v + w) is the sum of the vectors u v and u w, then the projection vector

194 Appendix B: Proof of the Right-Hand Rule for the Cross Product

pro j

P

u (v + w) is the sum of the projection vectors pro j

P

u v and pro j

P

u w (to

see this, using the shadow analogy again and the parallelogram rule for vector addi-

tion, think of how projecting a parallelogram onto a plane gives you a parallelogram in

that plane). So then rotating all three projection vectors by 90

◦

in a counter-clockwise

direction in the plane P preserves that sum (see the figure below), which means that

n(u, v + w) = n(u, v) + n(u, w).

u

v

w

v + w

u (v + w)

pro j

P

u v

pro j

P

u w

pro j

P

u (v + w)

u v

u w

n(u, v) n(u, w)

n(u, v + w)

θ

θ

P

Step 4: Show that n(w, v) = −n(v, w) for any vectors v, w.

If v and w are nonzero and parallel, or if either is 0, then n(w, v) = 0 = −n(v, w), so

the result holds. So assume that v and w are nonzero and not parallel. Then n(w, v)

has magnitude w v sin θ, which is the same as the magnitude of n(v, w), and hence

is the same as the magnitude of −n(v, w). By definition, n(v, w) is perpendicular to

the plane containing w and v, and hence so is −n(v, w). Also, v, w, n(v, w) form a

right-handed system, and so w, v, n(v, w) form a left-handed system, and hence w,

v, −n(v, w) form a right-handed system. Thus, we have shown that −n(v, w) is a vec-

tor with the same magnitude as n(w, v) and is perpendicular to the plane containing

w and v, and that w, v, −n(v, w) form a right-handed system. So by definition this

means that −n(v, w) must be n(w, v).

Step 5: Show that n(v, w) = v××× w for all vectors v, w.

Write v = v

1

i + v

2

j + v

3

k and w = w

1

i + w

2

j + w

3

k. Then by Steps 3 and 4, we have

195

n(v, w) = n(v

1

i + v

2

j + v

3

k, w

1

i + w

2

j + w

3

k)

= n(v

1

i + v

2

j + v

3

k, w

1

i) + n(v

1

i + v

2

j + v

3

k, w

2

j + w

3

k)

= n(v

1

i + v

2

j + v

3

k, w

1

i) + n(v

1

i + v

2

j + v

3

k, w

2

j) + n(v

1

i + v

2

j + v

3

k, w

3

k)

= −n(w

1

i, v

1

i + v

2

j + v

3

k) + −n(w

2

j, v

1

i + v

2

j + v

3

k) + −n(w

3

k, v

1

i + v

2

j + v

3

k).

We can use Steps 1 and 2 to evaluate the three terms on the right side of the last

equation above:

−n(w

1

i, v

1

i + v

2

j + v

3

k) = −n(w

1

i, v

1

i) + −n(w

1

i, v

2

j) + −n(w

1

i, v

3

k)

= −v

1

w

1

n(i, i) + −v

2

w

1

n(i, j) + −v

3

w

1

n(i, k)

= −v

1

w

1

(i ××× i) + −v

2

w

1

(i ××× j) + −v

3

w

1

(i ××× k)

= −v

1

w

1

0 + −v

2

w

1

k + −v

3

w

1

(−j)

−n(w

1

i, v

1

i + v

2

j + v

3

k) = −v

2

w

1

k + v

3

w

1

j

Similarly, we can calculate

−n(w

2

j, v

1

i + v

2

j + v

3

k) = v

1

w

2

k − v

3

w

2

i

and

−n(w

3

j, v

1

i + v

2

j + v

3

k) = −v

1

w

3

j + v

2

w

3

i .

Thus, putting it all together, we have

n(v, w) = −v

2

w

1

k + v

3

w

1

j + v

1

w

2

k − v

3

w

2

i − v

1

w

3

j + v

2

w

3

i

= (v

2

w

3

− v

3

w

2

)i + (v

3

w

1

− v

1

w

3

)j + (v

1

w

2

− v

2

w

1

)k

= v××× w by definition of the cross product.

∴ n(v, w) = v××× w for all vectors v, w.

So since v, w, n(v, w) form a right-handed system, then v, w, v ××× w form a right-

handed system, which completes the proof.

Appendix C

3D Graphing with Gnuplot

Gnuplot is a free, open-source software package for producing a variety of graphs.

Versions are available for many operating systems. Below is a very brief tutorial on

how to use Gnuplot to graph functions of several variables.

INSTALLATION

1. Go to http://www.gnuplot.info/download.html and followthe links to down-

load the latest version for your operating system. For Windows, you should get the

Zip file with a name such as gp420win32.zip, which is version 4.2.0. All the

examples we will discuss require at least version 4.2.0.

2. Install the downloaded file. For example, in Windows you would unzip the Zip file

you downloaded in Step 1 into some folder (use the “Use folder names” option if

extracting with WinZip).

RUNNING GNUPLOT

1. In Windows, run wgnuplot.exe fromthe folder (or bin folder) where you installed

Gnuplot. In Linux, just type gnuplot in a terminal window.

2. You should now get a Gnuplot terminal with a gnuplot> command prompt. In

Windows this will appear in a new window, while in Linux it will appear in the

terminal window where the gnuplot command was run. For Windows, if the font

is unreadable you can change it by right-clicking on the text part of the Gnuplot

window and selecting the “Choose Font..” option. For example, the font “Courier”,

style “Regular”, size “12” is usually a good choice (that choice can be saved for

future sessions by right-clicking in the Gnuplot window again and selecting the

option to update wgnuplot.ini).

3. At the gnuplot> command prompt you can now run graphing commands, which

we will now describe.

GRAPHING FUNCTIONS

The usual way to create 3D graphs in Gnuplot is with the splot command:

splot <range> <comma-separated list of functions>

196

197

For a function z = f (x, y), <range> is the range of x and y values (and optionally the

range of z values) over which to plot. To specify an x range and a y range, use an

expression of the form [a : b][c : d], for some numbers a < b and c < d. This will cause

the graph to be plotted for a ≤ x ≤ b and c ≤ y ≤ d.

Function definitions use the x and y variables in combination with mathematical op-

erators, listed below:

Symbol Operation Example Result

+ Addition 2 + 3 5

− Subtraction 3 − 2 1

* Multiplication 2*3 6

/ Division 4/2 2

** Power 2**3 2

3

= 8

exp(x) e

x

exp(2) e

2

log(x) ln x log(2) ln 2

sin(x) sin x sin(pi/2) 1

cos(x) cos x cos(pi) −1

tan(x) tan x tan(pi/4) 1

Example C.1. To graph the function z = 2x

2

+ y

2

from x = −1 to x = 1 and from y = −2

to y = 2, type this at the gnuplot> prompt:

splot [−1 : 1][−2 : 2] 2*x**2 + y**2

The result is shown below:

-1

-0.5

0

0.5

1

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0

1

2

3

4

5

6

7

2*(x**2) + y**2

198 Appendix C: 3D Graphing with Gnuplot

Note that we had to type 2*x**2 to multiply 2 times x

2

. For clarity, parentheses can

be used to make sure the operations are being performed in the correct order:

splot [−1 : 1][−2 : 2] 2*(x**2) + y**2

In the above example, to also plot the function z = e

x+y

on the same graph, put a

comma after the first function then append the new function:

splot [−1 : 1][−2 : 2] 2*(x**2) + y**2, exp(x+y)

By default, the x-axis and y-axis are not shown in the graph. To display the axes, use

this command before the splot command:

set zeroaxis

Also, by default the x- and y-axes are switched from their usual position. To show the

axes with the orientation which we have used throughout the text, use this command:

set view 60, 120, 1, 1

Also, to label the axes, use these commands:

set xlabel "x"

set ylabel "y"

set zlabel "z"

To show the level curves of the surface z = f (x, y) on both the surface and projected

onto the xy-plane, use this command:

set contour both

The default mesh size for the grid on the surface is 10 units. To get more of a col-

ored/shaded surface, increase the mesh size (to, say, 25) like this:

set isosamples 25

Putting all this together, we get the following graph with these commands:

set zeroaxis

set view 60, 120, 1, 1

set xlabel "x"

set ylabel "y"

set zlabel "z"

set contour both

set isosamples 25

splot [−1 : 1][−2 : 2] 2*(x**2) + y**2, exp(x+y)

199

-1

-0.5

0

0.5

1

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0

5

10

15

20

25

z

2*(x**2) + y**2

6

5

4

3

2

1

exp(x+y)

20

15

10

5

x

y

z

The numbers listed below the functions in the key in the upper right corner of the

graph are the “levels” of the level curves of the corresponding surface. That is, they

are the numbers c such that f (x, y) = c. If you do not want the function key displayed,

it can be turned off with this command: unset key

PARAMETRIC FUNCTIONS

Gnuplot has the ability to graph surfaces given in various parametric forms. For

example, for a surface parametrized in cylindrical coordinates

x = r cos θ , y = r sin θ , z = z

you would do the following:

set mapping cylindrical

set parametric

splot [a : b][c : d] v*cos(u),v*sin(u),f(u,v)

where the variable u represents θ, with a ≤ u ≤ b, the variable v represents r, with

c ≤ v ≤ d, and z = f (u, v) is some function of u and v.

Example C.2. The graph of the helicoid z = θ in Example 1.34 from Section 1.7 (p. 49)

was created using the following commands:

200 Appendix C: 3D Graphing with Gnuplot

set mapping cylindrical

set parametric

set view 60, 120, 1, 1

set xyplane 0

set xlabel "x"

set ylabel "y"

set zlabel "z"

unset key

set isosamples 15

splot [0 : 4*pi][0 : 2] v*cos(u),v*sin(u),u

The command set xyplane 0 moves the z-axis so that z = 0 aligns with the xy-plane

(which is not the default in Gnuplot). Looking at the graph, you will see that r varies

from 0 to 2, and θ varies from 0 to 4π.

PRINTING AND SAVING

In Windows, to print a graph from Gnuplot right-click on the titlebar of the graph’s

window, select “Options” and then the “Print..” option. To save a graph, say, as a PNG

file, go to the File menu on the main Gnuplot menubar, select “Output Device ...”, and

enter png in the Terminal type? textfield, hit OK. Then, in the File menu again, select

the “Output ...” option and enter a filename (say, graph.png) in the Output filename?

textfield, hit OK. Now run your splot command again and you should see a file called

graph.png in the current directory (usually the directory where wgnuplot.exe is lo-

cated, though you can change that setting using the “Change Directory ...” option in

the File menu).

In Linux, to save the graph as a file called graph.png, you would issue the following

commands:

set terminal png

set output ’graph.png’

and then run your splot command. There are many terminal types (which determine

the output format). Run the command set terminal to see all the possible types. In

Linux, the postscript terminal type is popular, since the print quality is high and

there are many PostScript viewers available.

To quit Gnuplot, type quit at the gnuplot> command prompt.

GNU Free Documentation License

Version 1.2, November 2002

Copyright c 2000,2001,2002 Free Software Foundation, Inc.

51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Everyone is permitted to copy and distribute verbatim copies of this license

document, but changing it is not allowed.

Preamble

The purpose of this License is to make a manual, textbook, or other functional

and useful document "free" in the sense of freedom: to assure everyone the effec-

tive freedom to copy and redistribute it, with or without modifying it, either com-

mercially or noncommercially. Secondarily, this License preserves for the author and

publisher a way to get credit for their work, while not being considered responsible

for modifications made by others.

This License is a kind of "copyleft", which means that derivative works of the doc-

ument must themselves be free in the same sense. It complements the GNU General

Public License, which is a copyleft license designed for free software.

We have designed this License in order to use it for manuals for free software,

because free software needs free documentation: a free program should come with

manuals providing the same freedoms that the software does. But this License is not

limited to software manuals; it can be used for any textual work, regardless of subject

matter or whether it is published as a printed book. We recommend this License

principally for works whose purpose is instruction or reference.

1. APPLICABILITY AND DEFINITIONS

This License applies to any manual or other work, in any medium, that contains

a notice placed by the copyright holder saying it can be distributed under the terms

of this License. Such a notice grants a world-wide, royalty-free license, unlimited

in duration, to use that work under the conditions stated herein. The "Document",

below, refers to any such manual or work. Any member of the public is a licensee, and

is addressed as "you". You accept the license if you copy, modify or distribute the work

in a way requiring permission under copyright law.

201

202 GNU Free Documentation License

A "Modified Version" of the Document means any work containing the Document

or a portion of it, either copied verbatim, or with modifications and/or translated into

another language.

A "Secondary Section" is a named appendix or a front-matter section of the Doc-

ument that deals exclusively with the relationship of the publishers or authors of the

Document to the Document’s overall subject (or to related matters) and contains noth-

ing that could fall directly within that overall subject. (Thus, if the Document is in

part a textbook of mathematics, a Secondary Section may not explain any mathemat-

ics.) The relationship could be a matter of historical connection with the subject or

with related matters, or of legal, commercial, philosophical, ethical or political posi-

tion regarding them.

The "Invariant Sections" are certain Secondary Sections whose titles are desig-

nated, as being those of Invariant Sections, in the notice that says that the Document

is released under this License. If a section does not fit the above definition of Sec-

ondary then it is not allowed to be designated as Invariant. The Document may con-

tain zero Invariant Sections. If the Document does not identify any Invariant Sections

then there are none.

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover

Texts or Back-Cover Texts, in the notice that says that the Document is released under

this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may

be at most 25 words.

A "Transparent" copy of the Document means a machine-readable copy, repre-

sented in a format whose specification is available to the general public, that is suit-

able for revising the document straightforwardly with generic text editors or (for im-

ages composed of pixels) generic paint programs or (for drawings) some widely avail-

able drawing editor, and that is suitable for input to text formatters or for automatic

translation to a variety of formats suitable for input to text formatters. A copy made in

an otherwise Transparent file format whose markup, or absence of markup, has been

arranged to thwart or discourage subsequent modification by readers is not Transpar-

ent. An image format is not Transparent if used for any substantial amount of text.

A copy that is not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain ASCII without

markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly

available DTD, and standard-conforming simple HTML, PostScript or PDF designed

for human modification. Examples of transparent image formats include PNG, XCF

and JPG. Opaque formats include proprietary formats that can be read and edited

only by proprietary word processors, SGML or XML for which the DTDand/or process-

ing tools are not generally available, and the machine-generated HTML, PostScript

or PDF produced by some word processors for output purposes only.

The "Title Page" means, for a printed book, the title page itself, plus such following

pages as are needed to hold, legibly, the material this License requires to appear in the

title page. For works in formats which do not have any title page as such, "Title Page"

203

means the text near the most prominent appearance of the work’s title, preceding the

beginning of the body of the text.

A section "Entitled XYZ" means a named subunit of the Document whose title

either is precisely XYZ or contains XYZ in parentheses following text that translates

XYZ in another language. (Here XYZ stands for a specific section name mentioned be-

low, such as "Acknowledgments", "Dedications", "Endorsements", or "History".)

To "Preserve the Title" of such a section when you modify the Document means that

it remains a section "Entitled XYZ" according to this definition.

The Document may include Warranty Disclaimers next to the notice which states

that this License applies to the Document. These Warranty Disclaimers are consid-

ered to be included by reference in this License, but only as regards disclaiming war-

ranties: any other implication that these Warranty Disclaimers may have is void and

has no effect on the meaning of this License.

2. VERBATIM COPYING

You may copy and distribute the Document in any medium, either commercially or

noncommercially, provided that this License, the copyright notices, and the license

notice saying this License applies to the Document are reproduced in all copies, and

that you add no other conditions whatsoever to those of this License. You may not

use technical measures to obstruct or control the reading or further copying of the

copies you make or distribute. However, you may accept compensation in exchange

for copies. If you distribute a large enough number of copies you must also follow the

conditions in section 3.

You may also lend copies, under the same conditions stated above, and you may

publicly display copies.

3. COPYING IN QUANTITY

If you publish printed copies (or copies in media that commonly have printed cov-

ers) of the Document, numbering more than 100, and the Document’s license notice

requires Cover Texts, you must enclose the copies in covers that carry, clearly and

legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover

Texts on the back cover. Both covers must also clearly and legibly identify you as the

publisher of these copies. The front cover must present the full title with all words

of the title equally prominent and visible. You may add other material on the covers

in addition. Copying with changes limited to the covers, as long as they preserve the

title of the Document and satisfy these conditions, can be treated as verbatim copying

in other respects.

If the required texts for either cover are too voluminous to fit legibly, you should put

the first ones listed (as many as fit reasonably) on the actual cover, and continue the

rest onto adjacent pages.

204 GNU Free Documentation License

If you publish or distribute Opaque copies of the Document numbering more than

100, you must either include a machine-readable Transparent copy along with each

Opaque copy, or state in or with each Opaque copy a computer-network location from

which the general network-using public has access to download using public-standard

network protocols a complete Transparent copy of the Document, free of added mate-

rial. If you use the latter option, you must take reasonably prudent steps, when you

begin distribution of Opaque copies in quantity, to ensure that this Transparent copy

will remain thus accessible at the stated location until at least one year after the last

time you distribute an Opaque copy (directly or through your agents or retailers) of

that edition to the public.

It is requested, but not required, that you contact the authors of the Document well

before redistributing any large number of copies, to give them a chance to provide you

with an updated version of the Document.

4. MODIFICATIONS

You may copy and distribute a Modified Version of the Document under the condi-

tions of sections 2 and 3 above, provided that you release the Modified Version under

precisely this License, with the Modified Version filling the role of the Document, thus

licensing distribution and modification of the Modified Version to whoever possesses

a copy of it. In addition, you must do these things in the Modified Version:

A. Use in the Title Page (and on the covers, if any) a title distinct from that of the

Document, and from those of previous versions (which should, if there were any,

be listed in the History section of the Document). You may use the same title as a

previous version if the original publisher of that version gives permission.

B. List on the Title Page, as authors, one or more persons or entities responsible for

authorship of the modifications in the Modified Version, together with at least five

of the principal authors of the Document (all of its principal authors, if it has fewer

than five), unless they release you from this requirement.

C. State on the Title page the name of the publisher of the Modified Version, as the

publisher.

D. Preserve all the copyright notices of the Document.

E. Add an appropriate copyright notice for your modifications adjacent to the other

copyright notices.

F. Include, immediately after the copyright notices, a license notice giving the public

permission to use the Modified Version under the terms of this License, in the form

shown in the Addendum below.

205

G. Preserve in that license notice the full lists of Invariant Sections and required

Cover Texts given in the Document’s license notice.

H. Include an unaltered copy of this License.

I. Preserve the section Entitled "History", Preserve its Title, and add to it an item

stating at least the title, year, new authors, and publisher of the Modified Version

as given on the Title Page. If there is no section Entitled "History" in the Docu-

ment, create one stating the title, year, authors, and publisher of the Document as

given on its Title Page, then add an item describing the Modified Version as stated

in the previous sentence.

J. Preserve the network location, if any, given in the Document for public access to

a Transparent copy of the Document, and likewise the network locations given in

the Document for previous versions it was based on. These may be placed in the

"History" section. You may omit a network location for a work that was published

at least four years before the Document itself, or if the original publisher of the

version it refers to gives permission.

K. For any section Entitled "Acknowledgments" or "Dedications", Preserve the Title

of the section, and preserve in the section all the substance and tone of each of the

contributor acknowledgments and/or dedications given therein.

L. Preserve all the Invariant Sections of the Document, unaltered in their text and

in their titles. Section numbers or the equivalent are not considered part of the

section titles.

M. Delete any section Entitled "Endorsements". Such a section may not be included

in the Modified Version.

N. Do not retitle any existing section to be Entitled "Endorsements" or to conflict in

title with any Invariant Section.

O. Preserve any Warranty Disclaimers.

If the Modified Version includes new front-matter sections or appendices that qual-

ify as Secondary Sections and contain no material copied from the Document, you

may at your option designate some or all of these sections as invariant. To do this,

add their titles to the list of Invariant Sections in the Modified Version’s license notice.

These titles must be distinct from any other section titles.

You may add a section Entitled "Endorsements", provided it contains nothing but

endorsements of your Modified Version by various parties–for example, statements of

peer review or that the text has been approved by an organization as the authoritative

definition of a standard.

206 GNU Free Documentation License

You may add a passage of up to five words as a Front-Cover Text, and a passage of up

to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified

Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be

added by (or through arrangements made by) any one entity. If the Document already

includes a cover text for the same cover, previously added by you or by arrangement

made by the same entity you are acting on behalf of, you may not add another; but

you may replace the old one, on explicit permission from the previous publisher that

added the old one.

The author(s) and publisher(s) of the Document do not by this License give per-

mission to use their names for publicity for or to assert or imply endorsement of any

Modified Version.

5. COMBINING DOCUMENTS

You may combine the Document with other documents released under this License,

under the terms defined in section 4 above for modified versions, provided that you

include in the combination all of the Invariant Sections of all of the original docu-

ments, unmodified, and list them all as Invariant Sections of your combined work in

its license notice, and that you preserve all their Warranty Disclaimers.

The combined work need only contain one copy of this License, and multiple iden-

tical Invariant Sections may be replaced with a single copy. If there are multiple

Invariant Sections with the same name but different contents, make the title of each

such section unique by adding at the end of it, in parentheses, the name of the original

author or publisher of that section if known, or else a unique number. Make the same

adjustment to the section titles in the list of Invariant Sections in the license notice

of the combined work.

In the combination, you must combine any sections Entitled "History" in the vari-

ous original documents, forming one section Entitled "History"; likewise combine any

sections Entitled "Acknowledgments", and any sections Entitled "Dedications". You

must delete all sections Entitled "Endorsements".

6. COLLECTIONS OF DOCUMENTS

You may make a collection consisting of the Document and other documents re-

leased under this License, and replace the individual copies of this License in the

various documents with a single copy that is included in the collection, provided that

you follow the rules of this License for verbatim copying of each of the documents in

all other respects.

You may extract a single document from such a collection, and distribute it individ-

ually under this License, provided you insert a copy of this License into the extracted

document, and follow this License in all other respects regarding verbatim copying of

that document.

207

7. AGGREGATION WITH INDEPENDENT WORKS

A compilation of the Document or its derivatives with other separate and indepen-

dent documents or works, in or on a volume of a storage or distribution medium, is

called an "aggregate" if the copyright resulting from the compilation is not used to

limit the legal rights of the compilation’s users beyond what the individual works per-

mit. When the Document is included in an aggregate, this License does not apply to

the other works in the aggregate which are not themselves derivative works of the

Document.

If the Cover Text requirement of section 3 is applicable to these copies of the Doc-

ument, then if the Document is less than one half of the entire aggregate, the Doc-

ument’s Cover Texts may be placed on covers that bracket the Document within the

aggregate, or the electronic equivalent of covers if the Document is in electronic form.

Otherwise they must appear on printed covers that bracket the whole aggregate.

8. TRANSLATION

Translation is considered a kind of modification, so you may distribute translations

of the Document under the terms of section 4. Replacing Invariant Sections with

translations requires special permission from their copyright holders, but you may

include translations of some or all Invariant Sections in addition to the original ver-

sions of these Invariant Sections. You may include a translation of this License, and

all the license notices in the Document, and any Warranty Disclaimers, provided that

you also include the original English version of this License and the original versions

of those notices and disclaimers. In case of a disagreement between the translation

and the original version of this License or a notice or disclaimer, the original version

will prevail.

If a section in the Document is Entitled "Acknowledgments", "Dedications", or "His-

tory", the requirement (section 4) to Preserve its Title (section 1) will typically require

changing the actual title.

9. TERMINATION

You may not copy, modify, sublicense, or distribute the Document except as ex-

pressly provided for under this License. Any other attempt to copy, modify, sublicense

or distribute the Document is void, and will automatically terminate your rights un-

der this License. However, parties who have received copies, or rights, from you under

this License will not have their licenses terminated so long as such parties remain in

full compliance.

10. FUTURE REVISIONS OF THIS LICENSE

208 GNU Free Documentation License

The Free Software Foundation may publish new, revised versions of the GNU Free

Documentation License from time to time. Such new versions will be similar in spirit

to the present version, but may differ in detail to address new problems or concerns.

See http://www.gnu.org/copyleft/.

Each version of the License is given a distinguishing version number. If the Doc-

ument specifies that a particular numbered version of this License "or any later ver-

sion" applies to it, you have the option of following the terms and conditions either of

that specified version or of any later version that has been published (not as a draft)

by the Free Software Foundation. If the Document does not specify a version number

of this License, you may choose any version ever published (not as a draft) by the Free

Software Foundation.

ADDENDUM: How to use this License for your documents

To use this License in a document you have written, include a copy of the License in

the document and put the following copyright and license notices just after the title

page:

Copyright c YEAR YOUR NAME. Permission is granted to copy, distribute

and/or modify this document under the terms of the GNU Free Documenta-

tion License, Version 1.2 or any later version published by the Free Software

Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-

Cover Texts. A copy of the license is included in the section entitled "GNU Free

Documentation License".

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace

the "with...Texts." line with this:

with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover

Texts being LIST, and with the Back-Cover Texts being LIST.

If you have Invariant Sections without Cover Texts, or some other combination of

the three, merge those two alternatives to suit the situation.

If your document contains nontrivial examples of program code, we recommend

releasing these examples in parallel under your choice of free software license, such

as the GNU General Public License, to permit their use in free software.

History

This section contains the revision history of the book. For persons making modifications

to the book, please record the pertinent information here, following the format in the

first item below.

1. VERSION: 1.0

Date: 2008-01-04

Author(s): Michael Corral

Title: Vector Calculus

Modification(s): Initial version

209

Index

Symbols

D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

M

x

, M

y

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

M

xy

, M

xz

, M

yz

. . . . . . . . . . . . . . . . . . . . . . . 126

∆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

¯ x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

¯ y. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

¯ z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

δ(x, y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

∂(x, y, z)

∂(u, v, w)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

∂f

∂x

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

S

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

R

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

C

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136, 139

C

1

, C

∞

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

∇ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80, 177

∇

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Σ

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

C

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

∂. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

D

v

f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

e

r

, e

θ

, e

z

, e

ρ

, e

φ

. . . . . . . . . . . . . . . . . . . . . 181

dr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

i, j, k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

A

acceleration. . . . . . . . . . . . . . . . . . . . . . . 2, 55

angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

annulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

area element . . . . . . . . . . . . . . . . . . . . . . . 105

average value . . . . . . . . . . . . . . . . . . . . . . 113

B

Bézier curve . . . . . . . . . . . . . . . . . . . . . . . . . 56

Beta function . . . . . . . . . . . . . . . . . . . . . . 123

C

capping surface . . . . . . . . . . . . . . . . . . . . 175

Cauchy-Schwarz Inequality . . . . . . . . . 17

center of mass. . . . . . . . . . . . . . . . . . . . . . 124

centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Chain Rule . . . . . . . . . . . . . . . . . . . . . 60, 147

change of variable. . . . . . . . . . . . . 117, 119

circulation . . . . . . . . . . . . . . . . . . . . . . . . . 174

closed curve . . . . . . . . . . . . . . . . . . . . . . . . 145

closed surface . . . . . . . . . . . . . . . . . . . . . . 161

collinear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

conical helix. . . . . . . . . . . . . . . . . . . . . . . . 167

conservative field . . . . . . . . . . . . . . . . . . 148

constrained critical point. . . . . . . . . . . . 96

continuity . . . . . . . . . . . . . . . . . . . . . . . 52, 69

continuously differentiable . . . . . . 59, 80

coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Cartesian. . . . . . . . . . . . . . . . . . . . . . . . . 1

curvilinear . . . . . . . . . . . . . . . . . . . . . . 47

cylindrical . . . . . . . . . . . . . . . . . . 47, 182

ellipsoidal . . . . . . . . . . . . . . . . . . . . . . 164

left-handed . . . . . . . . . . . . . . . . . . . . . . . 2

polar . . . . . . . . . . . . . . . . . . . . . . . 47, 121

rectangular . . . . . . . . . . . . . . . . . . . . . . . 1

210

Index 211

right-handed . . . . . . . . . . . . . . . . . . . . . 2

spherical . . . . . . . . . . . . . . . . . . . 47, 182

coplanar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

correlation . . . . . . . . . . . . . . . . . . . . . . . . . 134

covariance. . . . . . . . . . . . . . . . . . . . . . . . . . 134

critical point. . . . . . . . . . . . . . . . . . . . . . . . . 83

cross product . . . . . . . . . . . . . . . . . . . . . . . . 20

curl . . . . . . . . . . . . . . . . . . . . . . 169, 178, 182

curvature. . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

cylinder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

D

density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

derivative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

directional. . . . . . . . . . . . . . . . . . . . . . . 78

mixed partial . . . . . . . . . . . . . . . . . . . . 73

partial . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

vector-valued function . . . . . . . . . . 52

determinant . . . . . . . . . . . . . . . . . . . . . . . . . 26

differential . . . . . . . . . . . . . . . . . . . . . . . . . 139

differential form . . . . . . . . . . . . . . . . . . . 139

directed curve. . . . . . . . . . . . . . . . . . . . . . 144

direction angles . . . . . . . . . . . . . . . . . . . . . 19

direction cosines. . . . . . . . . . . . . . . . . . . . . 19

directional derivative . . . . . . . . . . . . . . . 78

distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

between points. . . . . . . . . . . . . . . . . 6, 7

from point to line. . . . . . . . . . . . . . . . 33

point to plane. . . . . . . . . . . . 37, 41, 42

distribution function. . . . . . . . . . . . . . . 129

joint . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

normal . . . . . . . . . . . . . . . . . . . . . . . . . 130

divergence . . . . . . . . . . . . . . . 162, 177, 182

Divergence Theorem. . . . . . . . . . . . . . . 162

dot product . . . . . . . . . . . . . . . . . . . . . . . . . . 15

double integral . . . . . . . . . . . . . . . . 102, 105

polar coordinates . . . . . . . . . . . . . . 121

doubly ruled surface. . . . . . . . . . . . . . . . .45

E

ellipsoid . . . . . . . . . . . . . . . . . . . 43, 123, 164

elliptic cone . . . . . . . . . . . . . . . . . . . . . . . . . 45

elliptic paraboloid . . . . . . . . . . . . . . . . . . . 44

Euclidean space . . . . . . . . . . . . . . . . . . . . . . 1

exact differential form. . . 139, 154, 175

expected value . . . . . . . . . . . . . . . . . . . . . 132

extreme point . . . . . . . . . . . . . . . . . . . . . . . 83

F

flux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

continuous . . . . . . . . . . . . . . . . . . . . . . 69

scalar. . . . . . . . . . . . . . . . . . . . . . . . . . . .53

vector-valued. . . . . . . . . . . . . . . . . . . . 51

G

Gaussian blur . . . . . . . . . . . . . . . . . . . . . . . 70

global maximum . . . . . . . . . . . . . . . . . . . . 83

global minimum. . . . . . . . . . . . . . . . . . . . . 83

gradient . . . . . . . . . . . . . . . . . . . . . . . . 80, 182

Green’s identities . . . . . . . . . . . . . . . . . . 186

Green’s Theorem. . . . . . . . . . . . . . . . . . . 150

H

harmonic . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

helicoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

helix . . . . . . . . . . . . . . . . . . . . . . . . 51, 59, 167

hyperbolic paraboloid . . . . . . . . . . . . . . . 44

hyperboloid. . . . . . . . . . . . . . . . . . . . . . . . . . 43

one sheet . . . . . . . . . . . . . . . . . . . . . . . . 43

two sheets. . . . . . . . . . . . . . . . . . . . . . . 43

hypersurface . . . . . . . . . . . . . . . . . . . . . . . 110

hypervolume . . . . . . . . . . . . . . . . . . . . . . . 110

I

improper integral . . . . . . . . . . . . . . . . . . 108

integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

double. . . . . . . . . . . . . . . . . . . . . 102, 105

improper . . . . . . . . . . . . . . . . . . . . . . . 108

iterated . . . . . . . . . . . . . . . . . . . . . . . . 102

multiple. . . . . . . . . . . . . . . . . . . . . . . . 101

surface . . . . . . . . . . . . . . . . . . . . 156, 158

triple. . . . . . . . . . . . . . . . . . . . . . . . . . . 110

irrotational . . . . . . . . . . . . . . . . . . . . . . . . . 174

212 Index

iterated integral . . . . . . . . . . . . . . . . . . . 102

J

Jacobi identity . . . . . . . . . . . . . . . . . . . . . . 30

Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

joint distribution. . . . . . . . . . . . . . . . . . . 131

L

Lagrange multiplier. . . . . . . . . . . . . . . . . 96

lamina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Laplacian . . . . . . . . . . . . . . . . . . . . . 178, 182

level curve. . . . . . . . . . . . . . . . . . . . . . . . . . . 66

limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

vector-valued function . . . . . . . . . . 52

line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

intersection of planes . . . . . . . . . . . 38

parallel . . . . . . . . . . . . . . . . . . . . . . . . . . 34

parametric representation . . . . . . 31

perpendicular . . . . . . . . . . . . . . . . . . . 34

skew. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

symmetric representation. . . . . . . 32

through two points . . . . . . . . . . . . . . 33

vector representation . . . . . . . . . . . 31

line integral . . . . . . . . . . . . . . . . . . . 136, 139

local maximum. . . . . . . . . . . . . . . . . . . . . . 83

local minimum . . . . . . . . . . . . . . . . . . . . . . 83

M

mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

mixed partial derivative. . . . . . . . . . . . . 73

Möbius strip . . . . . . . . . . . . . . . . . . . . . . . 168

moment . . . . . . . . . . . . . . . . . . . . . . . 124, 126

momentum. . . . . . . . . . . . . . . . . . . . . . . . . . 55

Monte Carlo method . . . . . . . . . . . . . . . 113

moving frame fields . . . . . . . . . . . . . . . . . 62

multiple integral . . . . . . . . . . . . . . . . . . . 101

multiply connected. . . . . . . . . . . . . . . . . 153

N

n-positive direction . . . . . . . . . . . . . . . . 169

Newton’s algorithm . . . . . . . . . . . . . . . . . 89

normal derivative . . . . . . . . . . . . . . . . . . 186

normal to a curve . . . . . . . . . . . . . . . . . . . 81

normal vector field. . . . . . . . . . . . . . . . . 168

O

orientable . . . . . . . . . . . . . . . . . . . . . . . . . . 168

orthonormal vectors . . . . . . . . . . . . . . . . . 64

outward normal . . . . . . . . . . . . . . . . . . . . 160

P

paraboloid. . . . . . . . . . . . . . . . . . . . . . . . . . . 44

elliptic. . . . . . . . . . . . . . . . . . . . . . . . . . . 44

hyperbolic . . . . . . . . . . . . . . . . . . . 44, 84

of revolution. . . . . . . . . . . . . . . . . . . . . 44

parallelepiped . . . . . . . . . . . . . . . . . . . . . . . 24

volume . . . . . . . . . . . . . . . . . . . . . . . . . . 25

parameter . . . . . . . . . . . . . . . . . . . . . . . 31, 60

parametrization. . . . . . . . . . . . . . . . . . . . . 60

partial derivative. . . . . . . . . . . . . . . . . . . . 71

partial differential equation. . . . . . . . . 74

path independence. . . . . . . 146, 154, 175

piecewise smooth curve . . . . . . . . . . . . 141

plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

coordinate . . . . . . . . . . . . . . . . . . . . . . . . 1

Euclidean. . . . . . . . . . . . . . . . . . . . . . . . . 1

in space . . . . . . . . . . . . . . . . . . . . . . . . . 35

line of intersection . . . . . . . . . . . . . . 38

normal form. . . . . . . . . . . . . . . . . . . . . 35

normal vector . . . . . . . . . . . . . . . . . . . 35

point-normal form. . . . . . . . . . . . . . . 35

tangent. . . . . . . . . . . . . . . . . . . . . . . . . . 75

through three points . . . . . . . . . . . . 36

position vector . . . . . . . . . . . . . . 54, 55, 139

potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

probability . . . . . . . . . . . . . . . . . . . . . . . . . 128

probability density function. . . . . . . . 129

projection. . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Q

quadric surface. . . . . . . . . . . . . . . . . . . . . . 43

R

random variable . . . . . . . . . . . . . . . . . . . 128

Riemann integral . . . . . . . . . . . . . . . . . . 135

Index 213

right-hand rule. . . . . . . . . . . . . . . . . 21, 192

ruled surface . . . . . . . . . . . . . . . . . . . . . . . . 45

S

saddle point . . . . . . . . . . . . . . . . . . . . . . . . . 85

sample space. . . . . . . . . . . . . . . . . . . . . . . 128

scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

combination. . . . . . . . . . . . . . . . . . . . . 12

scalar function . . . . . . . . . . . . . . . . . . . . . . 53

scalar triple product. . . . . . . . . . . . . . . . . 25

Second Derivative Test . . . . . . . . . . . . . . 84

second moment . . . . . . . . . . . . . . . . . . . . 134

second-degree equation . . . . . . . . . . . . . 43

simple closed curve . . . . . . . . . . . . . . . . 145

simply connected. . . . . . . . . . . . . . 154, 175

smooth function . . . . . . . . . . . . . . . . . 59, 84

solenoidal . . . . . . . . . . . . . . . . . . . . . . . . . . 163

span. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

spherical spiral . . . . . . . . . . . . . . . . . . . . . . 54

standard normal distribution . . . . . . 130

steepest descent . . . . . . . . . . . . . . . . . . . . . 95

stereographic projection. . . . . . . . . . . . . 46

Stokes’ Theorem . . . . . . . . . . . . . . 168, 169

surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

doubly ruled. . . . . . . . . . . . . . . . . . . . . 45

orientable . . . . . . . . . . . . . . . . . . . . . . 168

ruled . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

two-sided . . . . . . . . . . . . . . . . . . . . . . 168

surface integral . . . . . . . . . . . . . . . 156, 158

T

tangent plane . . . . . . . . . . . . . . . . . . . . . . . 75

torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

trace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

triangle inequality . . . . . . . . . . . . . . . . . . 18

triple integral . . . . . . . . . . . . . . . . . . . . . . 110

cylindrical coordinates . . . . . . . . . 122

spherical coordinates . . . . . . . . . . 122

U

uniform density . . . . . . . . . . . . . . . . . . . . 124

uniform distribution . . . . . . . . . . . . . . . 129

uniformly distributed . . . . . . . . . . . . . . 128

unit disk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

V

variance. . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

addition . . . . . . . . . . . . . . . . . . . . . . . . . . 9

angle between. . . . . . . . . . . . . . . . . . . 15

basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

components . . . . . . . . . . . . . . . . . . . . . 13

direction. . . . . . . . . . . . . . . . . . . . . . . . . . 3

magnitude . . . . . . . . . . . . . . . . . . . . . 3, 7

normal . . . . . . . . . . . . . . . . . . . . . 35, 160

normalized . . . . . . . . . . . . . . . . . . . . . . 12

parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

perpendicular . . . . . . . . . . . . . . . 16, 17

positive unit normal . . . . . . . . . . . 169

principal normal N. . . . . . . . . . . . . . 64

scalar multiplication . . . . . . . . . . . . . 9

subtraction. . . . . . . . . . . . . . . . . . . . . . 10

tangent. . . . . . . . . . . . . . . . . . . . . . . . . . 52

translation. . . . . . . . . . . . . . . . . . . . . 5, 9

unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

unit binormal B. . . . . . . . . . . . . . . . . 64

unit tangent T . . . . . . . . . . . . . . . . . . 64

zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3, 4

vector field . . . . . . . . . . . . . . . . . . . . . . . . . 138

normal . . . . . . . . . . . . . . . . . . . . . . . . . 168

smooth. . . . . . . . . . . . . . . . . . . . . . . . . 150

vector triple product. . . . . . . . . . . . . . . . .25

velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . 2, 55

volume element . . . . . . . . . . . . . . . . . . . . 110

W

wave equation. . . . . . . . . . . . . . . . . . . . . . . 74

work . . . . . . . . . . . . . . . . . . . . . . . . . . 135, 166

Z

zenith angle . . . . . . . . . . . . . . . . . . . . . . . . . 47

Vector Calculus

Michael Corral

Schoolcraft College

About the author: Michael Corral is an Adjunct Faculty member of the Department of Mathematics at Schoolcraft College. He received a B.A. in Mathematics from the University of California at Berkeley, and received an M.A. in Mathematics and an M.S. in Industrial & Operations Engineering from the University of Michigan.

A This text was typeset in LTEX 2ε with the KOMA-Script bundle, using the GNU Emacs text editor on a Fedora Linux system. The graphics were created using MetaPost, PGF, and Gnuplot.

Copyright c 2008 Michael Corral. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

I have tried to be somewhat rigorous about proving results. normally known as “Vector Calculus”. which allows others to not only copy and distribute the book but also to modify it. with 1 being completely informal and 10 being completely rigorous. But while it is important for students to see full-blown proofs . divided into three categories: A.4). Answers and hints to most odd-numbered and some even-numbered exercises are provided in Appendix A.k. “Moderate” and “Challenging”. and perhaps even easier with a functional programming language (such as Haskell or Scheme). respectively. the B exercises are slightly more involved. So that there is no ambiguity on this iii . which seems to have virtually disappeared from calculus texts over the last few decades. Java was chosen due to its ubiquity. and the C exercises usually require some effort or insight to solve. or simply “Calculus III”. and easy availability for multiple platforms. the Monte Carlo method for approximating multiple integrals. However. see the included copy of the GFDL. While it would have been simple to use a scripting language like Python. relatively clear syntax.a. which in my experience are more than enough for a semester course in this subject. Those exercises do not mandate the use of Java.g. so students are free to implement the solutions using the language of their choice.Preface This book covers calculus in two and three variables. A crude way of describing A. hopefully with enough comments so that the reader can figure out what is being done even without knowing Java. This book is released under the GNU Free Documentation License (GFDL). B and C. There are 420 exercises throughout the text. Appendix B contains a proof of the right-hand rule for the cross product.since that is how mathematics works . I would rate it as a 5. If I were to rate the level of rigor in the book on a scale of 1 to 10. Calculus I and II). The prerequisites are the standard courses in single-variable calculus (a. in Section 3. B and C would be “Easy”. “Multivariable Calculus”. There are exercises at the end of each section. There are a few exercises that require the student to write his or her own computer program to solve some numerical approximation problems (e. many of the B exercises are easy and not all the C exercises are difficult.too much rigor and emphasis on proofs can impede the flow of learning for the vast majority of the audience at this level. It is suitable for a one-semester course. Appendix C contains a brief tutorial on Gnuplot for graphing functions of two variables. The code samples in the text are in the Java programming language. For more details. The A exercises are mostly of a routine computational nature.

January 2008 M ICHAEL C ORRAL .mecmath. corrections. anyone can make as many copies of this book as desired and distribute it as desired. etc).net). I would like to thank my students in Math 240 for being the guinea pigs for the initial draft of this book. without needing my permission.edu for any questions on this or any other matter involving the book (e. suggestions.g. and for finding the numerous errors and typos it contained. comments. I welcome your input.iv Preface matter. Finally. Feel free to contact me at mcorral@schoolcraft. The PDF version will always be freely available to the public at no cost (go to http://www.

. . . . . . . . . . . . . . . . . . . . . .3 Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Unconstrained Optimization: Numerical Methods 2. . . . . . . . . . . . . . .7 Application: Probability and Expected Value . . . . . 2. . . . . . . .Contents Preface 1 Vectors in Euclidean Space 1. . . .3 Tangent Plane to a Surface . . . . . . . . . . . 3. . . . . .1 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. . . . . . . . . . . . 3. . 2. . . . . . . . . . . . . . . . . . . . . . .4 Numerical Approximation of Multiple Integrals 3. 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . . . . . . 135 4.2 Vector Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Change of Variables in Multiple Integrals . . . . . . . .1 Double Integrals . . . . . . . . . . . . . .5 Maxima and Minima . . . . iii 1 1 9 15 20 31 40 47 51 59 65 65 71 75 78 83 89 96 101 101 105 110 113 117 124 128 . . . . . . . .6 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . 4 Line and Surface Integrals 135 4. . 2. . . . . . .3 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . .1 Introduction . . . . . . 143 4. . . . . . . . . . . . . . . . . . . . .2 Partial Derivatives . . . . . . . . . . . . . .7 Constrained Optimization: Lagrange Multipliers . . . . 2. . . . . . 2. . . . . . . . . . . . . . .9 Arc Length . . . . . . . . . 1. .3 Green’s Theorem .1 Functions of Two or Three Variables . . . . . . . . . . . . . . . . . . . . . . 150 v .7 Curvilinear Coordinates 1. . . . . . . . . . . . . .4 Directional Derivatives and the Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Functions of Several Variables 2. . . . . . . . . . . . . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. . . . . . . . . .2 Properties of Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. .5 Lines and Planes . . . 1. . . . . . 1. . . . . . . . .4 Cross Product . . .8 Vector-Valued Functions 1. . . . . . . . .2 Double Integrals Over a General Region . . . . 3 Multiple Integrals 3. . . . . . . . . . . . . . . . .6 Application: Center of Mass . . . . . . .

. . . . . . . . . . . .4 Surface Integrals and the Divergence Theorem . . . . 177 Bibliography Appendix A: Answers and Hints to Selected Exercises Appendix B: Proof of the Right-Hand Rule for the Cross Product Appendix C: 3D Graphing with Gnuplot 187 189 192 196 201 209 210 GNU Free Documentation License History Index . . . . . . . . . . .vi Contents 4. Curl and Laplacian . .5 Stokes’ Theorem . . . 165 4. . . . . . . . . . . . . . . . 156 4. . . . . . . . . . . . . . . . . Divergence. .6 Gradient. . . .

b).1. The graph of f consists of the points (x. respectively). y)). For such a function.1 Introduction In single-variable calculus. which in the Cartesian coordinate system consists of all ordered triples of real numbers (a.2).1. we will deal with functions of two or three variables (usually x. The 3-dimensional coordinate system of Euclidean space can be represented on a flat surface. say.1. z. b. The graph of a function of two variables. we denote it by 3 . y. the functions that one encounters are functions of a variable (usually x or t) that varies over some subset of the real number line (which we denote by ). in the manner shown in Figure 1. Euclidean space has three mutually perpendicular coordinate axes (x. lies in Euclidean space.1 Vectors in Euclidean Space 1. We use the word “Euclidean” to denote a system in which all the usual rules of Euclidean geometry hold. c). y) = (x. b. the graph of the function f consists of the points (x. y = f (x).1. y or x. yz-plane and xz-plane (see Figure 1.1. in the Cartesian or rectangular coordinate system. and three mutually perpendicular coordinate planes: the xy-plane. y. y. say. c) b 0 a x x y xz-plane yz-plane z y 0 xy-plane Figure 1. such as this page or a blackboard. z c P(a. consists of all ordered pairs of real numbers (a. Since Euclidean space is 3-dimensional. f (x)). The Euclidean plane has two perpendicular coordinate axes: the x-axis and the y-axis. y and z). z = f (x. which. only by giving the illusion of three dimensions. These points lie in the Euclidean plane. y). the “2” represents the number of dimensions of the plane. We denote the Euclidean plane by 2 .2 1 . f (x. z) = (x.1 Figure 1. In vector (or multivariable) calculus.

and that rotating either type of system does not change its “handedness”.2 CHAPTER 1. For functions of three variables.1. which we can not see in our 3-dimensional space. and the thumb in the positive direction of the z-axis. 1 One thing you will learn is why a 4-dimensional creature would be able to reach inside an egg and remove the yolk without cracking the shell! . Figure 1.and y-axes in a right-handed system results in a left-handed system.1. This is where the idea of a vector comes in. VECTORS IN EUCLIDEAN SPACE The coordinate system shown in Figure 1. Doing the same thing with the left hand is what defines a left-handed coordinate system. using the right hand. 4 ).1 is known as a right-handed coordinate system.1. let alone simulate in 2-dimensional space. or its acceleration? Or the gravitational force acting on the object? These phenomena all seem to involve motion and direction in some way.3. the middle finger in the positive direction of the y-axis. to point the index finger in the positive direction of the x-axis. For an entertaining discussion of this subject.e. the graphs exist in 4-dimensional space (i. So we can only think of 4-dimensional space abstractly. as in Figure 1. because it is possible. see the book by A BBOTT. we have discussed the position of an object in 2-dimensional or 3-dimensional space.3 Right-handed coordinate system An equivalent way of defining a right-handed system is if you can point your thumb upwards in the positive z-axis direction while using the remaining four fingers to rotate the x-axis towards the y-axis.1 So far. But what about something such as the velocity of the object. Throughout the book we will use a right-handed system. Notice that switching the x.

which in elementary geometry is called a “directed line segment”.1. indicated by a plus or minus symbol (representing motion in the positive direction or the negative direction. preceded by a direction. which was called the velocity of the object. For example. and it is denoted by 0. For general motion along a curve in 2.g. Note that our definition could apply to systems with any number of dimensions (see Figure 1. This is the motivation for how we will define a vector. since they are just numbers. the velocities are also contained in that 1-dimensional space. however. For motion along a straight line.or 3-dimensional space.4 (a)-(c)). respectively). v) and use the terms “magnitude” and “length” interchangeably. The zero vector is just a point. in a 1-dimensional space. if y = f (t) gives the displacement of an object after time t. with P and Q being − − → distinct points. and negative if it moves in the opposite of that direction. then dy/dt = f ′ (t) is the velocity of the object at time t. Its magnitude is the length of the line − − → segment.1. y z − −→Q P v − − → RS 0 R − − → PQ P Q x 0 P R − − → RS Q S x S x → − −S R R Q −→ − PQ y 0 P v S (a) One dimension (b) Two dimensions (c) Three dimensions Figure 1.1. velocity will need to be represented by a multidimensional object which should have both a magnitude and a direction. as having two components: a magnitude. we draw an arrow from its initial point to its terminal point. The vector is denoted by PQ. indicated by a nonnegative number. i. for motion along a straight line.1. The derivative f ′ (t) is just a number. We will often denote a vector by a single bold-faced letter (e. which is positive if the object is moving in an agreedupon “positive” direction. and the ± represents the direction of the velocity (though the + is usually omitted for the positive direction).4 Vectors in different dimensions . So you can think of that number. f ′ (t) = ±a for some number a ≥ 0. A (nonzero) vector is a directed line segment drawn from a point P (called its initial point) to a point Q (called its terminal point). Definition 1. i. Then a is the magnitude of the velocity (normally called the speed of the object).e.1 Introduction 3 You have already dealt with velocity and acceleration in single-variable calculus. and its direction is the same as that of the directed line segment.e. To indicate the direction of a vector. A geometric object which has those features is an arrow. denoted by PQ .

VECTORS IN EUCLIDEAN SPACE A few things need to be noted about the zero vector. Definition 1. Notice that we were careful to only define the direction of a nonzero vector. We also see that v is parallel to u but points in the opposite direction. which is well-defined since the initial and terminal points are distinct. For example. Our definition of the zero vector. those vectors all being equal and differing only by their initial and terminal points. Any vector with zero magnitude is equal to the zero vector.2 Now that we know what a vector is. even though they have different initial points. This agrees with the definition of the zero vector as just a point. What is the magnitude of the zero vector? We define it to be zero. since they lie on lines having the same slope 1 . 0 = 0. . Not everyone agrees on the direction of the zero vector. So u v. while others say that it has no direction.2.4 CHAPTER 1. And we see that u and w are parallel. in Figure 1.1. Our motivation for what a vector is included the notions of magnitude and direction. y 4 3 2 1 0 1 v w x 2 3 4 u Figure 1. we need a way of determining when two vectors are equal.5. can take any direction). vectors with the same magnitude and direction but with different initial points would be equal. Is there a single vector which we can choose to represent all those equal vectors? The answer is yes. and they point in the 2 same direction. So u = w. which has zero length. some say that it has indeterminate direction (i.1. and is suggested by the vector w in Figure 1.5 the vectors u. What about the direction of the zero vector? A single point really has no well-defined direction. i. See A NTON and R ORRES. By this definition. Some contend that the zero vector has arbitrary direction (i. the direction can not be determined). does not require it to have a direction. v and w all √ have the same magnitude 5 (by the Pythagorean Theorem). Two nonzero vectors are equal if they have the same magnitude and the same direction.e. however.5 So we can see that there are an infinite number of vectors for a given magnitude and direction. 2 In the subject of linear algebra there is a more abstract way of defining a vector where the concept of “direction” is not really used.e.1.e. This leads us to the following definition. and we will leave it at that.

then the original vectors are equal. 0) and the terminal point is (3. when we refer to vectors as v = (a. 0. 5) and the vector v are different objects. which we will do in the next section). 4.5) y 0 x (b) The vector (3. . we will mean the one whose initial point is at the origin of the coordinate system. without having to determine their magnitude and direction. The point-vector correspondence provides an easy way to check if two vectors are equal. you are now seeing if the terminal points of vectors starting at the origin are the same. Example 1. when speaking of “the vector” with a given magnitude and direction.4.1. respectively. Let v be the vector in 3 whose initial point is at the origin and whose terminal point is (3. Also. 5) y 0 x (a) The point (3. But there will be times when it is convenient to consider a different initial point for a vector (for example. Do this for each original vector then compare.1. 4. z P(3. 5). To get the “new” vectors starting at the origin. 0) and (0. 4. 4. 5). it is understood that the initial point of v is at the origin (0. 0). Similar to seeing if two points are the same. For each vector. b. When doing this. 5). we will write the zero vector 0 in 2 and 3 as (0. The resulting point will be the terminal point of the “new” vector whose initial point is the origin. find the (unique!) vector it equals whose initial point is the origin. you translate each vector to start at the origin by subtracting the coordinates of the original initial point from the original terminal point. 4.4.1.5) Figure 1. c) in 3 . Another advantage of using the origin as the initial point is that it provides an easy correspondence between a vector and its terminal point. b) in 2 or v = (a. 4. we mean vectors in Cartesian coordinates starting at the origin. 5) z v = (3. 0. 5 Thinking of vectors as starting from the origin provides a way of dealing with vectors in a standard way. Then compare the coordinates of the terminal points of these “new” vectors: if those coordinates are the same. Though the point (3.6 Correspondence between points and vectors Unless otherwise stated. when adding vectors. since every coordinate system has an origin.1 Introduction Unless otherwise indicated. it is convenient to write v = (3.

7).1) (x2 − x1 )2 + (y2 − y1 )2 By this formula. −2) 0 S (2. 2) and RS = w = (1. 2) − − → Translate RS to w R (1. 2). 1. Does PQ = RS ? − − → Solution: The vector PQ is equal to the vector v with initial point (0. − − → − − → ∴ PQ = RS z → − −Q P P (2.6 CHAPTER 1. 5) Q (3. −2) = (2 − 1. VECTORS IN EUCLIDEAN SPACE − − → − − → Example 1. 5 − 1. 0) and terminal point S − R = (2. y2 ) in d= 2. 1.2) . −2) and S = (2. the magnitude of PQ is: − − → PQ = (x2 − x1 )2 + (y2 − y1 )2 (1. R = − − → − − → (1. −3. y1 ). 5. 5). 4. 7) − − → Translate PQ to v v=w (1. 1. 5) = (3 − 2. 1. 7) − (2. Q = (x2 . y1 ) and terminal point − − → Q = (x2 . 4. the distance d between P and Q is: (1. RS is equal to the vector w with initial point (0. − − → Similarly. 4. 5. 0 − (−2)) = (1.1. where P = (2. Consider the vectors PQ and RS in 3 . 7 − 5) = (1. 5.2. Q = (3. 0). 1. −3. 1. 0. 4. y2 ). we have the following result: − − → For a vector PQ in 2 with initial point P = (x1 . − − → − − → So PQ = v = (1. 0) − (1. 0) and terminal point Q − P = (3. 1 − (−3).7 Recall the distance formula for points in the Euclidean plane: For points P = (x1 . −3. 2). 0. 4. 2). 0) y → − −S R x Figure 1.

and S = (a. 0. c is 0. A second application of the Pythagorean Theorem. c are all positive (the other seven possibilities are handled in a similar manner). which is a vector in the yz-plane. b) in with P = (0. c are 0. Without loss of generality. 2 7 is a special case of formula (1. b. b.1 Introduction Finding the magnitude of a vector v = (a. y2 . Without loss of generality. b 0 and c 0 (the other two possibilities are handled in a similar manner). √ √ v = b2 + c2 = 02 + b2 + c2 = a2 + b2 + c2 . we assume that a = b = 0 and c 0 (the other two possibilities are handled in a similar manner). This proves the theorem. we assume that a = 0. which is a vector of length |c| along the z-axis. as shown in Figure 1. 0).1. we need a distance formula for points in Euclidean space (we will postpone the proof until the next section): Theorem 1. c are 0. c). The distance d between points P = (x1 . c).5) a2 + b2 + c2 Proof: There are four cases to consider: √ √ Case 1: a = b = c = 0.1. b. Case 2: exactly two of a. c) c y 0 b R a S x P Figure 1. b.2.√ c). so by the Pythagorean Theorem we have b. z2 ) in d= (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 3 is: (1. b. So v = |c| = c2 = √ √ 02 + 02 + c2 = a2 + b2 + c2 . we can assume that a. Consider the points P = (0. Case 4: none of a.8.8 . Q = (a. QED z v Q(a. b) in 2. the magnitude of v is: (1. Applying the Pythagorean Theorem to the right triangle △PS R gives |PR|2 = a2 +b2 . 0).2) the magnitude of v is: v = a2 + b2 (1. 0. Without loss of generality. b. y1 . gives v = |PQ| = |PR|2 + |QR|2 = a2 + b2 + c2 . b. Case 3: exactly one of a. R = (a.1. Then v = (0. 0). Then v = 0. b) : For a vector v = (a. b. 0) and Q = (a.4) The proof will use the following result: Theorem 1. z1 ) and Q = (x2 . 0. this time to the right triangle √ △PQR.3) To calculate the magnitude of vectors in 3 . so v = 0 = 02 + 02 + 02 = a2 + b2 + c2 . √ Then v = (0. c) in v = 3. For a vector v = (a.1.

Solution: By formula (1. 3c) be vectors in Show that w = 3 v . Show that w = |a| v . c) and w = (3a. 0. y2 . (b) The magnitude of the vector v = (8. 8. the distance d = (4 − 2)2 + (2 − (−1))2 + (−3 − 4)2 = √ √ 4 + 9 + 49 = 62. 3. R = (2. (Hint: Think of Case 4 in the proof of Theorem 1.2. 4. y1 . y1 . and consider Figure 1. z1 ) R(x2 .9. √ √ Solution: By formula (1. For the points P = (0.1. (c) The distance between the points P = (2.3. 0) be vectors in 3.1. 4) and Q = (4. S = (3. 2). does PQ = RS ? − − → − − → 3. 1). 0) Figure 1. 0. 3b. Though we will see a simple proof of Theorem 1. y2 . A Exercises © ¨ 1. it is possible to prove it using methods similar to those in the proof of Theorem 1. 0. −3) in 2 . 3. 0) (c) v = (3. Prove the special case of Theorem 1. −4) − − → − − → 2. v = 52 + 82 + (−2)2 = 25 + 64 + 4 = 93. y2 . y2 > y1 > 0. 0. z Q(x2 . v = 82 + 32 = 73.5). 1). Let v = (1. 5. 0. 0. 0) and w = (a. 0) . For the points P = (1. R = (1. Let v = (a. √ √ Solution: By formula (1.) T (x2 . (d) The magnitude of the vector v = (5. b.1 in the next section. −1. 4). PQ = 2 with P = (−1. 2). 3. −1. −2) in 3 . Q = (1. does PQ = RS ? B 4.2. and z2 > z1 > 0. −1.2). √ √ √ (5 − (−1))2 + (5 − 2)2 = 36 + 9 = 45 = 3 5. 1) (e) v = (6. Q = (2. z1 ) and Q = (x2 . S = (2. 5). 3) in 2 . z2 ) satisfy the following conditions: x2 > x1 > 0. Calculate the magnitudes of the following vectors: (a) v = (2. y1 . 1). VECTORS IN EUCLIDEAN SPACE Example 1. z1 ) 0 S (x1 . y1 .1 where the points P = (x1 . y2 . −2. 0). z2 ) P(x1 . 0) x U(x2 .4). Calculate the following: − − → (a) The magnitude of the vector PQ in − − → Solution: By formula (1.8 CHAPTER 1. −1. 2.3). −2) (d) v = (0. 2).9 y C 6. −1) (b) v = (2. 2) and Q = (5. 2.

v 2v 3v 0. and speed (not velocity).2. For a scalar k and a nonzero vector v. we will introduce the notion of a scalar. Definition 1. Definition 1. is that under certain types of coordinate transformations (e. is the vector whose magnitude is |k| v . and as flipping the vector in the opposite direction if the scalar is a negative number (see Figure 1. scalars will always be real numbers. the scalar multiple of v by k. points in the opposite direction as v if k < 0.3. We are now ready to define the sum of two vectors. For the zero vector 0.g.1. is obtained by translating w so that its initial point is at the terminal point of v. we define k0 = 0 for any scalar k.1). Definition 1.2. Two vectors v and w are parallel (denoted by v w) if one is a scalar multiple of the other. a quantity that is not affected is a scalar. rotations).3 Examples of scalar quantities are mass.4. physicist and astronomer William Rowan Hamilton.g. we can start to perform some of the usual algebraic operations on them (e. 3 The term scalar was invented by 19th century Irish mathematician. The word vector comes from Latin. electric charge. while a quantity that is affected (in a certain way) is a vector. the initial point of v + w is the initial point of v. . used in physics. You can think of scalar multiplication of a vector as stretching or shrinking the vector. Before doing that. 4 An alternate definition of scalars and vectors. denoted by kv. points in the same direction as v if k > 0. and its terminal point is the new terminal point of w.5.5v −v −2v Figure 1. denoted by v + w. See M ARION for details. subtraction). to convey the sense of something that could be represented by a point on a scale or graduated ruler.2 Vector Algebra Now that we know what vectors are. and is the zero vector 0 if k = 0. where it means “carrier”. A scalar is a quantity that can be represented by a single number. The sum of vectors v and w.1 Recall that translating a nonzero vector means that the initial point of the vector is changed but the magnitude and direction are preserved.2 Vector Algebra 9 1.4 We can now define scalar multiplication of a vector. addition. For our purposes.

3. For example.2. v −w (b) Translate −w to the end of v w v (a) Vectors v and w v v−w −w (c) The difference v − w Figure 1.2. w.2).4 shows the use of “geometric proofs” of various laws of vector algebra. we have not even mentioned coordinates in this section so far. See Figure 1. In general. In particular. Since we will deal mostly with Cartesian coordinates in this book. w w v (a) Vectors v and w v+w w v (c) The sum v + w v (b) Translate w to the end of v Figure 1. the following two theorems are useful for performing vector algebra on vectors in 2 and 3 starting at the origin. v w w+v v+w v (a) Add vectors w w v−w −w v+w v v−w w v−w v (c) Combined add/subtract (b) Subtract vectors Figure 1. as we would expect. it is easy to see that v + (−v) = 0.4 “Geometric” vector algebra Notice that we have temporarily abandoned the practice of starting vectors at the origin.2.2. 0 + 0 = 0.10 CHAPTER 1. Also. that is.2 Adding vectors v and w Notice that our definition is valid for the zero vector (which is just a point. In fact. and so we see that v + 0 = v = 0 + v for any vector v.3 Subtracting vectors v and w Figure 1. And (c) shows how you can think of v − w as the vector that is tacked on to the end of w to add up to v. (a) shows that v + w = w + v for any vectors v. we can define vector subtraction as follows: v − w = v + (−w). . VECTORS IN EUCLIDEAN SPACE Intuitively.2. adding w to v means tacking on w to the end of v (see Figure 1. it uses laws from elementary geometry to prove statements about vectors. and hence can be translated). since the scalar multiple −v = −1 v is a well-defined vector.2.

**1.2 Vector Algebra Theorem 1.3. Let v = (v1 , v2 ), w = (w1 , w2 ) be vectors in (a) kv = (kv1 , kv2 ) (b) v + w = (v1 + w1 , v2 + w2 )
**

2,

11 and let k be a scalar. Then

Proof: (a) Without loss of generality, we assume that v1 , v2 > 0 (the other possibilities are handled in a similar manner). If k = 0 then kv = 0v = 0 = (0, 0) = (0v1 , 0v2 ) = (kv1 , kv2 ), which is what we needed to show. If k 0, then (kv1 , kv2 ) lies on a line with slope kv2 = v2 , which is the same as the slope of the line on which v (and hence kv) lies, kv1 v1 and (kv1 , kv2 ) points in the same direction on that line as kv. Also, by formula (1.3) the magnitude of (kv1 , kv2 ) is (kv1 )2 + (kv2 )2 = k2 v2 + k2 v2 = k2 (v2 + v2 ) = |k| v2 + v2 = 1 2 1 2 1 2 |k| v . So kv and (kv1 , kv2 ) have the same magnitude and direction. This proves (a).

(b) Without loss of generality, we assume that v2 + w2 y v v1 , v2 , w1 , w2 > 0 (the other possibilities are hanw2 w w2 dled in a similar manner). From Figure 1.2.5, w v+w we see that when translating w to start at v2 w1 v the end of v, the new terminal point of w is x (v1 + w1 , v2 + w2 ), so by the definition of v + w v1 + w 1 w1 v1 0 this must be the terminal point of v + w. This Figure 1.2.5 proves (b). QED Theorem 1.4. Let v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 ) be vectors in (a) kv = (kv1 , kv2 , kv3 ) (b) v + w = (v1 + w1 , v2 + w2 , v3 + w3 ) The following theorem summarizes the basic laws of vector algebra. Theorem 1.5. For any vectors u, v, w, and scalars k, l, we have (a) v + w = w + v Commutative Law (b) u + (v + w) = (u + v) + w (c) v + 0 = v = 0 + v (d) v + (−v) = 0 (e) k(lv) = (kl)v (f) k(v + w) = kv + kw (g) (k + l)v = kv + lv Associative Law Additive Identity Additive Inverse Associative Law Distributive Law Distributive Law

3,

let k be a scalar. Then

Proof: (a) We already presented a geometric proof of this in Figure 1.2.4(a). (b) To illustrate the difference between analytic proofs and geometric proofs in vector algebra, we will present both types here. For the analytic proof, we will use vectors in 3 (the proof for 2 is similar).

12

**CHAPTER 1. VECTORS IN EUCLIDEAN SPACE
**

3.

Let u = (u1 , u2 , u3 ), v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 ) be vectors in u + (v + w) = (u1 , u2 , u3 ) + ((v1 , v2 , v3 ) + (w1 , w2 , w3 )) = (u1 , u2 , u3 ) + (v1 + w1 , v2 + w2 , v3 + w3 )

Then

by Theorem 1.4(b)

= (u1 + (v1 + w1 ), u2 + (v2 + w2 ), u3 + (v3 + w3 )) by Theorem 1.4(b) = ((u1 + v1 ) + w1 , (u2 + v2 ) + w2 , (u3 + v3 ) + w3 ) by properties of real numbers = (u1 + v1 , u2 + v2 , u3 + v3 ) + (w1 , w2 , w3 ) = (u + v) + w This completes the analytic proof of (b). Figure 1.2.6 provides the geometric proof.

u + (v + w) = (u + v) + w

by Theorem 1.4(b)

v+w u u+v

w

Figure 1.2.6

v Associative Law for vector addition

(c) We already discussed this on p.10. (d) We already discussed this on p.10. (e) We will prove this for a vector v = (v1 , v2 , v3 ) in k(lv) = k(lv1 , lv2 , lv3 ) = (klv1 , klv2 , klv3 ) = (kl)(v1 , v2 , v3 ) = (kl)v (f) and (g): Left as exercises for the reader.

QED 3

(the proof for

2

is similar):

by Theorem 1.4(a) by Theorem 1.4(a) by Theorem 1.4(a)

A unit vector is a vector with magnitude 1. Notice that for any nonzero vector v, v 1 the vector v is a unit vector which points in the same direction as v, since v > 0

v and v = v = 1. Dividing a nonzero vector v by v is often called normalizing v. v There are specific unit vectors which we will often use, called the basis vectors: i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1) in 3 ; i = (1, 0) and j = (0, 1) in 2 . These are useful for several reasons: they are mutually perpendicular, since they lie on distinct coordinate axes; they are all unit vectors: i = j = k = 1; every vector can be written as a unique scalar combination of the basis vectors: v = (a, b) = a i + b j in 2 , v = (a, b, c) = a i + b j + c k in 3 . See Figure 1.2.7.

1.2 Vector Algebra

z z

13

2 y y 1 k

v = (a, b, c)

2 1 j 0

v = (a, b) bj

ck y 1 2 x

(c)

3

y ai 0 bj

x i 1

(a)

2

x 0 ai

(b) v = a i + b j

i 0 j 1 x 2

2

(d) v = a i + b j + c k

Figure 1.2.7

Basis vectors in different dimensions

When a vector v = (a, b, c) is written as v = a i+b j+c k, we say that v is in component form, and that a, b, and c are the i, j, and k components, respectively, of v. We have: v = v1 i + v2 j + v3 k, k a scalar =⇒ kv = kv1 i + kv2 j + kv3 k v = v1 i + v2 j + v3 k, w = w1 i + w2 j + w3 k =⇒ v + w = (v1 + w1 )i + (v2 + w2 )j + (v3 + w3 )k v = v1 i + v2 j + v3 k =⇒ v = Example 1.4. Let v = (2, 1, −1) and w = (3, −4, 2) in v2 + v2 + v2 1 2 3

3.

(a) Find v − w. Solution: v − w = (2 − 3, 1 − (−4) − 1 − 2) = (−1, 5, −3) (b) Find 3v + 2w. Solution: 3v + 2w = (6, 3, −3) + (6, −8, 4) = (12, −5, 1) (c) Write v and w in component form. Solution: v = 2 i + j − k, w = 3 i − 4 j + 2 k (d) Find the vector u such that u + v = w. Solution: By Theorem 1.5, u = w−v = −(v−w) = −(−1, 5, −3) = (1, −5, 3), by part(a). (e) Find the vector u such that u + v + w = 0. Solution: By Theorem 1.5, u = −w − v = −(3, −4, 2) − (2, 1, −1) = (−5, 3, −1). (f) Find the vector u such that 2u + i − 2 j = k. 1 Solution: 2u = −i + 2 j + k =⇒ u = − 2 i + j + 1 k 2 (g) Find the unit vector Solution:

v v v v

. (2, 1, −1) =

1 √ 2 √ , √ , −1 6 6 6

= √

1 22 +12 +(−1)2

14

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

We can now easily prove Theorem 1.1 from the previous section. The distance d between two points P = (x1 , y1 , z1 ) and Q = (x2 , y2 , z2 ) in 3 is the same as the length of the vector w − v, where the vectors v and w are defined as v = (x1 , y1 , z1 ) and w = (x2 , y2 , z2 ) (see Figure 1.2.8). So since w − v = (x2 − x1 , y2 − y1 , z2 − z1 ), then d = w − v = (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 by Theorem 1.2.

z P(x1 , y1 , z1 ) w−v v w 0 x Q(x2 , y2 , z2 ) y

Figure 1.2.8

Proof of Theorem 1.2: d = w − v

A

Exercises ©

¨

**1. Let v = (−1, 5, −2) and w = (3, 1, 1). (a) Find v − w. (b) Find v + w. (e) Find
**

1 2 (v

(c) Find

v v

.

(d) Find

1 2 (v

+ w) .

(f) Find −2 v + 4 w.

(g) Find v − 2 w.

− w) .

(h) Find the vector u such that u + v + w = i. (i) Find the vector u such that u + v + w = 2 j + k. (j) Is there a scalar m such that m(v + 2 w) = k? If so, find it. 2. For the vectors v and w from Exercise 1, is v − w = v − w ? If not, which quantity is larger? 3. For the vectors v and w from Exercise 1, is v + w = v + w ? If not, which quantity is larger?

B

4. Prove Theorem 1.5(f) for

3.

5. Prove Theorem 1.5(g) for

3.

C

6. We know that every vector in 3 can be written as a scalar combination of the vectors i, j, and k. Can every vector in 3 be written as a scalar combination of just i and j, i.e. for any vector v in 3 , are there scalars m, n such that v = m i + n j? Justify your answer.

v2 .1 Angle between vectors We can now take a more geometric view of the dot product by establishing a relationship between the dot product of two vectors and the angle between them.3 Dot Product You may have noticed that while we did define multiplication of a vector by a scalar in the previous section on vector algebra. We will always choose the smallest nonnegative angle θ between them.3 Dot Product 15 1.(e)). There is a geometric way of defining the dot product. Definition 1. Definition 1. v. w2 ) in v · w = v1 w1 + v2 w2 2. The angle between two nonzero vectors with the same initial point is the smallest angle between them. Also notice that we defined the dot product in an analytic way. Let v = (v1 . does not hold for the dot product of vectors.1.e. so that 0◦ ≤ θ ≤ 180◦ .6) the dot product is: (1. i. w2 . So the associative law that holds for multiplication of numbers and for addition of vectors (see Theorem 1. the dot product u · v is a scalar. 3. We will now see one type of multiplication of vectors. by referencing vector coordinates. which we will now develop as a consequence of the analytic definition.5(b). w3 ) be vectors in The dot product of v and w.3.1. Any two nonzero vectors with the same initial point have two angles between them: θ and 360◦ − θ. and so (u · v) · w is not defined since the left side of that dot product (the part in parentheses) is a scalar and not a vector.3. (1. not a vector. Why? Because for vectors u. the dot product is still v · w = v1 w1 + v2 w2 + v3 w3 . w. For vectors v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k in component form. We do not define the angle between the zero vector and any other vector. θ θ 360 − θ (a) 0◦ < θ < 180◦ ◦ 360◦ − θ θ 360◦ − θ (b) θ = 180◦ (c) θ = 0◦ Figure 1. is given by: v · w = v1 w1 + v2 w2 + v3 w3 Similarly. See Figure 1. we did not define multiplication of a vector by a vector.7. v2 ) and w = (w1 . v3 ) and w = (w1 . . denoted by v · w. for vectors v = (v1 . called the dot product.7) Notice that the dot product of two vectors is a scalar.6.

so w cos θ = −2(v · w) .9) (note that equation (1. Then cos θ = v· w v w (1. v2 − w2 . expanding v − w v 2 2 in equation (1.08 =⇒ θ = 85. 1).9) gives + w 2 −2 v w cos θ = (v1 − w1 )2 + (v2 − w2 )2 + (v3 − w3 )2 −2 v + w 2 − 2(v · w) .9) holds even for the “degenerate” cases θ = 0◦ and 180◦ ).8) Proof: We will prove the theorem for vectors in 3 (the proof for 2 is similar). v3 − w3 ). Since cos 90◦ = 0. By the Law of Cosines (see Figure 1. QED cos θ = v w = v = (v2 + v2 + v2 ) + (w2 + w2 + w2 ) − 2(v1 w1 + v2 w2 + v3 w3 ) 1 2 3 1 2 3 2 = (v2 − 2v1 w1 + w2 ) + (v2 − 2v2 w2 + w2 ) + (v2 − 2v3 w3 + w2 ) 1 1 2 2 3 3 Example 1. 1.3. and w = 26.7. −4.6.41◦ v w 6 26 2 39 Two nonzero vectors are perpendicular if the angle between them is 90◦ . v2 . v3 ) and w = (w1 .2). Let v. and let θ be the angle between them.2 w v−w y Since v − w = (v1 − w1 . v = 6. w2 . We will write v ⊥ w to indicate that v and w are perpendicular.6: · Corollary 1. we have the following important corollary to Theorem 1. √ √ Solution: Since v · w = (2)(3) + (1)(−4) + (−1)(1) = 1. VECTORS IN EUCLIDEAN SPACE Theorem 1. Two nonzero vectors v and w are perpendicular if and only if v· w = 0. −1) and w = (3.5.3. so since v 0 and w 0 then v· w . Find the angle θ between the vectors v = (2. we have v−w 2 = v 2 + w 2 −2 v w cos θ (1. then cos θ = 1 1 v· w = √ √ = √ ≈ 0. Let v = (v1 . w be nonzero vectors. since v > 0 and w > 0. z v θ 0 x Figure 1.16 CHAPTER 1. . w3 ).

If θ is the angle between nonzero vectors v and w. So assume that v and w are nonzero vectors. v · w = cos θ v w . the dot product can be thought of as a way of telling if the angle between two vectors is acute.3. or a right angle. See Figure 1. then > 0 for 0◦ ≤ θ < 90◦ v · w is 0 for θ = 90◦ < 0 for 90◦ < θ ≤ 180◦ 17 By Corollary 1. obtuse.9. we have (a) v · w = w · v Commutative Law (b) (kv) · w = v · (kw) = k(v · w) Associative Law (c) v · 0 = 0 = 0 · v (d) u · (v + w) = u · v + u · w Distributive Law Distributive Law Cauchy-Schwarz Inequality5 (e) (u + v) · w = u · w + v · w (f) |v · w| ≤ v w Proof: The proofs of parts (a)-(e) are straightforward applications of the definition of the dot product. w 0◦ ≤ θ < 90◦ v (a) v · w > 0 w 90◦ < θ ≤ 180◦ v (b) v · w < 0 w θ = 90◦ v (c) v · w = 0 Figure 1. We will prove part (f). v. or zero.1. −2) and w = (3. then v · w = 0 by part (c).3. and scalar k. Theorem 1. respectively. w. and so the inequality holds trivially.3. negative.8. so |v · w| ≤ v w since |cos θ| ≤ 1. Are the vectors v = (−1.3 Dot Product Since cos θ > 0 for 0◦ ≤ θ < 90◦ and cos θ < 0 for 90◦ < θ ≤ 180◦ . Then by Theorem 1.8. v ⊥ w since v · w = (−1)(3) + (5)(1) + (−2)(1) = 0. depending on whether the dot product is positive. (f) If either v = 0 or w = 0. 1) perpendicular? Solution: Yes. .6. we also have: Corollary 1. 5 QED Also known as the Cauchy-Schwarz-Buniakovski Inequality. 1.6.3 Sign of the dot product & angle between vectors Example 1. so |v · w| = |cos θ| v w . and are left to the reader as exercises. 5. The following theorem summarizes the basic properties of the dot product. For any vectors u.

Let v = −3 i − 2 j − k and w = 6 i + 4 j + 2 k. we have the following fact: If u ⊥ v and u ⊥ w. Thus.9. which proves (b). so subtracting w from both sides gives v − w ≤ v − w . If nonzero vectors v and w are parallel. l. so since a ≤ |a| for any real number a. 3). Another way of saying this is with the familiar statement “the shortest distance between two points is a straight line.9(f) we have 2 w + w = ( v + w )2 and so v + w ≤ v + w after taking square roots of both sides.3. l. For Exercises 3-8. QED The Triangle Inequality gets its name from the fact that in any triangle.18 CHAPTER 1. 2. w.” ¨ v+w w v Figure 1. we see that if u · v = 0 and u · w = 0. we have (a) v 2 = v · v (b) v + w ≤ v + w Triangle Inequality (c) v − w ≥ v − w Proof: (a) Left as an exercise for the reader. the collection of all scalar combinations kv + lw is called the span of v and w. Let v = (5. Calculate v · w. if they are not parallel. so by Theorem 1. then their span is a plane. . then v = w + (v − w) ≤ w + v − w by the Triangle Inequality. we have . −4. the most important of which is the Triangle Inequality.9. The dot product can be used to derive properties of the magnitudes of vectors.4 A Exercises © 1. as given in the following theorem: Theorem 1.4). (c) Since v = w + (v − w).10. VECTORS IN EUCLIDEAN SPACE Using Theorem 1. no one side is longer than the sum of the lengths of the other two sides (see Figure 1. then u ⊥ (kv + lw) for all scalars k. (b) By part (a) and Theorem 1. Calculate v · w. we have v+w 2 = v ≤ v ≤ v = (v + w) · (v + w) = v · v + v · w + w · v + w · w 2 2 2 + 2 |v · w| + w +2 v + 2(v · w) + w 2 2 . find the angle θ between the vectors v and w. So what we showed above is that a vector which is perpendicular to two other vectors is also perpendicular to their span. then their span is a line. −2) and w = (4.3. 1. For vectors v and w. then u · (kv + lw) = k(u · v) + l(u · w) = k(0) + l(0) = 0 for all scalars k. For any vectors v.

(Note: α.1. 3) and w = (−2. γ are often called the direction angles of v. 22. 16.9(d). w from Exercise 6. Let α.3 Dot Product 3. verify the Triangle Inequality v + w ≤ v + w .9(e). −2). w (Hint: Consider the angle between v and w. 11.5 w 26. 6. Prove Theorem 1. 4). w = (1. 2. the projection of v onto w (sometimes written as pro jw v) is the vector u along the same line L as w whose terminal point is obtained by dropping a perpendicular line from the terminal point of v to L (see Figure 1. v = (2.9(c). 10. For nonzero vectors v and w. v = (5. v = (4. v = − i + 2 j + k. w = (8. w. and cos α. 3) 5.) v L u Figure 1. Prove or give a counterexample: If u · v = u · w. 4) and w = (0. cos γ are called the direction cosines. Show that cos2 α + cos2 β + cos2 γ = 1. 2. then w = 0. j. Prove that v − w ≤ v − w for all v. C 21.) . 0. v = (7. 4.9(b). 17. 4) 6. 0) 7. −2) 8. w = −3 i + 6 j + 3 k 4. and k. β. Prove Theorem 1. Prove or give a counterexample: If v · w = 0 for all v. B Note: Consider only vectors in 15. w = 3 i + 2 j + 4k 19 9. for Exercises 15-25. 2. 1. For v. w from Exercise 5. respectively. 14. 19. v = i. w . verify the Cauchy-Schwarz Inequality |v · w| ≤ v 12. Is v ⊥ w? Justify your answer. cos β. For v. Prove Theorem 1. 4. verify the Cauchy-Schwarz Inequality |v · w| ≤ v 13. verify the Triangle Inequality v + w ≤ v + w .10(a).3. 4). w = (2. w from Exercise 5. Show that |v · w| u = . For v.3. then v = w. −10). −4. −2. β. −1). Let v = (8. 25. Is v ⊥ w? Justify your answer. 24. and γ be the angles between a nonzero vector v in 3 and the vectors i. Prove Theorem 1. 1. For v. w from Exercise 6. Prove Theorem 1. w = (4. then v = w. −1). Prove or give a counterexample: If u · v = u · w for all u. 23.9(a).5). 3 w . 20. Prove Theorem 1. 18. Let v = (6. 1.

v1 w2 − v2 w1 ) · (v1 . v3 w1 − v1 w3 .1 In the above example. QED As a consequence of the above theorem and Theorem 1. v3 w1 − v1 w3 . however. v1 w2 − v2 w1 ) z 3. The resulting product. If the cross product v × w of two nonzero vectors v and w is also a nonzero vector. but we will see the geometric basis for it shortly. is only defined for vectors in 3 . which gave a way of multiplying two vectors. v2 . then it is perpendicular to both v and w. 1. w3 ) be vectors in product of v and w. not a vector. x 1 k = i× j 1 y i 0 j 1 Figure 1.3 we defined the dot product. 0. then it is perpendicular to the span of v and w.11. Find i × j. then i × j = ((0)(0) − (0)(1). . 0) and j = (0. the cross product of the given vectors was perpendicular to both those vectors.4.9. w2 .10) Example 1. = v1 v2 w3 − v1 v2 w3 + w1 v2 v3 − w1 v2 v3 + v1 w2 v3 − v1 w2 v3 ∴ v × w ⊥ v by Corollary 1. The proof that v × w ⊥ w is similar. This product. The cross (1. 0). denoted by v × w. we have the following: Corollary 1. v3 ) and w = (w1 . 1) =k Similarly it can be shown that j × k = i and k × i = j. called the cross product. 0.20 CHAPTER 1.12. Proof: We will show that (v × w) · v = 0: (v × w) · v = (v2 w3 − v3 w2 . VECTORS IN EUCLIDEAN SPACE 1. v2 . The definition may appear strange and lacking motivation. If the cross product v × w of two nonzero vectors v and w is also a nonzero vector. Theorem 1.7. after rearranging the terms. is the vector in 3 given by: v × w = (v2 w3 − v3 w2 . Let v = (v1 . (0)(0) − (1)(0). was a scalar. In this section we will define a product of two vectors that does result in another vector.8. Definition 1. Solution: Since i = (1.7.4 Cross Product In Section 1. It turns out that this will always be the case. (1)(1) − (0)(0)) = (0. v3 ) = v2 w3 v1 − v3 w2 v1 + v3 w1 v2 − v1 w3 v2 + v1 w2 v3 − v2 w1 v3 = 0 .

so we have: . there are two possible directions for v× w.4.6 v 2 w 2 w 2 (1 − cos2 θ) . since v > 0 and w > 0. for nonzero vectors v. and since 0◦ ≤ θ ≤ 180◦ . It turns out (see Appendix B) that the direction of v × w is given by the right-hand rule. v2 w2 . so by Theorem 1. then sin θ ≥ 0. Recall from Section 1. As shown in Figure × 1. and v2 w2 on the right side gives 1 1 2 2 3 3 2 = v2 (w2 + w2 + w2 ) + v2 (w2 + w2 + w2 ) + v3 (w2 + w2 + w2 ) 2 3 1 2 3 2 1 2 3 1 1 2 − (v1 w2 + v2 w2 + v2 w2 + 2(v1 w1 v2 w2 + v1 w1 v3 w3 + v2 w2 v3 w3 )) 3 3 1 2 2 2 = v2 w2 − 2v2 w2 v3 w3 + v2 w2 + v2 w1 − 2v1 w1 v3 w3 + v2 w2 + v2 w2 − 2v1 w1 v2 w2 + v2 w2 2 3 3 2 3 1 3 1 2 2 1 2 = (v1 + v2 + v2 )(w2 + w2 + w2 ) 2 3 1 2 3 − ((v1 w1 )2 + (v2 w2 )2 + (v3 w3 )2 + 2(v1 w1 )(v2 w2 ) + 2(v1 w1 )(v3 w3 ) + 2(v2 w2 )(v3 w3 )) so using (a + b + c)2 = a2 + b2 + c2 + 2ab + 2ac + 2bc for the subtracted term gives 2 = (v1 + v2 + v2 )(w2 + w2 + w2 ) − (v1 w1 + v2 w2 + v3 w3 )2 2 3 1 2 3 2 = v w 2 = v = v v× w 2 2 2 2 w = v w (v · w)2 .1 that this means that you can point your thumb upwards in the direction of v × w while rotating v towards w with the remaining four fingers. z v θ w x 0 P v× w y −v × w Direction of v × w Figure 1. nonparallel vectors v. so the above corollary shows that v × w is perpendicular to that plane. where θ is the angle between v and w. v × w form a right-handed system. one the opposite of the other. w in 3 is a plane P.4 Cross Product 21 The span of any two nonzero.4.1.2. so 2 − (v · w)2 1− 2 sin2 θ . the vectors v. w. w: v× w 2 = (v2 w3 − v3 w2 )2 + (v3 w1 − v1 w3 )2 + (v1 w2 − v2 w1 )2 = v2 (w2 + w2 ) + v2 (w2 + w2 ) + v2 (w2 + w2 ) − 2(v1 w1 v2 w2 + v1 w1 v3 w3 + v2 w2 v3 w3 ) 1 2 3 2 1 3 3 1 2 and now adding and subtracting v2 w2 . that is.2 We will now derive a formula for the magnitude of v × w.

VECTORS IN EUCLIDEAN SPACE If θ is the angle between nonzero vectors v and w in v× w = v w sin θ 3. w (as vectors in A = v× w is: . w (as vectors in A= 1 v× w 2 3) 3) is: (b) The area A of a parallelogram with adjacent sides v. respectively. respectively. where b is the base of the triangle and h is the height. in 3 . Area of triangles and parallelograms (a) The area A of a triangle with adjacent sides v.3.4. Let θ be the angle between v and w.8. Let △PQR and PQRS be a triangle and parallelogram. and identify the sides QR and QP with vectors v and w. then By the discussion in Example 1. So we see that b= v and h = w sin θ 1 v w sin θ 2 1 v× w = 2 of the parallelogram PQRS is twice the area of the triangle APQR = APQRS = v w sin θ So since the area APQRS △PQR.13. Example 1.8. we have proved the following theorem: Theorem 1. as shown in Figure 1. then (1.3 Q S w θ v R P h S Think of the triangle as existing in 3 . as in the following example. The area APQR of 1 △PQR is 2 bh.22 CHAPTER 1. The formula is more useful for its applications in geometry. P h θ Q b R Figure 1. when the magnitude of the cross product can be calculated directly. like for any other vector.11) It may seem strange to bother with the above formula.4.

as in Figure 1. −7) = (−7. 4 2 . 12.5 5 A=5 .4. and Theorem 1. 4. 18). 1) − (4. and R = (−5.13 is valid.4 Cross Product 23 It may seem at first glance that since the formulas derived in Example 1. 12. R = (5.4. 12. 2.1. 2).13 makes it simpler to calculate the area of a triangle in 3-dimensional space than by using traditional geometric methods. 8) − (2. −190. so the choice of adjacent sides indeed does not matter. 2 can be thought of as the subset of 3 such that the z-coordinate is always 0. and the cross product is only But these are vectors in Q w 3 defined for vectors in 3 . 25) × (−7. 2. Theorem 1. 0) = (0.9. We would get a different formula for the area if we had picked PQ and PR as the adjacent sides. − − → − − → y Solution: Let v = S P and w = S R. Q = (3. then the more general statements in Theorem 1. − − → − − → z Solution: Let v = PQ and w = PR. 8). 3.4. 2) = (1. 7. 3).8 were for the adjacent sides QP and QR only. Then the v 1 P area A of PQRS is x A = v × w = (−3.13 that the formulas hold for any adjacent sides are not justified. where P = (2. Example 1. 18) − (2. 1). −1. 4.10. −5) 0 1 = ((−1)(0) − (0)(2). −1) and w = (5. 0.5. 29) 2 1 √ 1 (−155)2 + (−190)2 + 292 = 60966 = 2 2 A ≈ 123. 8. Calculate the area of the parallelogram PQRS . 4) − (4. Then R v = (1. Q(3.4 w y Example 1. 18) Then v = (3. (−3)(2) − (−1)(1)) 2 3 4 Figure 1. 2) = (−3. 7. and S = (4. 0) × (1.4. 4. 2). where P = (1. 4. −7). 3. but it can be shown (see Exercise 26) that the different formulas would yield the same value.46 A= v 0 x P(2. 8. 4). 0) and w = (1. However. −1. as in Figure 1. so the area A of the triangle △PQR is R(−5. 0). −7) = (1. (1)(8) − (3)(−7)) = 2 1 = (−155. 15) 2 2 1 ((3)(15) − (25)(8). 8) 1 1 v× w = (1. (25)(−7) − (1)(15). Q = (2. Calculate the area of the triangle △PQR. −7) Figure 1. 25) and w = (−5.4. 7. 15). 2 S So we can write v = (−3. (0)(1) − (−3)(0).

6). If both v and w are nonzero. and either v = 0 = 0w or w = 0 = 0v. v2 w1 − v1 w2 ) = −(w2 v3 − w3 v2 .e. we have: v × w = (v2 w3 − v3 w2 . v2 . Figure 1.24 CHAPTER 1. k2 . (a) By the definition of the cross product and scalar multiplication. v1 w2 − v2 w1 ) z v v× w y w x 0 = −(v3 w2 − v2 w3 .6 6 An equivalent definition of a parallelepiped is: the collection of all scalar combinations k1 v1 +k2 v2 +k3 v3 of some vectors v1 . which is true if and only if sin θ = 0 (since v > 0 and w > 0). v3 in 3 . Theorem 1. But the angle between v and w is 0◦ or 180◦ if and only if v w. QED Example 1. and θ is the angle between them. then sin θ = 0 if and only if θ = 0◦ or 180◦ . all of which are parallelograms. VECTORS IN EUCLIDEAN SPACE The following theorem summarizes the basic properties of the cross product. so v and w are scalar multiples. we have (a) v × w = −w × v Anticommutative Law (b) u × (v + w) = u × v + u × w Distributive Law (c) (u + v) × w = u × w + v × w Distributive Law (d) (kv) × w = v × (kw) = k(v × w) (e) v × 0 = 0 = 0 × v (f) v × v = 0 (g) v × w = 0 if and only if v w Associative Law Proof: The proofs of properties (b)-(f) are straightforward. w in 3 . then by formula (1.14. v.4. w1 v2 − w2 v1 ) = −w × v w× v Note that this says that v × w and w × v have the same magnitude but opposite direction (see Figure 1. and scalar k. v × w = 0 if and only if v w sin θ = 0. where 0 ≤ k1 . We will prove parts (a) and (g) and leave the rest to the reader as exercises. . they are parallel. For any vectors u.7.6 × (g) If either v or w is 0 then v× w = 0 by part (e). v1 w3 − v3 w1 . i. So since 0◦ ≤ θ ≤ 180◦ . v3 w1 − v1 w3 .11. Adding to Example 1. w3 v1 − w1 v3 .11). we have i× j = k j × i = −k j× k = i k × j = −i k× i = j i × k = −j i× i = j× j = k× k = 0 Recall from geometry that a parallelepiped is a 3-dimensional solid with 6 faces. k3 ≤ 1.4.

And we can see that since v× w is θ perpendicular to the base parallelogram dew termined by v and w.7 In Example 1. then the volume of the parallelepiped is |u · (v × w)|. v.12) will give you the negative of the volume of the parallelepiped. By Theorem 1.4.15. Another type of triple product is the vector triple product u × (v × w). the volume is w · (u × v). u · (v × w) = w · (u × v) = v · (w × u) (Note that the equalities hold trivially if any of the vectors are 0. Repeating this with the base determined by w and u. because the vector u is on the same side of the base parallelogram’s plane as the vector × v× w (so that cos θ > 0).12. then repeating the same steps using the base determined by u and v (since w is on the same side of that base’s plane as u × v).12) Since v × w = −w × v for any vectors v.4 Cross Product 25 Example 1.4. we have the following result: For any vectors u. By Theorem u 1. then picking the wrong order for the three adjacent sides in the scalar triple product in formula (1. Volume of a parallelepiped: Let the vectors u. Hence.1.v × w allelepiped is the area A of the base parallelogram times the height h. Solution: Recall that the volume of a par. So taking the absolute value of the scalar triple product for any order of the three adjacent sides will always give the volume: Theorem 1. v.) (1. w in 3 . as in Figure 1.6 we know that u · (v × w) . the area A of the base parallelogram × × is v× w .12 the height h of the parallelepiped is u cos θ. w in 3 represent adjacent sides of a parallelepiped P. u v× w vol(P) = A h u u · (v × w) = v× w u v× w = u · (v × w) cos θ = h v Parallelepiped P Figure 1. Since the volume is the same no matter which base and height we use.7.13(b). w in 3. where θ is the angle between u and v × w. If vectors u. and not − u cos θ. then the height h is u cos θ. v. w in 3 represent any three adjacent sides of a parallelepiped. Show that the volume of P is the scalar triple product u · (v × w). The proof of the following theorem is left as an exercise for the reader: .

w = (1. which could be any vector? The following example may help to see how this works. u × (v × w).4. the cross product is written as: v × w = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k. because it can be represented as a determinant. Solution: Since u · v = 6 and u · w = 7. −4. But then how is u × (v × w) also perpendicular to u.4.8 For vectors v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k in component form. since that plane is itself perpendicular to v × w. being perpendicular to v × w means that u × (v × w) lies in the plane containing v and w. 18.8). VECTORS IN EUCLIDEAN SPACE 3. then u × (v × w) = (u · w)v − (u · v)w = (8. 7 See A NTON and R ORRES for a fuller development. 0) = (14. v× w z u y 0 u × (v × w) v w x Figure 1. . For any vectors u. v and w are coplanar). v = (2. 0). We will not go too deeply into the theory of determinants7 . v. 0) − 6 (1.11. 2.26 CHAPTER 1. 4) (see Figure 1. 3. 0) − (6.e. and hence lies in the plane containing v and w (i. 0) = 7 (2. Theorem 1. and that u × (v × w) also lies in that plane. It is often easier to use the component form for the cross product. This makes sense since. In particular. By the right side of formula (1. w in u × (v × w) = (u · w)v − (u · v)w (1.13. 0) Note that v and w lie in the xy-plane. 3. 4). by Theorem 1. 14. 0. 0). u × (v × w) is perpendicular to both u and v × w. we see that u × (v × w) is a scalar combination of v and w. we will just cover what is essential for our purposes. Example 1. 2.16. Find u × (v × w) for u = (1.13). u × (v × w) is perpendicular to both u and v × w = (0.16 gives some idea of the geometry of the vector triple product.13) An examination of the formula in Theorem 1. 2. Also.

written as a b c d or a b c d 27 where a.14. c. putting alternating plus and minus signs in front of each (starting with a plus). The determinant of such a matrix. 1 0 2 −1 3 4 −1 3 = 1 0 2 1 0 2 − 0 4 3 1 2 + 2 4 −1 = 1(−2 − 0) − 0(8 − 3) + 2(0 + 1) = 0 1 0 . written as a b c d or det a b .15. Example 1. written as a1 a2 a3 a1 a2 a3 b1 b2 b3 or b1 b2 b3 .14) and its determinant is given by the formula: One way to remember the above formula is the following: multiply each scalar in the first row by the determinant of the 2 × 2 matrix that remains after removing the row and column that contain that scalar. then sum those products up. Example 1.4 Cross Product A 2 × 2 matrix is an array of two rows and two columns of scalars. d are scalars.1. b. c1 c2 c3 c1 c2 c3 a1 a2 a3 b b b b b b b1 b2 b3 = a1 2 3 − a2 1 3 + a3 1 2 c1 c2 c1 c3 c2 c3 c1 c2 c3 (1. 1 2 = (1)(4) − (2)(3) = 4 − 6 = −2 3 4 A 3 × 3 matrix is an array of three rows and three columns of scalars. c d is the scalar defined by the following formula: a b = ad − bc c d It may help to remember this formula as being the product of the scalars on the downward diagonal minus the product of the scalars on the upward diagonal.

Then i j k −1 3 × w = 4 −1 3 = v i − 0 2 1 0 2 4 3 j + 1 2 4 −1 k = −2 i − 5 j + k 1 0 The scalar triple product can also be written as a determinant. This gives us a determinant that is now a vector. In fact. u2 .12. derived from algebraic operations on scalar entries in a matrix. 3. Find the volume of the parallelepiped with adjacent sides u = (2. v3 ). 1. 1.28 CHAPTER 1. 2 1 3 2 u · (v × w) = −1 3 1 1 −2 3 2 =2 1 −2 −1 2 − 1 1 −2 −1 3 + 3 1 1 u 0 x w Figure 1. u3 ). Theorem 1.9). The proof is left as an exercise for the reader. 3).15. v2 .15) Example 1. VECTORS IN EUCLIDEAN SPACE We defined the determinant as a scalar. v = (v1 . Solution: By Theorem 1. −2) (see Figure 1. w2 . so . w = (w1 . by Example 1.16. v = (−1. 2).17. By Theorem 1.17. if we put three vectors in the first row of a 3 × 3 matrix.4. the volume of the parallelepiped P is the absolute value of the scalar triple product of the three adjacent sides (in any order). For any vectors u = (u1 . = 2(−8) − 1(0) + 3(−4) = −28.4. w3 ) in u1 u2 u3 u · (v × w) = v1 v2 v3 w1 w2 w3 3: (1. Let v = 4 i − j + 3 k and w = i + 2 k. However.17. and lets us write the cross product of v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k as a determinant: i j k v v3 v v3 v v2 × w = v1 v2 v3 = 2 i − 1 v j + 1 k w2 w3 w1 w3 w1 w2 w1 w2 w3 = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k Example 1. the following theorem provides an alternate definition of the determinant of a 3 × 3 matrix as the volume (or negative volume) of a parallelepiped whose adjacent sides are the rows of the matrix. since we would be performing scalar multiplication on those three vectors (they would be multiplied by the 2 × 2 scalar determinants as before).9 P z v y vol(P) = |−28| = 28. w = (1. then the definition still makes sense.

3). calculate the area of the parallelogram PQRS . −10) 6. 1) 10. 13. 9. w = (1. w = (1. 2. v = (7. 3). (u × v) · (w × z) = x · (w × z) = w · (z × x) (by formula (1. 2. 4) 4. 1. v = (2. calculate u · (v × w) and u × (v × w). S = (3. v = i. 0.4 Cross Product 29 Interchanging the dot and cross products can be useful in proving vector identities: Example 1. 1. −2) 12. calculate the area of the triangle △PQR. u = (1. −2. −4. w = (7. 4). 2). w = (2. 0. 4). P = (4. 5). 1. R = (−1.1. u = (1. u = (1. 1. 1. 0) Exercises © ¨ 2. w = (2. z in v·w v·z 3. find the volume of the parallelepiped with adjacent sides u. 6.12)) = w · (z × (u × v)) = w · ((z · v)u − (z · u)v) (by Theorem 1. 0) 8. w = −3 i + 6 j + 3 k For Exercises 7-8. 0. 0. v = (3. 2) 14. −2). 6). Q = (2. −2) 15. Prove: (u × v) · (w × z) = Solution: Let x = u × v. v = − i + 2 j + k. u = (1. 2. w = (2. 5). −2). 11. 3). Q = (4. z = (2. 1. 2). −10). 0. 3. . v. 1. 2. 1. 2. −2). 0. v = (7. −10). Calculate (u × v) · (w × z) for u = (1. w = (2. w = (4. 5. v. 3. 4. R = (6. 1) For Exercises 13-14. A For Exercises 1-6. 3).18. 2). 4). 0. R = (2. −4. 2). w = 3 i + 2 j + 4k 5. w = (5. 2. 0. w. 3) 3. Q = (1. 1). v = (3. S = (3. 1. −1) For Exercises 9-10. Q = (1. 2). 1. 4). 1. v = (2. P = (−2. calculate v × w.16) = (z · v)(w · u) − (z · u)(w · v) = u·w u·z v· w v· z = (u · w)(v · z) − (u · z)(v · w) (by commutativity of the dot product). 7. w. v = (1. 2). 1). 0) For Exercises 11-12. 2). P = (5. v = (−1. P = (2. 3). 4. Then u· w u· z for all vectors u. R = (2. v = (5. 1.

(Hint: Expand both sides of the equation.14(e). Prove Theorem 1. 21. v. z in 3. If v and w are unit vectors in 3 . For all vectors u.14(c). Prove the Jacobi identity: u × (v × w) + v × (w × u) + w × (u × v) = 0 29. Show that if v × w = 0 for all w in 18. −w = PQ. 3. then v = 0 or w = 0. To do this. w. 24. where u = PR. Show that u.8 the formula for the area of the triangle △PQR yields the same value no matter which two adjacent sides are chosen. w in (a) v × w 2 3: (b) If v · w = 0 and v × w = 0. 17. show that 2 (−u) × (−v) = 2 v × w . then v = 0.17. 23. 3 if and only if u · (v × w) = 0. Consider the vector equation a × x = b in 3. Prove that in Example 1. Prove Theorem 1. and v = QR.14(b). 1 1 Similarly.30 CHAPTER 1. w = QP as before. 19. w lie in the same plane in 30.14(d). under what condition(s) would v × w also be a unit vector in 3 ? Justify your answer. Prove the following for all vectors v. 20. VECTORS IN EUCLIDEAN SPACE B 16. Prove Theorem 1. Prove Theorem 1. + |v · w|2 = v 2 w 2 C 26.14(f). Prove Theorem 1. 27. where a 0. show that (u × v) × (w × z) = (z · (u × v))w − (w · (u × v))z and that (u × v) × (w × z) = (u · (w × z))v − (v · (w × z))u Why do both equations make sense geometrically? . Show that: (a) a · b = 0 b× a (b) x = + ka is a solution to the equation. Prove Theorem 1. for any scalar k a 2 28. Prove Theorem 1.16. v. where −u = RP and −v = RQ. show that 1 1 2 u × (−w) = 2 v × w .) 25. 22.

y0 .5. Since multiplying the vector v by a scalar t lengthens or shrinks v while preserving its direction if t > 0. in the language of vectors. y.5 Lines and Planes 31 1. z0 ) be the vector pointing from the origin to P.5 Lines and Planes Now that we know how to perform some operations on vectors.5. Since v = (a. as t varies over all real numbers. We then get the parametric representation of L with the parameter t: For a point P = (x0 .17) Note that in both representations we get the point P on L by letting t = 0. y0 + bt. Note that we used the correspondence between a vector and its terminal point. z0 ) be a point in 3 . b. y0 . the line L through P parallel to v is given by r + tv.1 0 v tv r + tv t>0 y L Let r = (x0 . b. z P(x0 . We will first consider lines.1 that every point on the line L can be obtained by adding the vector tv to the vector r for some scalar t. c) in parallel to v consists of all points (x.5. then we see from Figure 1. . z0 ) r r + tv t<0 x Figure 1. for − ∞ < t < ∞ (1. c). let v = (a.1). and let L be the line through P which is parallel to v (see Figure 1. z0 + ct). for − ∞ < t < ∞ (1. parallel to a vector Let P = (x0 . and reversing its direction if t < 0. the line L through P z = z0 + ct. we can start to deal with some familiar geometric objects. y0 .16) where r = (x0 . b. then the terminal point of the vector r + tv is (x0 + at. Line through a point. y = y0 + bt. y0 .1. z0 ) and nonzero vector v in 3 . y0 . z0 ) is the vector pointing to P. The reason for doing this is simple: using vectors makes it easier to study objects in 3-dimensional Euclidean space. We can summarize the vector representation of L as follows: For a point P = (x0 . z0 ) and nonzero vector v = (a. z) given by x = x0 + at. c) be a nonzero vector. y0 . like lines and planes. 3. That is. the vector r + tv will point to every point on L.

z) such that x = 2 + 4t. for − ∞ < t < ∞ (c) L consists of the points (x. 5) + t(4. which x is parallel to the yz-plane (see Figure 1.5.17). 6).2 can be derived for the cases when b = 0 or c = 0. VECTORS IN EUCLIDEAN SPACE In formula (1. is zero: t = (y − y0 )/b and t = (z − z0 )/c. z = 5 + 6t. so we can write the following system of equalities. say. the parametric representation always gives just the points on L and nothing else. 3. though. Then by formula (1.19. 3. 11) and (10. then we can solve for the parameter t: t = (x − x0 )/a. for − ∞ < t < ∞ (b) L consists of the points (x. 5). These three values all equal the same value t. −1. called the symmetric representation of L: For a point P = (x0 . Solution: (a) Let r = (2.16) is more compact than the parametric and symmetric formulas.32 CHAPTER 1. L is given by: r + tv = (2. Technically. 6).18) Note that this says that the line L lies in the plane x = x0 . y = 3 − t. 1. z) such that x−2 y−3 z−5 = = 4 −1 6 (d) Letting t = 1 and t = 2 in part(b) yields the points (6. c) in 3 with a. Lastly: (d) find two points on L distinct from P. −1. Similar equations Figure 1. Write the line L through the point P = (2. z) given by the equations x − x0 y − y0 z − z0 = = a b c What if. the line L through P parallel to v consists of all points (x. b and c all nonzero. if a 0. . 17) on L.19) x0 z L y 0 x = x0 (1. (c) symmetric. That is an advantage of using vector notation. not just L itself. the vector representation gives us the vectors whose terminal points make up the line L.2). and so x = x0 + 0t = x0 . y0 . b. We can also solve for t in terms of y and in terms of z if neither b nor c. y − y0 z − z0 = b c (1. On the other hand. y. Example 1. y. You may have noticed that the vector representation of L in formula (1. in the following forms: (a) vector. y.5. So you have to remember to identify the vectors r + tv with their terminal points. a = 0 in the above scenario? We can not divide by zero. (b) parametric. 2. z0 ) and vector v = (a. Then the symmetric representation of L would be: x = x0 . 5) and parallel to the vector v = (4. but we do know that x = x0 + at. 3. respectively.16).

4 d L (1. and let r1 = (x1 . z2 ) P1 (x1 . we will get the entire line L as t varies over all real numbers.4). z1 ) and r2 = (x2 . parametric. z1 ). y1 y2 . Let r1 = (x1 . 1.5. and symmetric forms for the line L: z P2 (x2 .21). z2 ). Solution: By formula (1. and let P be a point not on L. r2 − r1 is the vector from P1 to P2 . The distance d from P to L is the length of the line segment from P to L which is perpendicular to L (see Figure 1.5 Lines and Planes 33 Line through two points Let P1 = (x1 . z = z1 + (z2 − z1 )t.20. Write the line L through the points P1 = (−3. and let w be the vector from Q to P.3. Pick a point Q on L. then: d= v× w v P w θ Q v Figure 1. z2 ) be the vectors pointing to P1 and P2 .3 r2 r1 + t(r2 − r1 ) y Let P1 = (x1 . z1 ) and P2 = (x2 . 4. So since v × w = v w sin θ and v 0. Then as we can see from Figure 1. y1 . −6) in parametric form.5. then d = w sin θ. L consists of the points (x. y1 . The following is a summary of the vector. z2 ) be distinct points in 3 . If θ is the angle between w and v. z2 ) be distinct points in 3 . y1 . z1 ) r2 − r1 L r1 0 x Figure 1.23) .5. Then the line L through P1 and P2 has the following representations: Vector: r1 + t(r2 − r1 ) . and let L be the line through P1 and P2 . y2 . y2 . P2 = (x2 . y. for − ∞ < t < ∞ Distance between a point and a line Let L be a line in 3 in vector form as r + tv (for −∞ < t < ∞). So if we multiply the vector r2 − r1 by a scalar t and add it to the vector r1 .20) Example 1. y = 1 + 3t. for − ∞ < t < ∞ (1. z1 ).21) (1. y2 . y2 . for − ∞ < t < ∞ Parametric: x = x1 + (x2 − x1 )t. z = −4 − 2t. z) such that x = −3 + 7t.5. y2 . respectively. y1 . and z1 z2 ) (1. −4) and P2 = (4. y1 . Symmetric: x − x1 y − y1 z − z1 = = x2 − x1 y2 − y1 z2 − z1 (if x1 x2 .1. r2 = (x2 .22) y = y1 + (y2 − y1 )t.

parallel. −2). 8 − 3t.) 1 − s = −3 + 2t : 1 − 0 = −3 + 2(2) ⇒ 1 = 1 Letting s = 0 in the equations for the first line. 0. L1 and L2 are perpendicular (denoted as L1 ⊥ L2 ) if v1 and v2 are perpendicular. −4) = (4. z) triples equal will result in a system of 3 equations in 2 unknowns (s and t). or L1 they intersect. that is. 2. we have: i j k 3 −2 v × w = 7 3 −2 = i − 0 5 4 0 5 7 −2 j + 4 5 7 3 k = 15 i − 43 j − 12 k . 3. Also. Since the point Q = (−3. skew 0 lines are on parallel planes (see Figure 1. Solution: From Example 1. It is clear that two lines L1 and L2 .5.5). 1) to the line L in Example 1. In this case. two lines are either identical. 1. gives the point of intersection (−1. Find the point of intersection (if any) of the following lines: x+1 y−2 z−1 = = 3 2 −1 and x+3= y−8 z+3 = −3 2 Solution: First we write the lines in parametric form. y.22. or letting t = 2 in the equations for the second line. 5).20.34 CHAPTER 1. 1 − s) = (−3 + t. 1. 3. −2) 152 + (−43)2 + (−12)2 72 + 32 + (−2)2 z In 2-dimensional space.5 to use the parametric representation of the lines. . −4) and v = (7. so 4 0 √ 2218 = √ = 5. there is an additional possibility: two lines can be skew. t: −1 + 3s = −3 + t : ⇒ t = 2 + 3s 2 + 2s = 8 − 3t : ⇒ 2 + 2s = 8 − 3(2 + 3s) = 2 − 9s ⇒ 2s = −9s ⇒ s = 0 ⇒ t = 2 + 3(0) = 2 (Note that we had to check this. it is often easier Figure 1. 1.5. even though they are not parallel. 1. z = 1 − s and x = −3 + t. 1). 1. 1) − (−3. Setting the two (x. 2 + 2s. with parameters s and t: x = −1 + 3s. −4) is on L. −3 + 2t) for some s. y = 8 − 3t. Find the distance d from the point P = (1. In 3-dimensional space. However.98 62 15 i − 43 j − 12 k v× w = = d= v (7.21. y = 2 + 2s. for r = (−3. since the values of the parameters may not be the same at the point of intersection. we see that we can represent L in vector form as: r + tv. z = −3 + 2t The lines intersect when (−1 + 3s. respectively. then for − − → w = QP = (1.20. represented in vector form as r1 + sv1 and r2 + tv2 . they do not intersect but they L2 y are not parallel. Example 1. VECTORS IN EUCLIDEAN SPACE Example 1. you should use different parameter variables (usually s and t) for the lines. x To determine whether two lines in 3 intersect. are parallel (denoted as L1 L2 ) if v1 and v2 are parallel.

Let P be a plane in 3 . n r (x. y0 . y − y0 . then r ⊥ n and hence n · r = 0. z) lies in P. This proves the following theorem: Theorem 1. z − z0 ) lies in the plane P (see Figure 1.25) For example. Solution: By formula (1. z) be any point in the plane P.6).5 Lines and Planes We will now consider planes in 3-dimensional Euclidean space. Such a vector is called a normal vector (or just a normal) to the plane. then r ⊥ n and so (x. 35 Plane through a point. Example 1. y. . the normal form of the plane in Example 1. Then P consists of the points (x. So if r 0. 4. And if r = 0 then we still have n · r = 0. perpendicular to a vector Let P be a plane in 3 . we get an equation of the plane in normal form: ax + by + cz + d = 0 (1. and suppose it contains a point P0 = (x0 .23 is 2x + 4y + 8z − 22 = 0. Then the vector r = (x − x0 . z) is any point in 3 such that r = (x − x0 . Find the equation of the plane P containing the point (−3.18. z − z0 ) 0 and n · r = 0. or equivalently: a(x − x0 ) + b(y − y0 ) + c(z − z0 ) = 0 The above equation is called the point-normal form of the plane P. 1.26) (1. b. z0 ) Figure 1. the plane P consists of all points (x.5.25). and let n = (a. if (x.25) and combine the constant terms. 3) and perpendicular to the vector n = (2. c) be a nonzero vector which is perpendicular to P. z − z0 ).23. y. y − y0 . b. y. Now let (x. let (x0 . 8). y. y − y0 . z) satisfying the vector equation: n· r = 0 (1.24) where r = (x − x0 . z) such that: 2(x + 3) + 4(y − 1) + 8(z − 3) = 0 If we multiply out the terms in formula (1. y0 . y0 .1. z0 ). z0 ) be a point in P. Let n = (a.6 The plane P Conversely.5. z) (x0 . c) be a nonzero vector which is perpendicular to the plane P. y. y.

−1) and QS = (1.36 CHAPTER 1. In both cases. − − → −→ − n = QR × QS − − → QR R S Q −→ − QS Figure 1. However. −2). 1).25) with the point Q (we could also use R or S ). .24. then QR and QS are nonzero − − −→ → − vectors which are not parallel (by noncollinearity). the plane P consists of all points (x. Two points do not determine a plane in 3 . −2. 1. to find the equation of the plane that contains those two lines. three noncollinear points do determine a − − → −→ − plane. −3. R. − − → Solution: Let Q = (2. three collinear points (i. −2) = (5. −1) × (1. 3).e. 2) and S = (3. R = (1.5. So two skew lines do not determine a plane. (1. 1) So using formula (1. 2) and (3. R and S ) lie in the − − → −→ − plane through the point Q with normal vector n = QR × QS (see Figure 1. 1).5.7). In fact. parallel planes. y. So QR and QS (and hence Q.7 Noncollinear points Q. all on the same line) do not determine a plane.e. Then for the vectors QR = −→ − (−1. the plane P has a normal vector − − → −→ − n = QR × QS = (−1.24. We will leave examples of this as exercises for the reader. as in Example 1. 2. But two (nonidentical) lines which either intersect or are parallel do determine a plane. Find the equation of the plane P containing the points (2. S Example 1. to write the equation. 2. −1. then use the technique above. two points determine a line. 1. −1. one point from one line and two points from the other). z) such that: 5(x − 2) − 3(y − 1) + (z − 3) = 0 or in normal form. −2. 5x − 3y + z − 10 = 0 We mentioned earlier that skew lines in 3 lie on separate. For if Q. and so their cross product QR × QS − − → −→ − − − → −→ − is perpendicular to both QR and QS . 1. R and S are noncollinear points in 3 . an infinite number of planes would contain the line on which those three points lie. 1. VECTORS IN EUCLIDEAN SPACE Plane containing three noncollinear points In 2-dimensional and 3-dimensional space. simply pick from the two lines a total of three noncollinear points (i. 3).

D= |5(2) − 3(4) + 1(−5) − 10| 52 + (−3)2 + 12 |−17| 17 = √ = √ ≈ 2. z0 ) be a point in 3 .5 Lines and Planes 37 Distance between a point and a plane The distance between a point in 3 and a plane is the length of the line segment from that point to the plane which is perpendicular to the plane.8). From the normal form equation for P. Place n so that its initial point is at R. Assume that n points toward the side of P where the point Q is located. so cos θ > 0. y0 .5. and let θ be the angle between r and n.1. Theorem 1. y.6 in Section 1. Find the distance D from (2. So . n Q r D θ R Figure 1. and thus repeating the same argument as above still gives the same result. Then r 0 since Q does not lie in P. z) be any point in the plane P (so that ax + by + cz + d = 0) and − − → let r = RQ = (x0 − x. Then the distance D from Q to P is: D= |ax0 + by0 + cz0 + d| √ a2 + b2 + c2 (1. c) is a normal vector for P. The following theorem gives a formula for that distance. Then 0◦ < θ < 90◦ .25.24. the distance D is cos θ r = |cos θ| r (see Figure 1. QED Example 1. we know that cos θ = n· r n· r n· r . b.87 35 35 Solution: Recall that the plane is given by 5x − 3y + z − 10 = 0. we know that n = (a. so n r D = |cos θ| r = |a(x0 − x) + b(y0 − y) + c(z0 − z)| √ n r n a2 + b2 + c2 |ax0 + by0 + cz0 − (ax + by + cz)| |ax0 + by0 + cz0 − (−d)| |ax0 + by0 + cz0 + d| = = = √ √ √ a2 + b2 + c2 a2 + b2 + c2 a2 + b2 + c2 r = = If n points away from the side of P where the point Q is located. and let P be a plane with normal form ax + by + cz + d = 0 that does not contain Q. 4. Now.8 D P By Theorem 1. −5) to the plane from Example 1. Let Q = (x0 .19.27) Proof: Let R = (x.3. z0 − z). any plane divides 3 into two disjoint parts. Thus. The distance D is then |cos θ| r . then 90◦ < θ < 180◦ and so cos θ < 0. y0 − y.5.

y. Figure 1. 7. z) on both planes will satisfy the following system of two equations in three unknowns: 5x − 3y + z − 10 = 0 2x + 4y − z + 3 = 0 Set x = 0 (why is that a good choice?).5.e. Example 1. y. Find the line of intersection L of the planes 5x − 3y + z − 10 = 0 and 2x + 4y − z + 3 = 0. 7. 31) + t(−1. and the planes are perpendicular if their normal vectors are perpendicular. 1) and the plane 2x + 4y − z + 3 = 0 has normal vector n2 = (2. 26). and so the point (0. intersect in a line L. A point (x.5. n1 × n2 is parallel to the intersection of P1 and P2 . i. Since n1 and n2 are not scalar multiples. z) to the two normal form equations of the planes.9 n1 × n2 ⊥ n2 means that n1 × n2 is also parallel to P2 . Thus. If two planes do intersect. for − ∞ < t < ∞ . for − ∞ < t < ∞ or in parametric form: x = −t. 4.26. 7. which leaves you to solve two equations in just two unknowns. they do so in L a line (see Figure 1. then the two planes are not parallel and hence will intersect. Solution: The plane 5x − 3y + z − 10 = 0 has normal vector n1 = (5. Likewise. Since n1 × n2 = (−1. we can write L in the following vector form: L : r + t(n1 × n2 ) . then n1 × n2 is parallel to the plane P1 . for − ∞ < t < ∞ (1. Since n1 × n2 ⊥ n1 . This can often be made easier by setting one of the coordinate variables to zero. 31) is on L. respectively. n1 × n2 is parallel to L. z = 31 + 26t. Then the above equations are reduced to: −3y + z − 10 = 0 4y − z + 3 = 0 The second equation gives z = 4y + 3. Suppose that two planes P1 and P2 with normal vectors n1 and n2 . y = 7 + 7t. substituting that into the first equation gives y = 7.38 CHAPTER 1. then L is given by: r + t(n1 × n2 ) = (0. 7. VECTORS IN EUCLIDEAN SPACE Line of intersection of two planes Note that two planes are parallel if they have normal vectors that are parallel. find a common solution (x. 26). −3. Thus. To find a point in both planes. −1). Then z = 31.9).28) where r is any vector pointing to a point belonging to both planes.

2). z = 3 − 2t x−6 x − 11 y − 14 z + 9 = y + 3 = z and = = 4 3 −6 2 For Exercises 11-12. P2 = (−2. 2. y = 4 + 3t. −1). 2). y = −4 − 3s. 0. v = (2. v = (1. P1 = (1. −1.) . write the normal form of the plane P containing the point Q and perpendicular to the vector n. 3) For Exercises 5-6. P2 = (3. 12. 4) 14. (b) parametric. 2x − y + z + 2 = 0 18. P = (3. x + 3y + 2z − 6 = 0. 0. 5. 3x + y − 5z = 0. 6. 1. −3). 3). 1. Write the normal form of the plane containing the lines from Exercise 10. 5. 0). P1 = (4. 1) 4.1. −2). 3. 1. Q = (0. 1. x + 2y + z + 4 = 0 For Exercises 19-20. (1. 2. v = (5. 1. (0. (−3. P : 3x − y − 5z + 8 = 0 19. write the line L through the point P and parallel to the vector v in the following forms: (a) vector. L : x = −2 − 2t. and (c) symmetric. find the line of intersection (if any) of the given planes. B 21. y = 2 + t. 0. x = 7 + 3s. Q = (6. Write the normal form of the plane containing the lines from Exercise 9. 1) 2. −4. −4. −10) 6. −2. 5). (6. 15. −2).5 Lines and Planes 39 A Exercises © ¨ For Exercises 1-4. 6) For Exercises 13-14. 1. Find the point(s) of intersection (if any) of the line x−6 = y + 3 = z with the plane 4 x + 3y + 2z − 6 = 0. 2. 8. 1) 11. 16. 3). P = (2. (4. z = 5 + 4t For Exercises 9-10. y = 4t. x = 1 + 6t. find the point of intersection (if any) of the given lines. 4. 17. 1. 0). Q = (5. 0). P = (0. 5) For Exercises 7-8. Q = (4. find the distance D from the point Q to the plane P. z = −7 − 5s and 10. 0. −2. 0. −1). 1. 9. 0). −3) 3. For Exercises 17-18. L : x = 3 + 2t. z = 7 + t 8. −3). v = (7. (1. −1. P = (2. 7. 3). n = (4. write the normal form of the plane containing the given points. write the line L through the points P1 and P2 in parametric form. n = (2. (Hint: Put the equations of the line into the equation of the plane. 3) 13. P = (0. P = (1. find the distance d from the point P to the line L. P : −5x + 2y − 7z + 1 = 0 20.

9. y0 . for some real-valued function F. a plane intersects a sphere either at a single point or in a circle.1 Spheres in 3 Note in Figure 1. z) = 0 in 3 . z) and x0 = (x0 . . z) in 3 which are a fixed distance r (called the radius) from a fixed point P0 = (x0 . since it is “flat”.e. a plane given by ax+by+cz+d = 0 is the solution set of F(x. z) : (x − x0 )2 + (y − y0 )2 + (z − z0 )2 = r2 } (1. y0 . For example. y. In general. y0 . a great circle. this can be written in the equivalent form: S = { x : x − x0 = r } where x = (x. Definition 1.6 Surfaces In the previous section we discussed planes in Euclidean space. z0 ) x0 y x (a) radius r.1 illustrates the vectorial approach to spheres.1(a) that the intersection of the sphere with the xy-plane is a circle of radius r (i. VECTORS IN EUCLIDEAN SPACE 1.6. given by x2 + y2 = r2 as a subset of 2 ). A sphere S is the set of all points (x. Similarly for the intersections with the xz-plane and the yz-plane. Figure 1. The plane is the simplest surface. center (x0 . z0 ) Figure 1. z) = 0 for the function F(x. z0 ) are vectors. which we will define informally8 as the solution set of the equation F(x. y. 8 See O’N EILL for a deeper and more rigorous discussion of surfaces. A plane is an example of a surface. y0 . the most important of which are the sphere and the cylinder. y.40 CHAPTER 1.29) Using vector notation. 0) 0 x (b) radius r. z z (1. 0. center (0.30) x =r x y 0 x − x0 = r x x − x0 (x0 . In this section we will look at some surfaces that are more complex. y. y. z) = ax+by+cz+d. y. z0 ) (called the center of the sphere): S = { (x. Surfaces are 2-dimensional.6.6.

6 Surfaces 41 Example 1. Solution: The sphere is centered at the origin and has √ radius 13 = 169. Solution: Put the equations of the line into the equation of the sphere. c and d.29) is multiplied out. parallel to the xy-plane (see Figure 1. 4).6. b. Is 2x2 + 2y2 + 2z2 − 8x + 4y − 16z + 10 = 0 the equation of a sphere? x2 + y2 + z2 − 4x + 2y − 8z + 5 = 0 (x − 2)2 + (y + 1)2 + (z − 4)2 = 16 Solution: Dividing both sides of the equation by 2 gives (x2 − 4x + 4) + (y2 + 2y + 1) + (z2 − 8z + 16) + 5 − 4 − 1 − 16 = 0 which is a sphere of radius 4 centered at (2. an equation of this form may describe a sphere. Example 1. −1 + √ . Find the points(s) of intersection (if any) of the sphere from Example 1.6.2 z z = 12 y 0 If the equation in formula (1. so it does intersect the plane z = 12.29. which was (x − 2)2 + (y + 1)2 + (z − 4)2 = 16. Putting those two values into 6 the equations of the line gives the following two points of intersection: 4 4 8 2 + √ . Example 1. we get an equation of the form: x2 + y2 + z2 + ax + by + cz + d = 0 (1. 4 − √ 6 6 6 and 8 4 4 2 − √ .28 and the line x = 3 + t.27. y and z variables.31) for some constants a. which can be determined by completing the square for the x.28. x Figure 1.2). Find the intersection of the sphere x2 + y2 + z2 = 169 with the plane z = 12. 4 + √ 6 6 6 . 12). 0.1. Conversely. y = 1 + 2t. Putting z = 12 into the equation of the sphere gives x2 + y2 + 122 = 169 x2 + y2 = 169 − 144 = 25 = 52 which is a circle of radius 5 centered at (0. and solve for t: (3 + t − 2)2 + (1 + 2t + 1)2 + (3 − t − 4)2 = 16 (t + 1)2 + (2t + 2)2 + (−t − 1)2 = 16 6t2 + 12t − 10 = 0 4 The quadratic formula gives the solutions t = −1 ± √ . −1 − √ . −1. z = 3 − t.

42

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

If two spheres intersect, they do so either at a single point or in a circle. Example 1.30. Find the intersection (if any) of the spheres x2 + y2 + z2 = 25 and x2 + y2 + (z − 2)2 = 16. x2 + y2 + z2 = 25 16 − (z − 2)2 = 25 − z2 x2 + y2 + (z − 2)2 = 16 ⇒ ⇒ ⇒ ⇒ ∴ The intersection is the circle x2 + y2 = x2 + y2 = 25 − z2 , and ⇒

Solution: For any point (x, y, z) on both spheres, we see that

x2 + y2 = 16 − (z − 2)2 , so 4z − 4 = 9

2 2 231 16

z = 13/4

x + y = 25 − (13/4)2 = 231/16 of radius

√ 231 4

≈ 3.8 centered at (0, 0, 13 ). 4

The cylinders that we will consider are right circular cylinders. These are cylinders obtained by moving a line L along a circle C in 3 in a way so that L is always perpendicular to the plane containing C. We will only consider the cases where the plane containing C is parallel to one of the three coordinate planes (see Figure 1.6.3).

z r z r y 0 x

(a) x2 + y2 = r2 , any z

z

y x

r 0

y

0 x

(b) x2 + z2 = r2 , any y

3

(c) y2 + z2 = r2 , any x

Figure 1.6.3 Cylinders in

For example, the equation of a cylinder whose base circle C lies in the xy-plane and is centered at (a, b, 0) and has radius r is (x − a)2 + (y − b)2 = r2 , (1.32)

where the value of the z coordinate is unrestricted. Similar equations can be written when the base circle lies in one of the other coordinate planes. A plane intersects a right circular cylinder in a circle, ellipse, or one or two lines, depending on whether that plane is parallel, oblique9 , or perpendicular, respectively, to the plane containing C. The intersection of a surface with a plane is called the trace of the surface.

9

i.e. at an angle strictly between 0◦ and 90◦ .

1.6 Surfaces

43

3,

The equations of spheres and cylinders are examples of second-degree equations in i.e. equations of the form Ax2 + By2 + Cz2 + Dxy + Exz + Fyz + Gx + Hy + Iz + J = 0 (1.33)

for some constants A, B, . . . , J. If the above equation is not that of a sphere, cylinder, plane, line or point, then the resulting surface is called a quadric surface. One type of quadric surface is the ellipsoid, given by an equation of the form: x2 y2 z2 + + =1 a2 b2 c2 (1.34) a In the case where a = b = c, this is just a sphere. In general, an ellipsoid is egg-shaped (think of an ellipse rotated around its major axis). Its traces in the coordinate planes are ellipses.

0

c

z

y

b

x Figure 1.6.4

Ellipsoid

Two other types of quadric surfaces are the hyperboloid of one sheet, given by an equation of the form: x2 y2 z2 + − =1 (1.35) a2 b2 c2 and the hyperboloid of two sheets, whose equation has the form: x2 y2 z2 − − =1 a2 b2 c2

z z

(1.36)

y 0 0

y

x Figure 1.6.5 Hyperboloid of one sheet

x Figure 1.6.6 Hyperboloid of two sheets

44

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

For the hyperboloid of one sheet, the trace in any plane parallel to the xy-plane is an ellipse. The traces in the planes parallel to the xz- or yz-planes are hyperbolas (see Figure 1.6.5), except for the special cases x = ±a and y = ±b; in those planes the traces are pairs of intersecting lines (see Exercise 8). For the hyperboloid of two sheets, the trace in any plane parallel to the xy- or xzplane is a hyperbola (see Figure 1.6.6). There is no trace in the yz-plane. In any plane parallel to the yz-plane for which | x| > |a|, the trace is an ellipse. The elliptic paraboloid is another type of quadric surface, whose equation has the form: x2 y2 z + 2 = 2 c a b (1.37)

z

The traces in planes parallel to the xy-plane are ellipses, though in the xy-plane itself the trace is a single point. The y traces in planes parallel to the xz- or yz-planes are parabo0 las. Figure 1.6.7 shows the case where c > 0. When c < 0 the x surface is turned downward. In the case where a = b, the surface is called a paraboloid of revolution, which is often Figure 1.6.7 Paraboloid used as a reflecting surface, e.g. in vehicle headlights.10 A more complicated quadric surface is the hyperbolic paraboloid, given by: x2 y2 z − 2 = 2 c a b (1.38)

100 50 0 -10 z -50 -5 -100-10 -5 0 y 5 10 10 5 0 x

Figure 1.6.8

10

Hyperbolic paraboloid

For a discussion of this see pp. 157-158 in H ECHT.

1.6 Surfaces

45

The hyperbolic paraboloid can be tricky to draw; using graphing software on a computer can make it easier. For example, Figure 1.6.8 was created using the free Gnuplot package (see Appendix C). It shows the graph of the hyperbolic paraboloid z = y2 − x2 , which is the special case where a = b = 1 and c = −1 in equation (1.38). The mesh lines on the surface are the traces in planes parallel to the coordinate planes. So we see that the traces in planes parallel to the xz-plane are parabolas pointing upward, while the traces in planes parallel to the yz-plane are parabolas pointing downward. Also, notice that the traces in planes parallel to the xy-plane are hyperbolas, though in the xy-plane itself the trace is a pair of intersecting lines through the origin. This is true in general when c < 0 in equation (1.38). When c > 0, the surface would be similar to that in Figure 1.6.8, only rotated 90◦ around the z-axis and the nature of the traces in planes parallel to the xz- or yz-planes would be reversed. The last type of quadric surface that we will consider is the elliptic cone, which has an equation of the form: x2 y2 z2 + − =0 a2 b2 c2 (1.39)

z

y The traces in planes parallel to the xy-plane are ellipses, 0 except in the xy-plane itself where the trace is a single point. The traces in planes parallel to the xz- or yz-planes are hyperbolas, except in the xz- and yz-planes themselves where the traces are pairs of intersecting lines. x Notice that every point on the elliptic cone is on a line which lies entirely on the surface; in Figure 1.6.9 these Figure 1.6.9 Elliptic cone lines all go through the origin. This makes the elliptic cone an example of a ruled surface. The cylinder is also a ruled surface. What may not be as obvious is that both the hyperboloid of one sheet and the hyperbolic paraboloid are ruled surfaces. In fact, on both surfaces there are two lines through each point on the surface (see Exercises 11-12). Such surfaces are called doubly ruled surfaces, and the pairs of lines are called a regulus. It is clear that for each of the six types of quadric surfaces that we discussed, the surface can be translated away from the origin (e.g. by replacing x2 by (x − x0 )2 in its equation). It can be proved11 that every quadric surface can be translated and/or rotated so that its equation matches one of the six types that we described. For example, z = 2xy is a case of equation (1.33) with “mixed” variables, e.g. with D 0 so that we get an xy term. This equation does not match any of the types we considered. However, by rotating the x- and y-axes by 45◦ in the xy-plane by means of the coor√ √ dinate transformation x = (x′ − y′ )/ 2, y = (x′ + y′ )/ 2, z = z′ , then z = 2xy becomes the hyperbolic paraboloid z′ = (x′ )2 − (y′ )2 in the (x′ , y′ , z′ ) coordinate system. That is, z = 2xy is a hyperbolic paraboloid as in equation (1.38), but rotated 45◦ in the xy-plane.

11

See Ch. 7 in P OGORELOV.

0). 0. x2 + y2 + z2 + 2x − 2y − 8z + 19 = 0 4. 1). 0) x Figure 1. 7. 0.35) as a2 − c2 = 1 − b2 . and vice versa.) 12 z (0. Recall that two planes intersect in a line. 0). c) be an arbitrary point on S ∗ . 0. z = 3 + t. 9. y. b. x2 + y2 + z2 − 4x − 6y − 10z + 37 = 0 3. . (Note: Every point in the xy-plane can be matched with a point on S ∗ . p. and let S ∗ be S without the “north pole” point (0. 2) and (a. If so. c) 1 0 (x. 0.e.46 CHAPTER 1.6. b. 2). VECTORS IN EUCLIDEAN SPACE A Exercises © ¨ For Exercises 1-4. b and c. Find the intersection of the spheres x2 + y2 + z2 = 9 and (x − 4)2 + (y + 2)2 + (z − 4)2 = 9. (Hint: Exercise 11) 13. 160. 1. b. −1. Show that the hyperboloid of one sheet is a doubly ruled surface.12 Find the equation of the sphere that passes through the points (0. 2x2 + 2y2 + 2z2 + 4x + 4y + 4z − 44 = 0 2. 8. and y2 b2 2 2 − = z c in the xy-plane.6.e. Then the line passing through (0. It can be shown that any four noncoplanar points (i. x2 + y2 − z2 + 12x + 2y − 4z + 32 = 0 5. 0. C 10.10. i. Find the intersection of the sphere x2 + y2 + z2 = 9 and the cylinder x2 + y2 = 4. (Hint: Write equation y2 z2 x2 (1. as in Figure 1. in this manner. 2) (a. 0) in terms of a. c) intersects the xyplane at some point (x. Let S be the sphere with radius 1 centered at (0. which essentially identifies all of 2 with a “punctured” sphere. Find the trace of the hyperbolic paraboloid x2 a2 x2 a2 y z + b2 − c2 = 1 in the plane x = a. y. B 6. factor each side. 3). Find this point (x. Let (a. This method is called stereographic projection.10 S y See W ELCHONS and K RICKENBERGER. for a proof. (0.) 12. y = −2 − 3t. −4.31)) 11. points that do not lie in the same plane) determine a sphere. Find the trace of the hyperboloid of one sheet the trace in the plane y = b. Show that the hyperbolic paraboloid is a doubly ruled surface. determine if the given equation describes a sphere. each point on the surface is on two lines lying entirely on the surface. 2). y. Find the point(s) of intersection of the sphere (x − 3)2 + (y + 1)2 + (z − 3)2 = 9 and the line x = −1 + 2t. 0. 3) and (0. (1. (Hint: Equation (1. find its radius and center.

0) where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0 Figure 1. y.1.7. y.7. z) are defined as follows: Cylindrical coordinates (r. y.2 Cylindrical coordinates Spherical coordinates (ρ. Then the cylindrical coordinates (r. Also. 0). as in Figure 1. z) z θ = tan−1 z=z 0 x θ x y y r P0 (x.7. y) = (0. z) and the spherical coordinates (ρ. z) The Cartesian coordinates of a point (x. z) be a point in Cartesian coordinates in 3 . we will think of the point as lying on a cylinder or sphere. 0 ≤ θ < 2π. Treating (x. φ is called the zenith angle. 0. Note that r ≥ 0. ρ ≥ 0 and 0 ≤ φ ≤ π. let (r. y. and let P0 = (x. θ. φ) of P(x. Instead of referencing a point in terms of Figure 1.1. 0). The two types of curvilinear coordinates which we will consider are cylindrical and sphery x ical coordinates. y. Let P = (x.3). θ. y. 0) where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0 Figure 1. .7 Curvilinear Coordinates z (x. y.7. In curvilinear coordinate sysx 0 tems. y. y. θ. θ. φ): x = ρ sin φ cos θ y = ρ sin φ sin θ z = ρ cos φ ρ= x2 + y2 + z2 y x z x2 +y2 +z2 z ρ φ 0 x θ x y P(x. while spherical coordinates are useful when there is symmetry about the origin.7. then parallel to y the z-axis. these paths can be curved.1 sides of a rectangular parallelepiped. as with Cartesian coordinates. z) are determined by following a family of straight paths from the origin: first z along the x-axis. θ) be its polar coordinates (see Figure 1. z): x = r cos θ y = r sin θ z=z r= x2 + y2 y x z P(x. y. z) = (0. z) z y θ = tan−1 φ = cos−1 √ P0 (x.7 Curvilinear Coordinates 47 1. and let φ be the angle between that line segment and the positive z-axis (see Figure 1. Let ρ be the length of the line segment from the origin to P. then parallel to the y-axis. 0) be the projection of P upon the xy-plane.3 Spherical coordinates Both θ and φ are measured in radians.7. y) as a point in 2 . θ is undefined when (x. Cylindrical coordinates are often used when there is symmetry around the z-axis. and φ is undefined when (x.2).

5π .48 CHAPTER 1. z) = 2 2. −2 4 √ 5π ∴ (r. Convert the point (−2. 1 √ 1 (b) ρ = (−2)2 + (−2)2 + 12 = 9 = 3. φ) = 3. φ = cos−1 3 ≈ 1. z).31. θ. and the surface z = z0 is a plane parallel to the xy-plane. 4 .5 Spherical coordinate surfaces Figures 1.4(a) and 1.7. For cylindrical coordinates (r. the surface θ = θ0 is a half-plane emanating from the z-axis.4 that the surface r = r0 is a cylinder of radius r0 centered along the z-axis. θ. and constants r0 . since y = −2 < 0.5 that the surface ρ = ρ0 is a sphere of radius ρ0 centered at the origin. z ρ0 y 0 0 θ0 x (a) ρ = ρ0 z z y φ0 y 0 x (c) φ = φ0 x (b) θ = θ0 Figure 1.7. θ0 and z0 . the surface θ = θ0 is a half-plane emanating from the z-axis. θ0 and φ0 .4 Cylindrical coordinate surfaces For spherical coordinates (ρ.7. we see from Figure 1.23 4 Example 1. and constants ρ0 . z r0 y 0 0 θ0 x (a) r = r0 z z0 y z y 0 x (b) θ = θ0 x (c) z = z0 Figure 1.7.7. 1) from Cartesian coordinates to (a) cylindrical and (b) spherical coordinates. 1. we see from Figure 1. √ Solution: (a) r = (−2)2 + (−2)2 = 2 2. and the surface φ = φ0 is a circular cone whose vertex is at the origin. . θ. −2. φ). VECTORS IN EUCLIDEAN SPACE ∴ (ρ. θ. θ = tan−1 −2 = tan−1 (1) = 5π .5(a) show how these coordinate systems got their names.7.23 radians.

5 -1. 1. 14 12 10 8 z 6 4 2 0 -2 -1 -0.5 y 0 0.6 Helicoid θ = z . Example 1. while the radius r is unrestricted. as opposed to the Cartesian equation where you could immediately identify the surface as a sphere of radius 3 centered at (2. then the equation in cylindrical coordinates is r = 2. So this sweeps out a (ruled!) surface shaped like a spiral staircase. Solution: Multiplying the equation out gives x2 + y2 + z2 − 4x − 2y + 5 = 9 . as in the following example.5 -1 0 -0. if the sphere is not centered at the origin. so does the angle θ.1.5 1.32. Figure 1. so we get ρ2 − 4ρ sin φ cos θ − 2ρ sin φ sin θ − 4 = 0 . Example 1.5 2 2 x -2 -1. or ρ2 − 2 sin φ (2 cos θ − sin θ ) ρ − 4 = 0 after combining terms. Write the equation of the cylinder x2 +y2 = 4 in cylindrical coordinates. where the spiral has an inﬁnite radius.7. Using spherical coordinates to write the equation of a sphere does not necessarily make the equation simpler. Solution: Since r = x2 + y2 .34. As the (vertical) z coordinate increases.7 Curvilinear Coordinates 49 Sometimes the equation of a surface in Cartesian coordinates can be transformed into a simpler equation in some other coordinate system. 0). Note that this actually makes it more difficult to figure out what the surface is. Example 1. Describe the surface given by θ = z in cylindrical coordinates.6 shows a section of this surface restricted to 0 ≤ z ≤ 4π and 0 ≤ r ≤ 2.33. Write the equation (x − 2)2 + (y − 1)2 + z2 = 9 in spherical coordinates. Solution: This surface is called a helicoid.5 0.5 Figure 1.5 1 1 1.7.

6) 3. the line segment from the origin to P can be extended to intersect the cylinder given by r = a (in cylindrical coordinates). Let P1 and P2 be points whose spherical coordinates are (ρ1 . 5. 2 4 9. x2 + y2 = 2y 7. θ. VECTORS IN EUCLIDEAN SPACE A Exercises © ¨ For Exercises 1-4. θ1 . 2) 1. which provides a general expression for the electrostatic potential at a point due to a unit charge. Show that the distance d between the points P1 and P2 with cylindrical coordinates (r1 . Let v1 be the vector from the origin to P1 . ( 21. 0) with radius |a|. 5. (−5. θ1 . θ2 . z1 ) and (r2 . For the angle γ between v1 and v2 . φ) be a point in spherical coordinates. 2 3. φ1 ) and (ρ2 . (2. x2 + y2 + 9z2 = 36 B 8. φ2 ). is d= 2 2 r1 + r2 − 2r1 r2 cos( θ2 − θ1 ) + (z2 − z1 )2 . See pp. θ2 .50 CHAPTER 1. Then P lies on the sphere ρ = a. respectively. 1 2 . show that cos γ = cos φ1 cos φ2 + sin φ1 sin φ2 cos( θ2 − θ1 ). respectively. θ1 . Show that the distance d between the points P1 and P2 with spherical coordinates (ρ1 . (0. θ2 . C 10. Find the cylindrical coordinates of that point of intersection. − 7. the equation ρ = 2a sin φ cos θ in spherical coordinates describes a sphere centered at (a. and let v2 be the vector from the origin to P2 . −1) For Exercises 5-7. write the given equation in (a) cylindrical and (b) spherical coordinates. Since 0 < φ < π. 13. φ1 ) and (ρ2 . 100-102 in J ACKSON. φ2 ). 11. 12. 2. This formula is used in electrodynamics to prove the addition theorem for spherical harmonics. √ √ √ √ 2. Describe the intersection of the surfaces whose equations in spherical coordinates are θ = π and φ = π . Let P = (a. is d= ρ2 + ρ2 − 2ρ1 ρ2 [sin φ1 sin φ2 cos( θ2 − θ1 ) + cos φ1 cos φ2 ] . x2 + y2 + z2 = 25 6. find the (a) cylindrical and (b) spherical coordinates of the point whose Cartesian coordinates are given. 0) 4. Show that for a 0. with a > 0 and 0 < φ < π. z2 ). 0. respectively.

the x. Define f : → 3 by f(t) = (cos t. 2 2 2 2 f(2π) f(0) x y 0 Figure 1. f2 (t). Much of the theory of real-valued functions of a single real variable can be applied to vector-valued functions of a real variable. The first form is often used when emphasizing that f(t) is a vector.10. t). We write f : D → 3 to denote that f is a mapping of D into 3 . which in Cartesian coordinates has the terminal point (1. f3 (t)) for some real-valued functions f1 (t). as in the following definition. can be extended naturally to vector-valued functions. called the component functions of f. f2 (t). so x + y = cos t + sin t = 1. there are times when such generalizations do not hold (see Exercise 13).8. The concept of a limit. For example. we can begin discussing functions whose values are vectors. a curve in space can be written as a vector-valued function. As the value of t increases. At t = 1 the value of the function is the vector i + j + k. the curve lies on the surface of the right circular cylinder x2 + y2 = 1.1). Definition 1. z Example 1.35. where t is in some subset D of 1 (called the domain of f).1 It may help to think of vector-valued functions of a real variable in 3 as a generalization of the parametric functions in 2 which you learned about in single-variable calculus. the terminal points of f(t) trace out a curve spiraling upward.8 Vector-Valued Functions Now that we are familiar with vectors and their operations. sin t. For each t. though. f(t) = ti + t2 j + t3 k is a vector-valued function in 3 . A vector-valued function of a real variable is a rule that associates a vector f(t) with a real number t. A vector-valued function of a real variable can be written in component form as f(t) = f1 (t)i + f2 (t)j + f3 (t)k or in the form f(t) = ( f1 (t). 1. Thus. This is the equation of a helix (see Figure 1. However. Since each of the three component functions are real-valued. By identifying vectors with their terminal points.and y-coordinates of f(t) are x = cos t and y = sin t.8 Vector-Valued Functions 51 1. 1). f3 (t). defined for all real numbers t. .8. it will sometimes be the case that results from singlevariable calculus can simply be applied to each of the component functions to yield a similar result for the vector-valued function. We would write f : → 3 . and the second form is useful when considering just the terminal points of the vectors.1.

f3 ′ (a)). Recall that the derivative of a real-valued function of a single variable is a real number. if lim f(t) − c = 0.8.52 CHAPTER 1. Then f ′ (t) = (− sin t. written as lim f(t) = c. f2 (t). Equivalently.36.2). and f3 (t) are continuous at a. and let a be a real number in its domain. representing the slope of the tangent line to the graph of the function at a point.11. The above definition shows that continuity and the derivative of vector-valued functions can also be defined in terms of its component functions. Let f(t) be a vector-valued function. z f(a) f(a f ′ (a) + h) − L f(t) y f(a ) f(a + h) 0 x Figure 1. VECTORS IN EUCLIDEAN SPACE Definition 1. Similarly. 1.8. f ′ (a) = ( f1 ′ (a). is the limit dt f ′ (a) = lim f(a + h) − f(a) h→0 h if that limit exists. t→a f(t) is continuous at a if and only if f1 (t). denoted by f ′ (a) or (a). lim f2 (t). The tangent line L to the curve at f(2π) = (1. f2 (t). 0. Definition 1. Then we say that the limit of f(t) as t approaches a equals c. lim f3 (t) t→a t→a t→a t→a provided that all three limits on the right side exist. Let f(t) = ( f1 (t).12. cos t. df The derivative of f(t) at a. 1) for all t. 2π) is L = f(2π) + s f ′ (2π) = (1. z = 2π + s for −∞ < s < ∞. Equivalently. . f2 ′ (a). Then f(t) is continuous at a if lim f(t) = f(a). if the component derivatives exist. or in parametric form: x = 1. y = s. f3 (t)). f2 (t). then t→a t→a lim f(t) = lim f1 (t). and it lies on the tangent line to the curve (see Figure 1.2 Tangent vector f ′ (a) and tangent line L = f(a) + sf ′ (a) Example 1. 0. t). let a be a real number and let c be a vector. 2π) + s(0. f3 (t)) be a vector-valued function. Let f(t) = (cos t. the derivative of a vector-valued function is a tangent vector to the curve in space which the function represents. 1). We say that f(t) is differentiable at a if f ′ (a) exists. sin t. If f(t) = ( f1 (t).

f2 (t). then their product. f3 (t)) and g(t) = (g1 (t). f2 (t). f3 (t)) · (t). (t). defined by (u f)(t) = u(t) f(t) for all t. let u(t) be a differentiable scalar function. where the component functions f1 (t). Then d (c) = 0 dt d df (b) (kf) = k dt dt df dg d + (c) (f + g) = dt dt dt d df dg (d) (f − g) = − dt dt dt du df d f+u (e) (u f) = dt dt dt d df dg (f) (f · g) = · g + f· dt dt dt df dg d × g+f× (g) (f × g) = dt dt dt (a) Proof: The proofs of parts (a)-(e) follow easily by differentiating the component functions and using the rules for derivatives from single-variable calculus. let k be a scalar. g2 (t). Note that if u(t) is a scalar function and f(t) is a vector-valued function. f3 (t). Then d d (f(t) · g(t)) = ( f1 (t) g1 (t) + f2 (t) g2 (t) + f3 (t) g3 (t)) dt dt d d d = ( f1 (t) g1 (t)) + ( f2 (t) g2 (t)) + ( f3 (t) g3 (t)) dt dt dt d f1 dg1 d f2 dg2 d f3 dg3 = (t) g1 (t) + f1 (t) (t) + (t) g2 (t) + f2 (t) (t) + (t) g3 (t) + f3 (t) (t) dt dt dt dt dt dt d f2 d f3 d f1 (t). Theorem 1. Let f(t) and g(t) be differentiable vector-valued functions. The basic properties of derivatives of vector-valued functions are summarized in the following theorem. (f) Write f(t) = ( f1 (t). (t) · (g1 (t). (t) dt dt dt df dg = (t) · g(t) + f(t) · (t) for all t. g3 (t)).8 Vector-Valued Functions 53 A scalar function is a real-valued function. f2 (t). g3 (t) are all differentiable real-valued functions. We will prove part (f). (t).1. g2 (t). is a vector-valued function (since the product of a scalar with a vector is a vector). g3 (t)) = dt dt dt dg2 dg3 dg1 + ( f1 (t). g2 (t). QED dt dt .20. and leave the proof of part (g) as an exercise for the reader. and let c be a constant vector. g1 (t).

This means that if a curve lies completely on a sphere (or circle) centered at the origin.6 0. cos t sin t −at .6 -0. so dt dt = 2f ′ (t) · f(t) . the above example shows this important fact: If f(t) 0.3 Spherical spiral with a = 0.37.6 0.1 -0.2 0. Thus.54 CHAPTER 1.20(f).8 -0.4 0. In the exercises.05 0 z-0.2 0. we know that f(t) 2 = 2 f(t) f(t) .2 0. then f(t) is constant if and only if f(t) ⊥ f ′ (t) for all t.15 -0. √ .2 y 0 0. √ 1 + a2 t 2 1 + a2 t 2 1 + a2 t 2 Figure 1.15 0. so if f(t) 0 then ′ (t) · f(t) f d f(t) = .8.4 -0. so f(t) 2 = (f(t) · f(t)). √ .8 -1 -0. Suppose f(t) is differentiable.8 -0. dt f(t) d We know that f(t) is constant if and only if f(t) = 0 for all t. then the tangent vector f ′ (t) is always perpendicular to the position vector f(t).4 0. we have dt dt 2 f(t) d d f(t) = (f(t) · f(t)) = f ′ (t) · f(t) + f(t) · f ′ (t) by Theorem 1. The spherical spiral f(t) = 0. Hence.4 -0.05 -0.3 shows the graph of the curve when a = 0. the reader will be asked to show that this curve lies on the sphere x2 + y2 + z2 = 1 and to verify directly that f ′ (t) · f(t) = 0 for all t.2 .2. f(t) ⊥ f ′ (t) if dt and only if f ′ (t) · f(t) = 0. Find the derivative of f(t) .2 x 0 0. Example 1.6 -0. dt dt d d But f(t) 2 = f(t) · f(t).38. Solution: Since f(t) is a real-valued function of t. Also. then by the Chain Rule for reald d valued functions.8. VECTORS IN EUCLIDEAN SPACE Example 1.1 0.2 -1 -0. for a 0.8 1 1 Figure 1.

so by Example 1. .. v(t) = 5 for all t also. dn f d dn−1 f = dtn dt dtn−1 (for n = 2. 4 sin t) be the position vector of an object at time t ≥ 0. 4 cos t) Note that r(t) = 25 cos2 t + 25 sin2 t = 5 for all t.37 we know that ˙ r(t) · r(t) = 0 for all t (which we can verify from part (a)).39. And not only does r(t) lie on the sphere of radius 5 centered at the origin. 3. y. z(t)) dr dt ′ ′ ′ = (x (t). Find its (a) velocity and (b) acceleration vectors. y(t). higher-order derivatives of vector-valued functions are obtained by repeatedly differentiating the (first) derivative of the function: f ′′ (t) = d ′ f (t) . Note that since the mass m is a constant. It turns out (see Exercise 16) that whenever an object moves in a circle with constant speed. y (t). z(t)) the position vector of the object. . towards the center of the circle). That is. such as velocity. but perhaps not so obvious is that it lies completely within a circle of radius 5 centered at the origin.e. . . z(t).) We can use vector-valued functions to represent physical quantities. etc. We can define various physical quantities associated with the object as follows:13 position: r(t) = (x(t). force. z) at time t a function of t. momentum. let the real variable t represent time elapsed from some initial time (t = 0). acceleration. y(t). y(t). 13 ˙ (b) a(t) = v(t) = (−5 cos t. z = z(t) for some real-valued functions x(t). y (t). and suppose that an object of constant mass m is subjected to some force so that it moves in space. y = y(t). Let r(t) = (5 cos t. the acceleration vector will point in the opposite direction of the position vector (i. z (t)) ˙ velocity: v(t) = r(t) = r ′ (t) = momentum: p(t) = mv(t) ˙ force: F(t) = p(t) = p ′ (t) = dp dt (Newton’s Second Law of Motion) The magnitude v(t) of the velocity vector is called the speed of the object. note that a(t) = −r(t). For example.1. ˙ Solution: (a) v(t) = r(t) = (−5 sin t. dt . 4. the force equation becomes the familiar F(t) = ma(t). −4 sin t) We will often use the older dot notation for derivatives when physics is involved. with its position (x. Call r(t) = (x(t).. x = x(t). dt f ′′′ (t) = d ′′ f (t) . 3 cos t. −3 sin t. Also. z (t)) dv ˙ acceleration: a(t) = v(t) = v ′ (t) = dt d2 r ¨ = r(t) = r ′′ (t) = 2 dt ′′ ′′ ′′ = (x (t). In fact. Example 1. 3 sin t. .8 Vector-Valued Functions 55 Just as in single-variable calculus.

4. 5. b2 in 3 . b1 = (1.40. and the curve is b2 (t).0) 0 0.5 0 0 2 2. Example 1. a3 t + b3 ) represents a line in 3 . 2). and when t is restricted to the interval [0.5 1 2 y 3 4 5 4 3 3. 3). 0).2. In general. 4t + t2 . 1].4 Bézier curve approximation for three points . define b1 (t) = (1 − t)b0 + tb1 0 b1 (t) = (1 − t)b1 + tb2 1 b2 (t) = (1 − t)b1 (t) + tb1 (t) 1 0 0 = (1 − t)2 b0 + 2t(1 − t)b1 + t2 b2 for all real t. The function b2 (t) is the 1 0 Bézier curve for the points b0 . as shown in Figure 1.5 x 1 1. 0 1 0 (1. with l(0) = r1 and l(1) = r2 .3) 3 2. 6t − 4t2 ).2) (0. we see that b1 (t) is the line segment between 0 b0 and b1 .5. VECTORS IN EUCLIDEAN SPACE Recall from Section 1. So the function l(t) = (1 − t)r1 + tr2 is a line through the terminal points of r1 and r2 .56 CHAPTER 1. a t2 + b t + c . a function of the form f(t) = (a1 t + b1 .5 z 1 0. A function of the form f(t) = (a t2 + b t + c . For instance.5 Figure 1. 0 where the line segments are b1 (t) and b1 (t). Then the explicit formula for the Bézier curve is b2 (t) = (2t + 2t2 . b1 .5 (4. 0.0. given three points (or position vectors) b0 .8. and b1 (t) is the line segment between b1 and b2 . For t in the interval [0. and b2 = (4. a2 t + b2 . b2 .5 2 1.5 that if r1 . As an example. 2. b1 . That vector sum can be written as (1 − t)r1 + tr2 . a t2 + b t + c ) represents a 1 1 1 2 2 2 3 3 3 (possibly degenerate) parabola in 3 . Bézier curves are used in Computer Aided Design (CAD) to approximate the shape of a polygonal path in space (called the Bézier polygon or control polygon). r2 are position vectors to distinct points then r1 +t(r2 −r1 ) represents a line through those two points as t varies over all real numbers. Note from the last formula that the curve is a parabola that goes through b0 (when t = 0) and b2 (when t = 1).8. let b0 = (0. 1] it is the line segment between the points.

the polygonal path determined by n ≥ 3 noncollinear points in 3 can be used to define the Bézier curve recursively by a process called repeated linear interpolation. 27-30 in FARIN. t3 + 1) 3.2) (0.8. If f ′ (t) = 0 for all t in some interval (a. and its formula is given by de Casteljau’s algorithm. 2 (4. See pp. r(t) = (t.14 In the exercises.1) 1. √ . the reader will be given the algorithm for the case of n = 4 points and asked to write the explicit formula for the Bézier curve for the four points shown in Figure 1.5. (b) Show directly that f ′ (t) · f(t) = 0 for all t. √ . 8.5. 2 sin t. 2.3.5 1 z (0. with a 2 t2 2 t2 1+a 1+a 1 + a2 t 2 (a) Show that f(t) = 1 for all t.8. b). 2 sin2 t.0) 0. f(t) = (t + 1. calculate f ′ (t) and find the tangent line at f(0).5 (2. f(t) = (sin 2t. 1 − cos t) 6.5 1 1.5 4 5 4 Figure 1. find the velocity v(t) and acceleration a(t) of an object with the given position vector r(t). t − sin t.5 x 2 2. f(t) = (cos 2t. Let f(t) = √ . b). 14 0. r(t) = (3 cos t. .5 Bézier curve approximation for four points ¨ A 1. This curve will be a vector-valued function whose components are polynomials of degree n − 1. t) Exercises © For Exercises 1-4.8 Vector-Valued Functions 57 In general. 5. 1) B cos t sin t −at 7. et + 1) 4.0.0) 0 0 1 2 y 3 3 3.1.5 0 0. sin 2t.1. 2 cos t) 2 For Exercises 5-6. e2t + 1. show that f(t) is a constant vector in (a. f(t) = (et + 1. t2 + 1.

Given your answer to part (a). b3 = (4. (b) What kind of curve does h(t) = et c represent? Explain. the function f(t) = tc represents a line parallel to c.37 to show that r(t) ⊥ v(t) and a(t) ⊥ v(t). 0). (Hint: Use Example 1. (c) Compare f ′ (0) and g ′ (0). If F(t) is the force acting on the particle at time t. 9. VECTORS IN EUCLIDEAN SPACE 0. dt 16. The angular momentum L(t) of the particle with respect to the origin at time t is defined as L(t) = r(t) × p(t).) 17. b2 . Show that d dg df dh (f · (g × h)) = · (g × h) + f · × h + f· g × . 15. b2 = (2. Show that L ′ (t) = N(t). 1). 1. acceleration a(t) and momentum p(t) at time t. b1 = (0. sin t. b1 (t) = (1 − t)b2 + tb3 2 . Show that d (r × (v × r)) = r 2 a + (r · v)v − ( v 2 + r · a)r. 0 (b) Write the explicit formula (as in Example 1. t). there is no t in the interval (0. dt dt dt dt 13. Let r(t) be the position vector in 3 for a particle that moves with constant speed c > 0 in a circle of radius a > 0 in the xy-plane. Let a particle of (constant) mass m have position vector r(t). dt dt dt 11. The Mean Value Theorem does not hold for vector-valued functions: Show that for f(t) = (cos t. b3 in 0 the following algorithm (going from the left column to the right): b1 (t) = (1 − t)b0 + tb1 0 b1 (t) = (1 − t)b1 + tb2 1 b2 (t) = (1 − t)b1 (t) + tb1 (t) 0 0 1 b2 (t) = (1 − t)b1 (t) + tb1 (t) 1 1 2 is defined by b3 (t) = (1 − t)b2 (t) + tb2 (t) 0 0 1 (a) Show that b3 (t) = (1 − t)3 b0 + 3t(1 − t)2 b1 + 3t2 (1 − t)b2 + t3 b3 . Let r(t) be the position vector for a particle moving in 3 . b1 . 2π) such that f ′ (t) = f(2π) − f(0) . 3. For a constant vector c (a) What kind of curve does g(t) = t3 c represent? Explain.20(g). 0). and hence a(t) r(t). 5.58 CHAPTER 1. velocity v(t). 2). The Bézier curve b3 (t) for four noncollinear points b0 . Show that a(t) points in the opposite direction as r(t) for all t. 2π − 0 3 C 14. how do you explain the difference in the two derivatives? 10. 0. 12. Show that d df d2 f f× = f × 2. Prove Theorem 1. then define the torque N(t) acting on the particle with respect to the origin as N(t) = r(t) × F(t).40) for the Bézier curve for the points b0 = (0.

and a function whose derivatives of all orders are continuous is called smooth (or a C∞ function). z(t)) be a curve in 3 whose domain includes the interval [a.1. y(t). § 14.40) which is analogous to the case from single-variable calculus for parametric functions in 2 . t) from t = 0 to t = 2π. Definition 1. normally glossed over in calculus texts. which are beyond the scope of this book. b] where the derivative of a component function is not continuous then it is often possible to partition [a. we have 2π 2π 2π L= (− sin t)2 + (cos t)2 + 12 dt = 0 √ √ = 2(2π − 0) = 2 2π sin2 t + cos2 t + 1 dt = 0 0 √ 2 dt Similar to the case in 2 . Solution: By formula (1. All the functions we will consider will be smooth. (1. A rigorous proof requires dealing with some subtleties. Suppose that in the interval (a. b].41. the arc length of a curve in 3 .13. Note that we did not prove that the formula in the above definition actually gives the length of a section of a curve. Find the length L of the helix f(t) = (cos t. in general. Let f(t) = (x(t).41) A real-valued function whose first derivative is continuous is called continuously differentiable (or a C1 function). This is indeed how we will define the distance traveled and.9 Arc Length 59 1. if there are values of t in the interval [a.2 and § 18.9 Arc Length Let r(t) = (x(t).15 Example 1. sin t. b] into subintervals where all the component functions are continuously differentiable (except at the endpoints.2. b]. Since v(t) is the speed of the object at time t. 15 In particular. See the proof in T AYLOR and M ANN. A smooth curve f(t) is one whose derivative f ′ (t) is never the zero vector and whose component functions are all smooth. b) the first derivative of each component function x(t). The sum of the arc lengths over the subintervals will be the arc length over [a.41). . Duhamel’s principle is needed. z(t)) be the position vector of an object moving in 3 . it seems natural to define the distance s traveled by the object from time t = a to t = b as the definite integral b b s= a v(t) dt = a x ′ (t)2 + y ′ (t)2 + z ′ (t)2 dt . y(t). which can be ignored). and that no section of the curve is repeated. y(t) and z(t) exists and is continuous. Then the arc length L of the curve from t = a to t = b is b b L= a f ′ (t) dt = a x ′ (t)2 + y ′ (t)2 + z ′ (t)2 dt (1.

defining α : [0. Then the function g : [c. t) from Example 1. one-to-one. 2π]. 2π] g(s) = (cos 2s. g(t) traces out the same section of the curve as f(t) does over the interval [0. b] be a smooth one-to-one mapping of an interval [c. d] → [a. Chain Rule: If f(t) is a differentiable vector-valued function of t. 2π] by α(s) = 2s. 1] To see that g(s) is equivalent to f(t). 2π]. The following are all equivalent parametrizations of the same curve: f(t) = (cos t. Intuitively. π].21. For example. d] then we say that g(s) is equivalent to f(t). b] f(t) 3 f g(s) = f(α(s)) = f(t) Note that the differentiability of g(s) follows from a version of the Chain Rule for vector-valued functions (the proof is left as an exercise): Theorem 1. the speeds of f(t) and g(t) are f ′ (t) = 2 and g ′ (t) = 2 2. This makes sense since. Then α is smooth. respectively.41 is also traced out by the function g(t) = (cos 2t. and let α : [c. VECTORS IN EUCLIDEAN SPACE Notice that the curve traced out by the function f(t) = (cos t. π] h(s) = (cos 2πs. over the interval [0. d] → 3 defined by g(s) = f(α(s)) is a parametrization of C with parameter s. sin 2s.42) ds dt ds for any s where the composite function f(α(s)) is defined. 2t). Likewise. π] onto [0. . viewing the functions as position vectors and their derivatives as velocity √ √ vectors. d] onto [a. d] α t [a. b]. 2πs) for s in [0. sin t. Definition 1. b]. π] → [0. sin 2πs.42. 2π] by α(s) = 2πs shows that h(s) is equivalent to f(t). this says that g(t) traces the curve twice as fast as f(t). t) for t in [0. s [c. then f(s) = f(α(s)) is a differentiable vector-valued function of s.60 CHAPTER 1. We say that g(t) and f(t) are different parametrizations of the same curve. Let C be a smooth curve in 3 represented by a function f(t) defined on an interval [a. sin t. define α : [0. and is strictly increasing (since α ′ (s) = 2 > 0 for all s). If α is strictly increasing on [c. sin 2t. and t = α(s) is a differentiable scalar function of s. Example 1. 1] → [0.14. 2s) for s in [0. maps [0. and df df dt = (1.

In fact. t] for each t in [a. b] there is a unique s in [0. By the Fundamental Theorem of Calculus. (1. L] → [a. L]. for each t in [a. f(s) has unit speed: f ′ (s) = f ′ (α(s)) α ′ (s) by the Chain Rule. so f (α(s)) f ′ (s) = 1 for all s in [0. the distance traveled along the curve (in one direction) is uniquely determined by the amount of time elapsed. So the arc length parametrization traverses the curve at a “normal” rate. with different speeds. and vice versa. b]. by the parameter s given by t s = s(t) = a f ′ (u) du. then f ′ (t) > 0 for all t in [a. b]. And we know that the derivative of α is 1 1 = ′ α ′ (s) = ′ s (α(s)) f (α(s)) So define the arc length parametrization f : [0. L] → 3 s [0. L]. b] onto the interval [s(a). b] s(t) Figure 1. Recall that this means that s is a one-to-one mapping of the interval [a.9. its derivative is s ′ (t) = ds d = dt dt t f ′ (u) du = f ′ (t) a for all t in [a. That is. L] is a one-to-one. So the new parameter will be distance instead of time. L] such that s = s(t) and t = α(s). by the Chain Rule. L] t [a. From single-variable calculus. L]. s is the distance traveled along the curve after time t has elapsed. s(b)]. But we see that a b s(a) = a f ′ (u) du = 0 and s(b) = a f ′ (u) du = L = arc length from t = a to t = b α(s) So the function s : [a.1. Since f(t) is smooth. we know that this means that there exists an inverse function α : [0. so which one is the best to use? In some situations the arc length parametrization can be useful. L]. The idea behind this is to replace the parameter t. so 1 = f ′ (α(s)) ′ . b].9 Arc Length 61 A curve can have many parametrizations. Then f(s) is smooth. then it is a function of t. b] → [0. There is a natural correspondence between s and t: from a starting point on the curve. Since s is the arc length of the curve over the interval [a. b] → [0. b].43) In terms of motion along a curve. b] that is differentiable and the inverse of s : [a. Thus s ′ (t) > 0 and hence s(t) is strictly increasing on the interval [a. differentiable mapping onto the interval [0. for any given smooth parametrization f(t) defined on [a.1 t = α(s) by f(s) = f(α(s)) for all s in [0. b]. .

is desirable. This makes their computation relatively simple. We will leave this to the exercises. 2π]. The simple integral in Example 1. which we discussed in Section 1. z(t) = z(t) so differentiating the above expressions for x(t) and y(t) with respect to t gives x ′ (t) = r ′ (t) cos θ(t) − r(t)θ ′ (t) sin θ(t). y(t) = r(t) sin θ(t). Then the arc length L of the curve over [a. t).43). arc length parametrizations are more useful for theoretical purposes than for practical computations.41 and formula (1. √ for all s in [0. the usual parametrizations of Bézier curves. are polynomial functions in 3 .43 is the exception. not the norm.16 The methods involve using an arc length parametrization. VECTORS IN EUCLIDEAN SPACE In practice. b] is b L= a r ′ (t)2 + r(t)2 θ ′ (t)2 + z ′ (t)2 dt (1. they are in fact usually impossible to calculate at all. Suppose that r = r(t). z(t)) of a point on the curve are given by x(t) = r(t) cos θ(t).8. you would then substitute the expression for t in terms of s (which we called α(s)) into the formula for f(t) to get f(s). which makes their computation much easier. sin √ . y(t).43. Note that f ′ (s) = 1. For example. 2 2π]. by arc length. The arc length for curves given in other coordinate systems can also be calculated: Theorem 1. 16 17 y ′ (t) = r ′ (t) sin θ(t) + r(t)θ ′ (t) cos θ(t) See O’N EILL for an introduction to elementary differential geometry. Example 1.62 CHAPTER 1. and these definitions can be shown to be equivalent to those using arc length. 2 √ s s s ∴ f(s) = cos √ . we have t t s= 0 f ′ (u) du = 0 √ √ 2 du = 2 t for all t in [0. for t in [a. 2 2 2 Arc length plays an important role when discussing curvature and moving frame fields.44) Proof: The Cartesian coordinates (x(t). 2π].17 Curvature and moving frame fields can be defined without using arc length. which often leads to an integral that is either difficult or impossible to evaluate in a simple closed form. parametrizing a curve f(t) by arc length requires you to evaluate the t integral s = a f ′ (u) du in some closed form (as a function of t) so that you could then solve for t in terms of s. for t in [0. . θ = θ(t) and z = z(t) are the cylindrical coordinates of a curve f(t). which. s So we can solve for t in terms of s: t = α(s) = √ . sin t. If that can be done. Solution: By Example 1. b]. In general. But their arc length parametrizations are not only not polynomials. Parametrize the helix f(t) = (cos t. in CAD. in the field of mathematics known as differential geometry.22.

θ ′ (t) = 1 and z ′ (t) = et . 1]. Find the arc length L of the curve whose cylindrical coordinates are r = et . dt f(t) f(t) 3 .9 Arc Length and so x ′ (t)2 + y ′ (t)2 = (r ′ (t) cos θ(t) − r(t)θ ′ (t) sin θ(t))2 + (r ′ (t) sin θ(t) + r(t)θ ′ (t) cos θ(t))2 = r ′ (t)2 (cos2 θ + sin2 θ) + r(t)2 θ ′ (t)2 (cos2 θ + sin2 θ) − 2r ′ (t)r(t)θ ′ (t) cos θ sin θ + 2r ′ (t)r(t)θ ′ (t) cos θ sin θ b 63 = r ′ (t)2 + r(t)2 θ ′ (t)2 . then 1 L= 0 1 r ′ (t)2 + r(t)2 θ ′ (t)2 + z ′ (t)2 dt e2t + e2t (1) + e2t dt = 0 1 = 0 √ √ et 3 dt = 3(e − 1) A Exercises © ¨ For Exercises 1-3. Let f(t) be a differentiable curve such that f(t) 0 for all t. f(t) = (2 cos 3t. and so L= a b x ′ (t)2 + y ′ (t)2 + z ′ (t)2 dt r ′ (t)2 + r(t)2 θ ′ (t)2 + z ′ (t)2 dt QED = a Example 1. (t2 + 1) sin t.1. 3 sin 2t. calculate the arc length of f(t) over the given interval. for t over the interval [0. Solution: Since r ′ (t) = et . 3t) on [0. 5. 1] 3. 2 2t) on [0. 1] 4.44. Show that d f(t) f(t) × (f ′ (t) × f(t)) = . 1. f(t) = (3 cos 2t. 2t3/2 ) on [0. 2 sin 3t. f(t) = ((t2 + 1) cos t. Parametrize the curve from Exercise 1 by arc length. θ = t and z = et . Parametrize the curve from Exercise 3 by arc length. B 6. π/2] √ 2.

sin t. b] is b L= a ρ ′ (t)2 + (ρ(t)2 sin2 φ(t)) θ ′ (t)2 + ρ(t)2 φ ′ (t)2 dt. f ′ (t) 3 0 8. Then we can define the unit f ′ (t) . B and κ at each point of the helix f(t) = (cos t. 12. B(t) = 10. f ′ (t) × f ′′ (t) . f ′ (t) × f ′′ (t) Note: The vectors T(t). T ′ (t) f ′ (t) × (f ′′ (t) × f ′ (t)) . Continuing Exercise 7. κ(t) = 11.64 CHAPTER 1. t). Then T ′ (t) so we can define the unit principal normal vector N by N(t) = Show that N(t) = T ′ (t) . = ′ (t) f f ′ (t) 4 Show that f ′ (t) × f ′′ (t) and that T ′ (t) = f ′ (t) κ(t) N(t). Show that the arc length L of a curve whose spherical coordinates are ρ = ρ(t). B at a point on a curve. N(t) and B(t) form a right-handed system of mutually perpendicular unit vectors (called orthonormal vectors) at each point on the curve f(t). N. f ′ (t) f ′′ (t) × f ′ (t) 9. . f ′ (t) f ′ (t) × (f ′′ (t) × f ′ (t)) . assume that f ′ (t) and f ′′ (t) are not parallel. Let f(t) be a smooth curve such that f ′ (t) tangent vector T by T(t) = Show that T ′ (t) = 0 for all t. 7. Find T. N. Continuing Exercise 9. θ = θ(t) and φ = φ(t) for t in an interval [a. VECTORS IN EUCLIDEAN SPACE Exercises 7-9 develop the moving frame field T. f ′ (t) 3 Note: κ(t) gives a sense of how “curved” the curve f(t) is at each point. the unit binormal vector B is defined by B(t) = T(t) × N(t). Continuing Exercise 8. the curvature κ is defined by κ(t) = Show that T ′ (t) f ′ (t) × (f ′′ (t) × f ′ (t)) .

and the range of f is all of . The domain of the function f (x. z) in 3 . The range of f is all real numbers except 0. since the quantity inside the square root is nonnegative if and only if 1 − (x2 + y2 ) ≥ 0. y) = 1 x−y is all of 2 except the points (x. We see that D consists of all points on and inside the unit circle in 2 (D is sometimes called the closed unit disk).2. and there will also be times when it will be convenient to think of the points as vectors (or terminal points of vectors). y. Example 2. Example 2. We will now examine real-valued functions of a point (or vector) in 2 or 3 . y) as (x. y) varies over the domain D. 1] in . y) for which x = y. A similar definition holds for functions f (x. y) = xy is all of 2. The domain of the function f (x. the domain is the set D = {(x. y) in D a real number f (x. A real-valued function f defined on a subset D of 2 is a rule that assigns to each point (x. y) : x y}.2 Functions of Several Variables 2. y).3. but there will be times when we will use points in 3 . Example 2. 65 .8 we discussed vector-valued functions of a single real variable. The domain of the function f (x. and the range of f is the set of all real numbers f (x. y. The range of f is the interval [0. For the most part these functions will be defined on sets of points in 2 . The largest possible set D in 2 on which f is defined is called the domain of f .1. y) : x2 + y2 ≤ 1}. z) defined on points (x.1 Functions of Two or Three Variables In Section 1. That is. y) = 1 − x2 − y2 is the set D = {(x.

0). Level curves are often projected onto the xy-plane to give an idea of the various “elevation” levels of the surface (as is done in topography).1 The function f (x.4. y) = sin x2 + y2 x2 + y2 is shown below. Equivalently. The graph of the function f (x. y. y) = √ 2 2 x +y You may be wondering what happens to the function in Example 2.1. y. The domain of the function f (x. 1 0. y) defined in 2 is often written as z = f (x. z) = e x+y−z is all of 3. So we see that this graph is a surface in 3 .5. since it satisfies an equation of the form F(x.2 -0. y. y) approaches (0.4 -5 -10 -5 y 0 0 5 5 10 10 x -10 √ sin x2 +y2 Figure 2. and the range of f is all positive real numbers. A function f (x. y). y) is the set {(x. where c varies over . F(x. 0). The traces of this surface in the planes z = c. are called the level curves of the function. y) = (0.1. for c in . Note that the level curves (shown both on the surface and projected onto the xy-plane) are groups of concentric circles. 0). y) − z). since both the numerator and denominator are 0 at that point. as was mentioned in Section 1. y. We will now state explicitly what is meant by the limit of a function of two variables. z) = 0 (namely.66 CHAPTER 2.4 z 0. FUNCTIONS OF SEVERAL VARIABLES Example 2. y) = c. z) = f (x.5 at the point (x.2 0 -0. so that the graph of f (x. z) : z = f (x. . Example 2. The function is not defined at (0.8 0.6 0. y)} in 3 . the level curves are the solution sets of the equations f (x. but the limit of the function exists (and equals 1) as (x.

b) (but not necessarily defined at (a. written as (x.2. and let f (x. Let (a. 2). (2. Example 2. The idea behind the above definition is that the values of f (x. however. If you recall the “epsilon-delta” proofs of limits of real-valued functions of a single variable. In general. y) − L| < ǫ whenever 0 < (x − a)2 + (y − b)2 < δ. b) (e. A similar definition can be made for functions of three variables. In two dimensions. within ǫ of L) if we pick (x. Instead.2(b)). y) can get arbitrarily close to L (i. b). b) itself). y) can approach a point (a. In the single-variable case. Then we say that the limit of f (x. y) is given by a single formula and is defined at the point (a. y) be a real-valued function defined on some set containing (a. and how they can usually only be done easily for simple functions.1. the statement “x → a” means that x gets closer to the value a from two possible directions along the real number line (see Figure 2. The major difference between limits in one variable and limits in two or more variables has to do with how a point is approached. you may remember how awkward they can be. the multivariable cases are at least equally awkward to go through. y) = (a. (x.1. b) in 2 (a. b) with some sufficiently small radius δ). y x 0 a x x 0 (b) (x. b) x (a) x → a in Figure 2.b) lim f (x. b) into the formula for f (x.6. so we will not bother with such proofs. y) → (a. y) = xy x2 +y2 is properly defined at the point (1. y) = L . y) approaches (a.g. y) to find the limit. inside a circle centered at (a.y)→(a.1 Functions of Two or Three Variables 67 Definition 2. b) (i.e.2) lim x2 xy (1)(2) 2 = 2 = 2 2 5 +y 1 +2 since f (x.1.1) if given any ǫ > 0. y) equals L as (x. (x. b) along an infinite number of paths (see Figure 2.2 “Approaching” a point in different dimensions .2(a)). we will simply state that when the function f (x. is not some indeterminate form like 0/0) then you can just substitute (x.y)→(1. there exists a δ > 0 such that | f (x. b) be a point in 2 . y) sufficiently close to (a.1.e.

y)g(x. But if (x.b) (x.b) lim f (x. Hence the limit does not exist. y) → (0.b) Note that in part (e). for x > 0. Then f (x. y) = g(x. so that y = 0 along that path. To see this. y) for all (x.b) f (x. then we see that f (x.y)→(a. y) approaches (0.b) lim g(x.b) (x.1. Limits of real-valued multivariable functions obey the same algebraic rules as in the single-variable case.b) lim f (x.b) (x. 0) into the function. suppose that (x. y) ± g(x.b) lim (x. To show that the limit does not exist.y)→(a. y) (x. Theorem 2. y)] = f (x.y)→(a. y) − L| ≤ g(x. it suffices to have | f (x.y)→(a. .y)→(a. y) → (0.b) lim (x. y) = L.68 Example 2.y)→(a. y) lim g(x. since doing so gives an indeterminate form 0/0.b) lim lim g(x.0) lim xy does not exist x2 + y2 Note that we can not simply substitute (x.y)→(a. (e) If | f (x.b) lim [ f (x. y) = xy x0 = 2 =0 x2 + y2 x + 02 along that path (since x > 0 in the denominator).b) f (x.y)→(a. y) ± (x. b) itself). y) for all (x. y) and (x. 0) along the straight line y = x through the origin. y) = (0.y)→(a.b) (d) (x. 0) along different paths. y) lim (x. 0) along different paths in 2 . y)] = k f (x. as shown in the following theorem. y) g(x.b) lim g(x.y)→(0. Then: (a) (b) (c) (x. we will show that the function approaches different values as (x. 0) along the positive x-axis. y) approaches different values as (x. and that k is lim lim [ f (x.7. Suppose that some scalar. then (x. y) (x. y) → (0.y)→(a. y) f (x. FUNCTIONS OF SEVERAL VARIABLES (x. CHAPTER 2. y) = 0. y) if (x. y) and if (x.b) lim g(x.y)→(a. y) “sufficiently close” to (a. y) = x2 1 xy = 2 = 2 + y2 2 2 x x +x which means that f (x.y)→(a. y) 0 lim f (x.y)→(a.y)→(a. y) both exist. b) (but excluding (a. y) − L| ≤ g(x. which we state without proof.y)→(a. y) = k lim (x.y)→(a.

y) (0.b) b4 = f (a. 0) into the function gives the indeterminate form 0/0. y) → (0.y)→(a. 0) Then f (x. x2 + y2 x + y2 Therefore lim y4 = 0. 0) by Example 2. y) = y 2 x + y2 if (x. 2. y) with domain D in 2 is continuous at the point (a. y) is a continuous (x. 0). Define a function f (x.y)→(a. there are no indeterminate forms for (x. Unless indicated otherwise. b).0) lim f (x. y) in any (x.y)→(0. y). for all (x.y)→(0.y)→(0. y) is well-defined for all (x. y) = f (a. b) in D if lim f (x. 0). you can assume that all the functions we deal with are continuous. A real-valued function f (x. But x2 + y2 4 = (x2 + y2 )2 . we need an alternate method for evaluating this limit. y) is continuous on all of . y) on all of 2 as follows: 0 if (x. y) = 2 (i. In fact. 0) we have y4 (x2 + y2 )2 ≤ 2 = x2 + y2 → 0 as (x. Example 2.2. y) = (0. So since (x. we can modify the function from Example 2. 0) 4 f (x. y) (0. b) for (a. x2 + y2 Since substituting (x.0) 69 lim y4 = 0.e.1 Functions of Two or Three Variables Example 2. First. Thus.9. We say that f (x.8. y) = 0 = f (0.2.8.1(e). then f (x. b) a2 + b2 (0.8 so that it is continuous on all of 2 . y)).0) Continuity can be defined similarly as in the single-variable case. Show that (x. We will use Theorem 2.b) function if it is continuous at every point in its domain D. x2 + y2 (x. and we see that lim f (x. y) = (0. notice that y4 = y2 4 and so 0 ≤ y4 ≤ x2 + y2 4 for all (x. Definition 2.

f (x. y. 18. y) ≤ f (y.1(a) in the case of addition. 7. f (x. (x.0) lim lim x2 − y2 x−y 4 sin(xy) y x2 + y2 x y (x.0) lim 1 xy (x.) 23. 17. 21.0) lim e xy xy2 x2 + y4 xy2 x2 + y2 x2 − 2xy + y2 x−y (x2 + y2 ) cos cos 1 xy x2 − y2 (x.0) lim lim (x. (x. Show that f (x.y)→(0. and is used as a filter in image processing software to produce a “blurred” effect.0) lim cos(xy) 8.0) lim (x.y)→(0. This function is called a Gaussian blur. (Hint: Use Definition 2. is constant on the circle of radius r > 0 centered at the origin. y) = 4. Suppose that f (x. 10. y) = x2 + y2 − 1 3. Show that f (x. y) = x2 + y2 − 4 Exercises © ¨ For Exercises 1-6.) C 22.0) lim B 1 19. z) = For Exercises 7-18.y)→(0.y)→(0.y)→(0. 12. f (x. y) in 2 . f (x. z) = sin(xyz) 6.0) 2. y) in (x.y)→(0.0) x2 + y2 x2 − 2xy + y2 11.y)→(0. FUNCTIONS OF SEVERAL VARIABLES A 1. x) for all lim sin x2 + y2 x2 + y2 =1. for σ > 0. f (x. 16.0) (x.1.1(b).70 CHAPTER 2. evaluate the given limit.y)→(1.y)→(0. (x. state the domain and range of the given function.−1) x−y lim 13. f (x.y)→(1. 2 2 2 20. Use the substitution r = x2 + y2 to show that (x.y)→(0. y) = f (y. y) = x2 1 + y2 x2 + 1 y (x − 1)(yz − 1) 5. lim (x. (Hint: You will need to use L’Hôpital’s Rule for single-variable limits. 14. 15.y)→(0. Prove Theorem 2. 9.1) (x. Prove Theorem 2.0) lim (x. x) for all (x. y) = 2πσ2 e−(x +y )/2σ . y. 2. .y)→(0.

b). y) with respect to x or y is the rate of change of f (x. to distinguish it from the letter d used for the “usual” derivative. b) with respect to y.1 Recall that the derivative of a function f (x) can be interpreted as the rate of change of that function in the (positive) x direction. is ∂y (2. and then simply differentiating f (x. b) ∂f (a.2) ∂f (a. Likewise. From the definitions above. y) as if it were a function of y alone. b) = lim . y) with respect to x gives ∂f (x.2 Partial Derivatives Now that we have an idea of what functions of several variables are. h→0 ∂y h (2.2 Partial Derivatives 71 2. y) with respect to x can be calculated by treating the y variable as a constant. and what a limit of such a function is. Clairaut and L. b) − f (a. using the usual rules from single-variable calculus. b) with respect to x. y) in the (positive) x or y direction. is defined as ∂x f (a + h. b) = lim h→0 ∂x h and the partial derivative of f at (a. Then the partial derivative of f at (a. Let f (x. y) as if it were a function of x alone. y) with respect to y is obtained by treating the x variable as a constant and then differentiating f (x.2. What this means is that the partial derivative of a function f (x. Euler around 1740. y) = x2 + 3y2 . y) = 2xy ∂x and treating x as a constant and differentiating f (x. b) (a. we can start to develop an idea of a derivative of a function of two or more variables. denoted by defined as ∂f f (a. respectively. b). and let (a. ∂y 1 It is not a Greek letter.3) Note: The symbol ∂ is pronounced “del”. ∂x ∂y Solution: Treating y as a constant and differentiating f (x. the partial derivative of f (x. The symbol was first used by the mathematicians A. y) with respect to y gives ∂f (x. y) = x2 y + y3 . Example 2. We will start with the notion of a partial derivative. b) be a point in D. Definition 2.3. Find ∂f ∂f (x. y) for the function f (x. we can see that the partial derivative of a function f (x. y) be a real-valued function with domain D in 2 . y) and (x. b + h) − f (a. .10. ∂f denoted by (a.

y) with respect to x gives ∂f (x2 + 1)(y2 cos(xy2 )) − (2x) sin(xy2 ) = ∂x (x2 + 1)2 and treating x as a constant and differentiating f (x. . y) and (x. 2. ∂y x2 + 1 ∂f ∂f and are themselves functions of x and y. Find ∂f ∂f sin(xy2 ) . y). y) with respect to y gives 2xy cos(xy2 ) ∂f = .11. and for the function f (x. 2.12. ∂x ∂y ∂x ∂y We will often simply write Example 2. . y) = e x y + xy3 . we can take their partial ∂x ∂y derivatives with respect to x and y. This yields the higher-order partial derivatives: Since both ∂2 f ∂ ∂f = 2 ∂x ∂x ∂x 2f ∂ ∂f ∂ = ∂y ∂x ∂y ∂x ∂3 f ∂ ∂2 f = ∂x3 ∂x ∂x2 3f ∂ ∂2 f ∂ = ∂y ∂x2 ∂y ∂x2 ∂ ∂2 f ∂3 f = ∂y2 ∂x ∂y ∂y ∂x ∂ ∂2 f ∂3 f = ∂x ∂y ∂x ∂x ∂y ∂x . y) = 2 ∂x ∂y x +1 Solution: Treating y as a constant and differentiating f (x. 2 ∂2 f ∂ f ∂ f ∂2 f ∂2 f ∂2 f .72 CHAPTER 2. and for the ∂x ∂y ∂x ∂y ∂y ∂x ∂x ∂y . . ∂ ∂f ∂2 f = 2 ∂y ∂y ∂y 2f ∂ ∂ ∂f = ∂x ∂y ∂x ∂y ∂ ∂2 f ∂3 f = ∂y ∂y2 ∂y3 3f ∂ ∂2 f ∂ = ∂x ∂y2 ∂x ∂y2 ∂ ∂2 f ∂3 f = ∂x2 ∂y ∂x ∂x ∂y ∂3 f ∂ ∂2 f = ∂y ∂x ∂y ∂y ∂x ∂y Example 2. FUNCTIONS OF SEVERAL VARIABLES ∂f ∂f ∂f ∂f and instead of (x. Find the partial derivatives function f (x.

Dxx (x. fx (x. y) in the domain of f . y) . f11 (x. y) D11 (x. fyx (x.2 Partial Derivatives Solution: Proceeding as before. y) . y) D2 (x. b). y) . All of the following are equivalent: ∂f : ∂x ∂f : ∂y ∂2 f : ∂x2 ∂2 f : ∂y2 ∂2 f : ∂y ∂x ∂2 f : ∂x ∂y 2 ∂2 f ∂y ∂x ∂2 f both ∂y ∂x = ∂2 f ∂x ∂y . y) D22 (x. fxx (x. Dyy (x. y) . Dxy (x. Specifically. f2 (x. Dx (x. y) .2. f22 (x. are called mixed partial derivatives. y) . This applies even to mixed partial derivatives of order 3 or higher. y) D21 (x. ∂y ∂x ∂x ∂y In other words. y) . y) . y) . Dy (x. fyy (x. ∂2 f ∂2 f such as ∂y ∂x and ∂x ∂y . we have 2 ∂f = 2xye x y + y3 ∂x 2 ∂ ∂2 f = (2xye x y + y3 ) 2 ∂x ∂x 2 ∂f = x2 e x y + 3xy2 ∂y 2 ∂ ∂2 f = (x2 e x y + 3xy2 ) 2 ∂y ∂y 73 = 2ye x y + 4x2 y2 e x 2 2y = x4 e x y + 6xy ∂2 f ∂ 2 x2 y = (x e + 3xy2 ) ∂x ∂y ∂x = 2xe x y + 2x3 ye x y + 3y2 2 2 2 2 ∂2 f ∂ = (2xye x y + y3 ) ∂y ∂x ∂y = 2xe x y + 2x3 ye x y + 3y2 2 2 Higher-order partial derivatives that are taken with respect to different variables. D1 (x. y) . fxy (x. The notation for partial derivatives varies. y) . y) . y) See pp. 214-216 in T AYLOR and M ANN for a proof. Notice in the above example that whenever are continuous at a point (a. y) . Dyx (x. y) . ∂2 f and ∂x ∂y It turns that this will usually be the case. y) . . f12 (x. all orders. y) . y) D12 (x. fy (x. then they are equal at that 2 All the functions we will deal with will have continuous partial derivatives of point. f1 (x. y) . y) . it doesn’t matter in which order you take partial derivatives. so you can assume in the remainder of the text that ∂2 f ∂2 f = for all (x. f21 (x.

28. ∂x2 ∂y2 15. f (x. f (x. y) = x + 2y 26. f (x. y) = x2 + y + 4 and ∂2 f ∂y ∂x (use Exercises 1-8. y) = xy + 1 x+y x2 + y + 4 2 +y2 ) 13. y) = x + 2y 10. Show that the function f (x. ¨ 2. f (x. f (x. f (x. 15). f (x. find 17. 1 in W EINBERGER. y) = x2 + y + 4 ∂f ∂x Exercises © and ∂f ∂y . y) = ln(xy) 16. y) = e−(x 14. f (x. y) = u(x + cy) + v(x − cy) is a solution of the general one-dimensional wave equation3 ∂2 f 1 ∂2 f − 2 2 =0. y) = x2 − y2 + 6xy + 4x − 8y + 2 8. y) = x2 − y2 + 6xy + 4x − 8y + 2 24. f (x. f (x. . f (x. y) = sin(x + y) + cos(x − y) satisfies the wave equation ∂2 f ∂2 f − =0. 14. y) = x+1 y+1 5. f (x. y) = cos(x + y) 4. y) = cos(x + y) 20. f (x. y) = x+1 y+1 21. find 1. f (x. f (x. FUNCTIONS OF SEVERAL VARIABLES A For Exercises 1-16. f (x. f (x. Show that f (x. f (x. y) = x4 25. and let c 0 be a constant. y) = e xy + xy 7. f (x. y) = x4 9. f (x. y) = sin(x + y) 12. y) = sin(xy) For Exercises 17-26. y) = x2 + y2 19. y) = 11. See Ch. f (x. f (x. y) = x2 + y2 3 6. y) = tan(x + y) ∂2 f ∂2 f . f (x. y) = sin(xy) B 27. y) = x2 + y2 3. ∂x2 c ∂y 3 Conversely. f (x. 18. y) = e xy + xy 23. it turns out that any solution must be of this form. Let u and v be twice-differentiable functions of a single variable. ∂x2 ∂y2 The wave equation is an example of a partial differential equation.74 CHAPTER 2. f (x. f (x. y) = ln(xy) 22.

1 Partial derivatives as slopes dy Since the derivative dx of a function y = f (x) is used to find the tangent line to the graph of f (which is a curve in 2 ). then we call T the tangent plane to S at P. b. y) in the positive x and y direcdy tions. and the slope of the tangent line L x to that curve at that point is ∂ f (a. If the (acute) angle between the vector − − → PQ and the plane T approaches zero as the point Q approaches P along the surface S .1). Let T be a plane which contains the point P. b) D y x (b) Tangent line Ly in the plane x = a Figure 2. Note that since two lines in 3 determine a plane. f (a. b)) slope = ∂f ∂x (a. Let z = f (x. y) be the equation of a surface S in 3 . and let P = (a. The intuitive idea is that a tangent plane “just touches” a surface at a point. z z = f (x. b)).3. b)) Ly Lx b 0 (a. f (x)) in 2 . b) z slope = ∂f ∂y (a. y) in the plane y = b is a curve in 3 through the point (a. z) represent a generic point on the surface S . then the two tangent lines to the surface z = f (x.3 Tangent Plane to a Surface 75 2. and let Q = (x. b) in the domain D of f (x. b) D (a) Tangent line L x in the plane y = b y 0 a x (a. respectively. First. we need a definition of a tangent plane. This indeed turns out to be the case. Definition 2.4.3. b) is the slope of the tangent line Ly to the trace of the surface z = f (x. ∂x Similarly. b. y) in the x and y directions described in Figure 2. the trace of the surface described by z = f (x. y) (a.3 Tangent Plane to a Surface In the previous section we mentioned that the partial derivatives ∂ f and ∂ f can be ∂x ∂y thought of as the rate of change of a function z = f (x. There is a similar geometric meaning to the partial derivatives ∂ f and ∂ f of ∂x ∂y a function z = f (x. you might expect that partial derivatives can be used to define a tangent plane to the graph of a surface z = f (x. b. y). y): given a point (a. f (a.2. c) be a point on S . b. The formal definition mimics the intuitive notion of a tangent line to a curve. b). y). y) (a. y) ∂y in the plane x = a (see Figure 2. Recall that the derivative dx of a function y = f (x) has a geometric meaning.1 are contained in . f (a. y. namely as the slope of the tangent line to the graph of f at the point (x. b) z = f (x.3. ∂ f (a.

b) ∂f where n = (A. ∂ f (a. Thus the equation of T is − ∂f ∂x (a. (2.3.2). b) and are continuous at (a. C) is a normal vector to the plane T .2 Tangent plane Since the slope of Lx is ∂ f (a.b) ∂f ∂x (a. f (a. the vector ∂y i j n = vx × vy = 1 0 0 1 k ∂f ∂x (a. b) (y − b) + z − f (a. b)). b) = 0 . § 6.5) Multiplying both sides by −1. it turns out4 that if ∂ f and ∂ f exist in a region around a point (a. z Suppose that we want an equation of the tangent plane T to the surface z = f (x. y) and Ly be the tangent lines to the traces of the surface (a. Luckily. Then the equation for T is T y A(x − a) + B(y − b) + C(z − f (a. Hence. b) ∂f ∂y (a. Since T contains the lines Lx and Ly . f (a. and then let n = vx × vy . b. those conditions will always hold. b)) is parallel to Ly . b).3. and suppose that the conditions for T to exist do Lx hold. In this text. b. b)). See Figure 2. b.4) 0 x Figure 2. b. then all we need are vectors vx and vy that are parallel to Lx and Ly . Similarly. the vector ∂x vy = (0. b) x 0 1 Figure 2. we have the following result: The equation of the tangent plane to the surface z = f (x. then the vector vx = (1. y) will exist at the point (a. 0. z vx = (1. or it may not have a tangent line at all at that point. Let Lx z = f (x.3. B.3 = − ∂ f (a. 0. y) at the point (a. b) (x − a) − ∂f ∂y (a. b) i ∂x − ∂f ∂y (a. FUNCTIONS OF SEVERAL VARIABLES the tangent plane at that point. b) (x − a) + ∂y (a. b)) ∂x (a. respectively (as in Figure y 2.4. if the tangent plane exists at that point. b) j +k is normal to the plane T . The existence of those two tangent lines does not by itself guarantee the existence of the tangent plane. y) at a point (a. It is possible that if we take the trace of the surface in the plane x − y = 0 (which makes a 45◦ angle with the positive x-axis). b)) is ∂f ∂f (2. b)) = 0 (2.76 CHAPTER 2. . ∂ f (a. ∂ f (a. the resulting curve in that plane may have a tangent line which is not in the plane determined by the other two tangent lines. f (a. b) then the ∂x ∂y tangent plane to the surface z = f (x. b) = 0 4 See T AYLOR and M ANN. b) (y − b) − z + f (a. respectively. b)) L in the planes y = b and x = a. b)) is ∂x ∂x parallel to Lx (since vx lies in the xz-plane and lies in a line with slope ∂x 1 = ∂ f (a.6) ∂x (a. 1. f (a.3.3). b).

P = (1. 1. f (x. f (x. 0. y) = x2 y. −1. b. 1. we have ∂F ∂z = 2z. 5) For Exercises 7-10. c) (z − c) = 0 . y) = xy. y) − z. 1. P = (1. Find the equation of the tangent plane to the surface x2 + y2 + z2 = 9 at the point (2. y) = x + 2y. 2 √ 11 3 9. 1.7) where F(x. f (x. 0. y) = x2 + y2 . 2) 3. and 2(2)(x − 2) + 2(2)(y − 2) + 2(−1)(z + 1) = 0 . b.3 Tangent Plane to a Surface 77 Example 2. x2 + y2 − z2 = 0.2. 1) 5. 2. P = (3. In a similar fashion. 3) √ 10. find the equation of the tangent plane to the given surface at the point P. it can be shown that if a surface is defined implicitly by an equation of the form F(x.13. 4. P = 1. 2. y. Solution: For the function F(x.7) Note that formula (2. −1) is 2x + 2y − z − 9 = 0 . z) = 0. x2 + y2 = 4. 4) 2. so the equation 2(1)(x − 1) + 2(2)(y − 2) − z + 5 = 0 . 2. c) is given by the equation ∂F ∂x (a. y. 5) is ∂f ∂x = 2x and ∂f ∂y = 2y. c) (y − b) + ∂F ∂z (a. −1) 4. P = (0. so the equation of the tangent plane at (2. 7. y) = xey .6) is the special case of formula (2. b. z) = x2 + y2 + z2 − 9. y. P = (3. x2 + y2 + z2 = 9. or 2x + 4y − z − 5 = 0 . −1). P = (2. 2. Example 2. c) (x − a) + ∂F ∂y (a. ∂F ∂x = 2x. x2 4 + y2 9 + z2 16 = 1. 2. 0) . f (x. y) = x2 + y3 . (2. 1) 6. P = (−1. ∂F ∂y = 2y. then the tangent plane to the surface at a point (a. we have of the tangent plane at the point (1. Find the equation of the tangent plane to the surface z = x2 + y2 at the point (1. 4. y) = x2 + y2 . P = (1. z) = f (x. find the equation of the tangent plane to the surface z = f (x. 5). or ¨ A Exercises © For Exercises 1-6. b. 5) 8. P = ( 3. f (x.14. y) at the point P. f (x. Solution: For the function f (x. 1.

0) and v = j = (0. b) as a vector. FUNCTIONS OF SEVERAL VARIABLES 2. b) = v1 ∂f ∂f (a. as that represents a “standard” vector for a given direction. since we are adding the vector hv to it. y) be a real-valued function with domain D in 2 such that the partial derivatives ∂ f and ∂ f exist and are continuous in D. b) . b + hv2 ) − f (a.4 Directional Derivatives and the Gradient For a function z = f (x. y) has continuous partial derivatives ∂ f and ∂ f (which will always be the case ∂x ∂y in this text). b). Since there are many vectors with the same ∂x ∂y direction. Let v be a unit vector in 2 . then Dv f (a.2.8) h→0 Notice in the definition that we seem to be treating the point (a. Then Dv f (a. If f (x. If we were to write the vector v as v = (v1 . 1) ∂x the formula reduces to Dv f (a. Definition 2. we learned that the partial derivatives ∂ f and ∂ f represent ∂x ∂y the (instantaneous) rate of change of f in the positive x and y directions. Similarly. b) be a point in D. ∂ f = Di f and ∂ f = Dj f . What about other directions? It turns out that we can find the rate of change in any direction using a more general type of derivative called a directional derivative. b) . So since ∂y ∂y i = (1. 0) and j = (0. y). and let (a.10) Proof: Note that if v = i = (1. is defined as Dv f (a. That is. respectively. b) = ∂ f (a. b). 0) then the above formula reduces to Dv f (a. ∂x ∂y and let v = (v1 . b) = ∂ f (a. 1) are the only unit vectors in 2 with a zero component. b).5. respectively. y) be a real-valued function with domain D in 2 . b) + v2 (a. 1). b) + hv) − f (a. But this is just the usual idea of identifying vectors with their terminal points. Let f (x. b) h (2. Let (a. denoted by Dv f (a. ∂x which we know is true since Di f = ∂ f .78 CHAPTER 2. we use a unit vector in the definition. b) in the direction of v. b) = lim f (a + hv1 . for v = j = (0. which is true since Dj f = ∂ f . as we noted earlier. Let f (x. b) = lim f ((a. b) be a point in D.9) h→0 From this we can immediately recognize that the partial derivatives ∂ f and ∂ f are ∂x ∂y special cases of the directional derivative with v = i = (1. ∂x ∂y (2. v2 ) be a unit vector in 2 . then there is a simple formula for the directional derivative: Theorem 2. then we . h (2. which the reader should be used to by now. Then the directional derivative of f at (a. v2 ).

b) = v · ∂f ∂f ∂x (a. b) = lim f (a + hv1 . So since the function f (a + hv1 . y) on the interval [b.4 Directional Derivatives and the Gradient need only show the formula holds for unit vectors v = (v1 . ∂y (a. b + hv2 ) − f (a + hv1 . b) ∂x ∂y h→0 QED after reversing the order of summation. b + αhv2 ) = g ′ (b + αhv2 ) = = ∂y b + hv2 − b hv2 and so f (a + hv1 . b + hv2 ) − f (a + hv1 . v2 ) with v1 So fix such a vector v and fix a number h 0. b) . b) = hv1 Thus. we have ∂f ∂f f (a + hv1 . (2. y) is a real-valued function of y (since a + hv1 is a fixed number). b) = v1 (a.2. b) + f (a + hv1 . b + hv2 ] (or [b + hv2 . b + hv2 ) − f (a + hv1 . b) = h h ∂f ∂f = v2 (a + hv1 . b) = hv2 ∂f (a + hv1 . b + hv2 ) − f (a.11) Since h 0 and v2 0. b) + v1 (a. by equation (2. b) . b) h→0 ∂y ∂x ∂f ∂f ∂f ∂f and . b) − f (a. so = v2 (a. b + αhv2 ) + v1 (a + βhv1 . b + αhv2 ) + v1 (a + βhv1 . b + hv2 ) − f (a. b) hv2 ∂y (a + hv1 . The second vector has a special name: . ∂x so by formula (2. b). f (a + hv1 . Then 0 and v2 79 0. then hv2 0 and thus any number c between b and b + hv2 can be written as c = b + αhv2 for some number 0 < α < 1. b + αhv2 ) . b) . b) by the continuity of ∂y ∂x ∂x ∂y ∂f ∂f Dv f (a.11). b + hv2 ) − f (a. b] if one of h or v2 is negative) to find a number 0 < α < 1 such that ∂f g(b + hv2 ) − g(b) f (a + hv1 . b + αhv2 ) + hv1 ∂x (a + βhv1 . b) (a + hv1 . b) − f (a. ∂y By a similar argument. b) + v2 (a. Note that Dv f (a. b) ∂y ∂x ∂f (a + βhv1 . there exists a number 0 < β < 1 such that f (a + hv1 . b) = f (a + hv1 . b) h ∂f ∂f = lim v2 (a + hv1 .9) we have Dv f (a. then the Mean Value Theorem from single-variable calculus can be applied to the function g(y) = f (a + hv1 .

2xy + x3 ). the gradient of f . 2(1)(2) + 13 ) = 15 √ 2 A real-valued function z = f (x. the gradient is the vector ∇f = ∂f ∂f ∂f . y) = xy2 + x3 y at the point (1. y) = c (see Figure 2. Assume that f (x. ∂x ∂y in 2. Find the directional derivative of f (x. Dv f = v · ∇ f Example 2. 2) = 1 1 √ . y) is such a function and that ∇ f 0.4. denoted by ∇ f .80 CHAPTER 2.12) ∇f = . y v ∇f f (x. . y) = c x 0 Figure 2. FUNCTIONS OF SEVERAL VARIABLES Definition 2.3. 2) in the direction of v = 1 1 √ . Let c be a real number in the range of f and let v be a unit vector in 2 which is tangent to the level curve f (x.4. y). so Dv f (1.5 Corollary 2. √ 2 2 . z). y) whose partial derivatives ∂ f and ∂ f exist and are ∂x ∂y continuous is called continuously differentiable.1).1 5 Sometimes the notation grad( f ) is used instead of ∇ f .6.15. √ 2 2 · (22 + 3(1)2 (2). Solution: We see that ∇ f = (y2 + 3x2 y.13) in 3. ∂x ∂y ∂z (2. 2) = v · ∇ f (1. For a real-valued function f (x. For a real-valued function f (x. . is the vector ∂f ∂f (2. y. The symbol ∇ is pronounced “del”.

Dv f = 0. In which direction does the function f (x. and the value of f decreases the fastest in the direction of −∇ f (since θ = 180◦ in that case). then the temperature will decrease the fastest in the direction of −∇ f (1. . 1. with ∇ f 0. Though we proved Theorem 2.16.17. Then: (a) The gradient ∇ f is normal to any level curve f (x. y) = c. z) = e−x + e−2y +e4z . −2e−2y . f increases the fastest in the direction of −2 −1 √ . where θ is the angle between v and ∇ f . y) the length ∇ f is fixed. a similar argument can be used to show that it also applies to functions of three or more variables. We have thus proved the following theorem: Theorem 2. 1. √ 5 5 Solution: Since ∇ f = (y2 + 3x2 y.2.4 for functions of two variables. √ 5 5 . (c) The value of f (x.4 Directional Derivatives and the Gradient 81 The value of f (x. Example 2. 2)? In which direction does it decrease the fastest? that direction is v = 2 1 √ . A unit vector in = 1 2 √ . while the smallest value occurs when cos θ = −1 (θ = 180◦ ). where x. where θ is the angle between v and ∇ f . Example 2. y) decreases the fastest in the direction of −∇ f . so since v is a tangent vector to this curve. y) increases the fastest in the direction of ∇ f . ∇ f ⊥ v. y) = xy2 + x3 y increase the fastest from the point (1. So since ∇ f 0 then Dv f = 0 ⇒ cos θ = 0 ⇒ θ = 90◦ . the directional derivative in the three-dimensional case can also be defined by the formula Dv f = v · ∇ f . 2e−2 . we still have Dv f = ∇ f cos θ. 1) = (e−1 . (b) The value of f (x. the value of the function f increases the fastest in the direction of ∇ f (since θ = 0◦ in that case).e. y) is constant along a level curve. √ 5 5 and decreases the fastest in the direction of . then ∇ f (1. y. which means that ∇ f is normal to the level curve. then the rate of change of f in the direction of v is 0. 2xy + x3 ). The temperature T of a solid is given by the function T (x. y) be a continuously differentiable real-valued function. 5) ∇f ∇f 0. i. In other words.4. In general. In other words. 4e4z ). Thus. for any unit vector v in 2 . But we know that Dv f = v · ∇ f = v ∇ f cos θ. Let f (x. In which direction from the point (1. y. 2) = (10. At a fixed point (x. The largest value that Dv f can take is when cos θ = 1 (θ = 0◦ ). So since v = 1 then Dv f = ∇ f cos θ. 1) will the temperature decrease the fastest? Solution: Since ∇ f = (−e−x . Likewise. and the value of Dv f then varies as θ varies. −4e4 ). z are space coordinates relative to the center of the solid.

∇( f g) = f ∇g + g ∇ f 23. 1) 15. f (x. 12. Repeat Example 2. let f (x. ∇( f + g) = ∇ f + ∇g 22. 1. D−v f = −Dv f 25. z) = sin(xyz). f (x. 3). The function r(x. 1) + y2 11. y) = x2 + y2 − 1 3. y. find the directional derivative of f at the point P in the direction of v = 1 1 1 √ . 0). y) = x2 + y2 − 1. z) = x2 + y2 + z2 For Exercises 11-14. and that ∇(r2 ) = 2 r. B For Exercises 19-26. f (x. f (x. Show that ∇r = 1 r when (x. FUNCTIONS OF SEVERAL VARIABLES A 1. y) = ln(xy) 7. y) = x2 + y2 + 4 Exercises © ¨ For Exercises 1-10. P = (1. y) = 2x + 5y 8. and let v be a unit vector in 2 . ∇( f /g) = g ∇ f − f ∇g if g(x. 1) 17. Dv (c f ) = c Dv f 26. f (x. f (x. y. √ . Show that: 19. P = (1. z) = x2 eyz .82 CHAPTER 2. 1. f (x. y) (0.16 at the point (2. f (x. y) = each point (x.17 at the point (3. ∇(c f ) = c ∇ f 21. y) = x2 + y2 + 4. Repeat Example 2. f (x. √ 2 2 . f (x. y. compute the gradient ∇ f . P = (1. f (x. P = (1. 2. y) = x2 1 + y2 4. find the directional derivative of f at the point P in the direction of v = 1 1 √ . y) = x2 ey . let c be a constant. f (x. Dv ( f g) = f Dv g + g Dv f x2 + y2 is the length of the position vector r = x i + y j for 2 . Dv ( f + g) = Dv f + Dv g 27. y) be continuously differentiable real-valued functions. 2). y) and g(x. z) = sin(xyz) 9. y. √ 3 3 3 . 1) 14. y) in 20. P = (1. P = (1. z) = x2 + y2 + z2 5. f (x. y. 16. f (x. r . y) = x2 ey 6. y) = x2 1 . f (x. f (x. y) g2 0 24. z) = x2 eyz 10. 1. 18. 1) 13. 1) For Exercises 15-16. y.

b) = 0 is not always sufficient to guarantee that a critical point is a local maximum or minimum. 0) since any disk around (0. If f (x. and let (a. y). then f has a global maximum at (a. Note: Theorem 2.7. y) inside some disk of positive radius centered at (a. b) for all (x. Then a necessary condition for f (x. Since g ′ (x) = ∂ f (x. y) = xy > 0 = f (0.e. So given a function f (x. y) inside some disk of positive radius centered at (a. But clearly f does not have a ∂y local maximum or minimum at (0. We know that f (a. b) is that ∇ f (a. in some sufficiently small disk centered at (a. b) is the largest value of f in the x direction (around the point (a. b). A point (a.2. b) has a local maximum at x = a. b) is the largest value of f (x. then f has a global minimum at (a. Let f (x. Example 2. along the path y = x . y) in the domain of f . we say that f has a local minimum at (a. f (a. So we know that g ′ (a) = 0. In particular. y) goes in all directions from the point (a. and that the first-order partial derivatives of f exist at (a. b) if f (x. the necessary condition that ∇ f (a. y). y) in the domain of f . Similar to the single-variable case. i.5. b). In fact. that is. there is some sufficiently small r > 0 such that f (x. y) ≤ f (a. points where the function has a local maximum or local minimum. b) for all (x. then ∂ f (a. y) ≥ f (a. We say that f has a local maximum at (a. b). b). y) where the values of x and y have the same sign (so that f (x. ∂x ∂x Similarly. b) if f (x. b). y) = xy has a critical point at (0. y) = 0 and ∂y (x. Let f (x. y) ≤ f (a. b) for all (x. y). b) = 0. b) is a local maximum point for f (x.5 Maxima and Minima 83 2. y) as (x. b) and ∂ f (a. ∂x and ∂ f = x = 0 ⇒ x = 0. functions of three or more variables require methods using linear algebra. the single-variable function g(x) = f (x. 0) is the only critical point. b) for all (x. b) = 0 is called a critical point for the function f (x. b). 0)) and different signs (so that f (x. that is. y). 0) contains points (x. y) for which (x − a)2 + (y − b)2 < r2 . b) is the largest value of f near (a. b) where ∇ f (a. y) ≤ f (a. y) = xy < 0 = f (0. b) = 0. 0): ∂ f = y = 0 ⇒ y = 0. y) be a real-valued function such that both ∂ f (a. b) ∂x ∂y exist. b) be a point in the domain of f . ∂y We thus have the following theorem: Theorem 2. b)). We will consider only functions of two variables. Likewise. b). f (a.18.5 can be extended to apply to functions of three or more variables.5 Maxima and Minima The gradient can be used to find extreme points of real-valued functions of several variables. b) in the y direction and so ∂ f (a. 0)). Definition 2. b) = 0. y) ≥ f (a. If f (x. The function f (x. y) = 0 simultaneously for (x. y) be a real-valued function. b). y) to have a local maximum or minimum at (a. Suppose that (a. to find the critical points of f you have to solve the equations ∂f ∂f ∂x (x. so (0. b) for all (x.

6 See T AYLOR and M ANN. b) ∂x2 ∂2 f ∂2 f ∂2 f (a. then f has neither a local minimum nor a local maximum at (a. which has a local maximum at (0.e.6. .84 CHAPTER 2. while along the path y = −x we have f (x. b) ∂x2 ∂2 f (a. which has a local minimum at (0. saddle point at (0.e. then f has a local minimum at (a.1 f (x. it is a local maximum in one direction and a local minimum in another direction. a function whose partial derivatives of all orders exist and are continuous). b) (i. b) (a. FUNCTIONS OF SEVERAL VARIABLES in 2 . b) (c) if D < 0. Define D= Then (a) if D > 0 and (b) if D > 0 and ∂2 f (a. y) = x2 .6. with a critical point at (a. 0). then the test fails. y) be a smooth real-valued function. b) − ∂y ∂x ∂x2 ∂y 2 > 0.6 Theorem 2. then f has a local maximum at (a. which is a hyperbolic paraboloid. ∇ f (a. 0) The following theorem gives sufficient conditions for a critical point to be a local maximum or minimum of a smooth function (i. 100 50 0 z -50 -10 -100-10 -5 -5 y 0 0 5 5 10 10 x Figure 2. f (x.5. i.1.e. b) (d) if D = 0.5. y) = xy. b) < 0. 0). b) 2 (a. § 7. Let f (x. y) = −x2 . which we will not prove here. y) is shown in Figure 2. So (0. The graph of f (x. 0) is an example of a saddle point. b) = 0).

b) ∂x2 2f ∂2 f (a. b) ∂y2 theorem one can replace if desired. b) and (a. Find all local maxima and minima of f (x. if D > 0 then ∂2 f (a. y) = (2. y) are the common solutions of the equations 2x + y − 3 = 0 x + 2y =0 which has the unique solution (x. This means that in parts (a) and (b) of the by ∂2 f (a. b) ∂y2 = D+ ∂2 f ∂y ∂x (a. i. −1).e. where ∇ f = 0.19. then (a. Find all local maxima and minima of f (x. b) ∂x ∂y ∂2 f (a. So (2. y) = x2 + xy + y2 − 3x. To use Theorem 2. b) ∂ 2 (a.20. y) is smooth means that ∂2 f (a.2. −1) ∂x2 ∂2 f =2.5 Maxima and Minima 85 If condition (c) holds. b) have the same sign. and so (a.6. Solution: First find the critical points. then the critical points (x.e. we need the second-order partial derivatives: ∂2 f =2. −1) 2 (2. b) ∂x2 ∂2 f (a. −1) is a local minimum. Solution: First find the critical points. i. b) ∂y ∂x ∂2 f (a. Thus. ∂x2 and so D= and ∂2 f (2. b) is a saddle point. b) ∂x2 ∂y > 0. ∂2 f ∂y2 Also. Since ∂f = y − 3x2 ∂x and ∂f = x − 2y ∂y Example 2. Since ∂f = 2x + y − 3 ∂x and ∂f = x + 2y ∂y Example 2. y) = xy − x3 − y2 . −1) is the only critical point. −1) ∂y ∂x ∂x2 ∂y = (2)(2) − 12 = 3 > 0 = 2 > 0. −1) − (2. Note that the assumption that f (x. . b) 2 D= since ∂2 f ∂x2 ∂2 f ∂y ∂x = ∂2 f ∂x ∂y . ∂y2 ∂2 f =1 ∂y ∂x 2 ∂2 f ∂2 f ∂2 f (2. (2. where ∇ f = 0.

y) = (x − 2)4 + (x − 2y)2 . 6 6 1 1 So the critical points are (x. y) are the common solutions of the equations y − 3x2 = 0 x − 2y = 0 The first equation yields y = 3x2 . 1 1 6 . ∂x2 So D= ∂2 f = −2 . 0) is a saddle point. Since ∂f = 4(x − 2)3 + 2(x − 2y) ∂x and Example 2. we need the second-order partial derivatives: ∂2 f = 12(x − 2)2 + 2 . and so x = 2(1) = 2. 0) − ∂y ∂x ∂x2 ∂y = (−6(0))(−2) − 12 = −1 < 0 and thus (0. 0) (0. 12 − ∂2 f ∂y ∂x 1 1 6 . i. 12 2 = (−6 1 )(−2) − 12 = 1 > 0 6 = −1 < 0.86 CHAPTER 2. 12 . (2. where ∇ f = 0. substituting that into the first equation yields 4(2y − 2)3 = 0. To use Theorem 2. ∂y2 2 ∂2 f =1 ∂y ∂x ∂2 f ∂2 f ∂2 f (0. 0) 2 (0.e. substituting that into the second equation yields 1 x − 6x2 = 0. 12 ∂2 f ∂y2 1 1 6 . So x = 0 ⇒ y = 3(0) = 0 and 2 1 x = 1 ⇒ y = 3 1 = 12 . y) are the common solutions of the equations 4(x − 2)3 + 2(x − 2y) = 0 −4(x − 2y) = 0 The second equation yields x = 2y.21. Find all local maxima and minima of f (x. Thus. Also.6. y) = (0. ∂y2 ∂2 f = −4 ∂y ∂x . ∂x2 ∂2 f =8. 1) is the only critical point. we need the second-order partial derivatives: ∂2 f = −6x . which has the solutions x = 0 and x = 6 . ∂x2 6 12 ∂2 f ∂x2 1 1 6 . To use Theorem 2. ∂f = −4(x − 2y) ∂y then the critical points (x. 12 is a local maximum. 0) and (x. which has the solution y = 1. FUNCTIONS OF SEVERAL VARIABLES then the critical points (x. y) = 6 . Thus. D= and ∂2 f 1 1 .6. Solution: First find the critical points.

so (0.22. y). y). But we also see that f (2. we have D = 4 > 0 and ∂ 2 (0. y) = (x2 + y2 )e−(x Solution: First find the critical points. so the Second Derivative Test from single-variable calculus says that r = 1 is a local maximum. 1) = 0. y) on the unit circle x2 + y2 = 1. 0) and all points (x. y) on the unit circle x2 + y2 = 1. we have 2 D = (−4x2 e−1 )(−4y2 e−1 ) − (−4xye−1 )2 = 0 and so the test fails. y). ∂x for points (x. If we look at the graph of f (x. where ∇ f = 0. Find all local maxima and minima of f (x. and hence (2. and we can check that g ′′ (1) = −4e−1 < 0. y) ≥ 0 = f (2. 1) is in fact a global minimum for f . as shown in Figure 2. In our case. Thus f (x. y) on the unit circle x2 + y2 = 1 are local maximum points for f .6. If we switch to using polar coordinates (r. Example 2.2. y) as a function g(r) of the variable r alone: g(r) = r2 e−r . i. y) on the unit circle x2 + y2 = 1.5. Since 2 2 ∂f = 2x(1 − (x2 + y2 ))e−(x +y ) ∂x 2 2 ∂f = 2y(1 − (x2 + y2 ))e−(x +y ) ∂y 2 +y2 ) . What can be done in this situation? Sometimes it is possible to examine the function to see directly the nature of a critical point. 0) = 2 > 0. θ) instead of (x. since f (x. we see that f (x. . so it has a critical point at r = 1. y) is the sum of fourth and second powers of numbers and hence must be nonnegative. 0). But r = 1 corresponds to the unit circle x2 + y2 = 1. 0) is a local minimum. where r2 = x2 + y2 . then the critical points are (0. 1) for all (x. To use Theorem 2. Thus.5 Maxima and Minima So D= 87 ∂2 f ∂2 f ∂2 f (2. it looks like we might have a local maximum for (x. 1) ∂y ∂x ∂x2 ∂y 2 = (2)(8) − (−4)2 = 0 and so the test fails. y) ≥ 0 for all (x. 2 Then g ′ (r) = 2r(1 − r2 )e−r . y) in 2 .2.e. However. 1) − (2. the points (x. 1) 2 (2. then 2 we see that we can write f (x. we need the second-order partial derivatives: 2 2 ∂2 f = 2[1 − (x2 + y2 ) − 2x2 − 2x2 (1 − (x2 + y2 ))]e−(x +y ) 2 ∂x 2 2 ∂2 f = 2[1 − (x2 + y2 ) − 2y2 − 2y2 (1 − (x2 + y2 ))]e−(x +y ) 2 ∂y 2 2 ∂2 f = −4xy[2 − (x2 + y2 )]e−(x +y ) ∂y ∂x f At (0.

FUNCTIONS OF SEVERAL VARIABLES 0.1 0.4 0. y) = (x2 + y2 )e−( x 2 +y2 ) A 1. y. f (x. y) = x3 − 12x + y2 + 8y 4. find the dimensions that will minimize the surface area.15 z 0. f (x. y) = −4x2 + 4xy − 2y2 + 16x − 12y 9. (Hint: Use Theorem 2.05 0 -3 -2 -1 0 1 y 2 3 3 2 1 -3 -2 -1 0 x Figure 2. then the tangent plane to the surface z = f (x. b)) is parallel to the xy-plane. y). f (x. f (x. f (x. y) = x3 + 3x2 + y3 − 3y2 6.3 0.2 f (x. y) at the point (a. y) = 2x3 − 6xy + y2 8. y) = x + 2y 10. f (x.5. find all local maxima and minima of the function f (x. f (x. f (x. Find three positive numbers x.35 0. z whose sum is 10 such that x2 y2 z is a maximum.) 12. y) = 2x3 + 6xy + 3y2 7. 2.88 CHAPTER 2. y). (Hint: Use the volume condition to write the surface area as a function of just two variables. f (a. f (x.25 0.) C 13. y) = x2 + y2 Exercises © ¨ For Exercises 1-10.2 0. b.5. y) = 4x2 − 4xy + 2y2 + 10x − 6y B 11. y) = x3 − 3x + y2 3. f (x. . b) is a local maximum or local minimum point for a smooth function f (x. Prove that if (a. For a rectangular solid of volume 1000 cubic meters. y) = x3 − 3x + y3 − 3y 5.

6 Unconstrained Optimization: Numerical Methods 89 2. For n = 0. y0 ). In this section we will describe another method of Newton for finding critical points of real-valued functions of two variables. y ) ∂x2 n n ∂f ∂x (xn . yn ) ∂2 f ∂x ∂y (xn . define: ∂2 f (x . Trial and error would not help 3 √ 3 √ 28 − 1. where the points (x. or logarithmic functions. yn ) (2. yn ) ∂2 f ∂x ∂y (xn . though one that is not usually emphasized. yn )∞ converges to a critical point. . y) 2 (x. While this was relatively simple for the examples we did. yn ) . in general this will not be the case. . complex number solutions. . but it can be proved that there is no general formula for solving equations for polynomials of degree five or higher.6 Unconstrained Optimization: Numerical Methods The types of problems that we solved in the previous section were examples of unconstrained optimization problems. could be impossible by elementary means. yn ) xn+1 = xn − D(xn . exponential. If the equations involve polynomials in x and y of degree three or higher. especially since the only real solution8 turns out to be 28 + 1− situation such as this. The method we used required us to find the critical points of f . then you will have to try different initial points to find them.2. Cubic polynomial equations in one variable can be solved using Cardan’s formulas. y) could be any points in the domain of f . y ) ∂y2 n n ∂f ∂y (xn . y) − ∂y ∂x ∂x2 ∂y 2 . This is also a problem for the equivalent method (the Second Derivative Test) in single-variable calculus. Newton’s algorithm: Pick an initial point (x0 . In a much. we tried to find local (and perhaps even global) maximum and minimum points of real-valued functions f (x. which you probably learned in single-variable calculus. y) be a smooth real-valued function.14) Then the sequence of points (xn . y) = ∂2 f ∂2 f ∂2 f (x. . If there are several n=1 critical points. 2. which meant having to solve the equation ∇ f = 0. Newton’s method for solving equations f (x) = 0. y) (x. yn ) ∂2 f (x . 8 There are also two nonreal. 7 . the only choice may be to find a solution using some numerical method which gives a sequence of numbers which converge to the actual solution. yn+1 = yn − D(xn . then solving even one such equation. For example. yn ) ∂f ∂y (xn . if one of the equations that had to be solved was x3 + 9x − 2 = 0 . or complicated expressions involving trigonometric. See U SPENSKY for more details. Let f (x. you may have a hard time getting the exact solutions. let alone two. That is. 3. 1.7 For example. There are formulas for solving polynomial equations of degree 4. and define D(x. which are not quite as simple as the familiar quadratic formula. y). which in general is a system of two equations in two unknowns (x and y). yn ) ∂f ∂x (xn .

. For this. we will write a simple program. z 50000 0 -50000 -100000 -150000 -200000 -250000 -300000 -350000-20 -20 -15 -10 -5 -15 -10 0 -5 5 0 5 10 10 15 15 20 20 x y Figure 2.6. FUNCTIONS OF SEVERAL VARIABLES Solution: First calculate the necessary partial derivatives: Example 2. Find all local maxima and minima of f (x. and since the computations are quite tedious. y) = x3 − xy − x + xy3 − y4 . using the Java programming language. ∂f ∂f = 3x2 − y − 1 + y3 . y) over a large region may help (see Figure 2. 0) = (0)(0) − (−1)2 = −1 0. though it may be hard to tell where the critical points are. ∂y ∂x ∂x2 ∂y2 Notice that solving ∇ f = 0 would involve solving two third-degree polynomial equations in x and y. so that we can see if there is convergence.1. Since it may take a large number of iterations of Newton’s algorithm to be sure that we are close enough to the actual critical point. = 6xy − 12y2 . y) = x3 − xy − x + xy3 − y4 for −20 ≤ x ≤ 20 and −20 ≤ y ≤ 20 Notice in the formulas (2. we will let a computer do the computing.14) that we divide by D.1 below). Looking at the graph of z = f (x. so take (0. y0 ) for our algorithm. In each iteration the new point will be printed.90 CHAPTER 2. so we should pick an initial point where D is not zero. We need to pick an initial point (x0 .1 f (x.23. which in this case can not be done easily. = −x + 3xy2 − 4y3 ∂x ∂y ∂2 f ∂2 f ∂2 f = −1 + 3y2 = 6x . And we can see that D(0. 0) as our initial point. which will take a given initial point as a parameter and then perform 100 iterations of Newton’s algorithm. The full code is shown in Listing 2.6.

} } } //Below are the p a r t s s p e c i f i c t o the f u n c t i o n f //The f i r s t p a r t i a l d e r i v a t i v e o f f wrt x : 3x^2−y−1+y^3 public s t a t i c double f x ( double x . yn ) ) / D. double y ) { return 6∗x∗y − 12∗Math . yn ) − fxy ( xn . //The c u r r e n t x and y values i f (D == 0 ) { //We can not d i v i d e by 0 System . } //The f i r s t p a r t i a l d e r i v a t i v e o f f wrt y : −x+3xy^2−4y^3 public s t a t i c double f y ( double x . out . n<=100. parseDouble ( args [ 0 ] ) . System . e x i t ( 0 ) . parseDouble ( args [ 1 ] ) . 3 ) . pow ( y . } //The mixed second p a r t i a l d e r i v a t i v e o f f wrt x and y : −1+3y^2 public s t a t i c double fxy ( double x . p r i n t l n ( "n = " + n + " : ( " + x + " .2. yn ) ∗ f y ( xn . out . pow ( y .6 Unconstrained Optimization: Numerical Methods 91 //Program t o f i n d the c r i t i c a l p o i n t s o f f ( x . yn ) ∗ f y ( xn . 2 ) . yn ) ∗ f x ( xn . //End the program } else { //C a l c u l a t e the new values f o r x and y x = xn − ( fyy ( xn . n++) { double D = fxx ( x . yn ) − fxy ( xn . 3 ) .1 Program listing for newton. double yn = y . " + y + " ) " ) . // I n i t i a l x value double y = Double . pow ( x . double y ) { return 6∗x . //Go through 100 i t e r a t i o n s o f Newton ’ s algorithm for ( int n=1. y ) as command− l i n e parameters double x = Double . p r i n t l n ( " I n i t i a l p o i n t : ( " + x + " . double y ) { return −1 + 3∗Math . pow ( fxy ( x . yn ) ) / D. // I n i t i a l y value System . 2 ) . double y ) { return −x + 3∗x∗Math . y = yn − ( fxx ( xn . y )= x^3−xy−x+xy^3−y^4 public class newton { public s t a t i c void main ( String [ ] args ) { //Get the i n i t i a l p o i n t ( x . " + y + " ) " ) . y ) − Math . 2 ) . pow ( y . y ) . yn ) ∗ f x ( xn . pow ( y . double y ) { return 3∗Math . 2 ) − y − 1 + Math . y ) ∗ fyy ( x . } } Listing 2. System . 2 ) − 4∗Math . } //The second p a r t i a l d e r i v a t i v e o f f wrt y : 6xy −12y^2 public s t a t i c double fyy ( double x . } //The second p a r t i a l d e r i v a t i v e o f f wrt x : 6x public s t a t i c double fxx ( double x . double xn = x . out . p r i n t l n ( " Error : D = 0 at i t e r a t i o n n = " + n ) . pow ( y .java .

4711356343449874.sun.0.-0.4711356343449874.4711356343449874.4711356343449874.39636433796318005) n = 98: (0.4711356343449874. either by evaluating ∂ f and ∂ f at the point ourselves or by modifying our ∂x ∂y program to also print the values of the partial derivatives at the point.39636433796318005) n = 99: (0.484506572966545.6065857885615251.4711356343449874.0) n = 2: (1.5) n = 3: (0.1 in a plain text file called newton.0.-1. It is easy to confirm that ∇ f = 0 at this point. namely the point (0.-0.-0. 9 Available for free at http://java.com/javase/downloads .405341511995805) n = 5: (0. run this command at a command prompt to compile the code: javac newton.-0.39636433796318005) = 4.4711356343449874.java is saved. You will need the Java Development Kit9 to compile the code.39636433796318005) n = 10: (0.3966334583092305) n = 6: (0.java.-0.39636450001936047) n = 7: (0. −0.-0.4711356343449705.6 we know that (0.java Then run the program with the initial point (0.39636433796318005) n = 9: (0.39636433796318005) n = 100: (0.85722573273506 × 10−17 ∂x ∂f (0.39636433796318005) .326672684688674 × 10−17 ∂y We also have D(0.-0.4711356343449874. −0.4711356343449874.39636433796318005).4711356343449874. −0.39636433796318005) n = 97: (0. It turns out that both partial derivatives are indeed close enough to zero to be considered zero: ∂f (0..39636433796318005) = −8. so by Theorem 2. In the directory where newton.47123972682634485.776075636032301 < 0.44194107452339687) n = 4: (0.39636433796318005) is a saddle point. n = 96: (0.-0. we appear to have converged fairly quickly (after only 8 iterations) to what appears to be an actual critical point (up to Java’s level of precision). −0.39636433796318005) As you can see.-0.4711356343449874. truncated to show the first 10 lines and the last 5 lines: java newton 0 0 Initial point: (0.0) n = 1: (0. −0.92 CHAPTER 2.39636433796318005) = −8.0.4711356343449874. you should first save the code in Listing 2.0.-0.-0.-0..-0. 0) as the initial point.4711356343449874. 0) with this command: java newton 0 0 Below is the output of the program using (0.3963643379632247) n = 8: (0. FUNCTIONS OF SEVERAL VARIABLES To use this program.47113558510349535.-0.

0.6 Unconstrained Optimization: Numerical Methods 93 Since ∇ f consists of cubic polynomials.6703832459238667.6733618916578702.0.2.540962756992551.-0.129841298650007.-2.-0.6703832459238667.595509445899435).-1.6703832459238667.0.0.5.6703832459238667.-0. it is easy to confirm that both ∂ f and ∂ f vanish at the point ∂x ∂y (−0.4794622222856417.. 0.11570743992954591.42501465652420045) = −4. An idea of what the graph of f looks like near that point is shown in Figure 2.0. 0.42501465652420045) n = 99: (-0. running the computer program with the initial point (−5. 0. which means it is a critical point.6703832679150286.4176293491131443) n = 12: (-0.0.6985177124230715.4319791238981274) n = 8: (-0.4250146565242004) n = 19: (-0.-0.42501465652420045) Again.5426077421319053) n = 6: (-0.0.121516233310142) n = 10: (-1.0222994755432 < 0 ∂x2 so we know that (−0.49295774647887325.5161209914612475.6536079835854451) n = 9: (-0.5788664043863884.0) n = 1: (-0.08450704225352113) n = 3: (-0. −5. n = 98: (-0.8643989895639324) n = 5: (-0.3672160534444.0.42501465652420045) n = 100: (-0.0.0.6703832459238701.-1.6704392913413444.3853578526055 > 0 ∂2 f (−0.5) n = 2: (-0.24529117721011612) n = 7: (0.4252025996474051) n = 16: (-0.-1.4250147307973365) n = 17: (-0.6703832459238667.-0.1855674752461383.4540060574531383.6.6703832459238667.4345777963475479) n = 15: (-0.2918236503332734) n = 13: (-0.42501465652420045).0. −5) yields the critical point (−7.004453014967208. and trying different values does indeed lead to different sequences which converge: java newton -1 -1 Initial point: (-1.42501465652420045) = 15.6703832459238667.2047647348546167) n = 4: (-0.05837851765533317.42501465652420045) . 0.. with D < 0 at that point.42501465652420045) is a local maximum.42501465652420045) n = 20: (-0.-0. which does suggest a local maximum around that point. which makes it a saddle point.6703832459238667. The computer program makes experimenting with other initial points easy.49848120123515316) n = 14: (-0.42501465652421205) n = 18: (-0. Finally.-0.6703832459238667.6703832459238667.0.0. And D(−0. it seems likely that there may be three critical points.-1.9206128022529645) n = 11: (-0.2. .

67.6703832459238667.4 0. In general.57) -1 -0. In the case of functions which have a global maximum or minimum.4711356343449874.0.2 -0.42501465652420045) : local maximum (−7. The crux of the steepest descent idea. Our description of Newton’s algorithm is the special two-variable case of a more general algorithm that can be applied to functions of n ≥ 2 variables.2 0 -0. A maximization problem can always be turned into a minimization problem (why?). so a large number of methods have been developed to find the global minimum of functions of any number of variables.94 CHAPTER 2. is that starting from some initial point.2 0.2 1 0 x Figure 2.540962756992551. y) = x3 − xy − x + xy3 − y4 for −1 ≤ x ≤ 0 and 0 ≤ y ≤ 1 We can summarize our findings for the function f (x.2 f (x. global maxima and minima tend to be more interesting than local versions. which is based on an idea that we discussed in Section 2.4 z -0. 0.8 -0. at least in practical applications. See R ALSTON and R ABINOWITZ for more detail and for discussion of other numerical methods. This field of study is called nonlinear programming.8 (-0.8 -1 0 0. then.0.42.4. y) = x3 − xy − x + xy3 − y4 : (−0.39636433796318005) : saddle point .6 y 0.6 -0. −5. −0. Newton’s algorithm can be used to find those points. you move a certain amount in the (0. FUNCTIONS OF SEVERAL VARIABLES .595509445899435) : saddle point The derivation of Newton’s algorithm. Many of these methods are based on the steepest descent technique. and the proof that it converges (given a “reasonable” choice for the initial point) requires techniques beyond the scope of this text.4 -0.6 -0.6.4 0.6 0. 0. Recall that the negative gradient −∇ f gives the direction of the fastest rate of decrease of a function f .

For more discussion of this. yn ) ∂ f2 ∂y (xn . 3). y) are smooth real-valued functions: Pick an initial point (x0 . then use the initial point (3. Did anything strange happen when your program ran? If so. 1). y) = 0 . yn ) (xn . Make sure that your program attempts to do 100 iterations of the algorithm. yn ) D(xn . yn ) = ∂x ∂y ∂y ∂x . Write a computer n=1 program that uses this algorithm to find approximate solutions to the system of equations f1 (x. y) = 0 and f2 (x.21 from the previous section. . D(xn . define: f1 (xn . where Then the sequence of points (xn . 0) and (1. . Newton’s algorithm can be interpreted as a modified steepest descent method. There is a “pure” steepest descent method. ease of calculation. In fact. how do you explain it? (Hint: Something strange should happen. . . 1) for the initial point (x0 . y0 ). where we showed that the point (2. 1) was a global minimum for the function f (x. y0 ). yn ) f1 (xn . fxx.2. y) = (x − 2)4 + (x − 2y)2 . yn+1 = yn + D(xn . and of nonlinear programming in general. yn ) xn+1 = xn − ∂ f1 ∂y (xn . etc.6 Unconstrained Optimization: Numerical Methods 95 direction of −∇ f at that point. yn ) ∂ f2 ∂x (xn . yn ) ∂ f2 ∂ f1 ∂ f2 ∂ f1 (xn . where f1 (x. and a multitude of variations on it that improve the rate of convergence. y) and f2 (x. 1. and compare the results. yn ) (xn . fy. see B AZARAA. yn ) . yn ) . fyy and fxy function definitions to use the appropriate partial derivative). There is a version of Newton’s algorithm for solving a system of two equations f1 (x. y) = e2x − 2x + 3y = 0 . 2. 2). Notice that our computer program can be modified fairly easily to use this function (just change the return values in the fx. yn ) ∂ f1 ∂x (xn . 3. .) 2. First use the initial point (0. yn )∞ converges to a solution. yn ) − (xn . and you then just keep repeating that procedure until eventually (hopefully) you reach the point where f has its smallest value. Wherever that takes you becomes your new point. yn ) f2 (xn . S HERALI and S HETTY. y) = sin(xy) − x − y = 0 and f2 (x. yn ) f2 (xn . For n = 0. Show that you get two different solutions when using (0. ¨ C Exercises © 1. Either modify that program or write one of your own in a programming language of your choice to show that Newton’s algorithm does lead to the point (2. Recall Example 2.

10]. Since we are given that the perimeter P = 20.24. This gives y = 10 − x. called the Lagrange multiplier method10 . But what if that were not possible (which is often the case)? In this section we will use a general method. say. respectively.6 we were concerned with finding maxima and minima of functions without any constraints on the variables (other than being in the domain of the function). so we now just have to maximize the function f (x) = 10x − x2 on the interval [0. FUNCTIONS OF SEVERAL VARIABLES 2. then the Second Derivative Test tells us that x = 5 is a local maximum for f . So since y = 10 − x = 5. y. Since we must have 2x + 2y = 20. y) = c (or g(x. y in terms of x using that equation. Example 2. z) = c) for some constant c The equation g(x. 10] (since f = 0 at the endpoints of the interval). which we then substitute into f to get f (x. this problem can be stated as: Maximize : f (x. y) (or f (x. using single-variable calculus. y) = c. y) = c is called the constraint equation. For a rectangle whose perimeter is 20 m. Points (x. and we say that x and y are constrained by g(x. y) = c are called constrained maximum or constrained minimum points. z)) given : g(x. Since f ′ (x) = 10 − 2x = 0 ⇒ x = 5 and f ′′ (5) = −2 < 0. The Lagrange multiplier method for solving such problems can now be stated: 10 Named after the French mathematician Joseph Louis Lagrange (1736-1813). find the dimensions that will maximize the area. and hence x = 5 must be the global maximum on the interval [0. y) = xy = x(10 − x) = 10x − x2 .7 Constrained Optimization: Lagrange Multipliers In Sections 2. Similar definitions hold for functions of three variables. for solving constrained optimization problems: Maximize (or minimize) : f (x. y) with the condition that they satisfy the constraint equation g(x. This is now a function of x alone. . Solution: The area A of a rectangle with width x and height y is A = xy. What would we do if there were constraints on the variables? The following example illustrates a simple case of this type of problem. The perimeter P of the rectangle is then given by the formula P = 2x + 2y. for solving this problem. y) = xy given : 2x + 2y = 20 The reader is probably familiar with a simple method. y. then we can solve for. y) which are maxima or minima of f (x. Notice in the above example that the ease of the solution depended on being able to solve for one variable in terms of the other in the equation 2x + 2y = 20.5 and 2. then the maximum area occurs for a rectangle whose width and height both are 5 m.96 CHAPTER 2.

y) for some constant λ (the number λ is called the Lagrange multiplier). § 6. y) = λ∇g(x. y) satisfying ∇ f (x. y) for some λ means solving the equations 11 12 See T AYLOR and M ANN. y) = λ∇g(x. which is beyond the scope of this text.7 really is a constrained maximum or minimum? The answer is that it depends on the constraint function g(x.7 Constrained Optimization: Lagrange Multipliers 97 Theorem 2.24 it was clear that there had to be a global maximum.24 the constraint equation 2x + 2y = 20 describes a line in 2 . use the Lagrange multiplier method to find the dimensions that will maximize the area. which is bounded.24. see T AYLOR and M ANN. Example 2. y) that solve the equation ∇ f (x. there are “hidden” constraints.11 Note that the theorem only gives a necessary condition for a point to be a constrained maximum or minimum. . then the constrained maximum or minimum of f (x. Let f (x. of the rectangle. Whether a point (x. y) = xy given : g(x. Then to solve the constrained optimization problem Maximize (or minimize) : f (x. y) that satisfies ∇ f (x. If there is a constrained maximum or minimum. y) that satisfy the equation g(x. So how can you tell when a point that satisfies the condition in Theorem 2. together with any implicit constraints. y) = λ∇g(x. In Example 2. y) and g(x.2. which cause that line to be restricted to a line segment in 2 (including the endpoints of that line segment). this problem can be stated as: Maximize : f (x. Again. y) given : g(x. y) for some λ actually is a constrained maximum or minimum can sometimes be determined by the nature of the problem itself. y) or at a “boundary” point of the set B. due to the nature of the problem. For a rectangle whose perimeter is 20 m. However. namely 0 ≤ x. Solution: As we saw in Example 2. y) = 2x + 2y = 20 Then solving the equation ∇ f (x. y ≤ 10. y). then it must be such a point.25. with x and y representing the width and height. find the points (x. A rigorous proof of the above theorem requires use of the Implicit Function Theorem.8 for more detail. in Example 2. y) = λ∇g(x. y) = c. y) be smooth functions. y) = c . For instance. It can be shown12 that if the constraint equation g(x. respectively.7. which by itself is not bounded. y) 0 for all (x. and suppose that c is a scalar constant such that ∇g(x. y) = c (plus any hidden constraints) describes a bounded set B in 2 . y) will occur either at a point (x.

y) to the point (1. y) = (x − 1)2 + (y − 2)2 given : g(x. y) = x2 + y2 = 80 Solving ∇ f (x.26. so the point (5.98 CHAPTER 2. Solution: The distance d from any point (x. 5) that we found (called a constrained critical point) must be the constrained maximum. FUNCTIONS OF SEVERAL VARIABLES ∂f ∂g ∂f ∂g =λ and = λ . Find the points on the circle x2 + y2 = 80 which are closest to and farthest from the point (1. So we can solve both equations for λ as follows: x−1 y−2 =λ= x y ⇒ xy − y = xy − 2x ⇒ y = 2x . Thus the problem can be stated as: Maximize (and minimize) : f (x. Example 2. y 0. then set those expressions equal (since they both equal λ) to solve for x and y. Similarly. since the minimum area is 0 and f (5. 2) is d= (x − 1)2 + (y − 2)2 . y) = λ∇g(x. so now substitute either of the expressions for x or y into the constraint equation to solve for x and y: 20 = g(x. ∴ The maximum area occurs for a rectangle whose width and height both are 5 m. y) means solving the following equations: 2(x − 1) = 2λx . x = 2λ The general idea is to solve for λ in both equations. y) = 2x + 2y = 2x + 2x = 4x ⇒ x=5 ⇒ y=5 There must be a maximum area. 2(y − 2) = 2λy Note that x 0 since otherwise we would get −2 = 0 in the first equation. 5) = 25 > 0. namely: ∂x ∂x ∂y ∂y y = 2λ . Doing this we get x y =λ= 2 2 ⇒ x=y. 2). and minimizing the distance is equivalent to minimizing the square of the distance.

Example 2. 0. So far we have not attached any significance to the value of the Lagrange multiplier λ. So the two constrained critical points are (4. when the constant c in the constraint equation g(x. √ 2 2 is the is the constrained minimum point. but made no use of its value. 2) 0 x (−4. 8) (1. It turns out that λ gives an approximation of the change in the value of the function f (x. 2).7 Constrained Optimization: Lagrange Multipliers Substituting this into g(x. z) = x2 + y2 + z2 = 1 yields the constrained critical points and −1 √ . We needed λ only to find the constrained critical points.27. −8) is the farthest from (1. 8) is the point on the circle closest to (1. −1 √ 2 2 . y. 0. −8) = 125. 0. 8) and (−4. 2) and (−4. y. √ 2 2 −1 √ . Notice that since the constraint equation x2 + y2 = 80 describes a circle.1 The Lagrange multiplier method can be extended to functions of three variables. which is a bounded set in 2 . −8). 8) = 45 and f (−4. √ 2 2 1 1 √ . then it must be the case that (4. Since f (4. . 0. z) = x2 + y2 + z2 = 1 Solution: Solve the equation ∇ f (x. Substituting these expressions into the constraint equation g(x. z) = λ∇g(x.2. y) that we wish to maximize or minimize. x2 + y2 = 80 y 99 (4. −8) Figure 2. 2) (see Figure 2. Since f > f . then we were guaranteed that the constrained critical points we found were indeed the constrained maximum and minimum. z): 1 = 2λx 0 = 2λy 1 = 2λz The first equation implies λ 0 (otherwise we would have 1 = 0). so x = ±4.7. 0. and since the constraint equation 3. y. x2 + y2 + z2 = 1 describes a sphere (which is bounded) in constrained maximum point and −1 √ . y) = x2 + y2 = 80 yields 5x2 = 80. so we can divide by λ in the second equation to get y = 0 and we can divide by λ in the first and 1 third equations to get x = 2λ = z. 0. and since there must be points on the circle closest to and farthest from (1. −1 √ 2 2 then 1 1 √ . y) = c is changed by 1. y. y. z) = x + z given : g(x.1).7. −1 √ 2 2 1 1 √ . Maximize (and minimize) : f (x.

Find the volume of the largest rectangular parallelepiped that can be inscribed in the ellipsoid x2 y2 z2 + + =1. . Notice that λ = 2. 2. which as we have seen before. 3). Thus. 5. Luckily there are many numerical methods for solving constrained optimization problems. Find the constrained maxima and minima of f (x.13 ¨ A Exercises © 1. i.25) = 27. y) = 2x + 2y = 20 had the solution (x. and that λ = x/2 = y/2.5625 when we increased the value of c in the constraint equation g(x. pt) − f (old max. y) = c from c = 20 to c = 21. 3. And the 3-variable case can get even more complicated. 5) = 25 to f (5.5.25 we showed that the constrained optimization problem Maximize : f (x. though we will not discuss them here. y) = xy given : g(x. So we see that the value of f (x. it increased by 2. note that solving the equation ∇ f (x.100 CHAPTER 2. a2 b2 c2 13 See B AZARAA. y) = (5. y. that is. y) = xy given that x2 + 3y2 = 6. 5). All of this somewhat restricts the usefulness of Lagrange’s method to relatively simple functions. 5.5625. FUNCTIONS OF SEVERAL VARIABLES For example.25.25).5 is close to 2.e. y) = 2x + y given that x2 + y2 = 4. may not be possible to do. Find the points on the circle x2 + y2 = 100 which are closest to and farthest from the point (2. Find the constrained maxima and minima of f (x. λ ≈ ∆ f = f (new max.5625. y) = (5. y) = λ∇g(x. y) means having to solve a system of two (possibly nonlinear) equations in three unknowns. y) = 2x + 2y = 21 has the solution (x. B 4. λ = 2.25. y) at the constrained maximum increased from f (5. z) = x + y2 + 2z given that 4x2 + 9y2 − 36z2 = 36. pt) . In a similar fashion we could show that the constrained optimization problem Maximize : f (x. Finally. Find the constrained maxima and minima of f (x. S HERALI and S HETTY. in Example 2. 5. y) = xy given : g(x.

so we know that the area under the curve is the definite integral. b] × [c. z z = f (x.3 Multiple Integrals 3.1. y). the double integral of a nonnegative real-valued function f (x. For any number x∗ in the interval [a.1 Double Integrals In single-variable calculus. c ≤ y ≤ d} in 2 . d] then depends only on the value of x∗. where x∗ is fixed and only y varies. slice the surface z = f (x. y) be a continuous function such that f (x. For instance.1 The area A(x) varies with x Then A(x) = c f (x. to integrate a function f (x) it is necessary to find the antiderivative of f . So using the variable x instead of x∗. Then the trace of the surface in that plane is the curve f (x∗. d 101 . y) c a x b x R 0 A(x) d y Figure 3. We will often write this as R = [a. y) with the plane x = x∗ parallel to the yz-plane. Recall also that the definite integral of a nonnegative function f (x) ≥ 0 represented the area “under” the curve y = f (x).e. Is there a similar way of defining integration of real-valued functions of two or more variables? The answer is yes.1). y) is a continuous function of y over the interval [c. differentiation and integration are thought of as inverse operations. b]. The area A under that curve (i. This makes sense since for a fixed x the function f (x. and only y varies. another function F(x) whose derivative is f (x). y) : a ≤ x ≤ b.1. As we will now see. y) ≥ 0 represents the volume “under” the surface z = f (x. let A(x) be that area (see Figure 3. y). d]. y) dy since we are treating x as fixed. y) on the rectangle R = {(x. Let f (x. that is. as we will see shortly. y) ≥ 0 for all (x. d]. the area of the region between the curve and the xy-plane) as y varies over the interval [c.

which can then be integrated with respect to x. y) but above the xy-plane over the rectangle R is the integral over [a. so by the “slice” or cross-section method from singlevariable calculus we know that the volume V of the solid under the surface z = f (x. and then the resulting function is integrated with respect to y using the “outer” limits of integration c and d. 18 in T AYLOR and M ANN. we could just as easily have taken the area of cross-sections under the surface which were parallel to the xz-plane.1) is called a double integral.1). which would then depend only on the variable y. . y) dx dy . y) with respect to y. First the function f (x. See Ch. and the last expression in equation (3. Find the volume V under the plane z = 8x + 6y over the rectangle R = [0. 1 due to Fubini’s Theorem. (3. 1] × [0. This order of integration can be changed if it is more convenient. (3. treating the variable x as a constant (this is called integrating with respect to y). y) dx dy . This process of going through two iterations of integrals is called double integration. Also.102 CHAPTER 3. so that the volume V would be d b V= c a f (x. Also.1) We will always refer to this volume as “the volume under the surface”. The above expression uses what are called iterated integrals. That is what occurs in the “inner” integral between the square brackets in equation (3.2) It turns out that in general1 the order of the iterated integrals does not matter. y) with respect to y is the inverse operation of taking the partial derivative of f (x. 2]. Once that integration is performed. y) is first integrated with respect to x using the “inner” limits of integration a and b.3) where it is understood that the fact that dx is written before dy means that the function f (x. The final result is then a number (the volume). y) is integrated as a function of y. the result is then an expression involving only x. That is what occurs in the “outer” integral above (the second iterated integral). This is the first iterated integral. y) dy dx (3. MULTIPLE INTEGRALS The area A(x) is a function of x. b] of that crosssectional area A(x): b b d V= a A(x) dx = a c f (x. Example 3. we will usually discard the brackets and simply write d b V= c a f (x.1. Notice that integrating f (x.

Find the volume V under the surface z = e x+y over the rectangle R = [2. and the b . so 2 3 V= 1 2 2 e x+y dx dy e x+y 1 2 x=3 x=2 = = 1 dy (ey+3 − ey+2 ) dy − ey+2 4 2 1 4 =e y+3 5 = e − e − (e − e3 ) = e5 − 2e4 + e3 Recall that for a general function f (x).2. so: 2 1 103 V= 0 2 0 (8x + 6y) dx dy 4x2 + 6xy 0 2 x=1 x=0 = = 0 dy (4 + 6y) dy 2 0 = 4y + 3y2 = 20 Suppose we had switched the order of integration. Solution: We know that f (x. y). y) = 8x + 6y ≥ 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2. the integral a f (x) dx represents the difference of the area below the curve y = f (x) but above the x-axis when f (x) ≥ 0. We can verify that we still get the same answer: 1 2 V= 0 1 0 (8x + 6y) dy dx 8xy + 3y2 0 1 y=2 y=0 = = 0 dx (16x + 12) dx 1 0 = 8x2 + 12x = 20 Example 3. 3] × [1. 2].1 Double Integrals Solution: We see that f (x. y) = e x+y > 0 for all (x.3.

1] 3. 0 2 0 4 xy cos(x2 y) dx dy xy dx dy 0 1 d b c a sin x cos(y − π) dx dy 1 dx dy 11.104 CHAPTER 3. y) = 4xy. find the volume under the surface z = f (x. 0 2 0 1 x(x + y) dx dy x(xy + sin x) dx dy −1 π −1 0 1 2 π/2 7. 0 (1 − y)x2 dx dy (x + 2) dx dy 6. regardless of whether f (x. f (x. 1] 4. our method of double integration by means of iterated integrals can be used to evaluate the double integral of any continuous function over a rectangle. 10. f (x. Let M be a constant. y) ≥ 0. 1] × [0. y) = x4 + xy + y3 . R = [0. . the double integral of any continuous function f (x. 1] 1 2 1 2 1 Exercises © ¨ For Exercises 1-4. f (x. f (x. MULTIPLE INTEGRALS area above the curve but below the x-axis when f (x) ≤ 0. 2] × [0. We can still evaluate the double integral: 2π 0 0 π 2π sin(x + y) dx dy = 0 2π − cos(x + y) x=π x=0 dy = 0 (− cos(y + π) + cos y) dy 2π 0 = − sin(y + π) + sin y =0 = − sin 3π + sin 2π − (− sin π + sin 0) A 1. y) but above the xy-plane when f (x. y) = e x+y . R = [1. 2] 1 2 For Exercises 5-12. y) over the rectangle R. y) = sin(x + y) is both positive and negative over the rectangle [0. y) represents the difference of the volume below the surface z = f (x. 0 1 9. −1 −1 13. 2π π Example 3. evaluate the given double integral. 1] × [0. R = [0. Similarly. 12. y) = x3 + y2 . 0 π/2 0 8. 1] × [−1. 2. Solution: Note that f (x. 5. Evaluate 0 0 sin(x + y) dx dy. R = [0. and the volume above the surface but below the xy-plane when f (x. Show that M dx dy = M(d − c)(b − a). Thus. y) ≥ 0 or not. π] × [0. y) ≤ 0. 2π].3.

and bounded above by a curve y = g2 (x). though). Note that f (x.2 Double Integrals Over a General Region 105 3. with the A signifying area. y) dy dx (3. y) dx dy Figure 3. y) dA.4) This means that we take vertical slices in the region R between the curves y = g1 (x) and y = g2 (x).1(a). We can now define the double integral of a real-valued function f (x.2. y) is first integrated with respect to y. b) (they could intersect at the endpoints x = a and x = b. bounded on the right by a curve x = h2 (y). which then allows us to take the second iterated integral with respect to x. denoted by R b g2 (x) f (x. bounded below by the horizontal . The symbol dA is sometimes called an area element or infinitesimal. Similarly. is given by f (x. y) over more general regions in 2 . This makes sense since the result of the first iterated integral will have to be a function of x alone.1 Double integral over a nonrectangular region R Then using the slice method from the previous section.2. as in Figure 3. y y = g2 (x) d y R c x 0 a b a g2 (x) g1 (x) x = h1 (y) x = h2 (y) y = g1 (x) b f (x. y) over the region R. with functions of x as the limits of integration.2 Double Integrals Over a General Region In the previous section we got an idea of what a double integral over a rectangle represents.3. bounded on the right by the vertical line x = b (where a < b). bounded below by a curve y = g1 (x). y) dy dx R 0 (b) Horizontal slice: d c h2 (y) h1 (y) x (a) Vertical slice: f (x. the double integral of a real-valued function f (x. We will assume that g1 (x) and g2 (x) do not intersect on the open interval (a. if we have a region R in the xy-plane that is bounded on the left by a curve x = h1 (y). y) dA = R a g1 (x) f (x. Suppose that we have a region R in the xy-plane that is bounded on the left by the vertical line x = a.

MULTIPLE INTEGRALS line y = c.5) Notice that these definitions include the case when the region R is a rectangle.4 . as in Figure 3.3 We get the same answer using horizontal slices (see Figure 3. y) in the region R.2.4 y 2 x= y/2 R x 0 1 Figure 3. then taking horizontal slices gives d h2 (y) f (x. Using vertical slices we get: V= R y y = 2x2 (8x + 6y) dA 1 R x 0 1 Figure 3.2.2 = 0 1 = 0 1 2x2 0 8xy + 3y2 (8x + 6y) dy dx y=2x2 y=0 dx = 0 (16x3 + 12x4 ) dx 1 12 5 5 x 0 = 4x4 + =4+ 12 5 = 32 5 = 6.4. y) dx dy (3.2. and bounded above by the horizontal line y = d (where c < d).2.2.2. y) ≥ 0 for all (x. y) over the region R. Example 3. if f (x. 0 ≤ y ≤ 2x2 }. d)).3): V= R (8x + 6y) dA 2 1 √ 2 2 = 0 y/2 = 0 4x2 + 6xy dy (8x + 6y) dx x=1 x= √ y/2 dy = 0 (4 + 6y − (2y + √ 6 √ √ y y )) dy 2 2 2 = = 4y + 2y2 − 6 2 5/2 5 y 0 =8+8− 0 √ √ 6 2 32 5 √ (4 + 4y − 3 2y3/2 ) dy = 16 − 48 5 = 32 5 = 6. y) dA is the volume under the Also. y) : 0 ≤ x ≤ 1.106 CHAPTER 3. y) dA = R c h1 (y) f (x.1(b) (assuming that h1 (y) and h2 (y) do not intersect on the open interval (c. Solution: The region R is shown in Figure 3. Find the volume V under the plane z = 8x+6y over the region R = {(x. f (x. then R surface z = f (x.

4(b). y) = z = 1 (4 − 2x − y) and the region R. 1) 2x + y + 4z = 4 y 0 x (2. the double integral R is a nonnegative real-valued function and that R is a bounded region in 2 . The volume V is given by f (x. as shown by the shaded subrectangles in Figure 3. . xi+1 ] × [y j . Then the volume under the surface z = f (x. d]. y) dA is defined as follows. 4. where ∆xi = xi+1 − xi . y) dA. 0.2.4(a) with a typical vertical slice. 0 ≤ y ≤ −2x + 4}. y 4 z (0.3. pick a point (xi∗ . Only consider the subrectangles that are enclosed completely within the region R. shown in 4 R Figure 3. where f (x. Find the volume V of the solid bounded by the three coordinate planes and the plane 2x + y + 4z = 4.4 Solution: The solid is shown in Figure 3.2.5. b] × [c. 0) (a) y = −2x + 4 R x 0 (b) (0. y) : 0 ≤ x ≤ 2. y j+1 ]. y j∗ ) ∆xi ∆y j . Using vertical slices in R gives V= R 2 1 4 (4 − 2x − y) dA 1 4 (4 = 0 2 0 −2x+4 − 2x − y) dy dx y=−2x+4 y=0 = 0 2 − 1 (4 − 2x − y)2 8 1 8 (4 dx = 0 − 2x)2 dx 2 0 1 = − 48 (4 − 2x)3 = 64 48 = 4 3 For a general region R. 0) 2 Figure 3. is R = {(x. In any such subrectangle [xi . y) over that subrectangle is approximately f (xi∗ . Assume that f (x.2 Double Integrals Over a General Region 107 Example 3.2. 0.5(a). so it can be enclosed in some rectangle [a. which may not be one of the types of regions we have considf (x. Then divide that rectangle into a grid of subrectangles. y) ered so far.2. y j∗ ).

with volume f (xi∗ . the region R does not have to be bounded.2. MULTIPLE INTEGRALS ∆y j = y j+1 − y j . b] × [c. y j∗ ) R y x xi xi+1 b Subrectangles inside the region R xi+1 x (b) Parallelepiped over a subrectangle. as shown in Figure 3. using the definition of the Riemann integral from single-variable calculus. so that the length of the largest diagonal of the subrectangles goes to 0. or over a region which contains points where the function f (x. y) dA reduces to a sequence of two iterated integrals. namely f (xi∗ . In the case of a region of the type shown in Figure 3.6) where the summation occurs over the indices of the subrectangles inside R. y j∗ ) ∆xi ∆y j .2. Then the total volume under the surface is approximately the sum of the volumes of all such parallelepipeds.2.e. over an unbounded region. y) f (xi∗ . y j∗ ) ∆xi ∆y j Figure 3. y j∗ ) is the height and ∆xi ∆y j is the base area of a parallelepiped. d] as the largest diagonal of the subrectangles goes to 0). and f (xi∗ . y) that is not necessarily always nonnegative: just replace each mention of volume by the negative volume in the description above when f (x. We can evaluate improper double integrals (i. y d z ∆xi y j+1 (xi∗ . y j∗ ) yj c 0 a (a) ∆y j z = f (x. our f (x. .5(b).108 CHAPTER 3.1. y) is not defined) as a sequence of iterated improper single-variable integrals.5 Double integral over a general region R A similar definition can be made for a function f (x. y j∗ ) y j y j+1 0 xi (xi∗ . j i (3. If we take smaller and smaller subrectangles. then the subrectangles begin to fill more and more of the region R. We then define R summation (the limit is taken over all subdivisions of the rectangle [a. y) < 0. y) over the region R. definition of R Finally. y) dA as the limit of that double z = f (x. and so the above sum approaches the actual volume under the surface f (x.

as in Figure 3. 0 ey dx dy xye−(x 2 +y2 ) 2 5. B 11.5 how three noncollinear points determine a plane.) 13. b.3.1(a). 0 2 0 sin x dx dy 2y 0 ∞ 0 1 0 x2 ∞ x ln x 3. Prove that the volume of a tetrahedron with mutually perpendicular adjacent sides of lengths a. Find the volume V of the solid bounded by the three coordinate planes and the plane x + y + z = 1. 0 24x2 y dy dx 4x dy dx 2. and recall from 6 Section 1.6. Find the volume V of the solid bounded by the three coordinate planes and the plane 3x + 2y + 5z = 6.6.2. 2 dy dx 0 9.6 b . Evaluate Solution: ∞ 1 0 1/x2 2y dy dx. 10.2. Explain why the double integral R 1 dA gives the area of the region R. is abc .2. 8. (Hint: Mimic Example 3. For sim- plicity. you can assume that R is a region of the type shown in Figure 3. Show how Exercise 12 can be used to solve Exercise 10. evaluate the given double integral. 1 π/2 0 y 4. y 1. c a Figure 3. 0 dx dy 7.5. 2y dy dx = 1 ∞ ∞ 1 y2 y=1/x2 y=0 dx ∞ 1 = 1 x−4 dx = − 3 x−3 = 0 − (− 1 ) = 3 1 3 A 1 1 √ 2 Exercises © π ¨ For Exercises 1-6. C 12. and c.2 Double Integrals Over a General Region ∞ 1 0 1/x2 109 Example 3. 0 2 0 y cos x sin y dx dy 1 dx dy 0 0 6.

y. z) dV = S a g1 (x. y). y. Then b h2 (x) h1 (x) g2 (x. and x varies between a and b. pick a point (x∗ . z) dz dy dx . y. the triple integral is a sequence of three iterated integrals. z) dV. y. that is. which is then divided into subparallelepipeds. and the triple summation is over all the subparallelepipeds inside S . y. z) over S . but at least we now know how to calculate that volume! In the case where S is a rectangular parallelepiped [x1 . y. MULTIPLE INTEGRALS 3. what does the triple integral represent? We saw that a double integral could be thought of as the volume under a two-dimensional surface. length in 1 .9) Notice in this case that the first iterated integral will result in a function of x and y (since its limits of integration are functions of x and y). It turns out that the triple integral simply generalizes this idea: it can be thought of as representing the hypervolume under a three-dimensional hypersurface w = f (x. z) dV = S z1 x1 f (x. z) dx dy dz . y2 ] × [z1 . denoted by f (x. y. (3. (3. y) over a region R in 2 can be extended to define a triple integral of a real-valued function f (x. (3.3 Triple Integrals Our definition of a double integral of a real-valued function f (x. which then leaves you with a . In general. z∗ ).7) where the limit is over all divisions of the rectangular parallelepiped enclosing S into subparallelepipeds whose largest diagonal is going to 0. by S f (x. ∆y and ∆z. In each subparallelepiped inside S . z) whose graph lies in 4 . y. y). the word “volume” is often used as a general term to signify the same concept for any n-dimensional object (e.110 CHAPTER 3. This is the simplest case. z2 ]. y∗ . A more complicated case is where S is a solid which is bounded below by a surface z = g1 (x. z) dV = lim S f (x∗ . z∗ ) ∆x ∆y ∆z . y. y is bounded between two curves h1 (x) and h2 (x). namely z2 y2 y1 x2 f (x. y∗ .8) where the order of integration does not matter. z) : x1 ≤ x ≤ x2 . area in 2 ). x2 ] × [y1 . S = {(x. It may be hard to get a grasp on the concept of the “volume” of a four-dimensional object.y) f (x. It can be shown that this limit does not depend on the choice of the rectangular parallelepiped enclosing S . y1 ≤ y ≤ y2 .y) f (x. bounded above by a surface z = g2 (x. with sides of lengths ∆x.g. z) over a solid S in 3 . Physically. z1 ≤ z ≤ z2 }. The symbol dV is often called the volume element. We simply proceed as before: the solid S can be enclosed in some rectangular parallelepiped. Then define the triple integral of f (x. y.

Solution: 1 0 0 1−x 0 2−x−y 1 1−x 0 1 1−x 0 1 1−x 0 1 1 2 − 2 x2 − xy − 1 y2 dy dx 2 y=1−x y=0 1 (x + y)(2 − x − y) + 2 (2 − x − y)2 dy dx 1 (x + y)z + 2 z2 (x + y + z) dz dy dx = 0 z=2−x−y z=0 dy dx = 0 = 0 = 0 1 1 2y − 2 x2 y − xy − 1 xy2 − 1 y3 2 6 11 6 dx = 0 − 2x + 1 x3 dx 6 1 1 4 24 x 0 = 11 6 x − x2 + = 7 8 . At this point. 3 2 0 0 1 Example 3.3 Triple Integrals 111 double integral of a type that we learned how to evaluate in Section 3. Evaluate 0 (x + y + z) dz dy dx. of course.2.3. so as you can probably tell.7. many variations on this case (for example. regardless of what it represents. y.8. Solution: 3 0 0 2 0 1 3 2 0 3 2 0 3 1 2 4y 3 1 2y 1 2 2x y (xy + z) dx dy dz = 0 + xz x=1 x=0 dy dz = 0 + z dy dz y=2 y=0 = 0 + yz dz = 0 (1 + 2z) dz 3 0 = z + z2 = 12 1 1−x 0 0 2−x−y Example 3. just learning how to evaluate a triple integral. triple integrals can be quite tricky. z). There are. Evaluate 0 (xy + z) dx dy dz. is the most important thing. changing the roles of the variables x. We will see some other ways in which triple integrals are used later in the text.

112

**CHAPTER 3. MULTIPLE INTEGRALS
**

3

Note that the volume V of a solid in V=

is given by 1 dV .

S

(3.10)

Since the function being integrated is the constant 1, then the above triple integral reduces to a double integral of the types that we considered in the previous section if the solid is bounded above by some surface z = f (x, y) and bounded below by the xy-plane z = 0. There are many other possibilities. For example, the solid could be bounded below and above by surfaces z = g1 (x, y) and z = g2 (x, y), respectively, with y bounded between two curves h1 (x) and h2 (x), and x varies between a and b. Then

b h2 (x) h1 (x) g2 (x,y) b h2 (x) h1 (x)

V=

S

1 dV =

a g1 (x,y)

1 dz dy dx =

a

(g2 (x, y) − g1 (x, y)) dy dx

**just like in equation (3.9). See Exercise 10 for an example.
**

¨

A

3 2 0 π x 0 e y 0 2 4 2 0 0 xy 1

Exercises ©

1

**For Exercises 1-8, evaluate the given triple integral.
**

x 0 1 z 0 2 y2 0 1 1−x 0 0 0 1−x−y 0 z2 0 y y

1.

0

xyz dx dy dz x2 sin z dz dy dx

0 0 1/y

2.

0

xyz dz dy dx zey dx dy dz

0

2

3. 5.

1 0 3

4. 6.

1

x2 z dx dz dy 1 dx dy dz

1

yz dx dz dy 1 dz dy dx

0

7.

8.

z2 y2 x2 z1 y1 x1

9. Let M be a constant. Show that

M dx dy dz = M(z2 − z1 )(y2 − y1 )(x2 − x1 ).

B

10. Find the volume V of the solid S bounded by the three coordinate planes, bounded above by the plane x + y + z = 2, and bounded below by the plane z = x + y.

C

b z a a y b

**11. Show that
**

a

f (x) dx dy dz =

a

(b−x)2 2

f (x) dx. (Hint: Think of how changing

the order of integration in the triple integral changes the limits of integration.)

3.4 Numerical Approximation of Multiple Integrals

113

**3.4 Numerical Approximation of Multiple Integrals
**

As you have seen, calculating multiple integrals is tricky even for simple functions and regions. For complicated functions, it may not be possible to evaluate one of the iterated integrals in a simple closed form. Luckily there are numerical methods for approximating the value of a multiple integral. The method we will discuss is called the Monte Carlo method. The idea behind it is based on the concept of the average value of a function, which you learned in single-variable calculus. Recall that for a continuous function f (x), the average value f¯ of f over an interval [a, b] is defined as f¯ = 1 b−a

b

f (x) dx .

a

(3.11)

The quantity b − a is the length of the interval [a, b], which can be thought of as the “volume” of the interval. Applying the same reasoning to functions of two or three variables, we define the average value of f (x, y) over a region R to be f¯ = 1 A(R)

R

f (x, y) dA ,

(3.12)

**where A(R) is the area of the region R, and we define the average value of f (x, y, z) over a solid S to be 1 f (x, y, z) dV , (3.13) f¯ = V(S )
**

S

**where V(S ) is the volume of the solid S . Thus, for example, we have f (x, y) dA = A(R) f¯ .
**

R

(3.14)

The average value of f (x, y) over R can be thought of as representing the sum of all the values of f divided by the number of points in R. Unfortunately there are an infinite number (in fact, uncountably many) points in any region, i.e. they can not be listed in a discrete sequence. But what if we took a very large number N of random points in the region R (which can be generated by a computer) and then took the average of the values of f for those points, and used that average as the value of f¯? This is exactly what the Monte Carlo method does. So in formula (3.14) the approximation we get is f (x, y) dA ≈ A(R) f¯ ± A(R)

R

f 2 − ( f¯)2 , N

N 2 i=1 ( f (xi , yi ))

(3.15)

where f¯ =

N i=1

f (xi , yi ) N

and

f2 =

N

,

(3.16)

114

CHAPTER 3. MULTIPLE INTEGRALS

with the sums taken over the N random points (x1 , y1 ), . . ., (xN , yN ). The ± “error term” in formula (3.15) does not really provide hard bounds on the approximation. It represents a single standard deviation from the expected value of the integral. That is, it provides a likely bound on the error. Due to its use of random points, the Monte Carlo method is an example of a probabilistic method (as opposed to deterministic methods such as Newton’s method, which use a specific formula for generating points). For example, we can use formula (3.15) to approximate the volume V under the plane z = 8x + 6y over the rectangle R = [0, 1] × [0, 2]. In Example 3.1 in Section 3.1, we showed that the actual volume is 20. Below is a code listing (montecarlo.java) for a Java program that calculates the volume, using a number of points N that is passed on the command line as a parameter.

//Program t o approximate the double i n t e g r a l o f f ( x , y )=8 x+6y //over the r e c t a n g l e [ 0 , 1 ] x [ 0 , 2 ] . public class montecarlo { public s t a t i c void main ( String [ ] args ) { //Get the number N o f random p o i n t s as a command− l i n e parameter int N = I n t e g e r . parseInt ( args [ 0 ] ) ; double x = 0 ; //x−c o o r d i n a t e o f a random p o i n t double y = 0 ; //y−c o o r d i n a t e o f a random p o i n t double f = 0 . 0 ; //Value o f f at a random p o i n t double mf = 0 . 0 ; //Mean o f the values o f f double mf2 = 0 . 0 ; //Mean o f the values o f f ^2 for ( int i =0; i <N; i ++) { //Get the random c o o r d i n a t e s x = Math . random ( ) ; //x i s between 0 and 1 y = 2 ∗ Math . random ( ) ; //y i s between 0 and 2 f = 8∗x + 6∗y ; //Value o f the f u n c t i o n mf = mf + f ; //Add t o the sum o f the f values mf2 = mf2 + f ∗ f ; //Add t o the sum o f the f ^2 values } mf = mf /N; //Compute the mean o f the f values mf2 = mf2 /N; //Compute the mean o f the f ^2 values System . out . p r i n t l n ( "N = " + N + " : i n t e g r a l = " + v o l ( ) ∗ mf + " +/ − " + v o l ( ) ∗ Math . s q r t ( ( mf2 − Math . pow ( mf , 2 ) ) /N ) ) ; //Print the r e s u l t } //The volume o f the r e c t a n g l e [ 0 , 1 ] x [ 0 , 2 ] public s t a t i c double v o l ( ) { return 1 ∗ 2 ; } } Listing 3.1 Program listing for montecarlo.java

The results of running this program with various numbers of random points (e.g. java montecarlo 100) are shown below:

3.4 Numerical Approximation of Multiple Integrals N N N N N N = = = = = = 10: 100: 1000: 10000: 100000: 1000000: 19.36543087722646 21.334419561385353 19.807662237526227 20.080975812043256 20.009403854556716 20.000866994982314 +/+/+/+/+/+/2.7346060413546147 0.7547037194998519 0.26701709691370235 0.08378816229769506 0.026346782289498317 0.008321168748642816

115

As you can see, the approximation is fairly good. As N → ∞, it can be shown that the √ Monte Carlo approximation converges to the actual volume (on the order of O( N), in computational complexity terminology). In the above example the region R was a rectangle. To use the Monte Carlo method for a nonrectangular (bounded) region R, only a slight modification is needed. Pick a ˜ rectangle R that encloses R, and generate random points in that rectangle as before. Then use those points in the calculation of f¯ only if they are inside R. There is no need to calculate the area of R for formula (3.15) in this case, since the exclusion of points ˜ not inside R allows you to use the area of the rectangle R instead, similar to before. For instance, in Example 3.4 we showed that the volume under the surface z = 8x+6y over the nonrectangular region R = {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 2x2 } is 6.4. Since the ˜ rectangle R = [0, 1] × [0, 2] contains R, we can use the same program as before, with the only change being a check to see if y < 2x2 for a random point (x, y) in [0, 1] × [0, 2]. Listing 3.2 below contains the code (montecarlo2.java):

//Program t o approximate the double i n t e g r a l o f f ( x , y )=8 x+6y over the //r e g i o n bounded by x =0 , x =1 , y =0 , and y=2x^2 public class montecarlo2 { public s t a t i c void main ( String [ ] args ) { //Get the number N o f random p o i n t s as a command− l i n e parameter int N = I n t e g e r . parseInt ( args [ 0 ] ) ; double x = 0 ; //x−c o o r d i n a t e o f a random p o i n t double y = 0 ; //y−c o o r d i n a t e o f a random p o i n t double f = 0 . 0 ; //Value o f f at a random p o i n t double mf = 0 . 0 ; //Mean o f the values o f f double mf2 = 0 . 0 ; //Mean o f the values o f f ^2 for ( int i =0; i <N; i ++) { //Get the random c o o r d i n a t e s x = Math . random ( ) ; //x i s between 0 and 1 y = 2 ∗ Math . random ( ) ; //y i s between 0 and 2 i f ( y < 2∗Math . pow ( x , 2 ) ) { //The p o i n t i s in the r e g i o n f = 8∗x + 6∗y ; //Value o f the f u n c t i o n mf = mf + f ; //Add t o the sum o f the f values mf2 = mf2 + f ∗ f ; //Add t o the sum o f the f ^2 values } } mf = mf /N; //Compute the mean o f the f values mf2 = mf2 /N; //Compute the mean o f the f ^2 values System . out . p r i n t l n ( "N = " + N + " : i n t e g r a l = " + v o l ( ) ∗ mf +

z) in a parallelepiped.0. you will need to generate random triples (x. For a more detailed discussion of numerical integration methods. 1] × [0. 1000. instead of random pairs (x.2.477032813858756 +/. where S = [0. 10000. Show the program output for N = integral R 10. Show the program output for S N = 10.0. } //The volume o f the r e c t a n g l e [ 0 . 100.01009454409789472 To use the Monte Carlo method to evaluate triple integrals. Use the Monte Carlo method to approximate the volume of the ellipsoid 1. where R = [0. 1]. 6. MULTIPLE INTEGRALS " +/ − " + v o l ( ) ∗ Math . 2 ) ) /N ) ) . y. 0 ≤ y ≤ 1.9549009662159909 6. y) : −1 ≤ x ≤ 1. 1000.0. see P RESS et al. and use the volume of the parallelepiped instead of the area of a rectangle in formula (3. z) : 0 ≤ x ≤ 1. 100000 and 1000000 random points.9185131565120592 6.440184132811864 +/. Repeat Exercise 2 with the solid S = {(x. y. Use the Monte Carlo method to approximate the volume of a sphere of radius 1. pow ( mf . y) in a rectangle. 10000.95747529014894 +/.0. 0 ≤ y ≤ x2 }. 2. 3. 1] × [0.java The results of running the program with various numbers of random points (e. s q r t ( ( mf2 − Math . java montecarlo2 1000) are shown below: N N N N N N = = = = = = 10: 100: 1000: 10000: 100000: 1000000: integral integral integral integral integral integral = = = = = = 6. } } Listing 3. x2 9 + y4 + z1 = 2 2 . C Exercises © ¨ 1.349975080015089 +/. 2 ] public s t a t i c double v o l ( ) { return 1 ∗ 2 .0. 0 ≤ z ≤ 1 − x − y}.03200476870881392 6. 100000 and 1000000 random points. 4. 5.10040086346895105 6. 1 ] x [ 0 . Repeat Exercise 1 with the region R = {(x.15) (see Exercise 2).g.116 CHAPTER 3. 1] × [0.417050897922222 +/.3149056229650355 +/. 100. Write a program that uses the Monte Carlo method to approximate the double e xy dA. 1].31916837260973624 6.2 Program listing for montecarlo2. Write a program that uses the Monte Carlo method to approximate the triple integral e xyz dV.

we let u = x2 − 1. . 3]). though it is a bit more complicated than the substitution method which you learned in single-variable calculus. The answer is yes. the definite integral 2 1 x3 x2 − 1 dx . First. √ Then substituting that expression for x into the function f (x) = x3 x2 − 1 gives √ f (x) = f (g(u)) = (u + 1)3/2 u . then you would make the substitution u = x2 − 1 ⇒ x2 = u + 1 du = 2x dx which changes the limits of integration x=2⇒u=3 so that we get 2 1 2 x=1⇒u=0 x3 x2 − 1 dx = = 1 3 0 1 2 2x 1 2 (u 3 · 2x x2 − 1 dx √ + 1) u du = = 1 2 u3/2 + u1/2 du . 3] we can define x as a function of u. 2] onto [0. 2]. namely √ x = g(u) = u + 1 . Recall that if you are given. on [0. for example. the reader may be wondering if it is possible to simplify those integrals using a suitable substitution for the variables. On the interval of integration [1.5 Change of Variables in Multiple Integrals Given the difficulty of evaluating multiple integrals. which will give some motivation for how substitution works in multiple integrals.5 Change of Variables in Multiple Integrals 117 3. 3]) and hence has an inverse function (defined on the interval [0. the function x → x2 − 1 is strictly increasing (and maps [1.3. 0 √ 14 3 5 Let us take a different look at what happened when we did that substitution. That is. which can be easily integrated to give .

32 and § 15.118 and we see that CHAPTER 3. The proof of the following theorem is beyond the scope of the text. and it is what you were implicitly using when doing integration by substitution. We will state the formulas for double and triple integrals involving real-valued functions of two and three variables. respectively. MULTIPLE INTEGRALS dx = g ′ (u) ⇒ dx = g ′ (u) du du dx = 1 (u + 1)−1/2 du . d). d] (which you can think of as being on the “u-axis”) onto an interval [a. b] (on the x-axis). We will assume that all the functions involved are continuously differentiable and that the regions and solids involved all have “reasonable” boundaries. differentiable function from an interval [c. and b g−1 (b) f (x) dx = a f (g(u)) g ′ (u) du . then c = g−1 (a) and d = g−1 (b). § 15. . 2 so since g(0) = 1 ⇒ 0 = g−1 (1) g(3) = 2 ⇒ 3 = g−1 (2) then performing the substitution as we did earlier gives 2 2 f (x) dx = 1 1 3 x3 x2 − 1 dx 1 2 (u 3 = 0 √ + 1) u du . which can be written as = 0 2 √ 1 (u + 1)3/2 u · 2 (u + 1)−1/2 du . so that a = g(c) and b = g(d). which means f (g(u)) g ′ (u) du .62 for all the details. This formula turns out to be a special case of a more general formula which can be used to evaluate multiple integrals. (3. which means that g ′ (u) 0 on the interval (c. if x = g(u) is a one-to-one.2 2 See T AYLOR and M ANN.17) g−1 (a) This is called the change of variable formula for integrals of single-variable functions. g−1 (2) f (x) dx = 1 g−1 (1) In general.

21) The determinant J(u. Similarly. v) to denote the area element in the (x. y. y) dA(x. v. v)| dA(u. w) .3. v. v) = ∂y ∂u is never 0 in R′ . y(u. v) define a one-to-one mapping of a region R′ in the uv-plane onto a region R in the xy-plane such that the determinant ∂x ∂u J(u. z(u. w) = ∂u ∂z ∂u is never 0 in S ′ . . w). v)) | J(u. v.19) is saying that dA(x. w) define a one-to-one mapping of a solid S ′ in uvw-space onto a solid S in xyz-space such that the determinant ∂x ∂u ∂y J(u. The following example shows how the change of variables formula is used.23) Notice that formula (3. v) (3.20) f (x(u. y) . v. z) .22) Similarly. v) coordinates. y) = | J(u. w). (3. z) = S S′ ∂x ∂v ∂y ∂v ∂z ∂v ∂x ∂w ∂y ∂w ∂z ∂w (3. v. w) (3. v. w)) | J(u. respectively. y. v. y(u. ∂(u. then f (x. (3. v. which you can think of as a two-variable version of the relation dx = g ′ (u) du in the single-variable case. v). v) = ∂(x.1. and is sometimes written as J(u. v) and y = y(u. y) and dA(u. v) . Change of Variables Formula for Multiple Integrals Let x = x(u. Then f (x.18) f (x(u. w) = ∂(x. v. w)| dV(u. y) = R R′ ∂x ∂v ∂y ∂v (3. v.5 Change of Variables in Multiple Integrals 119 Theorem 3. if x = x(u. y = y(u. w) of three variables is sometimes written as J(u. v. w) and z = z(u. v. v) in formula (3. v). ∂(u.18) is called the Jacobian of x and y with respect to u and v. v)| dA(u. w).19) We use the notation dA(x. the Jacobian J(u. y. y) and (u. z) dV(x.

⇒ | J(u.9. v) = ∂y ∂u ∂x 1 2 ∂v = 1 ∂y −2 ∂v 1 2 1 2 = 1 1 1 = . v)| dA v −v = 0 1 ev u 1 2 du dv dv = 0 1 u=v v u v 2 e u=−v v 2 (e = = 0 2 v − e−1 ) dv 1 0 4 (e − e−1 ) = 1 e2 − 1 1 e− = 4 e 4e . Evaluate R e x+y dA. v)) | J(u. MULTIPLE INTEGRALS x−y Example 3. To use the change of variables formula (3.1 below. y(u. v) = 2 (u + v). y) : x ≥ 0. By looking at the numerator and denominator of the exponent of e. So solving for x and y gives x = 1 (u + v) and y = 2 (v − u). at least in a closed form. y = y(u.19). y 1 x+y=1 R 0 1 Figure 3. In Figure 3. x + y ≤ 1}. v) = 1 (v − u) maps the region R′ 2 onto R in a one-to-one manner. y ≥ 0. note that evaluating this double integral without using substitution is probably impossible. we will try the substitution u = x − y and v = x + y. v)| = 2 2 2 so using horizontal slices in R′ . where R = {(x.5. 2 1 we see how the mapping x = x(u. we need to write both x and y in terms of u 1 and v.5. we have x−y e x+y dA = R R′ 1 f (x(u.1 x 1 x = 2 (u + v) v 1 R′ u = −v −1 The regions R and R′ 0 u=v u 1 1 2 (v y= − u) Now we see that ∂x ∂u J(u.120 CHAPTER 3. Solution: First. v).

θ) : 0 ≤ r ≤ 1.5 Change of Variables in Multiple Integrals 121 The change of variables formula can be used to evaluate double integrals in polar coordinates. Thus. y) : x2 + y2 ≤ 1} is the unit disk in 2 (see Figure 3. Find the volume V inside the paraboloid z = x2 + y2 for 0 ≤ z ≤ 1.10. θ) = r sin θ . r sin θ) r dr dθ . Solution: Using vertical slices. y) dx dy = R R′ f (r cos θ.3. θ) we know that x2 + y2 = r2 and that the unit disk R is the set R′ = {(r. θ) = r cos θ we have ∂x ∂r J(u. Example 3. so we have the following formula: Double Integral in Polar Coordinates f (x. = ∂y sin θ r cos θ ∂θ and y = y(r. (3. v) = ∂y ∂r ∂x cos θ −r sin θ ∂θ = r cos2 θ + r sin2 θ = r ⇒ | J(u.5. v)| = |r| = r . we see that V= R z x2 + y2 = 1 (1 − z) dA = R (1 − (x2 + y2 )) dA .2 z = x2 + y2 V= 0 (1 − r2 ) r dr dθ (r − r3 ) dr dθ − r=1 r4 4 r=0 = 0 = 0 2π dθ = = π 2 0 dθ .5. 0 ≤ θ ≤ 2π}. Letting x = x(r.24) where the mapping x = r cos θ.2). 2π 1 0 2π 1 0 2π r2 2 1 4 y 0 x Figure 3. 1 where R = {(x. y = r sin θ maps the region R′ in the rθ-plane onto the region R in the xy-plane in a one-to-one manner. In polar coordinates (r.

Thus. (3. ρ sin φ sin θ. ρ cos φ) ρ2 sin φ dρ dφ dθ . z = ρ cos φ maps the solid S ′ in ρφθ-space onto the solid S in xyz-space in a one-to-one manner. it can be shown (see Exercises 5-6) that triple integrals in cylindrical and spherical coordinates take the following forms: Triple Integral in Cylindrical Coordinates f (x. where R = {(x.3 z = x2 + y2 V= 0 (1 − r) r dr dθ (r − r2 ) dr dθ − r=1 r3 3 r=0 = 0 = 0 2π dθ = = π 3 0 dθ In a similar fashion. Find the volume V inside the cone z = Solution: Using vertical slices. Triple Integral in Spherical Coordinates f (x.3). z) r dr dθ dz . MULTIPLE INTEGRALS x2 + y2 for 0 ≤ z ≤ 1. r sin θ.11. y. θ) : 0 ≤ r ≤ 1. In polar coordinates (r. 0 ≤ θ ≤ 2π}. 2π 1 0 2π 1 0 2π r2 2 1 6 y 0 x Figure 3.122 CHAPTER 3. we see that V= R (1 − z) dA = R 1− x2 + y2 dA . y = ρ sin φ sin θ. z x2 + y2 = 1 1 Example 3. (3.5.5. z) dx dy dz = S S′ f (ρ sin φ cos θ.26) where the mapping x = ρ sin φ cos θ.25) where the mapping x = r cos θ. z = z maps the solid S ′ in rθz-space onto the solid S in xyz-space in a one-to-one manner. y = r sin θ. y. y) : x2 + y2 ≤ 1} is the unit disk in 2 (see Figure 3. . θ) we know that x2 + y2 = r and that the unit disk R is the set R′ = {(r. z) dx dy dz = S S′ f (r cos θ.

7. (u + 1) x+y for x > 0. Show that the Beta function. Find the volume V of the solid inside both x2 + y2 + z2 = 4 and x2 + y2 = 1.) 11. For a > 0. y) = 0 t x−1 (1 − t)y−1 dt . . 6. 12. y) = 0 ∞ u x−1 du . 1). 0). y > 0.) 8. v = (x − y)/2. then consider Example 3. 2. 3. (2. where R is the triangle with vertices (0. Using the substitution t = u/(u + 1). show that the Beta function can be written as B(x. so 2π π 0 0 2π π 0 a 123 V= S 2π 1 dV = 0 π 0 2π 1 ρ2 sin φ dρ dφ dθ ρ=a ρ=0 = 0 ρ3 3 sin φ dφ dθ = 0 φ=π φ=0 2π a3 sin φ dφ dθ 3 4πa3 . Show that the volume inside the ellipsoid a2 + b2 + c2 = 1 is 4πabc . Find the volume V inside the cone z = B x2 + y2 for 0 ≤ z ≤ 3. for x > 0. y > 0. Find the volume V inside the paraboloid z = x2 + y2 for 0 ≤ z ≤ 4.25). Find the volume V inside both the sphere x2 + y2 + z2 = 1 and the cone z = 5. 0) and (1. 2 y z x 10.5 Change of Variables in Multiple Integrals Example 3. find the volume V inside the sphere S = x2 + y2 + z2 = a2 . 3 = 0 − a3 3 cos φ dθ = 0 2a3 3 ¨ dθ = A Exercises © 1. x−y 2 sin x+y 2 cos dA.3. y = bv. y > 0.26). Find the volume of the solid bounded by z = x2 + y2 and z2 = 4(x2 + y2 ). satisfies the relation B(y. Prove formula (3. 9.12. 4. Evaluate R x2 + y2 . x) = B(x. Solution: We see that S is the set ρ = a in spherical coordinates. y) for x > 0. (Hint: Use the change of variables u = (x + y)/2. Prove formula (3. defined by 1 B(x. Find the volume inside the elliptic cylinder x2 a2 2 + y2 b2 2 C = 1 for 0 ≤ z ≤ 2. (Hint: Use the 3 change of variables x = au.12. z = cw.

To see this. 0 ≤ y ≤ f (x)} in 2 that represents a thin. y) dA. (3.1). In the general case where the density of a region (or lamina) R is a continuous function δ = δ(x.27) assuming that R has uniform density.1 Mx = a ( f (x))2 dx .29).6. The quantity M is the mass of the region R. y) : a ≤ x ≤ b. y) : 0 ≤ x ≤ 1.27) represent a special case when δ(x. 2 b b My = a x f (x) dx . flat plate (see Figure 3.e the mass of R is uniformly distributed over the region. Then the mass of R is the limit of the sums of the masses of all such rectangles inside R as the diagonals of the rectangles approach 0. y) = x + y. respectively.6. y) = 1 throughout R in the formulas in (3. y) of the coordinates (x. y) ¯ ¯ 0 a b Center of mass of R x My M and y= ¯ Mx .29) The quantities M x and My are called the moments (or first moments) of the region R about the x-axis and y-axis. y) dA . M= R δ(x. .13. MULTIPLE INTEGRALS 3. y) dA . y∗ )∆x ∆y. y∗ ) in that rectangle. 0 ≤ y ≤ 2x2 }. y) of points inside R (where R can be any region in 2 ) the coordinates ( x. for some point (x∗ . The mass of that rectangle is approximately δ(x∗ .28) xδ(x. where f (x) is a continuous function on [a. M (3. b]. M Figure 3. y) is δ(x. y) given by ¯ ¯ x= ¯ where b y y = f (x) R ( x .124 CHAPTER 3. i. M= a f (x) dx .6 Application: Center of Mass Recall from single-variable calculus that for a region R = {(x. y) of the center of mass of R are given by ¯ ¯ x= ¯ where My = R My M and y= ¯ Mx . (3. Mx = R yδ(x. In this case the area M of the region is considered the mass of R (the density is constant. Example 3. if the density function at (x. and taken as 1 for simplicity). which is the double integral δ(x. think of taking a small rectangle inside R with dimensions ∆x and ∆y close to 0. R Note that the formulas in (3. y) dA . Find the center of mass of the region R = {(x. the center of mass of R has coordinates ( x.

M 9/10 27 y= ¯ Mx 5/7 50 = = . ¯ ¯ . This makes sense since the density function δ(x. where there is quite a bit of area.27) to show that 3 ( x.2. In the special case where the density function δ(x. y) = 3 . y) dA (x + y) dy dx 0 0 y=2x2 1 y2 xy + dx = 2 y=0 0 = 1 = (2x3 + 2x4 ) dx 1 2x5 + = 2 5 and 0 4 x = 0 9 10 Mx = R 1 yδ(x. y) dA 2x2 My = R 1 xδ(x. y) is given by ¯ ¯ x= ¯ My 11/15 22 = = . We have M= R 1 2x2 125 y y = 2x2 R x 0 1 Figure 3. y) dA 2x2 0 = = = y(x + y) dy dx y=2x2 1 2 xy y3 dx = 2 + 3 0 y=0 0 0 1 = (2x5 + 8x7 21 0 6 x 8x6 ) dx 3 = 5 7 = x(x + y) dy dx 2 1 2 y=2x 2 x y + xy dx = 2 y=0 0 0 1 (2x4 + 2x5 ) dx 1 0 1 0 3 + = 2x5 x6 + 5 3 = 0 11 .6. 5 in that case).3. y) is called the centroid of R. y) is a constant function on the region R. M 9/10 63 Note how this center of mass is a little further towards the upper corner of the region R than when the density is uniform (it is easy to use the formulas in (3.6 Application: Center of Mass Solution: The region R is shown in Figure 3. y) approaches that upper corner. 15 so the center of mass ( x. y) = x + y ¯ ¯ 4 increases as (x.2 δ(x. the center of mass ( x.6.

MULTIPLE INTEGRALS The formulas for the center of mass of a region in 2 can be generalized to a solid S in 3 . Then the center of mass of S has coordinates ( x. M y= ¯ M xz . So since the density function is a constant and S is symmetric about the z-axis. which we know by Example 3. M (3. And 3 3 M xy = S zδ(x. y. y. Solution: The solid S is just the upper hemisphere inside the sphere of radius a centered at the origin (see Figure 3. y.31) (3.126 CHAPTER 3. y. z) is δ(x. z) = 1. M xz = S yδ(x. y. Find the center of mass of the solid S = {(x. z) dV .14. so we ¯ ¯ need only find z. y. M z= ¯ M xy . Also. Let S be a solid with a continuous mass density function δ(x. y.12 is 4πa . respectively. M is the mass of S . then M = 2πa .3). z) : z ≥ 0. z) dV . which in spherical coordinates is S 2π π/2 0 2π π/2 0 a a = = 0 (ρ cos φ) ρ2 sin φ dρ dφ dθ sin φ cos φ 0 2π 0 π/2 0 a4 4 0 = = 0 ρ3 dρ dφ dθ sin φ cos φ dφ dθ . z) dV = S 1 dV = Volume(S ). y. M xz and M xy are called the moments (or first moments) of S around the yz-plane. z) in S . Myz . xz-plane and xy-plane. x2 + y2 + z2 ≤ a2 }. then it is clear that x = 0 and y = 0. We have ¯ M= S a z ( x . Example 3. z) at any point (x. δ(x. where ¯ ¯ ¯ x= ¯ where Myz = S Myz .32) M= In this case. if the density function at (x. y. y. y. z).30) xδ(x. (3. y.6. z) ¯ ¯ ¯ 0 x Figure 3.6. z) dV .3 a y δ(x. y. But since the volume of S is half the volume of the sphere of 3 3 radius a. S M xy = S zδ(x. z) dV z dV . z) dV .

R = {(x. 1 ≤ x2 + y2 ≤ 4 }. z) : z ≥ 0. δ(x. y. y) : y ≥ 0. y.3. y) = x + y 3. y ≥ 0. 4 M xy = M πa4 4 2πa3 3 z= ¯ = 3a . y) = 1 4. z) : 0 ≤ x ≤ 1. δ(x. 0 ≤ y ≤ 1. S = {(x. y) = 2y 2. the center of mass of S is ( x. z) = x2 + y2 + z2 8. R = {(x. x2 + y2 + z2 ≤ a2 }. z) = x2 + y2 + z2 10. z) = xyz 7. z) : 0 ≤ x ≤ 1. z) : 0 ≤ x ≤ 1. y) = 5. find the center of mass of the solid S with the given density function δ(x. z) : x ≥ 0. y. δ(x. y) : 0 ≤ x ≤ 1. 0 ≤ z ≤ 1 − x − y}. δ(x. 3a . y. R = {(x. y. δ(x. 1. S = {(x. z) = 1 . δ(x. y. R = {(x. y. x2 + y2 + z2 ≤ a2 }. y. find the center of mass of the region R with the given density function δ(x. 0 ≤ y ≤ x2 }. z) = 1 9. δ(x. 6. y. 0 ≤ z ≤ 1 }. ¯ ¯ ¯ 8 A Exercises © ¨ For Exercises 1-5. y. z ≥ 0. δ(x. S = {(x. x2 + y2 ≤ 1 }. δ(x. y) : y ≥ 0. y). R = {(x. 0 ≤ z ≤ 1 }. z) = 0. y) : y ≥ 0. 0 ≤ y ≤ 1. S = {(x. 0 ≤ y ≤ 1. S = {(x.6 Application: Center of Mass 2π π/2 0 2π 4 127 M xy = 0 a4 8 sin 2φ dφ dθ (since sin 2φ = 2 sin φ cos φ) φ=π/2 φ=0 = 0 2π a − 16 cos 2φ a4 8 dθ = 0 dθ = so πa4 . x2 + y2 ≤ a2 }. 8 Thus. y) : 0 ≤ x ≤ 2. δ(x. y) = y x2 + y2 B For Exercises 6-10. y. z). 0. y. 0 ≤ y ≤ 4 }. x ≥ 0.

Now let X be a variable representing a random real number in the interval (0. P(X ≤ 3) = P(X = 1) + P(X = 2) + P(X = 3) in the die example). So since X represents a random number in (0. x). there are three equally likely numbers (1. in the case of the die. MULTIPLE INTEGRALS 3. 1) 1 P(X ≤ x) = In the case of a discrete random variable. for any real number x in (0. For example. Note that the set of all real numbers between 0 and 1 is not a discrete (or countable) set of values. 1) has length 1. and hence is uniformly distributed over (0. it makes no sense to consider P(X = x) since it must be 0 (why?). Instead. . x) has length x. since of the six numbers 6 2 on the die. Then the probability of rolling a 3. 4. since there are six sides on the die and each one is equally likely to be rolled. 1). Ω = {1. written as P(X ≤ 3). 6}. length of (0. i. An event A is a subset of the sample space. the event X ≤ 3 is the set {1. 1). 1). The reasoning is this: the interval (0. E. 1). written as P(X = 3). we consider the probability P(X ≤ x). 1) the interval (0. 5. and you let a variable X represent the value rolled. 2. then length of (0. An event A is a subset of the sample space. 1 is 6 . Likewise the probability of rolling at most a 3. For For a proof see p. it can not be put into a one-to-one correspondence with the set of positive integers. 1950. Theory of Sets. For a continuous random variable. 3.e. and hence in particular the 3 has a one out of six chance of being rolled. . Probability Suppose that you have a standard six-sided (fair) die. 2. 2. In particular we will see ways in which multiple integrals can be used to calculate probabilities and expected values. For example. We call X a discrete random variable on the sample space (or probability space) Ω consisting of all possible outcomes. In our case. 1).7 Application: Probability and Expected Value In this section we will briefly discuss some applications of multiple integrals in the field of probability theory.3 In this case. and for x in (0. Let X be a continuous real-valued random variable on a sample space Ω in 3 We call X a continuous random variable on the sample space Ω = (0. 9-10 in K AMKE . we saw how the probability of an event was the sum of the probabilities of the individual outcomes comprising that event (e. which we will now describe.g. 3}. Note that P(X ≤ 3) = P(X = 1) + P(X = 2) + P(X = 3). which is given by P(X ≤ x) = x.128 CHAPTER 3. the probability of an event will instead be the integral of a function. and 3) that are less than or equal to 3. x) x = =x. in our case the event X ≤ x is the set (0. New York: Dover.. is 3 = 1 .

for a < x < b f (x) = F (x) = b−a 0.7 Application: Probability and Expected Value simplicity. We say that X has the uniform distribution on (0. by the Fundamental Theorem of Calculus. 0. b).15. f (x) dx = 1 .37) Also.d. if X represents a randomly selected real number in an interval (a. for 0 < x < 1 f (x) = F (x) = 0. (3. we have F ′ (x) = f (x) . 1).40) and probability density function In general.42) .41) F(x) = P(X ≤ x) = b−a .35) Suppose that there is a nonnegative. let Ω = (a. and probability density function 1. ′ (3. = P(X ≤ x). (3.34) such that (3. Define the distribution function F of X as F(x) = P(X ≤ x) . b). for 0 < x < 1 0. ′ (3. a for a < x < b . x −∞ ∞ −∞ x 129 for −∞ < x < ∞ for x ≥ b for a < x < b for x ≤ a .f. (3.38) Example 3. Let X represent a randomly selected real number in the interval (0.3. for x ≤ 0 . elsewhere. 1 .39) F(x) = P(X ≤ x) = x. for −∞ < x < ∞ . We thus have P(X ≤ x) = f (y) dy . elsewhere. for x ≥ 1 (3. for short) for X. for a < x < b 0. continuous real-valued function f on F(x) = and f (y) dy .33) (3.36) Then we call f the probability density function (or p. for x ≥ b x (3. for −∞ < x < ∞. for x ≤ a . 1. (3. with distribution function 1. then X has the uniform distribution function 1. 1).

∞ −∞ ∞ −∞ e−(x 2 +y2 )/2 dx dy = = = ∞ e−y 2 /2 ∞ −∞ e−x 2 /2 dx dy 2 /2 −∞ ∞ −∞ ∞ −∞ e−x e−x 2 /2 dx 2 ∞ −∞ e−y dy 2 /2 dx since the same function is being integrated twice in the middle equation.36).16. 2π for −∞ < x < ∞.f. A famous distribution function is given by the standard normal distribution. which is equivalent to ∞ −∞ e−x 2 /2 dx = √ 2π . Since we are claiming that f is a p.43) This is often called a “bell curve”. But using polar coordinates. (3. just with different variables. and is used widely in statistics.45) We can use a double integral in polar coordinates to verify this integral. whose probability density function f is 2 1 f (x) = √ e−x /2 .130 CHAPTER 3.44) by formula (3. we see that ∞ −∞ ∞ −∞ e−(x 2 +y2 )/2 2π dx dy = 0 2π 0 ∞ e−r 2 /2 r dr dθ dθ r=∞ = 0 2π −e−r 2 /2 r=0 2π = 0 (0 − (−e0 )) dθ = 1 dθ = 2π . MULTIPLE INTEGRALS Example 3. First. we should have ∞ −∞ 2 1 √ e−x /2 dx = 1 2π (3.. e−x 2 /2 dx = −∞ .d. (3. and hence √ 2π . 0 and so ∞ −∞ 2 e−x ∞ 2 /2 dx = 2π .

3.7 Application: Probability and Expected Value

131

In addition to individual random variables, we can consider jointly distributed random variables. For this, we will let X, Y and Z be three real-valued continuous random variables defined on the same sample space Ω in (the discussion for two random variables is similar). Then the joint distribution function F of X, Y and Z is given by F(x, y, z) = P(X ≤ x, Y ≤ y, Z ≤ z) ,

z y −∞ ∞ −∞ x

for −∞ < x, y, z < ∞.

3

(3.46)

**If there is a nonnegative, continuous real-valued function f on F(x, y, z) =
**

−∞ −∞ ∞ −∞ ∞ −∞

such that (3.47)

f (u, v, w) du dv dw ,

for −∞ < x, y, z < ∞

and

f (x, y, z) dx dy dz = 1 ,

(3.48)

then we call f the joint probability density function (or joint p.d.f. for short) for X, Y and Z. In general, for a1 < b1 , a2 < b2 , a3 < b3 , we have

b3 b2 a2 b1

P(a1 < X ≤ b1 , a2 < Y ≤ b2 , a3 < Z ≤ b3 ) =

f (x, y, z) dx dy dz ,

a3 a1

(3.49)

with the ≤ and < symbols interchangeable in any combination. A triple integral, then, can be thought of as representing a probability (for a function f which is a p.d.f.). Example 3.17. Let a, b, and c be real numbers selected randomly from the interval (0, 1). What is the probability that the equation ax2 + bx + c = 0 has at least one real solution x? Solution: We know by the quadratic formula that there is at least one real solution if b2 − 4ac ≥ 0. So we need to calculate P(b2 − 4ac ≥ 0). We will use three jointly distributed random variables to do this. First, since 0 < a, b, c < 1, we have √ √ b2 − 4ac ≥ 0 ⇔ 0 < 4ac ≤ b2 < 1 ⇔ 0 < 2 a c ≤ b < 1 , where the last relation holds for all 0 < a, c < 1 such that 0 < 4ac < 1 ⇔ 0 < c < 1 . 4a

c 1 c= R1 0

1 4 1 4a

R2 1

a

Figure 3.7.1 Region R = R1 ∪ R2

Considering a, b and c as real variables, the region R in the ac-plane where the above 1 relation holds is given by R = {(a, c) : 0 < a < 1, 0 < c < 1, 0 < c < 4a }, which we can see is a union of two regions R1 and R2 , as in Figure 3.7.1 above. Now let X, Y and Z be continuous random variables, each representing a randomly selected real number from the interval (0, 1) (think of X, Y and Z representing a, b and c, respectively). Then, similar to how we showed that f (x) = 1 is the p.d.f. of the

132

CHAPTER 3. MULTIPLE INTEGRALS

uniform distribution on (0, 1), it can be shown that f (x, y, z) = 1 for x, y, z in (0, 1) (0 elsewhere) is the joint p.d.f. of X, Y and Z. Now, √ √ P(b2 − 4ac ≥ 0) = P((a, c) ∈ R, 2 a c ≤ b < 1) , √ √ so this probability is the triple integral of f (a, b, c) = 1 as b varies from 2 a c to 1 and as (a, c) varies over the region R. Since R can be divided into two regions R1 and R2 , then the required triple integral can be split into a sum of two triple integrals, using vertical slices in R:

1/4 1 0 R1 1/4 1 0 1/4 1 √ √ 2 a c 1 1/4a 0 1/4a 0 1 √ √ 1 db dc da 2 a c

P(b2 − 4ac ≥ 0) =

1 db dc da +

1/4 R2 1 1/4 1 1/4 1 1/4 1 1/4 1 12a

0

=

0

√ √ (1 − 2 a c) dc da +

c=1 c=0

√ √ (1 − 2 a c) dc da

c=1/4a c=0

=

0 1/4

√ c − 4 a c3/2 3 1−

4 3

da + da

c−

4 3

√ 3/2 ac

da

=

0

√ a da + + 1 ln a 12

8 = a − a3/2 9

1/4 0

1 1 1 1 1 5 = − ln + ln 4 + 0− = 4 9 12 4 36 12 5 + 3 ln 4 ≈ 0.2544 P(b2 − 4ac ≥ 0) = 36 In other words, the equation ax2 + bx + c = 0 has about a 25% chance of being solved!

Expected Value

The expected value EX of a random variable X can be thought of as the “average” value of X as it varies over its sample space. If X is a discrete random variable, then EX =

x

x P(X = x) ,

(3.50)

with the sum being taken over all elements x of the sample space. For example, if X represents the number rolled on a six-sided die, then

6 6

EX =

x=1

x P(X = x) =

x=1

x

1 = 3.5 6

(3.51)

is the expected value of X, which is the average of the integers 1 − 6.

**3.7 Application: Probability and Expected Value If X is a real-valued continuous random variable with p.d.f. f , then EX =
**

∞ −∞

133

x f (x) dx .

(3.52)

and so

1 . (3.54) 2 −∞ 0 For a pair of jointly distributed, real-valued continuous random variables X and Y with joint p.d.f. f (x, y), the expected values of X and Y are given by EX = x f (x) dx = x dx = EX = respectively.

∞ −∞ ∞ −∞

**For example, if X has the uniform distribution on the interval (0, 1), then its p.d.f. is 1, for 0 < x < 1 f (x) = (3.53) 0, elsewhere,
**

∞ 1

x f (x, y) dx dy and

EY =

∞ −∞

∞ −∞

y f (x, y) dx dy ,

(3.55)

Example 3.18. If you were to pick n > 2 random real numbers from the interval (0, 1), what are the expected values for the smallest and largest of those numbers? Solution: Let U1 , . . . , Un be n continuous random variables, each representing a randomly selected real number from (0, 1), i.e. each has the uniform distribution on (0, 1). Define random variables X and Y by X = min(U1 , . . . , Un ) and Y = max(U1 , . . . , Un ) .

Then it can be shown4 that the joint p.d.f. of X and Y is n(n − 1)(y − x)n−2 , for 0 ≤ x ≤ y ≤ 1 f (x, y) = 0, elsewhere. Thus, the expected value of X is

1 1

(3.56)

EX =

0 1 x

n(n − 1)x(y − x)n−2 dy dx

y=1 y=x

=

0 1

nx(y − x)n−1

dx

=

0

**nx(1 − x)n−1 dx , so integration by parts yields 1 (1 − x)n+1 n+1
**

1 0

= −x(1 − x)n − 1 , EX = n+1

4

See Ch. 6 in H OEL, P ORT and S TONE.

134

CHAPTER 3. MULTIPLE INTEGRALS

**and similarly (see Exercise 3) it can be shown that
**

1 y 0

EY =

0

n(n − 1)y(y − x)n−2 dx dy =

n . n+1

So, for example, if you were to repeatedly take samples of n = 3 random real numbers from (0, 1), and each time store the minimum and maximum values in the sample, 1 then the average of the minimums would approach 4 and the average of the max3 imums would approach 4 as the number of samples grows. It would be relatively simple (see Exercise 4) to write a computer program to test this.

B

1. Evaluate the integral

∞ −x2 e −∞

Exercises ©

¨

**dx using anything you have learned so far.
**

2 2 ∞ 1 √ e−(x−µ) /2σ −∞ σ 2π

**2. For σ > 0 and µ > 0, evaluate 3. Show that EY =
**

n n+1

dx.

in Example 3.18

C

4. Write a computer program (in the language of your choice) that verifies the results in Example 3.18 for the case n = 3 by taking large numbers of samples. 5. Repeat Exercise 4 for the case when n = 4. 6. For continuous random variables X, Y with joint p.d.f. f (x, y), define the second moments E(X 2 ) and E(Y 2 ) by E(X 2 ) =

∞ −∞ ∞ −∞

x2 f (x, y) dx dy and

E(Y 2 ) =

∞ −∞

∞ −∞

y2 f (x, y) dx dy ,

and the variances Var(X) and Var(Y) by Var(X) = E(X 2 ) − (EX)2 and Var(Y) = E(Y 2 ) − (EY)2 .

Find Var(X) and Var(Y) for X and Y as in Example 3.18. 7. Continuing Exercise 6, the correlation ρ between X and Y is defined as ρ=

∞ ∞

E(XY) − (EX)(EY) Var(X) Var(Y)

,

where E(XY) = −∞ −∞ xy f (x, y) dx dy. Find ρ for X and Y as in Example 3.18. (Note: The quantity E(XY) − (EX)(EY) is called the covariance of X and Y.) 8. In Example 3.17 would the answer change if the interval (0, 100) is used instead of (0, 1)? Explain.

e. a curve) in 2 . since an interval (or collection of intervals) is really the only kind of “path” in 1 .1 below). Partition the interval [a. b] We will assume for now that the function f (x.4 Line and Surface Integrals 4.1 Curve C : x = x(t). y) of the object and is applied in the direction of motion along C (see Figure 4. with a force f (x. b] in 1 . We will begin with real-valued functions of two variables. we will see how to define the integral of a function (either realvalued or vector-valued) of two variables over a general path (i. for some integer n ≥ 2 135 . y) which varies with the position (x. You may also recall that if f (x) represented the force applied along the x-axis to an object at position x in [a. Suppose that we want to find the total amount W of work done in moving an object along a curve C in 2 with a smooth parametrization x = x(t). y) is continuous and real-valued.1. This definition will be motivated by the physical notion of work. b]. In physics. y = y(t) for t in [a. y C t=a t = ti ∆yi ∆si ≈ ∆xi ∆xi 2 + ∆yi 2 t = ti+1 t=b x 0 Figure 4. then the work W done in moving that object from position x = a to x = b was defined as the integral: b W= a f (x) dx In this section. the intuitive idea of work is that Work = Force × Distance . a ≤ t ≤ b. b] as follows: a = t0 < t1 < t2 < · · · < tn−1 < tn = b . This integral (usually called a Riemann integral) can be thought of as an integral over a path in 1 . y = y(t).1 Line Integrals In single-variable calculus you learned how to integrate a real-valued function f (x) over an interval [a.1. so we only consider the magnitude of the force.

for any real-valued function f (x. if the subinterval is small enough then the work done in moving the object along that piece of the curve is approximately Force × Distance ≈ f (xi∗ . W≈ f (xi∗ . y) ds = C a f (x(t). y(t)) x ′ (t)2 + y ′ (t)2 dt . y(t)). yi∗ ) = (x(ti ∗). But since ∆xi 2 + ∆yi 2 = where ∆ti = ti+1 − ti .1. (4. and f (xi∗ . yi∗ ) i=0 ∆xi ∆ti 2 + ∆yi ∆ti 2 ∆ti . yi∗ ) becomes f (x(t). y(t)) x ′ (t)2 + y ′ (t)2 dt . ti+1 ] the distance ∆si traveled along the curve is approximately ∆xi 2 + ∆yi 2 . and so n−1 W≈ f (xi∗ . over a typical subinterval [ti . called a line integral: Definition 4.1) where (xi∗ . y) along the curve C. (4. the sum over all subintervals becomes the integral from t = a to t = b.136 CHAPTER 4. by the Pythagorean Theorem. (4.2) is approximately the total amount of work done over the entire curve. yi∗ ) ∆xi 2 + ∆yi 2 . so that b W= a f (x(t). the integral of f (x. yi∗ ) i=0 ∆xi 2 + ∆yi 2 (4. ti+1 ]. LINE AND SURFACE INTEGRALS As we can see from Figure 4. respectively. y). For a real-valued function f (x. a ≤ t ≤ b. then n−1 ∆xi ∆ti 2 + ∆yi ∆ti 2 ∆ti .1. y) along C with respect to arc length s is b f (x.6) .5) The symbol ds is the differential of the arc length function t s = s(t) = a x ′ (u)2 + y ′ (u)2 du . ∆xii and ∆yii become x ′ (t) ∆t ∆t and y ′ (t). (4. (4. Thus. parametrized by x = x(t).4) The integral on the right side of the above equation gives us our idea of how to define.3) Taking the limit of that sum as the length of the largest subinterval goes to 0. y = y(t). y(ti ∗)) for some ti ∗ in [ti . the line integral of f (x.1. y) and a curve C in 2 .

y) ds is the total area of that picket fence (see Figure 4.3). Parametrize C as follows: x = x(t) = r cos t . b]. y) ds = C 2π a f (x(t). for all t in [a. y = y(t) = r sin t . y) ds represent? The preceding discussion of ds gives us a clue.1. y) C ds x 0 Figure 4. 0 ≤ t ≤ 2π z r h = f (x.9 as the length of the curve C over the interval [a.4. y) y 0 x C : x2 + y2 = r2 Figure 4. y).1.1. So if you think of f (x. C y f (x. and thus the line integral f (x. Use a line integral to show that the lateral surface area A of a right circular cylinder of radius r and height h is 2πrh.3 Let f (x.2). y).2 Area of shaded rectangle = height × width ≈ f (x.1.1 Line Integrals 137 which you may recognize from Section 1. y) = h for all (x. ds = s ′ (t) dt = x ′ (t)2 + y ′ (t)2 dt . You can think of differentials as infinitesimal lengths.1. y) as the height of a picket fence along C. y(t)) x ′ (t)2 + y ′ (t)2 dt h (−r sin t)2 + (r cos t)2 dt 2π =h 0 r sin2 t + cos2 t dt 2π = rh 0 1 dt = 2πrh . Solution: We will use the right circular cylinder with base circle C given by x2 + y2 = r2 and with height h in the positive z direction (see Figure 4. then f (x. t]. y) ds Example 4. For a general real-valued function f (x. what does the line integral C f (x. (4.7) by the Fundamental Theorem of Calculus. Then b A= = 0 f (x. y) ds can be thought of as approximately the area of a section of that fence over some infinitesimally small section of the curve. That is.

(4. let t vary from 0 to 4π. In the derivation of the formula for a line integral. y = y(t). y = y(t) = r sin(2π − t) . y) and Q(x. y(t)) x ′ (t) dt (4.10) Notice that our definition of the line integral was with respect to the arc length parameter s. for any f (x. a ≤ t ≤ b.9) f (x. y = y(t). Then −C is parametrized by x = x(a + b − t) . y) ds . It is defined at points in 2 . y) dx = C a f (x(t). using the parametrization x = x(t) = r cos(2π − t) . So it would be helpful to develop a vector form for a line integral. (4. y) on 2 . we used the idea of work as force multiplied by distance. then denote by −C the same curve as C but traversed in the opposite direction. LINE AND SURFACE INTEGRALS Note in Example 4. We can also define b f (x. y) defined on 2 by f(x. a≤t≤b. i. let r(t) = x(t) i + y(t) j . and b f (x. i. Also. twice the desired area. 0 ≤ t ≤ 2π .138 CHAPTER 4.12) as the line integral of f (x. y) along C with respect to x. (4. then we would have gotten an area of 4πrh.e. and its values are vectors in 2 . For this. If a curve C has a parametrization x = x(t). and we have f (x. y) j for some continuous real-valued functions P(x. suppose that we have a function f(x.1 that if we had traversed the circle C twice. y) = P(x.e. a ≤ t ≤ b. y). notice that we traversed the circle in the counter-clockwise direction. y) i + Q(x. In general. For a curve C with a smooth parametrization x = x(t). even though the curve itself is still the same (namely. Such a function f is called a vector field on 2 .8) then it is easy to verify (see Exercise 12) that the value of the line integral is unchanged.11) as the line integral of f (x. y) ds unchanged. y) dy = C a f (x(t). it can be shown (see Exercise 15) that reversing the direction in which a curve C is traversed leaves C f (x. y(t)) y ′ (t) dt (4. y) along C with respect to y. However. If we had gone in the clockwise direction. y) ds = C −C y = y(a + b − t) . a circle of radius r). we know that force is actually a vector.

y) = P(x. y(t)) · r ′ (t) is a real-valued function on [a. y) dy (4. Recall that if the points on a curve C have position vector r(t) = x(t) i+y(t) j. For a vector field f(x. Then r ′ (t) = x ′ (t) i + y ′ (t) j and so b b 139 P(x. y(t)) on C. y = y(t). where it is understood that the line integral along C is being applied to both P and Q.2 is often called a line integral of a vector field to distinguish it from the line integral in Definition 4. A differential form ∂x ∂y P(x. This leads us to the following definition: Definition 4. y) dx + C C Q(x. where r(t) = x(t) i + y(t) j is the position vector for points on C. The line integral in Definition 4. y) dy = C P(x. y(t)) · r ′ (t) dt .2. then r ′ (t) 0 on [a. For convenience we will often write P(x. y) j and a curve C with a smooth parametrization x = x(t).1 and 4. y) dy . y(t)). y(t)) y ′ (t)) dt f(x(t). y). y) dx + Q(x. the differential of F is dF = ∂F dx + ∂F dy. y) dx + C b a C Q(x. Since C is a smooth curve. y(t)) x ′ (t) + Q(x(t).4. b]. then r ′ (t) is a tangent vector to C at the point (x(t). a ≤ t ≤ b. so the last integral on the right looks somewhat similar to our earlier definition of a line integral.2 together we get the following theorem: . y). y(t)) x ′ (t) dt + a Q(x(t). y(t)) y ′ (t) dt = a b (P(x(t). We use the notation dr = r ′ (t) dt = dx i + dy j to denote the differential of the vectorvalued function r. y) dy is called exact if it equals dF for some function F(x. The quantity P(x. For a realvalued function F(x. y) dy is known as a differential form.13) (4. y) dx + C C Q(x. y) dx + Q(x.1 which is called a line integral of a scalar field. y) dx + Q(x. y) i + Q(x. y(t)) · r ′ (t) dt = a by definition of f(x. y).1 Line Integrals be the position vector for a point (x(t). Notice that the function f(x(t). y(t)) in the direction of increasing t (which we call the direction of C).14) f(x(t). y) dy = a b P(x(t). the line integral of f along C is C f · dr = = P(x. b] and hence T(t) = r ′ (t) r ′ (t) is the unit tangent vector to C at (x(t). Putting Definitions 4.

If the vector field f(x. then 1 (x2 + y2 ) dx + 2xy dy = C 0 1 (x(t)2 + y(t)2 )x ′ (t) + 2x(t)y(t) y ′ (t) dt (t2 + 4t4 )(1) + 2t(2t2 )(4t) dt 0 1 = = = 0 3 t (t2 + 20t4 ) dt 1 3 + 4t 5 0 = 1 13 +4= 3 3 . y = y(t).1. 2) 2 Solution: Figure 4. For a vector field f(x. y = 2t2 .15) where T(t) = r ′ (t) r ′ (t) is the unit tangent vector to C at (x(t). (b) C : x = t . y) represents the force moving an object along a curve C. C (x2 + y2 ) dx + 2xy dy. y) = P(x. y) j and a curve C with a smooth parametrization x = x(t).140 CHAPTER 4. then 1 (x2 + y2 ) dx + 2xy dy = C 0 1 (x(t)2 + y(t)2 )x ′ (t) + 2x(t)y(t) y ′ (t) dt x = 0 1 (t + 4t )(1) + 2t(2t)(2) dt 13t dt 0 2 2 2 0 1 Figure 4. y(t)). where: 0≤t≤1 0≤t≤1 y (1. (a) Since x ′ (t) = 1 and y ′ (t) = 2.4 shows both curves.1.4 = = (b) Since x ′ (t) = 1 and y ′ (t) 13t3 3 1 = 0 13 3 = 4t.1. C f · dr = C f · T ds . y) i + Q(x. LINE AND SURFACE INTEGRALS Theorem 4. y = 2t . then the work W done by this force is W= C f · T ds = C f · dr .2. (4. (4.16) Example 4. Evaluate (a) C : x = t . a ≤ t ≤ b and position vector r(t) = x(t) i + y(t) j.

. that is C = C1 ∪ C2 ∪ . 2) along the given curve C.1 Line Integrals 141 So in both cases. Cn . 0) y 2 C2 C1 x 0 1 Figure 4. Example 4. 2) to (1. as we will see in the next section.1. . . C (x2 + y2 ) dx + 2xy dy.1 is often preferred in physics since it · emphasizes the idea of integrating the tangential component f· T of f in the direction of T (i. in the direction of C). . where C1 is the curve given by x = 0. in physics it is common to b see the notation a f · dl. this is not always the case. then we can define f · dr = f · dr1 + f · dr2 + . 2) (x2 + y2 ) dx + 2xy dy (x2 + y2 ) dx + 2xy dy C2 2 + = 0 (02 + t2 )(0) + 2(0)t(1) dt + 0 2 1 (t2 + 4)(1) + 2t(2)(0) dt = = 0 3 t 0 dt + 0 1 (t2 + 4) dt 13 1 +4= 3 3 3 + 4t 0 = Line integral notation varies quite a bit.1. 0 ≤ t ≤ 1 (see Figure 4. 2). . which is a useful physical interpretation of line integrals. and the letter l signifies length. Although we defined line integrals over a single smooth curve. then the work done is 13 .e. y = t. Then (x2 + y2 ) dx + 2xy dy = C C1 (1. . 3 This may lead you to think that work (and more generally. . + f · drn C C1 C2 Cn where each ri is the position vector of the curve Ci . Evaluate to (0. if the vector field f(x.5 1 Solution: Write C = C1 ∪ C2 . For example.3. the line integral of a vector field) is independent of the path taken. . 0 ≤ t ≤ 2 and C2 is the curve given by x = t. 0) to (1. ∪ Cn is the union of smooth curves C1 . y) = (x2 + y2 ) i + 2xy j represents the force moving an object from (0. where it is understood that the limits of integration a and b are for the underlying parameter t of the curve. y = 2.4. the formulation C f · T ds from Theorem 4. .5). if C is a piecewise smooth curve. However. Also. where C is the polygonal path from (0.

0) to (0. f(x. 0 ≤ t ≤ 2π 7.9). y = sin t. 10. Prove that C f (x. 0 ≤ t ≤ 2π C : x = cos t. f(x. . y) and curve C. y = sin t. C: path from (2. y = sin t. y) = x + y2 . calculate C Exercises © ¨ f (x. C 15. y = 0. Let C be a smooth curve with arc length L. y) = (x2 + y2 ) i. Show that if f ⊥ r ′ (t) at each point r(t) along a smooth curve C. y) ds = −C f (x. and suppose that f(x.142 CHAPTER 4. 0) C : x = 2 + cos t. 0) counterclockwise along the circle x2 + y2 = 4 to the point (−2. y) j is a vector field such that f(x. C f · dr for the given vector field f(x. 13. y) = 2 x +1 3. 11. y) = P(x. Prove that the Riemann integral b a f (x) dx is a special case of a line integral. y) = (x2 − y) i + (x − y2 ) j. C : x = 3t. 1. y) = 2x + y. C: polygonal path from (0. 9. y) = i − j. For Exercises 6-11. y) and curve C. 0) and then back to (2. Use a line integral to find the lateral surface area of the part of the cylinder x2 + y2 = 4 below the plane x + 2y + z = 6 and above the xy-plane. f(x. f(x. calculate 6. then C f · dr = 0. LINE AND SURFACE INTEGRALS A For Exercises 1-4. C : x = t. y) = y i − x j. Show that if f points in the same direction as r ′ (t) at each point r(t) along a smooth curve C. 0) to (1. (Hint: Use formulas (4. 2) 4. y) on C. y) ds. 0 ≤ t ≤ 1 2. C : x = cos t. y) i + Q(x. 14. y = sin t. B 12. 0 ≤ t ≤ 1 C : x = cos t. f (x. 0 ≤ t ≤ 2π C : x = cos t. then C f · dr = C f ds. f (x.) C 17. Show that b b f · dr ≤ ML. 1) to (0. f(x. 8.) 16. y) ds for the given function f (x. y) = xy2 i + xy3 j. f (x. (Hint: Recall that a g(x) dx ≤ a |g(x)| dx for Riemann integrals. f (x. y = 2t. y) = x i + y j. 0) along the x-axis 5. 0) to (3. 0 ≤ t ≤ π/2 x . y) = xy. Verify that the value of the line integral in Example 4. 0 ≤ t ≤ 2π C : the polygonal path from (0. y) ≤ M for all (x. y = sin t. f(x.8).1 is unchanged using the parametrization of the circle C given in formulas (4. 0) to (3.

y) ds = C −C f (x. y(u)) (−x ′ (u)) (−du) (by letting u = a + b − t) a = b = b P(x(u). To see this. y) = P(x. (4. let f(x. the value does change. reversing the direction in which the integral is taken along a curve does not change the value of the line integral: f (x. y) dx + Q(x. with P and Q continuously differentiable functions. y) dx + − P(x. y) dy . We know that the curve −C traversed in the opposite direction is parametrized by x = x(a + b − t). y(u)) x ′ (u) du . y) ds (4. y) dx = − P(x. a ≤ t ≤ b. since a b =− . y) j be a vector field. y(u)) x ′ (u) du b a b =− −C P(x(u). however.2 Properties of Line Integrals 143 4. with position vector r(t) = x(t) i + y(t) j (we will usually abbreviate this by saying that C : r(t) = x(t) i + y(t) j is a smooth curve).18) . A similar argument shows that −C Q(x. y) dx + −C C −C Q(x. y) dy C P(x. y) dx = −C a b P(x(a + b − t). C and hence −C f · dr = =− =− P(x. y) dy = − Q(x. y) i + Q(x. y = y(a + b − t). y) dx C since we are just using a different letter (u) for the line integral along C. y(a + b − t)) d (x(a + b − t)) dt dt = a a P(x(a + b − t).17) For line integrals of vector fields. y(a + b − t)) (−x ′ (a + b − t)) dt (by the Chain Rule) P(x(u). y) dy C C −C f · dr = − C f · dr . y) dy Q(x. Let C be a smooth curve parametrized by x = x(t). Then b P(x. y = y(t). so a P(x.2 Properties of Line Integrals We know from the previous section that for line integrals of real-valued functions (scalar fields).4. a ≤ t ≤ b.

y(u)) x ′ (u) du . and α ′ (u) > 0 on the open interval (c. is zero. y) dy has the same value for both parametrizations. Also. b]. c ≤ u ≤ d. The preceding discussion shows the importance of always taking the direction of the curve into account when using line integrals of vector fields. then we know that t = α(u) has an inverse function u = α−1 (t) defined on [a. and then back to the initial point moving backwards along the same path. d] onto [a.2. c ≤ u ≤ d ? If so.2 means that the two parametrizations move along C in the same direction. y(α(u))) α−1 (a) d x ′ (u) ′ ˜ (α (u) du) α ′ (u) = c P( x(u). say. Recall that our definition of a line integral required that we have a parametrization x = x(t). y = y(t).144 CHAPTER 4. This is because when force is considered as a vector. For this reason. b] such that c = α−1 (a). QED and hence C f · dr has the same value. y) j be a vector field. and du = α ′1 . ˜ ˜ Proof: Since α(u) is strictly increasing and maps [c. LINE AND SURFACE INTEGRALS The above formula can be interpreted in terms of the work done by a force f(x. such that a = α(c). Suppose that t = α(u) for c ≤ u ≤ d. direction is accounted for. But as we know. So could we get a different value for a line integral using ˜ ˜ some other parametrization of C. any curve has infinitely many parametrizations. this would mean that our definition is not well-defined. y) (treated as a vector) moving an object along a curve C: the total work performed moving the object along C from its initial point to its terminal point. d = α−1 (b). y) dx has the same value for both parametrizations. d]). Let f(x. a ≤ t ≤ b for the curve C. y = y(u). and by the Chain Rule dt (u) x ′ (u) = ˜ d dx dt dx ˜ = (x(α(u))) = = x ′ (t) α ′ (u) du du dt du ⇒ x ′ (t) = x ′ (u) ˜ α ′ (u) so making the susbstitution t = α(u) gives b P(x(t). x = x(u). and let C be a smooth curve parametrized by x = x(t). b = α(d). y = y(u) = y(α(u)). y(t)) x (t) dt = a ′ α−1 (b) P(x(α(u)). it turns out that the value of a line integral of a vector field is unchanged as long as the direction of the curve C is preserved by whatever parametrization is chosen: Theorem 4. A similar argument shows that C Q(x. y) i + Q(x. the curves in line integrals are sometimes referred to as directed curves or oriented curves. Then C f· dr has the same value for the parametrizations x = x(t). dt = α ′ (u) du. a ≤ t ≤ b. y = y(t). y = y(t). a ≤ t ≤ b and x = x(u) = x(α(u)). α(u) is strictly · increasing on [c. y) = P(x. Luckily. d) (i. ˜ ˜ ˜ which shows that C P(x. .e. Notice that the condition α ′ (u) > 0 in Theorem 4. That was not the case with the “reverse” parametrization for −C: for u = a + b − t we have t = α(u) = a + b − u ⇒ α ′ (u) = −1 < 0.

dt du = cos u > 0 on (0. So 0 ≤ u ≤ π/2 then C (x2 + y2 ) dx + 2xy dy should have the same value as we found in Example 4. Note that any closed curve can be regarded as a union of simple closed curves (think of the loops in a figure eight). .1 Closed vs nonclosed curves A simple closed curve is a closed curve which does not intersect itself. 0 ≤ t ≤ 1. for C : x = x(t). We use the special notation f (x. In some older texts you may see the notation traversing a closed curve in a counterclockwise or clockwise direction. we have (x(a).2 we know that if C is parametrized by x = sin u . ◮ t=a t=b t=a ◮ ◭ C (a) Closed t=b ◭ C (b) Not closed Figure 4. By a closed curve.4.4. y = 2 sin2 u . π/2). i. y = y(t). Evaluate the line integral C (x2 +y2 ) dx+2xy dy from Example 4. namely 13 . where t = sin u for 0 ≤ u ≤ π/2. along closed curves.2 Properties of Line Integrals 145 Example 4. Section 4. and by Theorem 4. y) ds and C C f · dr or to indicate a line integral to denote line integrals of scalar and vector fields. And we can indeed verify this: 3 π/2 (x2 + y2 ) dx + 2xy dy = C 0 π/2 (sin2 u + (2 sin2 u)2 ) cos u + 2(sin u)(2 sin2 u)4 sin u cos u du sin2 u + 20 sin4 u cos u du 0 π/2 = = sin3 u + 4 sin5 u 3 0 1 13 = +4= 3 3 In other words.2. respectively.e. respectively. y(b)). Solution: First. we notice that 0 = sin 0.1. the line integral is unchanged whether t or u is the parameter for C. we mean a curve C whose initial point and terminal point are the same. y(a)) = (x(b).2. 1 = sin(π/2). y = 2t2 . a ≤ t ≤ b.2. along the curve C : x = t.

That is.146 CHAPTER 4. so f · dr = 0 QED C1 C2 f · dr C1 f · dr − C2 C1 f · dr + −C2 C since C = C1 ∪ −C2 . the above theorem does not give a practical way to determine path inde- . the examples we have seen of line integrals (e. suppose that the line integral C f · dr is independent of the path between any two points in R. Let C1 be a part of the curve C that goes from P1 to P2 . the line integral has been independent of the path joining the two points. Let P1 and P2 be two distinct points on C. Then C = C1 ∪ −C2 is a closed curve in R (from P1 to P1 ). and so C f · dr = 0. this is not always the case. Then by path independence we have f · dr = f · dr = 0 f · dr = 0 .2.2. The following theorem gives a necessary and sufficient condition for this path independence: Theorem 4. Let P1 and P2 be two distinct points in R. Let C be a closed curve contained in R.g. In a region R. Clearly. Proof: Suppose that C f · dr = 0 for every closed curve C which is contained in R. Conversely.2 = C1 C1 f · dr .2. and so f · dr = C2 f · dr.2. This proves path independence. Let C1 be a curve in R going from P1 to P2 . the line integral C f · dr is independent of the path between any two points in R if and only if C f · dr = 0 for every closed curve C which is contained in R. and let C2 be the remaining part of C that goes from P1 to P2 .2) have had the same value for different curves joining the initial point to the terminal point. and let C2 be another curve in R going from P1 to P2 . 0= C C1 ◮ P1 P2 f · dr f · dr + f · dr − −C2 C2 = C1 f · dr ◮ C2 Figure 4. Example 4. again as in Figure 4. as in Figure 4.3. Thus.2. As we mentioned before. LINE AND SURFACE INTEGRALS So far.

y) such that ∇F = f on R.5. y(a)) and B = (x(b). The proof is virtually identical to the proof of Theorem 2. we have f · dr = = P(x(t). since it depends only on the values of F at those endpoints. Proof: By definition of b C C f · dr. y = y(t). with P and Q continuously differentiable functions on R. Let C be a smooth curve in R parametrized by x = x(t). y) j be a vector field in some region R. since it is impossible to check the line integrals around all possible closed curves in a region. we first need a version of the Chain Rule for multivariable functions: Theorem 4. and both x = x(t) and y = y(t) are differentiable functions of t. Suppose that there is a real-valued function F(x. y(t)) dt (by the Chain Rule in Theorem 4. § 6.4 (which uses the Mean Value Theorem). What it mostly does is give an idea of the way in which line integrals behave. Let f(x. a specific line integral between two points and all line integrals around closed curves). 1 See T AYLOR and M ANN.4. (Chain Rule) If z = f (x. and how seemingly unrelated line integrals can be related (in this case. y(t)) x ′ (t) + Q(x(t).4. y(b)) are the endpoints of C.2 from Section 2. y) = P(x. . Thus. the line integral is independent of the path between its endpoints. For a more practical method for determining path independence.4) = a = F(x(t). y) is a continuously differentiable function of x and y. Then C (4. then z is a differentiable function of t.20) where A = (x(a). (4.1 We will now use this Chain Rule to prove the following sufficient condition for path independence of line integrals: Theorem 4.5. y(t)) b a = F(B) − F(A) QED by the Fundamental Theorem of Calculus. a ≤ t ≤ b. so we omit it. and dz ∂z dx ∂z dy = + dt ∂x dt ∂y dt at all points where the derivatives on the right are defined.19) f · dr = F(B) − F(A) . y) i + Q(x.2 Properties of Line Integrals 147 pendence. y(t)) y ′ (t) dt a b a b ∂F ∂F ∂F dx ∂F dy dt (since ∇F = f ⇒ + = P and = Q) ∂x dt ∂y dt ∂x ∂y F ′ (x(t).

y) = 1 3 x + xy2 .5 can be thought of as the line integral version of the Fundamental Theorem of Calculus. A real-valued function F(x. C ∇F · dr = 0 for any real-valued function F(x.3 in Section 4. y) = (x2 + y2 ) i + 2xy j exists.5 to show that this line integral is indeed path independent.6. f · dr = F(1. namely F(x. y) = f(x. So ∂F = 2xy + g ′ (y) satisfies the condition ∂F = 2xy if g ′ (y) = 0. since by Theorem 4. Note that we can also verify that the value of the line integral of f along any curve C going from (0. is the following important corollary: Corollary 4. a potential F(x. Example 4. Evaluate C x dx + y dy for C : x = 2 cos t. Since any choice for K will do (why?). y) such that ∂F = x2 + y2 ∂x and ∂F = 2xy . y) such that ∇F(x. y): 1 ∂F = x ⇒ F(x. Solution: We need to find a real-valued function F(x. 0 ≤ t ≤ 2π. so ∂x 2 ∂F 1 = y ⇒ g ′ (y) = y ⇒ g(y) = y2 + K ∂y 2 . g(y) = K. y) for f(x. Thus. LINE AND SURFACE INTEGRALS Theorem 4. Solution: The vector field f(x. so that the endpoints A and B are the same point.148 CHAPTER 4. 0) to (1.1 that the line integral (x2 + y2 ) dx + 2xy dy was found to have the value 13 for three different curves C going 3 C from the point (0.5 in the special case where C is a closed curve. 2) will always be 13 .2 and 4. 3 Hence the line integral C (x2 + y2 ) dx + 2xy dy is path independent. i. Example 4. ∂y Suppose that ∂F = x2 + y2 .6.e. y) is called a potential for f. where ∂y ∂y K is a constant. A conservative vector field is one which has a potential. y = 3 sin t. y) = 1 x3 + xy2 + g(y) for some function ∂x 3 g(y). 0) = (1)3 + (1)(2)2 − (0 + 0) = + 4 = 3 3 3 C A consequence of Theorem 4. y)). 2) − F(0.5 3 1 13 1 . Then we must have F(x. we pick K = 0. 2). If a vector field f has a potential in a region R.e. then C f · dr = 0 for any closed curve C in R (i. y) = x2 + g(y) . 0) to the point (1. Use Theorem 4. Recall from Examples 4. y) = x i + y j has a potential F(x.5.

Show that f ∇g · dr = − g ∇ f · dr C C for any closed curve C in R. Is there a potential F(x.) . y) for f(x. y) = x i − y j? If so. 0 ≤ t ≤ 2π. y = sin t. 4.4. let a and b be constants. 2. y) for f(x. You may assume that F would be smooth. 8. Show that C (a f ± b g) · dr = a C f · dr ± b C C g · dr . (b) Show that C f · dr = 2π. y = sin t. find one. y) and g(x. (Hint: Use Exercise 21 in Section 2. y) for f(x. Thus. y) = −y x2 +y2 i+ x x2 +y2 j for all (x. A 1. y) be continuously differentiable real-valued functions in a region R. 5. Show that 1 ds = L. y) = h(y) i + g(x) j. and let C be a curve in 2 . 2 2 x dx + y dy = C C 149 for any constant K.2 Properties of Line Integrals 1 2 1 2 x + y is a potential for f(x. since the curve C is closed (it is the ellipse ¨ + y2 9 = 1). B 6. find one.6? Explain. Let f(x.4.) 9. find one. y) = f · dr = 0 x2 4 by Corollary 4. Evaluate C 3. Let g(x) and h(y) be differentiable functions. (x2 + y2 ) dx + 2xy dy for C : x = cos t. 7. Is there a potential F(x. y). y = sin t. Let f(x. y) = y i − x j? If so. (Hint: Consider the mixed partial derivatives of F. Let f (x. and let f(x. (a) Show that f = ∇F. for F(x.6. 0). Is there a potential F(x. Let C be a curve whose arc length is L. and C : x = cos t. y) = xy2 i + x3 y j? If so. y) be vector fields. 0 ≤ t ≤ 2π. C 10. Can f have a potential F(x. y)? If so. 0 ≤ t ≤ π. y) (0. so F(x. Evaluate C Exercises © (x2 + y2 ) dx + 2xy dy for C : x = cos t. y) and g(x. find it. y) = tan−1 (y/x). Does this contradict Corollary 4.

We will use Green’s Theorem (sometimes called Green’s Theorem in the plane) to relate the line integral around a closed curve with a double integral over the region inside the curve: Theorem 4. A vector field f(x.23) and .1 b Integrate P(x.21) where C is traversed so that R is always on the left side of C. on C. y) i+ Q(x. y) around C using the representation C = C1 ∪ C2 given by (4.3 Green’s Theorem We will now see a way of evaluating the line integral of a smooth vector field around a simple closed curve.3. Proof: We will prove the theorem in the case for a simple region R.23) where X1 and X2 are the points on C farthest to the left and right. See Figure 4. C1 = the curve x = x1 (y) from the point Y2 to the point Y1 C2 = the curve x = x2 (y) from the point Y1 to the point Y2 .7. y) j be a smooth vector field defined on both R and C. (Green’s Theorem) Let R be a region in 2 whose boundary is a simple closed curve C which is piecewise smooth. y) j is smooth if its component functions P(x. (4. respectively. y) and Q(x. that is. y y = y2 (x) d ◭ x = x1 (y) X1 Y1 Y2 X2 x = x2 (y) R ◮ C x c y = y1 (x) a Figure 4. y) i + Q(x. y) = P(x. ∂x ∂y (4. respectively.150 CHAPTER 4. and (4.22) (4. y) are smooth.25) where Y1 and Y2 are the lowest and highest points. where the boundary curve C can be written as C = C1 ∪ C2 in two distinct ways: C1 = the curve y = y1 (x) from the point X1 to the point X2 C2 = the curve y = y2 (x) from the point X2 to the point X1 .3. Then f · dr = R C ∂Q ∂P − dA . y) = P(x.1.24) (4. Let f(x. LINE AND SURFACE INTEGRALS 4.

y) dy + c d c Q(x2 (y). then we have P(x. y2 (x)) − P(x. as we see from Figure 4. as we see from Figure 4. integrate Q(x. y1 (x)) dx − b a b =− =− =− =− (P(x. y) around C using the representation C = C1 ∪ C2 given by (4. y) dx a = a b P(x. y) dy + c d Q(x2 (y).24). y1 (x))) dx P(x. y) dy d =− = c Q(x1 (y). y) dy + C2 Q(x.3 Green’s Theorem 151 (4. Since y = y1 (x) along C1 (as x goes from a to b) and y = y2 (x) along C2 (as x goes from b to a). Since x = x1 (y) along C1 (as y goes from d to c) and x = x2 (y) along C2 (as y goes from c to d).25) and (4.3. y) dx + C2 P(x. y) dx = C C1 b P(x. y) − Q(x1 (y). y1 (x)) dx + b b P(x. y2 (x)) dx P(x. y) dy (Q(x2 (y). y)) dy d = c d Q(x. y) dy = C C1 c Q(x.3. and so ∂x . y) dx dy (by the Fundamental Theorem of Calculus) ∂x = R ∂Q dA . y) dy d = d Q(x1 (y).26). y) y=y2 (x) y=y1 (x) dx a b a y2 (x) y1 (x) ∂P(x. then we have Q(x. y) dy dx (by the Fundamental Theorem of Calculus) ∂y ∂P dA .1. y2 (x)) dx a = a P(x.1.4. y) x2 (y) x1 (y) x=x2 (y) x=x1 (y) dy = c ∂Q(x. ∂y R Likewise.

31 for a discussion of some of the difficulties involved when the boundary curve is “complicated”. 2x2 ≤ y ≤ 2x }. R has a “hole” at the origin. Evaluate C (x2 +y2 ) dx+2xy dy. y) : 0 < x2 + y2 ≤ 1 }. .2 Example 4. 2) Solution: R is the shaded region in Figure 4. y) = 3 x3 + xy2 .2 that the vector field f(x. That is. y (1. the theorem can also be proved for more general regions (say. y) dy ∂Q dA ∂x R =− R ∂P dA + ∂y = R ∂Q ∂P dA . a union of simple regions). 2 See T AYLOR and M ANN.3. Example 4. we have 2 for P(x. so Green’s Theorem does not apply. y) j. Let f(x.2 We actually already knew that the answer was zero. For the boundary curve C : x2 + y2 = 1. By Green’s Theorem. and so C f · dr = 0 by Corollary 4. + y2 and let R = { (x.5 in Section 4.8. y) : 0 ≤ x ≤ 1. y) = x2 x . y) = 2xy. y) i + Q(x. 2 + y2 and Q(x.3. This would seem to contradict Green’s Theorem.6. note that R is not the entire region enclosed by C.2. 0) is not contained in R. y) = P(x. LINE AND SURFACE INTEGRALS C f · dr = P(x. But ∂P ∂Q y2 − x2 = = 2 ⇒ ∂x ∂y (x + y2 )2 ∂Q ∂P − dA = ∂x ∂y R R 0 dA = 0 .2 that C f · dr = 2π. where P(x. traversed counterclockwise. − ∂x ∂y QED Though we proved Green’s Theorem only for a simple region R. y) = x2 −y + y2 and Q(x. y) = (x2 + y2 ) i + 2xy j has a potential function 1 F(x. where C is the boundary (traversed counterclockwise) of the region R = { (x. However. y) dx + C C Q(x. 1 Figure 4. it was shown in Exercise 9(b) in Section 4. § 15. since the point (0.152 CHAPTER 4. y) = x (x2 + y2 ) dx + 2xy dy = C R ∂Q ∂P − dA ∂x ∂y (2y − 2y) dA = R R C x 0 = 0 dA = 0 .7. Recall from Example 4.

8. For such regions. as opposed to discrete points being cut out. For example. C1 R1 C2 R1 C3 C2 ◮ ◭C ◮ ◮ ◭ R2 1 ◭ ◮ ◭ R2 ◮ (a) Region R with one hole (b) Region R with two holes Figure 4.3.3. y) : 1/4 ≤ x2 + y2 ≤ 1 } (see Figure 4. and we traverse then in the manner indicated by ◭ ◮ .4.4 Multiply connected regions The intuitive idea for why Green’s Theorem holds for multiply connected regions is shown in Figure 4.3. and take the “boundary” C of R to be C = C1 ∪ C2 . − ∂x ∂y which shows that Green’s Theorem holds for the annular region R. so for this Figure 4.3.4 above. The idea is to cut “slits” between the boundaries of a multiply connected region R so that R is divided into subregions which do not have any “holes”.3). that is. regions like the annulus in Example 4. in Figure 4. which are divided by the slits indicated by the dashed lines. ∂Q ∂x We would still have R − ∂P ∂y dA = 0.3. It turns out that Green’s Theorem can be extended to multiply connected regions. Those slits are part of the boundary of both R1 and R2 . then it can be shown (see Exercise 8) that C 153 y 1 C1 ◭ R C2 0 1/2 1 x 1/2 f · dr = 0 .3 The annulus R R we would have f · dr = R C ∂Q ∂P dA .3 Green’s Theorem If we modify the region R to be the annulus R = { (x. the “outer” boundary and the “inner” boundaries are traversed so that R is always on the left side.4(a) the region R is the union of the regions R1 and R2 . which have one or more regions cut out from the interior. where C1 is the unit circle x2 + y2 = 1 traversed counterclockwise and C2 is the circle x2 + y2 = 1/4 traversed clockwise.

the following can be shown: The following statements are equivalent for a simply connected region R in (a) f(x. LINE AND SURFACE INTEGRALS the arrows. A similar argument shows that the theorem holds in the region with two holes shown in Figure 4. For a simply connected region R (i. which means that the line integrals of f along those slits cancel each other out. so that bdy of R1 f · dr = R1 ∂Q ∂P dA − ∂x ∂y and bdy of R2 f · dr = R2 ∂Q ∂P dA . − ∂x ∂y But since the line integrals along the slits cancel out. then C f · dr = 0. ∂y ∂x ∂x ∂y ∂y ∂x Conversely. Since R1 and R2 do not have holes in them.3. simple closed curve C) has a potential in R. a region with no holes). We know from Corollary 4.6 that when a smooth vector field f(x.e.154 CHAPTER 4. y) j on a region R (whose boundary is a piecewise smooth. then ∂F = P ∂x and ∂F = Q. we have C1 ∪C2 f · dr = bdy of R1 f · dr + bdy of R2 f · dr . y) is smooth in R. − ∂x ∂y R and so C1 ∪C2 f · dr = R1 ∂Q ∂P dA + − ∂x ∂y R2 ∂Q ∂P dA = − ∂x ∂y which shows that Green’s Theorem holds in the region R. if ∂P ∂y = ∂Q ∂x in R then f · dr = R C ∂Q ∂P dA = − ∂x ∂y R 0 dA = 0 . y) = P(x. And if the potential F(x. y) j has a smooth potential F(x. ∂Q ∂P dA . y) i+Q(x. Notice that along each slit the boundary of R1 is traversed in the opposite direction as that of R2 . y) = P(x.4(b). the differential form P dx + Q dy is exact) (c) C (d) ∂P ∂Q = in R ∂y ∂x . and so we know that ∂y ∂2 F ∂2 F ∂P ∂Q = ⇒ = in R. y) in R (b) C 2: f · dr is independent of the path for any curve C in R f · dr = 0 for every simple closed curve C in R (in this case. then Green’s Theorem holds in each subregion. y) i + Q(x.

y) = (x3 cos(xy) + 2x sin(xy)) i + x2 y cos(xy) j? If so.4. 7. traversed counterclockwise. C a dx + b dy = 0. C (4. 0). C is the circle x2 + y2 = 1 (e x + y2 ) dx + (ey + x2 ) dy. Is there a potential F(x. For the vector field f as in Example 4. use Green’s Theorem to evaluate the given line integral around the curve C. show directly that C f · dr = 0. y) : 1/4 ≤ x2 + y2 ≤ 1 } traversed so that R is always on the left.) R . traversed counterclockwise. 2x2 ≤ y ≤ 2x } x2 y dx + 2xy dy. 8. Is there a potential F(x. Show that for any constants a. b and any closed simple curve C. find one. y) for f(x. (1. 4) 5. B 9. show that the area A of R is A=− y dx = C C x dy = 1 2 C x dy − y dx . x2 ≤ y ≤ x } 2y dx − 3x dy. 1). 2 2 2. C (x2 − y2 ) dx + 2xy dy. C 3. For a region R bounded by a simple closed curve C. 6. where C is the boundary of the annulus R = { (x. −1). find one. 1. where C is the boundary of the rectangle with vertices (1. y) : 0 ≤ x ≤ 1. (Hint: Use Green’s Theorem and the fact that A = 1 dA. 0) and (0. C is the boundary of R = { (x. y) : 0 ≤ x ≤ 1. y) = (y2 + 3x2 ) i + 2xy j? If so. C is the boundary of the triangle with vertices (0. C 11. C 4. y) = (8xy + 3) i + 4(x2 + y) j? If so. Evaluate C e x sin y dx + (y3 + e x cos y) dy. 10. 1) and (−1.3 Green’s Theorem 155 A Exercises © ¨ For Exercises 1-4. y) for f(x.8. find one. y) for f(x. (−1. Is there a potential F(x. C is the boundary of R = { (x. where C is traversed so that R is always on the left. −1).

The idea behind a parametrization of a curve is that it “transforms” a subset of (normally an interval [a. z = z(u.156 CHAPTER 4. for (u. y. to parametrize a surface Σ in 3 : x = x(u.4. We will now learn how to perform integration over a surface in 3 . the position vector of a point on the surface Σ is given by the vectorvalued function r(u.4. v) in some region R in 2 (see Figure 4.1 Parametrization of a curve C in 3 Similar to how we used a parametrization of a curve to define the line integral along the curve.1). with the terminal points of the position vector r(t) = x(t)i + y(t)j + z(t)k for t in [a. z (x(t). z) on a curve C in 3 . z = z(t). v) z = z(u. y = y(u. Recall from Section 1. v 2 z Σ R (u.4 Surface Integrals and the Divergence Theorem In Section 4. a ≤ t ≤ b. parametrized by x = x(t).2 x 3 r(u. z(a)) x = x(t) y = y(t) z = z(t) 1 1 C r(t) 0 x (x(b).8 how we identified points (x. v).4. b]. y(b). v) = x(u. LINE AND SURFACE INTEGRALS 4. v) y = y(u. v).4. such as a sphere or a paraboloid. y = y(t). v) y Parametrization of a surface Σ in In this case.1 we learned how to integrate along a curve. b]) into a curve in 2 or 3 (see Figure 4. v) in R. v). v)i + y(u. v)k for (u. v) 0 u Figure 4. u and v. y(a). y(t).2). v) x = x(u. . we will use a parametrization of a surface to define a surface integral. z(b)) y a t b Figure 4. We will use two variables. v)j + z(u. z(t)) (x(a).

Thus. v + ∆v) − r(u. v) to (u. the tangent vector to those curves at a point (u. and the variable u is constant along the position vector r(u.26) . v) − r(u. define the partial derivatives for (u. v)) ≈ (∆u ∂r ∂r ∂r ∂r ) × (∆v ) = × ∆u ∆v ∂u ∂v ∂u ∂v by Theorem 1. v + ∆v) − r(u. ∂u ∂v (4. Then that rectangle gets mapped by the parametrization onto some section of the surface Σ which. the total surface area S of Σ is approximately the sum of all the quantities ∂r × ∂r ∆u ∆v. So those lines get mapped to curves on Σ.2.13 in Section 1. the lower left corner of one of the rectangular grid sections in R. v)i + ∂u ∂x (u. But by combining our usual notion of a partial derivative (see Definition 2. for ∆u and ∆v small enough. as shown in Figure 4. v) is a function of two variables. v)j + (u. v) is ∂r . v) − r(u.4. v) in R) and r(u.2) with that of the derivative of a vector-valued function (see Definition 1. v + ∆v) − r(u. This parametrization of the surface is sometimes called a patch. So the area of that rectangle is A = ∆u ∆v. v). (u + ∆u. Taking ∂u ∂v the limit of that sum as the diagonal of the largest rectangle goes to 0 gives S = R ∂r ∂r × du dv . v) in R by ∂r (u. the variable u is constant. v) to (u + ∆u. v + ∆v). v) (corresponding to the line segment from (u. v)k . the horizontal gridlines ∂v in R get mapped to curves on Σ whose tangent vectors are ∂r . ∂v ∆v and so the surface area element dσ is approximately (r(u + ∆u. Suppose that this rectangle has a small width and height of ∆u and ∆v. v)k . and ∂u ∂u ∂y ∂z (u. v). v) − r(u.4. v). will have a surface area (call it dσ) that is very close to the area of the parallelogram which has adjacent sides r(u + ∆u. v) ≈ . v) = ∂v ∂x (u. respectively. (u + ∆u.8) applied to a function of two variables. v + ∆v) and (u. In fact. ∂v ∂v ∂r ∂u 157 and ∂r ∂v The parametrization of Σ can be thought of as “transforming” a region in 2 (in the uv-plane) into a 2-dimensional surface in 3 . v + ∆v) in R). we have ∂r r(u + ∆u. The corner points of that rectangle are (u. v) = ∂u ∂r (u. v)i + ∂v ∂y ∂z (u.4. and ∂u ∆u ∂r r(u. ∂u Now take a point (u.12 in Section 1. Similarly. say. v) in R as. summed over the rectangles in R. v)j + (u.2. v) ≈ .4 Surface Integrals and the Divergence Theorem Since r(u. those gridlines in R lead us to how we will define a surface integral over Σ. Thus.3 in Section 2.4. v) (corresponding to the line segment from (u. based on the idea of “patching” the region R onto Σ in the grid-like manner shown in Figure 4. v)) × (r(u. Along the vertical gridlines in R.

v)k be the position vector for any point on Σ. z = z(u. Let Σ be a surface in 3 parametrized by x = x(u. v).y. y = y(u. v)j + z(u.158 CHAPTER 4. z) be a real-valued function defined on some subset of 3 that contains Σ.4. z(u. v)i + y(u. v) in some region R in 2 . where the circle’s center is at a distance b from the z-axis (0 < a < b). z) defined in 3 . Find the surface area of T . (4. z) over Σ is f (x.29) Example 4. y. Let r(u. where the surface area element dσ can be thought of as 1 dσ. Replacing 1 by a general real-valued function f (x. v). as in Figure 4. ∂u ∂v (4. z z (y − b)2 + z2 = a2 a 0 u y y v (x.27) This is a special case of a surface integral over the surface Σ. v). The surface integral of f (x. A torus T is a surface obtained by revolving a circle of radius a in the yz-plane around the z-axis. LINE AND SURFACE INTEGRALS We will write the double integral on the right using the special notation dσ = Σ R ∂r ∂r × du dv . z) dσ = Σ R f (x(u. and let f (x. v)) ∂r ∂r × du dv . ∂u ∂v (4. y.3.9.28) In particular. the line segment from the center of the circle to that point makes an angle u with the y-axis in the positive y direction (see Figure .4. y(u. v). for (u. v).3 Solution: For any point on the circle. the surface area S of Σ is S = Σ 1 dσ . y.z) a b (a) Circle in the yz-plane x (b) Torus T Figure 4. v) = x(u. we have the following: Definition 4. y.3.

Thus.3(b)). v) = x(u. the line segment from the origin to the center of that circle sweeps out an angle v with the positive x-axis (see Figure 4. ∂u ∂v which has magnitude ∂r ∂r = a(b + a cos u) . 0 ≤ v ≤ 2π 1 dσ ∂r ∂r du dv × ∂u ∂v a(b + a cos u) du dv 0 2π 0 = 0 = = 0 2π abu + a2 sin u 2πab dv 0 u=2π u=0 dv = = 4π2 ab Since ∂r and ∂r are tangent to the surface Σ (i. the surface area of T is S = Σ 2π 2π 0 2π 2π y = (b + a cos u) sin v .4. lie in the tangent plane to Σ at each ∂u ∂v point on Σ).3(a)). then their cross product ∂r × ∂r is perpendicular to the tangent plane to ∂u ∂v . v)k = (b + a cos u) cos v i + (b + a cos u) sin v j + a sin u k we see that ∂r = −a sin u cos v i − a sin u sin v j + a cos u k ∂u ∂r = −(b + a cos u) sin v i + (b + a cos u) cos v j + 0k . So for the position vector r(u. v)j + z(u.4. 0 ≤ u ≤ 2π .4 Surface Integrals and the Divergence Theorem 159 4.e. v)i + y(u. z = a sin u . And as the circle revolves around the z-axis. the torus can be parametrized as: x = (b + a cos u) cos v .4. × ∂u ∂v Thus. ∂v and so computing the cross product gives ∂r ∂r × = −a(b + a cos u) cos v cos u i − a(b + a cos u) sin v cos u j − a(b + a cos u) sin u k .

y.160 CHAPTER 4. Thus. then dividing v by its length yields the outward unit normal vector n = 1 1 1 √ . y. we make the following definition of a surface integral of a 3-dimensional vector field over a surface: y 0 x Figure 4.4. √ 3 3 3 z 1 Σ 0 1 x 1 x+y+z=1 Figure 4. Solution: Since the vector v = (1. 1) is normal to the plane x + y + z = 1 (why?). and hence we can use Definition 4. y) : 0 ≤ x ≤ 1.4. for 0 ≤ u ≤ 1. v). y(u. z) = f1 (x. With this idea in mind. (4. f (x. As we can see from Figure 4. z(u. y. but the picture in Figure 4. Evaluate the surface integral Σ f · dσ. v). v) instead of (x. This is a hazy definition. Let Σ be a surface in 3 and let f(x. y ≥ 0.30) where. using (u.4.4. y = v. we see that x = u. Thus. By an outward unit normal vector to a surface Σ. y. in the case of a sphere. z) = yzi + xzj + xyk and Σ is the part of the plane x + y + z = 1 with x ≥ 0.4 Definition 4. Note in the above definition that the dot product inside the integral on the right is a real-valued function. y.5).5 n y . with the outward unit normal n pointing in the positive z direction (see Figure 4. z)j + f3 (x. LINE AND SURFACE INTEGRALS the surface at each point of Σ. Example 4. z)k be a vector field defined on some subset of 3 that contains Σ.5. We say that n is a normal vector to Σ. v)) n dσ . z Recall that normal vectors to a plane can point in two opposite directions. n is the outward unit normal vector to Σ.4. projecting Σ onto the xy-plane yields a triangular region R = { (x. we will mean the unit vector that is normal to Σ and points away from the “top” (or “outer” part) of the surface. where n = ∂r ∂u × ∂r ∂v . and z ≥ 0. 0 ≤ v ≤ 1 − u .3 to evaluate the integral. 1. at any point on Σ. We now need to parametrize Σ. 0 ≤ y ≤ 1 − x }. √ . z) dσ = Σ R f (x(u. z = 1 − (u + v). The surface integral of f over Σ is f · dσ = Σ Σ f · n dσ . z)i + f2 (x. y. y).4 gives a better idea of what outward normal vectors look like.10.4. where f(x.

For example. √ = √ (yz + xz + xy) 3 3 3 3 1 1 = √ ((x + y)z + xy) = √ ((u + v)(1 − (u + v)) + uv) 3 3 1 = √ ((u + v) − (u + v)2 + uv) 3 161 for (u. −1) × (0. when Σ encloses a bounded solid in 3 . that is. √ . v). integrating over R using vertical slices (e. especially when the formula for the outward unit normal vector at each point of Σ changes. v). as indicated by the dashed line in Figure 4.g.4. and ellipsoids are closed surfaces. 1 1 1 1 f · n = (yz. 1) ∂u ∂v ⇒ √ ∂r ∂r × = 3. v)k = ui + vj + (1 − (u + v))k we have ∂r ∂r × = (1. 1. 0. ∂u ∂v Thus. 1. So on Σ. The following theorem provides an easier way in the case when Σ is a closed surface. xy) · √ . but planes and paraboloids are not.4 Surface Integrals and the Divergence Theorem is a parametrization of Σ over R (since z = 1 − (x + y) on Σ). and for r(u. v) = x(u. cubes. z(u. y(u.4. v) in R.5) gives f · dσ = Σ Σ f · n dσ (f(x(u. v)) · n) R 1 1−u = = ∂r ∂r × dv du ∂u ∂v √ 1 √ ((u + v) − (u + v)2 + uv) 3 dv du 0 0 3 v=1−u 1 2 (u + v)3 uv2 (u + v) du = − + 2 3 2 0 v=0 1 = 0 1 u + − + 6 2 2 6 1 3u2 5u3 du u u2 u3 5u4 = + − + 6 4 2 24 = 0 1 . v)j + z(u. spheres. v)i + y(u. xz. 8 Computing surface integrals can often be tedious. −1) = (1. .

§ 15. . Σ f · dσ. bounded below by another surface. The proof can then be extended to more general solids. and bounded laterally by one or more surfaces. y. then the flux is the net quantity of fluid to flow through the surface Σ per unit time. in the direction of the outward unit normal vector n). p. z) = f1 (x. z) = lim f · dσ . z) = xi + yj + zk and Σ is the unit sphere Solution: We see that div f = 1 + 1 + 1 = 3. y. and let f(x. 36-39. y. if f represents the velocity field of a fluid. A positive flux means there is a net flow out of the surface (i. y. z)k be a vector field defined on some subset of 3 that contains Σ. See S CHEY.6 for the details.11. Then f · dσ = Σ S div f dV . This is best seen by using another definition of div f which is equivalent4 to the definition given by formula (4. LINE AND SURFACE INTEGRALS Theorem 4. it is first proved for the simple case when the solid S is bounded above by one surface. (Divergence Theorem) Let Σ be a closed surface in 3 which bounds a solid S . z) in 3 . 1 div f(x.e. y.e. so f · dσ = Σ S div f dV = S 3 dV 4π(1)3 = 4π . y. for an intuitive discussion of this. where f(x. i. while a negative flux indicates a net flow inward (in the direction of −n). The term divergence comes from interpreting div f as a measure of how much a vector field “diverges” from a point.8. for a point (x. z)j + f3 (x. ∂ f1 ∂ f2 ∂ f 3 + + ∂x ∂y ∂z (4. the surface integral Σ f · dσ is often referred to as the flux of f through the surface Σ.162 CHAPTER 4. (4.33) V→0 V Σ 3 4 See T AYLOR and M ANN.32) The proof of the Divergence Theorem is very similar to the proof of Green’s Theorem. y. z)i + f2 (x. (4.31) where div f = is called the divergence of f. Namely. Evaluate x2 + y2 + z2 = 1. 3 =3 S 1 dV = 3 vol(S ) = 3 · In physical applications.3 Example 4. For example.32).

z). y. which means that the volumes they enclose are going to zero. z) = 2i + 3j + 5k. Σ : x2 + y2 + z2 = 9 2. y. then div f = 0 at that point. y. z) : 0 ≤ x. as we mentioned. y. Σ : x2 + y2 + z2 = 1 Σ · f· dσ . y. Σ : boundary of the solid cube S = { (x. z).33). f(x. z). y. It can be shown that this limit is independent of the shapes of those surfaces. it is common to see simply instead of . z) = x3 i + y3 j + z3 k. Vector fields which have zero divergence are often called solenoidal fields. f(x. Notice that the limit being taken is of the ratio of the flux through a surface to the volume enclosed by that surface. If the flux of a vector field f is zero through every closed surface containing a given point. respectively. Especially in physics texts. y. z) dσ and Σ Σ QED f · dσ is used to denote surface integrals of scalar and vector fields. V → 0 means that we take smaller and smaller closed surfaces around (x. z) over the surface Σ. y.9. z) = xi + yj + zk.4. y.33). so 1 (0) by our assumption that the flux through each Σ is zero. which gives a rough measure of the flow “leaving” a point. 1. y. y. y. Lastly. over closed surfaces.4 Surface Integrals and the Divergence Theorem 163 where V is the volume enclosed by a closed surface Σ around the point (x. z) = lim 1 V→0 V Σ f · dσ for closed surfaces Σ containing (x. we note that sometimes the notation f (x. f(x. so = lim V→0 V = lim 0 V→0 =0. Proof: By formula (4. f(x. z) we have div f(x. In the limit. z) = xi + 2yj + 3zk. y. use the Divergence Theorem to evaluate the surface integral of the given vector field f(x. z ≤ 1 } 3. Σ : x2 + y2 + z2 = 1 4. at the given point (x. Theorem 4. Σ Σ A Exercises © ¨ For Exercises 1-4. The following theorem is a simple consequence of formula (4.

An alternative is to express the surface area in terms of elliptic integrals. Evaluate the surface integral Σ f · dσ.164 CHAPTER 4. for 0 ≤ r ≤ R and 0 ≤ θ ≤ 2π. New York: Dover. § III. Use a surface integral to show that the surface area of a right circular cone of √ radius R and height h is πR h2 + R2 . Note that there will be a different outward unit normal vector to each of the six faces of the cube. F. For specific values of a. Show that the surface area S of the ellipsoid is π 2π S = 0 0 sin φ a2 b2 cos2 φ + c2 (a2 sin2 θ + b2 cos2 θ) sin2 φ dθ dφ . where f(x. (Note: The above double integral can not be evaluated by elementary means. Evaluate the surface integral from Exercise 2 without using the Divergence Theorem. Use a surface integral to show that the surface area of a sphere of radius r is 4πr2 . LINE AND SURFACE INTEGRALS B 5. i. Introduction to Elliptic Functions.) 5 B OWMAN. using only Definition 4. 8.) 10. Show that the flux of any constant vector field through any closed surface is zero.5 ) C 11. y ≥ 0. Use Definition 4. . for 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π. with the outward unit normal n pointing in the positive z direction.10. y.3.e.. z = c cos φ . h y = r sin θ. with Applications. 6. y = b sin φ sin θ . z = R r. y) is given by the formula S = R 2 of a 1+ ∂f 2 ∂x + ∂f 2 ∂y dA . (Hint: Think of the parametrization of the surface. 1961.3 to prove that the surface area S over a region R in surface z = f (x. as in Example 4. z) = x2 i + xyj + zk and Σ is the part of the plane 6x + 3y + 2z = 6 with x ≥ 0. 7. (Hint: Use the parametrization x = r cos θ.) 9.7. (Hint: Use spherical coordinates to parametrize the sphere. and z ≥ 0. The ellipsoid x2 a2 + y2 b2 + z2 c2 = 1 can be parametrized using ellipsoidal coordinates x = a sin φ cos θ . b and c it can be evaluated using numerical methods.

34) The line integral of f (x. z) dz = C a f (x(t).5. parametrized by x = x(t). a ≤ t ≤ b. z) j + R(x. which allows us to define the line integral of a vector field along a curve in 3 . z) dy + C R(x. y. y. z) along C with respect to x is b f (x. y. .2 can easily be extended to include functions of three variables.36) The line integral of f (x. y. y(t).1 and 4. y. z) and a curve C in 3 . y(t). a ≤ t ≤ b. y. z = z(t). y. y = y(t). y. y. y.35) The line integral of f (x. y = y(t). Definition 4. But the definitions and properties which were covered in Sections 4. z) = P(x. z) along C with respect to z is b f (x.6. y. z(t)) z ′ (t) dt . y. z) k and a curve C in 3 with a smooth parametrization x = x(t). z(t)) y ′ (t) dt . z) along C with respect to y is b f (x. z(t)) x ′ (t)2 + y ′ (t)2 + z ′ (t)2 dt . z(t)) · r ′ (t) dt . Vector fields in 3 are defined in a similar fashion to those in 2 . if f (x. z) ≥ 0 then the line integral C f (x. z(t)) x ′ (t) dt .5 Stokes’ Theorem 165 4. y. y. y. y. z = z(t).37) Similar to the two-variable case. the line integral of f (x. z) dx = C a f (x(t). y(t). z) along C with respect to arc length s is b f (x. z) at each point along the curve C in 3 . y(t). y(t). z) dz (4. z) ds can be thought of as the total area of the “picket fence” of height f (x. the line integral of f along C is f · dr = = a P(x. y. z) i + Q(x. z) dx + C b C Q(x.4.38) (4. Definition 4. so that we can now discuss line integrals along curves in 3 . For a real-valued function f (x.5 Stokes’ Theorem So far the only types of line integrals which we have discussed are those along curves in 2 . (4. z) ds = C a f (x(t).39) C f(x(t). y. (4. For a vector field f(x. y. (4. (4. where r(t) = x(t) i + y(t) j + z(t) k is the position vector for points on C. z) dy = C a f (x(t).

13. and x = x(t). z) then the line integral C f · dr represents the work done by that force in moving the object along the curve C in 3 . Then C f · dr = F(B) − F(A) . § 6. Let C be a smooth curve in S parametrized by x = x(t). y = y(t) and z = z(t) are differentiable functions of t.40) where T(t) = r ′ (t) r ′ (t) is the unit tangent vector to C at (x(t). z) k and a curve C with a smooth parametrization x = x(t). with P. z(b)) are the endpoints of C. y. y.10. then C · f· dr = 0 for any closed curve C in S (i. z(t)). Q and R continuously differentiable functions on S . z) i + Q(x. y. C f · dr = C f · T ds . y.41) Also. t2 ). t2 ). LINE AND SURFACE INTEGRALS Similar to the two-variable case. Theorem 4. y. y.e. z) = P(x. z) is a continuously differentiable function of x. y. (Chain Rule) If w = f (x. y(b). t2 ) are continuously differentiable function of (t1 . Theorem 4. ∂t2 ∂x ∂t2 ∂y ∂t2 ∂z ∂t2 (4. a ≤ t ≤ b. y.166 CHAPTER 4. See T AYLOR and M ANN. y = y(t1 .42) ∂t1 ∂x ∂t1 ∂y ∂t1 ∂z ∂t1 and ∂w ∂w ∂x ∂w ∂y ∂w ∂z = + + . For a vector field f(x. Suppose that there is a real-valued function F(x.44) where A = (x(a). if f(x. z) k be a vector field in some solid S . Some of the most important results we will need for line integrals in 3 are stated below without proof (the proofs are similar to their two-variable equivalents). y.11. y = y(t). z) represents the force applied to an object at a point (x. a ≤ t ≤ b and position vector r(t) = x(t) i + y(t) j + z(t) k. y(t). C 6 ∇F · dr = 0 for any real-valued function F(x. y(a). y = y(t). t2 ) and z = z(t1 . (4. y. . z) j + R(x. if x = x(t1 . z) = P(x. z)).43) Theorem 4. y. y. and z. y. then6 ∂w ∂w ∂x ∂w ∂y ∂w ∂z = + + (4. (4. z = z(t). z) i+Q(x. z = z(t).5 for a proof. z) such that ∇F = f on S . z) j+R(x.12. then w is a differentiable function of t. Let f(x. z(a)) and B = (x(b). and dw ∂w dx ∂w dy ∂w dz = + + . dt ∂x dt ∂y dt ∂z dt (4. y. Corollary 4. If a vector field f has a potential in a solid S .

y. See Figure 4. z) ds = C 0 8π f (x(t). y ′ (t) = cos t − t sin t. (Note: C is called a conical helix.1).12. y(t). y. so since f (x(t).5 Stokes’ Theorem Example 4. y. 0 ≤ t ≤ 8π . z(t)) = z(t) = t along the curve C. y(t). we have x ′ (t)2 + y ′ (t)2 + z ′ (t)2 = (sin2 t + 2t sin t cos t + t2 cos2 t) + (cos2 t − 2t sin t cos t + t2 sin2 t) + 1 = t2 (sin2 t + cos2 t) + sin2 t + cos2 t + 1 = t2 + 2 .12. Evaluate C 3 167 parametrized by y = t cos t . . and z ′ (t) = 1. z) = x i + y j + 2z k be a vector field in curve C from Example 4. Solution: It is easy to see that F(x.4. f (x. Solution: Since x ′ (t) = sin t + t cos t. z(t)) t 0 x ′ (t)2 + y ′ (t)2 + z ′ (t)2 dt = t2 + 2 dt 8π 1 = (t2 + 2)3/2 3 = 0 √ 1 (64π2 + 2)3/2 − 2 2 . z) (i. z) ds. Using the same + y2 2 + z2 is a potential for f(x. Let f (x. y. evaluate C f · dr.13.5. z) = z and let C be the curve in x = t sin t . then 8π f (x. y.5. y. 3 30 25 20 15 z 10 5 0-25 -20 -15 -10 -5 y -25 -20 -15 -10 x t = 8π t=0 5 0 -5 0 5 10 15 20 25 30 25 20 15 10 Figure 4.e. z) = x2 2 3. z=t. Let f(x.1 Conical helix C Example 4.

2 we say that the sphere is a two-sided surface.3). of the sphere. and hence orientable.3(b). paraboloids. Other examples of two-sided. called Stokes’ Theorem. z) is what we have called an outward normal vector. We say that such an N is a normal vector field. since N the continuous vector field N(x. y. −N(x. and planes. y. resulting in a “twisted” strip (see Figure 4.5. “twosided” means “orientable”. 8π cos 8π. and 0 −N(x.5. the unit sphere x2 +y2 +z2 = 1 is orientable. then you arrive back at the same place from which you started but upside down! That is. z) is an inward normal vector. z) = x i+y j+z k is nonzero and −N normal to the sphere at each point. 8π) − F(0 sin 0. z(0)) and B = (x(8π). So by Theorem 4.e. y. 0) (8π)2 + (8π)2 − (0 + 0 + 0) = 96π2 . which is constructed by taking a thin rectangle and connecting its ends at the opposite corners. y(0). 8π. An example is the Möbius strip. z) is another normal vector field (see Figure 4. A surface Σ in 3 is orientable if there is a continuous vector field N in 3 such that N is nonzero and normal to Σ (i. 0 cos 0. your orientation changed even though your motion was continuous A . These “outward” and “inward” normal vector fields on the sphere correspond to an “outer” and “inner” side. x Figure 4. 0) =0+ = F(8π sin 8π. 2 We will now discuss a generalization of Green’s Theorem in 2 to orientable surfaces in 3 . ellipsoids. Roughly.2). so = F(0. surfaces are cylinders. We see in this case that y N(x. In fact. A A B → → −→ B A (a) Connect A to A and B to B along the ends (b) Not orientable Figure 4. You may be wondering what kind of surface would not have two sides. 0. where A = (x(0). LINE AND SURFACE INTEGRALS ∇F = f). 8π) − F(0. as in Figure 4. y(8π).5. z(8π)). z For example.5.12 we know that C f · dr = F(B) − F(A) .3 Möbius strip If you imagine walking along a line down the center of the Möbius strip.5. perpendicular to the tangent plane) at each point of Σ. respectively. y. That is.168 CHAPTER 4.

2). z = z(x(t). say C D : x = x(t) . in fact. Projecting Σ onto the xy-plane. § IV. y). z) = P(x. thinking of your vertical direction as a normal vector field along the strip.5 Stokes’ Theorem 169 along that center line. y) n C y 0 x D (x. y).5. . y(t)) as a function of t. (Stokes’ Theorem) Let Σ be an orientable surface in 3 whose boundary is a simple closed curve C. y. at every point) since your vertical direction takes two different values there. pick a unit normal vector n such that if you walked along C with your head pointing in the direction of n. y(t)) .4 in Section 4. The Möbius strip has only one side. and C is traversed n-positively. z)k be a smooth vector field defined on some subset of 3 that contains Σ. Informally. since the curve C is part of the surface z = z(x. z)j + R(x. and so C can be parametrized (in 3) z Σ : z = z(x. and let f(x. z)i + Q(x. y = y(t) .4). its projection C D in the xy-plane also has a smooth parametrization. we see that the closed curve C (the boundary curve of Σ) projects onto a closed curve C D which is the boundary curve of D (see Figure 4.7 For an orientable surface Σ which has a boundary curve C. for z = z(x(t). we know that z ′ (t) = 7 ∂z ′ ∂z ′ x (t) + y (t) . We say in this situation that n is a positive unit normal vector and that C is traversed n-positively. with (x. ∂x ∂y For further discussion of orientability. Assuming that C has a smooth parametrization. see O’N EILL. y = y(t) . (4. y. y) CD Figure 4. y.4. by the Chain Rule (Theorem 4.45) where curl f = ∂P ∂R ∂Q ∂P ∂R ∂Q − i+ − j+ − k. and hence is nonorientable.7. there is a discontinuity at your starting point (and. a ≤ t ≤ b .4 as C : x = x(t) .46) n is a positive unit normal vector over Σ. Proof: As the general case is beyond the scope of this text. a ≤ t ≤ b . y) for some smooth real-valued function z(x. ∂y ∂z ∂z ∂x ∂x ∂y (4. Now. y) varying over a region D in 2 .14. Then f · dr = Σ C (curl f ) · n dσ .5. we will prove the theorem only for the special case where Σ is the graph of z = z(x. We can now state Stokes’ Theorem: Theorem 4. y. then the surface would be on your left.

z(x. we have ∂ ∂Q ∂x ∂Q ∂y ∂Q ∂z (Q(x.47) Thus. y) ∂y for (x.170 and so CHAPTER 4. y))) = + + ∂x ∂x ∂x ∂y ∂x ∂z ∂x ∂Q ∂Q ∂z ∂Q ·1+ ·0+ = ∂x ∂y ∂z ∂x ∂Q ∂Q ∂z = + . y. y)) (x. y. y))) = + . z(x. y. z) dz C b P x ′ (t) + Q y ′ (t) + R a b ∂z ′ ∂z ′ x (t) + y (t) ∂x ∂y dt = a P+R ∂z ′ ∂z ′ x (t) + Q + R y (t) dt ∂x ∂y = CD ˜ ˜ P(x. ˜ ∂z ∂ ∂Q Q(x. z(x. y. y) dy . y))) + R(x. y) . ∂x ∂x ∂z ∂x . z) dy + R(x. y) + R(x. y)) (x. y) . z(x. y)) + R(x. y)) (x. z(x. LINE AND SURFACE INTEGRALS C f · dr = = P(x. Thus. and ∂x ∂z ˜ Q(x. y) = P(x. y)) (x. y)) + R(x. y)) + R(x. y. y) = Q(x. by Green’s Theorem applied to the region D. ∂x ∂y (4. y. y. we have f · dr = D C ˜ ˜ ∂Q ∂P − dA . where ∂z ˜ P(x. y. = ∂x ∂x ∂y ∂x ∂y Now. z(x. z(x. by formula (4. y) in D. y)) (x. y. y. z) dx + Q(x.11. y. z(x. y) . so by the Product Rule we get = ∂x ∂x ∂y ∂ ∂z ∂ ∂z ∂ (Q(x. ∂R ∂R ∂z ∂ (R(x. z(x.42) in Theorem 4. y. y. ∂x ∂z ∂x Similarly. y. z(x. z(x. y) dx + Q(x.

49) after factoring out a −1 from the terms in the first two products in equation (4. by equation (4.5. upon comparing to equation (4. Hence.47).4). we can calculate ˜ ∂2 z ∂P ∂P ∂P ∂z ∂R ∂z ∂R ∂z ∂z = + + + +R . Hence. we have ∂z ∂z = i + ∂x k and ∂r = j + ∂y k.49).76) that the vector N = − ∂x i − ∂y j + k is normal to the tangent plane to the surface z = z(x. proves the Theorem. y)) ∂x ∂x ∂z ∂x ∂x ∂z ∂x ∂y ∂x ∂y 2z ∂Q ∂Q ∂z ∂R ∂z ∂R ∂z ∂z ∂ = + + + +R . . y) = x i + y j + z(x.5 Stokes’ Theorem Thus. y). y) at each point of Σ. recall from Section 2. Thus.4. N n= = N ∂z − ∂x i − ∂z ∂y j+k + ∂z 2 ∂y 1+ ∂z 2 ∂x is in fact a positive unit normal vector to Σ (see Figure 4.46) for curl f. and so ∂y using formula (4. y. using the parametrization r(x.3 (see p. z(x. we have ∂r ∂x ∂r ∂x × ∂r ∂y = 1+ ∂z 2 ∂x + ∂z 2 ∂y . So we see that (curl f) · n dσ = Σ D (curl f ) · n ∂r ∂r × dA ∂x ∂y = D ∂z ∂z ∂P ∂R ∂Q ∂P ∂R ∂Q i+ j+ k · − i − j + k dA − − − ∂y ∂z ∂z ∂x ∂x ∂y ∂x ∂y − D = ∂P ∂R ∂z ∂Q ∂P ∂R ∂Q ∂z − − − + − ∂y ∂z ∂x ∂z ∂x ∂y ∂x ∂y dA . − D C f · dr = ∂R ∂Q ∂z ∂P ∂R ∂z ∂Q ∂P − − − + − ∂y ∂z ∂x ∂z ∂x ∂y ∂x ∂y dA (4. of the surface Σ.48).48) = ∂2 z ∂y ∂x by the smoothness of z = z(x. y) in D. for (x. ∂y ∂y ∂z ∂y ∂y ∂x ∂z ∂y ∂x ∂y ∂x So subtracting gives ˜ ˜ ∂Q ∂P ∂Q ∂R ∂z ∂R ∂P ∂z ∂Q ∂P − = − + − + − ∂x ∂y ∂z ∂y ∂x ∂x ∂z ∂y ∂x ∂y since ∂2 z ∂x ∂y 171 (4. ∂z ∂z Now. y) k. QED which. ˜ ∂Q ∂Q ∂Q ∂z ∂2 z ∂R ∂R ∂z ∂z = + + + + R(x. ∂x ∂z ∂x ∂x ∂y ∂z ∂x ∂y ∂x ∂y In a similar fashion.

then the vectors T. y) = x i + y j + (x2 + y2 ) k for (x. so (curl f ) · n = (−2x − 2y + 1)/ 1 + 4x2 + 4y2 . Verify Stokes’ Theorem for f(x. Solution: The positive unit normal vector to the surface z = z(x. and curl f = (1 − 0) i + (1 − 0) j + (1 − 0) k = i + j + k. n. LINE AND SURFACE INTEGRALS Note: The condition in Stokes’ Theorem that the surface Σ have a (continuously varying) positive unit normal vector n and a boundary curve C traversed n-positively can be expressed more precisely as follows: if r(t) is the position vector for C and T(t) = r ′ (t)/ r ′ (t) is the unit tangent vector to C. y) in the region D = { (x. it should be noted that Stokes’ Theorem holds even when the boundary curve C is piecewise smooth.5 z = x2 + y2 Since Σ can be parametrized as r(x. Example 4. then (curl f ) · n dσ = Σ D (curl f ) · n ∂r ∂r × dA ∂x ∂y 1 + 4x2 + 4y2 dA = D −2x − 2y + 1 1+ 4x2 + 4y2 = D (−2x − 2y + 1) dA .14.5. Σ y 0 x Figure 4. z) = z i + x j + y k when Σ is the paraboloid z = x2 + y2 such that z ≤ 1 (see Figure 4. y) : x2 + y2 ≤ 1 }. .172 CHAPTER 4. so switching to polar coordinates gives 2π 1 0 2π 1 0 2π = 0 (−2r cos θ − 2r sin θ + 1)r dr dθ (−2r2 cos θ − 2r2 sin θ + r) dr dθ 3 = 0 = 0 2π − 2r cos θ − 3 2r3 3 sin θ + 1 2 r=1 r2 2 r=0 dθ = 0 − 2 cos θ − 2 sin θ + 3 3 2π 0 dθ 2 = − 2 sin θ + 3 cos θ + 1 θ 3 2 =π. y) = x2 + y2 is n= ∂z − ∂x z C n 1 i− ∂z ∂y j+k + ∂z 2 ∂y 1+ ∂z 2 ∂x = −2x i − 2y j + k 1 + 4x2 + 4y2 . T × n form a right-handed system. y.5). Also.5.

5 Stokes’ Theorem 173 The boundary curve C is the unit circle x2 + y2 = 1 laying in the plane z = 1 (see Figure 4. y. and so by Stokes’ Theorem f · dr = Σ C (curl f ) · n dσ = Σ 0 dσ = 0 . Calculate C f · dr for f(x. 1+ is a positive unit normal vector to Σ. Solution: The surface is similar to the one in Example 4. y. As in Example 4.14. so (curl f ) · n = x (−4y)(− 2 ) + (9x)(− 2y ) + (0)(1) 9 1+ x2 4 + 4y2 9 = 2xy − 2xy + 0 1+ x2 4 + 4y2 9 =0.5). at 2 2 each point (x. using Stokes’ 4 Theorem is easier than computing the line integral directly. z = 1 for 0 ≤ t ≤ 2π. which can be parametrized as x = cos t. y) = x + y9 the vector 4 n= ∂z − ∂x i − ∂z ∂y 2 2 j+k + ∂z 2 ∂y 1+ ∂z 2 ∂x = x −2 i − 2y 9 x2 4 j+k + 4y2 9 . and let C be its 4 boundary curve. Example 4. . as predicted by Stokes’ Theorem. but this will not always be the case. except now the boundary 2 2 curve C is the ellipse x + y9 = 1 laying in the plane z = 1.4. y)) on the surface z = z(x. z(x. The line integral in the preceding example was far simpler to calculate than the surface integral. Let Σ be the elliptic paraboloid z = x + y9 for z ≤ 1. z) = (9xz + 2y)i + (2x + y2 )j + (−2y2 + 2z)k. And calculating the curl of f gives curl f = (−4y − 0)i + (9x − 0)j + (2 − 2)k = −4y i + 9x j + 0 k .14. y = sin t.15. where C is traversed counterclockwise. So 2π C f · dr = = ((1)(− sin t) + (cos t)(cos t) + (sin t)(0)) dt 0 2π 1 + cos 2t dt 2 0 t sin 2t 2π = cos t + + =π. In this case. 2 4 0 − sin t + f · dr = Σ here we used cos2 t = 1 + cos 2t 2 So we see that C (curl f ) · n dσ.5.

y. y. For example.6 Curl and rotation would rotate counterclockwise if it were dropped to the right of the y-axis. and imagine dropping two wheels with paddles x into that water flow. In the limit. circulation per unit area). z) = (1 + x2 ) j. z) = 2x k in our example) and would obey the right-hand rule. the term curl was created by the 19th century Scottish physicist James Clerk Maxwell in his study of electromagnetism. for a simple closed curve C the line integral C f· dr is often called the circulation of f around C. z). y. y.6. then it turns out8 that curl E = 0. then the wheels would not rotate and hence there would be no curl (which is why such fields are called irrotational. where it is used extensively. as in Figure 4. meaning no rotation). y. y. z) which is always parallel f to the xy-plane at each point (x. z) = lim 1 S →0 S f · dr . Namely. See S CHEY. then such a wheel Figure 4. y. For example. This is best seen by using another definition of curl f which is equivalent9 to the definition given by formula (4. So the curl points outward (in the positive z-direction) if x > 0 and points inward (in the negative z-direction) if x < 0.e. z) in 3 . which causes Σ. curl f(x.6. . for the derivation. In fact. In physics. z). z) points in the direction of your thumb as you cup your right hand in the direction of the rotation of the wheel. to have smaller and smaller surface area. y.174 CHAPTER 4. and it would rotate clockwise if it were dropped to the left of the y-axis. the surface it bounds. y. if E represents the electrostatic field due to a point charge. 78-81. y An idea of how the curl of a vector field is related to rotation is shown in Figure 4. that is. which means that the circulation C E · dr = 0 by Stokes’ Theorem. 2 in R EITZ. M ILFORD and C HRISTY. y.5. z) is from the y-axis. z) and with a simple closed boundary curve C and positive unit normal vector n at (x. f(x. the magnitude of f is larger) as you move away from the y-axis.5.46). Suppose we have a vector field f(x. Since the 0 flow is stronger (i.e. p. Vector fields which have zero curl are often called irrotational fields. 8 9 See Ch. think of the curve C shrinking to the point (x. the curl is interpreted as a measure of circulation density.50) C where S is the surface area of a surface Σ containing the point (x.5. That ratio of circulation to surface area in the limit is what makes the curl a rough measure of circulation density (i. z) and that the vectors grow larger the further the point (x. In both cases the curl would be nonzero (curl f(x. for a point (x. (4. n · (curl f )(x. Think of the vector field as representing the flow of water. y. Notice that if all the vectors had the same direction and the same magnitude. LINE AND SURFACE INTEGRALS · In physical applications.

∂P ∂R = . = . z) and curve C. and R(x. regions having no holes): The following statements are equivalent for a simply connected solid region S in 3: (a) f(x. z) = xy. and = in S (i. Thus.3. z) i + Q(x. y. then f · dr = Σ C (curl f ) · n dσ = Σ 0 · n dσ = Σ 0 dσ = 0 . we have a three-dimensional version of a result from Section 4. Determine if the vector field f(x. z) = xyz. y. z) = xz. f(x. y. z) does not have a potential in A For Exercises 1-3. Example 4. z) in ∂x 3. calculate C Exercises © ¨ f (x. y.e. y. z) is a smooth vector field such that curl f = 0 in S . y.5 Stokes’ Theorem 175 Finally. by Stokes’ Theorem. z) k has a smooth potential F(x. ∂z ∂x and ∂Q ∂P = ∂x ∂y where P(x. we just need to check whether curl f = 0 throughout 3 . . But we see that ∂R ∂P = xy .4. y. y. y. that is. for solid regions in 3 which are simply connected (i. where Σ is any orientable surface inside S whose boundary is C (such a surface is sometimes called a capping surface for C). z) in S (b) C f · dr is independent of the path for any curve C in S f · dr = 0 for every simple closed curve C in S (c) C (d) ∂Q ∂P ∂R ∂Q ∂P ∂R = . Q(x.16. ∂y ∂z throughout 3. z) ds for the given function f (x. y. So similar to the two-variable case. curl f = 0 in S ) ∂y ∂z ∂z ∂x ∂x ∂y Part (d) is also a way of saying that the differential form P dx + Q dy + R dz is exact. 3 . y. z) = xyz i + xz j + xy k has a potential in 3 . y. y. ∂R ∂Q = . z) j + R(x. y.e. =y ∂z ∂x ⇒ ∂P ∂z ∂R for some (x. Solution: Since 3 is simply connected. z) = P(x. we know that if C is a simple closed curve in some solid region S in 3 and if f(x.

0 ≤ t ≤ 1 f · dr for the given vector field f(x. y. Let Σ be a closed surface and f(x. 0) to (1. f (x. C : x = 3t. 5. z) has a potential in (you do not need to find the potential itself). f(x. 2. verify Stokes’ Theorem for the given vector field f(x. f(x. state whether or not the vector field f(x. z ≤ 1 16. z) = + y + 2yz. y. 0 ≤ t ≤ 1 C : the polygonal path from (0. y = sin t.3(b)). z = t2 − 1. y. 0 ≤ u ≤ 2π . − 2 ≤ v ≤ 2 2 2 1 2 C 18.5. 6. 0 ≤ t ≤ 2π C : x = cos t. z) = 2y i − x j + z k. 0) to (1. .) Σ 19. z = t. z) = x i + y j + z k. 10. y. z = C √ 2 2 3/2 . (1. y = t. f(x. C : x = t2 . f(x. y. z) = xy i − (x − yz2 ) j + y2 z k B For Exercises 14-15. z) = z2 . 0. z) = xy i + xz j + yz k. z) = xy i + (z − x) j + 2yz k. z) and surface Σ. C : x = t sin t. y. f(x. z) = z. z) = (y − 2z) i + xy j + (2xz + y) k. 0. 0 ≤ t ≤ 2π x 2. 8. z = 1. calculate 4.176 CHAPTER 4. C : x = cos t. z) a smooth vector field. 9. z = 2. 0 ≤ t ≤ 2π C : x = t. y. f(x. LINE AND SURFACE INTEGRALS 1. f(x. y. y = t cos t. f(x. f(x. z) = yz i + xz j + xy k. 14. y = sin t. f (x. z) = a i + b j + c k (a. y. c constant) 13. Use Gnuplot (see Appendix C) to plot the Möbius strip parametrized as: 1 r(u. z = t. (Hint: Split Σ in half. 0) to 3 7. z = t. then draw a line down its center (like the dotted line in Figure 4. 3 t 0≤t≤1 For Exercises 4-9. 0) C : the polygonal path from (0. y. 0. y. z) and curve C. y. v) = cos u (1 + v cos u ) i + sin u (1 + v cos u ) j + v sin u k . z) = y i − x j + z k. 0. 1 ≤ t ≤ 2 y 3. y. Σ : x2 + y2 + z2 = 1. y. z ≥ 0 Σ : z = x2 + y2 . y. f(x. f(x. 2. 0) to (1. −2) For Exercises 10-13. y = sin t. Show that (curl f ) · n dσ = 0. y = 2t. Cut the Möbius strip along that center line completely around the strip. 15. C : x = cos t. y. b. How many surfaces does this result in? How would you describe them? Are they orientable? 17. f (x. Construct a Möbius strip from a piece of paper. 0) to (1. y = 2t. y. z) = y i − x j + z k 12. y. f(x. y. 2. Show that Green’s Theorem is a special case of Stokes’ Theorem. z) = i − j + k. z) = (x + y) i + x j + z2 k 11.

where each of the partial derivatives is evaluated at the point (x. So in this way. ∂x “applied” to f (x. y. y. the symbols ∂x . We will then show how to write these quantities in cylindrical and spherical coordinates. divergence and curl. ∂z to a real-valued function f (x. y. y. namely ∇= ∂ ∂ ∂ i+ j+ k. no. its value at a point (x. ∂y . y. ∂x ∂y ∂z ∂x ∂ ∂ ∂ Is ∇ really a vector? Strictly speaking. For example.6 Gradient. say f (x. Curl and Laplacian 177 4. This is done by thinking of ∇ as a vector in 3 . z) = ∂f ∂f ∂f ∂f ∂f ∂f = . z) on 3 . that is. z) is normally thought of as multiplying the quantities: ∂f ∂ (f) = . z) = f1 (x. For a real-valued function f (x. ∂y and ∂z are not actual numbers. ∂y ∂y ∂f ∂ (f) = ∂z ∂z For this reason. to produce the partial ∂ derivatives ∂ f . For instance. Curl and Laplacian In this final section we will establish some relationships between the gradient. z)i + f2 (x. you can think of the symbol ∇ as being “applied” to a real-valued function f to produce a vector ∇ f . especially with the divergence and curl. z) is the vector ∇ f (x. Divergence. z) produces ∂ f . z)i + f2 (x. But it helps to think of ∇ as a vector.51) ∂ ∂ ∂ Here.4. z)k. y. since it “operates” on functions. It turns out that the divergence and curl can also be expressed in terms of the symbol ∇. i+ j+ k ∂x ∂y ∂z ∂x ∂y ∂z in 3 . ∇ is often referred to as the “del operator”. ∂x ∂y ∂z (4. it is often convenient to write the divergence div f as ∇ · f. and we will also introduce a new quantity called the Laplacian. the dot product of f with ∇ (thought of as a vector) makes sense: ∇· f = ∂ ∂ ∂ i+ j+ k · f1 (x. since for a vector field f(x. y. y. y. y. z)j + f3 (x. The process of “applying” ∂x . z). z) is a vector-valued function on 3 . ∂y and ∂z are to be thought of as “partial derivative operators” that will get “applied” to a real-valued function. y. the gradient ∇ f (x. y.6 Gradient. . since ∂x . y. Divergence. y. ∂x ∂x ∂f ∂ (f) = . ∂ f and ∂ f . z). y. z)k ∂x ∂y ∂z ∂ ∂ ∂ ( f1 ) + ( f2 ) + ( f3 ) = ∂x ∂y ∂z ∂ f1 ∂ f 2 ∂ f 3 = + + ∂x ∂y ∂z = div f . z)j + f3 (x. as we ∂ ∂ ∂ will soon see.

y. is given by ∂2 f ∂2 f ∂2 f ∆ f (x. y. y. z) 2 = r · r = x2 + y2 + z2 is a real-valued function. For a real-valued function f (x. denoted by ∆ f . Find (a) the gradient of r 2 3.178 CHAPTER 4. z) Q(x. y. z)k. using the convention ∇2 = ∇ · ∇. z). z). z)j + R(x. the gradient ∇ f (x. z) = vector field. y. namely as ∇ × f. y. z) = x i + y j + z k be the position vector field on r(x. y.52) ∂x ∂y ∂z Often the notation ∇2 f is used for the Laplacian instead of ∆ f . Example 4. z) R(x. z)i + Q(x. since for a vector field f(x. to which we will give a special name: Definition 4. Let r(x. z) = ∇ · ∇ f = 2 + 2 + 2 . LINE AND SURFACE INTEGRALS We can also write curl f in terms of ∇. y.17. y. z) = ∂R ∂P ∂Q ∂P ∂R ∂Q i − j + k − − − ∂y ∂z ∂x ∂z ∂x ∂y ∂R ∂Q ∂P ∂R ∂Q ∂P = i + j + k − − − ∂y ∂z ∂z ∂x ∂x ∂y = curl f For a real-valued function f (x. so we can take its divergence: div ∇ f = ∇ · ∇ f = ∂f ∂x ∂f ∂y ∂f ∂z i+ j+ k is a ∂ ∂ ∂ i+ j+ k · ∂x ∂y ∂z ∂ ∂f ∂ ∂f = + + ∂x ∂x ∂y ∂y ∂2 f ∂2 f ∂2 f = 2 + 2 + 2 ∂x ∂y ∂z ∂f ∂f ∂f i+ j+ k ∂x ∂y ∂z ∂ ∂f ∂z ∂z Note that this is a real-valued function. Then (b) the divergence of r (c) the curl of r (d) the Laplacian of r 2 . we have: i j k ∂ ∂ ∂ ∇×f = ∂x ∂y ∂z P(x. y. y. y. (4. y.7. the Laplacian of f . z) = P(x.

If a vector field f(x.15. z). then curl f = 0. The following theorem shows that this will be the case in general: Theorem 4. Also. notice that in Example 4. y. QED 2 we get . = ∂y ∂z ∂z ∂y ∂x ∂z ∂z ∂x ∂x ∂y ∂y ∂x since the mixed partial derivatives in each component are equal. Another way of stating Theorem 4. The proof is straightforward and left as an exercise for the reader. z). The following theorem shows that this will be the case in general: Theorem 4.6 Gradient. using the ∇ notation along with parts (a) and (b): ∆ r 2 = ∇ · ∇ r 2 = ∇ · 2 r = 2 ∇ · r = 2(3) = 6 Notice that in Example 4. Proof: We see by the smoothness of f that i j k ∂ ∂ ∂ ∇ × (∇ f ) = ∂x ∂y ∂z ∂f ∂f ∂f ∂x ∂y ∂z ∂2 f ∂2 f ∂2 f ∂2 f ∂2 f ∂2 f − − − i− j+ k=0.16.15 is that gradients are irrotational. For any smooth real-valued function f (x. ∇ × (∇ f ) = 0.4. z) has a potential. Curl and Laplacian Solution: (a) ∇ r (b) ∇ · r = (c) i ∂ ∇× r = ∂x x (d) ∆ r 2 ∂ ∂x (x) 2 179 = 2x i + 2y j + 2z k = 2 r + ∂ ∂z (z) + ∂ ∂y (y) =1+1+1=3 j ∂ ∂y y ∂2 (x2 ∂y2 k ∂ = (0 − 0) i − (0 − 0) j + (0 − 0) k = 0 ∂z z + y2 + z2 ) + ∂2 (x2 ∂z2 = ∂2 (x2 ∂x2 + y2 + z2 ) + + y2 + z2 ) = 2 + 2 + 2 = 6 Note that we could have calculated ∆ r 2 another way. y.17 if we take the divergence of the curl of r we trivially get ∇ · (∇ × r) = ∇ · 0 = 0 . y.17 if we take the curl of the gradient of r ∇ × (∇ r 2 ) = ∇ × 2 r = 2 ∇ × r = 2 0 = 0 . ∇ · (∇ × f) = 0. Divergence.17. Corollary 4. For any smooth vector field f(x.

But the result is true.18.15 which can be useful. with S being the solid region enclosed by Σ. y. A system of electric charges has a charge density ρ(x.17) QED There is another method for proving Theorem 4. y. and is often used in physics.15. y. then we must have (∇× (∇ f ))· n = 0 throughout 3 . z) in space. then we must have f (x. to prove Theorem 4. Proof: Let Σ be a closed surface which bounds a solid S . so = 0 by Corollary 4. Show that ∇ · E = 4πρ.180 CHAPTER 4. j and k in place of n. z) and produces an electrostatic field E(x.10 10 In Gaussian (or CGS) units. Gauss’ Law states that E · dσ = 4π Σ S ρ dV for any closed surface Σ which encloses the charges. then (∇ × (∇ f )) · n dσ = Σ C Σ ∇ f · dr by Stokes’ Theorem. where n is any unit vector. Since ∇ f is a vector field. and physicists do not usually bother to prove it. Example 4. z) dσ = 0 for all surfaces Σ in some solid region (usually all of 3 ). The proof is not trivial. (by Theorem 4. The flux of ∇ × f through Σ is (∇ × f ) · dσ = Σ S ∇ · (∇ × f ) dV 0 dV S (by the Divergence Theorem) = =0. y. . we see that we must have ∇ × (∇ f ) = 0 in 3 . z) through any closed surface is zero.18. if the surface integral f (x. y. z) at points (x. For instance. This is one of Maxwell’s Equations.e. z) = 0 throughout that region. y. Using i. y. Let C be a simple closed curve in 3 and let Σ be any capping surface for C (i. which completes the proof. Σ is orientable and its boundary is C).13. The flux of the curl of a smooth vector field f(x. and can also be applied to double and triple integrals. assume that f (x. Namely. × · Since the choice of Σ was arbitrary. z) is a smooth real-valued function on 3 . LINE AND SURFACE INTEGRALS Corollary 4.

divergence. curl and Laplacian in Cartesian. y.1 Orthonormal vectors er . y. φ). z) can be represented in spherical coordinates (ρ. eθ .1). Curl and Laplacian Solution: By the Divergence Theorem. so combining the integrals gives ∇ · E − 4πρ = 0 since Σ and hence S was arbitrary. θ. Note.6.7 that a point (x. that ez × er = eθ . so ∇ · E = 4πρ .6 Gradient. y = r sin θ. Recall from Section 1. z = z. θ.6. eφ are orthonormal. eθ .4. ez be unit vectors in the direction of increasing r. eθ . We will present the formulas for these in cylindrical and spherical coordinates. we see that eθ × eρ = eφ . ez form an orthonormal set of vectors. let eρ . θ. y. y. Divergence. cylindrical and spherical coordinates in the following tables: . z) φ ρ z eρ eθ eφ y (x. eφ in spherical coordinates Similarly. where x = r cos θ. y. z = ρ cos φ. Then er .2 Orthonormal vectors eρ . φ). θ. let er .6. respectively (see Figure 4. 0) x eθ er y 0 x θ y z (x. by the right-hand rule. we have ∇ · E dV = S Σ 181 E · dσ ρ dV S = 4π (∇ · E − 4πρ) dV = 0 . eθ . respectively (see Figure 4. z) z 0 x θ x y r (x. θ. At each point (ρ. Often (especially in physics) it is convenient to use other coordinate systems when dealing with quantities such as the gradient. curl and Laplacian. By the right-hand rule. eφ be unit vectors in the direction of increasing ρ. ez in cylindrical coordinates Figure 4.2). eθ . eθ . We can now summarize the expressions for the gradient. a point (x. so S by Gauss’ Law. divergence. θ. φ. Then the vectors eρ . z). where x = ρ sin φ cos θ.6. y = ρ sin φ sin θ. z). ez z (x. 0) Figure 4. At each point (r. z. y. z) can be represented in cylindrical coordinates (r.

we will derive the formula for the gradient in spherical coordinates. The basic idea is to take the Cartesian equivalent of the quantity in question and to substitute into that formula using the appropriate coordinate transformation. θ. θ. Vector field f = fρ eρ + fθ eθ + fφ eφ 1 ∂F 1 ∂F ∂F eρ + eθ + eφ ∂ρ ρ sin φ ∂θ ρ ∂φ 1 ∂ 2 ∂ 1 ∂ fθ 1 divergence : ∇ · f = 2 (ρ fρ ) + + (sin φ fθ ) ρ sin φ ∂θ ρ sin φ ∂φ ρ ∂ρ ∂ fφ ∂ fρ 1 ∂ ∂ 1 eρ + eθ (sin φ fθ ) − (ρ fφ ) − curl : ∇ × f = ρ sin φ ∂φ ∂θ ρ ∂ρ ∂φ 1 ∂ fρ 1 ∂ + − (ρ fθ ) eφ ρ sin φ ∂θ ρ ∂ρ ∂2 F 1 ∂ 2 ∂F 1 ∂F ∂ 1 Laplacian : ∆ F = 2 ρ + sin φ + 2 2 sin2 φ ∂θ2 ∂ρ ∂φ ρ ∂ρ ρ sin φ ∂φ ρ gradient : ∇F = The derivation of the above formulas for cylindrical and spherical coordinates is straightforward but extremely tedious. LINE AND SURFACE INTEGRALS Cartesian (x. z): Scalar function F. y. As an example. . Vector field f = f1 i + f2 j + f3 k ∂F ∂F ∂F i+ j+ k ∂x ∂y ∂z ∂ f1 ∂ f2 ∂ f3 divergence : ∇ · f = + + ∂x ∂y ∂z ∂ f3 ∂ f2 ∂ f1 ∂ f3 ∂ f2 ∂ f1 − − − curl : ∇ × f = i+ j+ k ∂y ∂z ∂z ∂x ∂x ∂y ∂2 F ∂2 F ∂2 F Laplacian : ∆ F = 2 + 2 + 2 ∂x ∂y ∂z gradient : ∇F = Cylindrical (r. φ): Scalar function F. Vector field f = fr er + fθ eθ + fz ez ∂F 1 ∂F ∂F er + eθ + ez ∂r r ∂θ ∂z 1 ∂ fθ ∂ f z 1 ∂ (r fr ) + + divergence : ∇ · f = r ∂r r ∂θ ∂z 1 ∂ ∂ fr 1 ∂ fz ∂ fθ ∂ fr ∂ f z curl : ∇ × f = er + eθ + ez − − (r fθ ) − r ∂θ ∂z ∂z ∂r r ∂r ∂θ 1 ∂ ∂F 1 ∂2 F ∂2 F Laplacian : ∆ F = r + 2 2 + 2 r ∂r ∂r r ∂θ ∂z gradient : ∇F = Spherical (ρ. z): Scalar function F.182 CHAPTER 4.

j. eθ . That is. j.2 that the unit vector eρ in the ρ direction at a general r point (ρ. j. but we will do it by combining the formulas for eρ and eφ to eliminate k. will then leave us with a system of two equations in two unknowns (i and j). This comes down to solving a system of three equations in three unknowns. we will solve for k. ∂θ Step 1: Get formulas for eρ . which will give us an equation involving just i and j. k in terms of the spherical coordinate basis vectors eρ . φ) is eρ = r . y. Thus. k. ∂F . To figure out what a and b are. Putting φ = π/2 into the formula for eρ gives eρ = cos θ i + sin θ j + 0 k. This. then in particular eθ ⊥ eρ when eρ is in the xy-plane. θ and φ. z = ρ cos φ. it must be eθ : eθ = − sin θ i + cos θ j + 0 k Lastly. which we will use to solve first for j then for i. Then put the partial derivatives ∂F . we get: so using x = ρ sin φ cos θ. with the formula for eθ . We can see from Figure 4. and ρ = eρ = sin φ cos θ i + sin φ sin θ j + cos φ k Now. Lastly. since eφ = eθ × eρ . eθ . ∂F in terms of ∂F . eφ . ∂φ and functions of ρ. Divergence. θ. k in terms of eρ . eφ and functions of ρ. note that sin φ eρ + cos φ eφ = cos θ i + sin θ j . where r = x i + y j + z k is the position vector of the point in Cartesian coordinates. put the Carte∂x ∂y ∂z sian basis vectors i. Since this vector is also a unit vector and points in the (positive) θ direction. θ and φ. φ) in spherical coordinates is: 1 ∂F 1 ∂F ∂F eρ + eθ + eφ ∇F = ∂ρ ρ sin φ ∂θ ρ ∂φ Idea: In the Cartesian gradient formula ∇F(x. θ.4. eθ is of the form a i + b j + 0 k. eρ = r xi + yj + zk = . That occurs when the angle φ is π/2. since the angle θ is measured in the xy-plane. then the unit vector eθ in the θ direction must be parallel to the xy-plane.6 Gradient. eφ in terms of i. There are many ways of doing this. note that since eθ ⊥ eρ . First. r x2 + y2 + z2 x2 + y2 + z2 . ∂x ∂y ∂z ∂ρ ∂F ∂F . we get: eφ = cos φ cos θ i + cos φ sin θ j − sin φ k Step 2: Use the three formulas from Step 1 to solve for i. Curl and Laplacian 183 Goal: Show that the gradient of a real-valued function F(ρ. z) = ∂F i + ∂F j + ∂F k. y = ρ sin φ sin θ.6. eθ . and we see that a vector perpendicular to that is − sin θ i + cos θ j + 0 k.

∂F . and so: j = sin φ sin θ eρ + cos θ eθ + cos φ sin θ eφ Likewise. ∂F . this involves solving a system of three equations in three unknowns. we get: ∂F 1 ∂F ∂F ∂F = − sin θ + sin φ cos φ cos θ ρ sin2 φ cos θ ∂x ρ sin φ ∂ρ ∂θ ∂φ ∂F 1 ∂F ∂F ∂F ρ sin2 φ sin θ = + cos θ + sin φ cos φ sin θ ∂y ρ sin φ ∂ρ ∂θ ∂φ ∂F 1 ∂F ∂F ρ cos φ = − sin φ ∂z ρ ∂ρ ∂φ . ∂y . and so: i = sin φ cos θ eρ − sin θ eθ + cos φ cos θ eφ Lastly. Using a similar process of elimination as in Step 2. ∂F in terms of ∂F . ∂φ ∂x ∂φ ∂y ∂φ ∂z ∂φ which yields: ∂F ∂F ∂F ∂F = sin φ cos θ + sin φ sin θ + cos φ ∂ρ ∂x ∂y ∂z ∂F ∂F ∂F = −ρ sin φ sin θ + ρ sin φ cos θ ∂θ ∂x ∂y ∂F ∂F ∂F ∂F = ρ cos φ cos θ + ρ cos φ sin θ − ρ sin φ ∂φ ∂x ∂y ∂z Step 4: Use the three formulas from Step 3 to solve for ∂F . we see that: k = cos φ eρ − sin φ eφ Step 3: Get formulas for ∂F . ∂ρ ∂θ By the Chain Rule. ∂z . ∂F ∂F ∂x ∂F ∂y ∂F ∂z = + + . ∂ρ ∂x ∂ρ ∂y ∂ρ ∂z ∂ρ ∂F ∂F ∂x ∂F ∂y ∂F ∂z = + + .184 so that CHAPTER 4. we see that cos θ (sin φ eρ + cos φ eφ ) − sin θ eθ = (cos2 θ + sin2 θ)i = i . we have ∂F ∂φ in terms of ∂F ∂F ∂F ∂x . ∂φ Again. LINE AND SURFACE INTEGRALS sin θ (sin φ eρ + cos φ eφ ) + cos θ eθ = (sin2 θ + cos2 θ)j = j . ∂F . ∂θ ∂x ∂θ ∂y ∂θ ∂z ∂θ ∂F ∂F ∂x ∂F ∂y ∂F ∂z = + + . ∂x ∂y ∂z ∂ρ ∂θ ∂F .

φ) = ρ2 (so that F(ρ.17 we showed that ∇ r 2 = 2 r and ∆ r 2 = 6. as expected. let F(ρ. Divergence. ∂F ∂x ∂y ∂z ∂F ∂F ∂F from Step 4 into the Cartesian gradient formula ∇F(x. ρ ∂ρ ∂φ which we see has 8 terms involving eρ . In Example 4. Curl and Laplacian 185 Step 5: Substitute the formulas for i.6 Gradient. z) = ∂x i + ∂y j + ∂z k. ∂F . as expected. 6 terms involving eθ . φ) = r 2 ). z) = x i + y j + z k in Cartesian coordinates. since it involves simplifying 3 × 3 + 3 × 3 + 2 × 2 = 22 terms! Namely. But the algebra is straightforward and yields the desired result: ∇F = ∂F 1 ∂F 1 ∂F eρ + eθ + eφ ∂ρ ρ sin φ ∂θ ρ ∂φ Example 4.19. y. ∇F = 1 ∂F ∂F ∂F − sin θ + sin φ cos φ cos θ ρ sin2 φ cos θ (sin φ cos θ eρ − sin θ eθ ρ sin φ ∂ρ ∂θ ∂φ + cos φ cos θ eφ ) + 1 ∂F ∂F ∂F ρ sin2 φ sin θ (sin φ sin θ eρ + cos θ eθ + cos θ + sin φ cos φ sin θ ρ sin φ ∂ρ ∂θ ∂φ + cos φ sin θ eφ ) + 1 ∂F ∂F − sin φ ρ cos φ (cos φ eρ − sin φ eφ ) . and 8 terms involving eφ . k from Step 2 and the formulas for ∂F . so r r = 2ρ = 2 r . where r(x. Verify that we get the same answers if we switch to spherical coordinates. Solution: Since r 2 = x2 + y2 + z2 = ρ2 in spherical coordinates. The gradient of F in spherical coordinates is ∂F 1 ∂F 1 ∂F eρ + eθ + eφ ∂ρ ρ sin φ ∂θ ρ ∂φ 1 1 (0) eθ + (0) eφ = 2ρ eρ + ρ sin φ ρ r = 2ρ eρ = 2ρ . j. θ. And the Laplacian is ρ ∂ 1 ∂2 F 1 ∂F 1 ∂ 2 ∂F + 2 ρ + sin φ ∆F = 2 2 2 ∂ρ ∂φ ρ ∂ρ ρ sin φ ∂φ ρ2 sin φ ∂θ ∂ 1 1 ∂ 2 1 (sin φ (0)) (0) + 2 = 2 (ρ 2ρ) + 2 ρ ∂ρ ρ sin φ ρ sin φ ∂φ 1 ∂ (2ρ3 ) + 0 + 0 = 2 ρ ∂ρ 1 = 2 (6ρ2 ) = 6 . Doing this last step is perhaps the most tedious. θ. as we showed earlier. y.4. ρ ∇F = .

9. u is harmonic) over 3 . f (x. z) = e x+y+z Exercises © 2. Prove Theorem 4. 8. div (F + G) = div F + div G 18. 11. For f(r. ∂n ∂u Show that dσ = 0. Find the Laplacian of the function in Exercise 3 in spherical coordinates. z) = x i + y j + z k). Find ∇ f in cylindrical coordinates.e. z) = x5 ¨ For Exercises 1-6. f (x. f (x. find div f and curl f. y. ∇ · (r/r3 ) = 0 15. y. φ) = eρ + ρ cos θ eθ + ρ eφ in spherical coordinates. y. θ. curl (curl F) = ∇(div F) − ∆ F 17. y. Suppose that ∆ u = 0 (i. Find the Laplacian of the function in Exercise 6 in spherical coordinates. y. curl (F + G) = curl F + curl G 19. ∆ ( f g) = f ∆ g + g ∆ f + 2(∇ f · ∇g) C 24. f (x.) ∂n Σ . prove the given formula (r = r is the length of the position vector field r(x. y. 25. y. z) in Cartesian coordinates. div (F × G) = G · curl F − F · curl G 21. find the Laplacian of the function f (x. 12. div (∇ f × ∇g) = 0 22.186 CHAPTER 4. Use f = u ∇v in the Divergence Theorem to prove: (a) Green’s first identity: S ∂F ∂r er + 1 r ∂F ∂θ eθ + ∂F ∂z ez (u ∆ v + (∇u) · (∇v)) dV = S Σ (u ∇v) · dσ (b) Green’s second identity: (u ∆ v − v ∆ u) dV = Σ (u ∇v − v ∇u) · dσ ∂u 27. 3. B For Exercises 12-23. ∇ (1/r) = −r/r3 13. z) = x3 + y3 + z3 7. f (x. Define the normal derivative ∂n of u over a closed surface Σ with outward unit normal vector n by ∂u = Dn u = n · ∇u. div ( f F) = f div F + F · ∇ f 20. f (x. z) = (x2 + y2 + z2 )3/2 6. Let f (x. LINE AND SURFACE INTEGRALS A 1. z in Cartesian coordinates. For f(ρ. y. (Hint: Use Green’s second identity. z) = 2 x + y2 10. ∇ (ln r) = r/r2 16. curl ( f F) = f curl F + (∇ f ) × F 23.17. find div f and curl f. y. z) = r er + z sin θ eθ + rz ez in cylindrical coordinates. ∆ (1/r) = 0 14. z) = e−x 2 −y2 −z2 5. θ. Derive the gradient formula in cylindrical coordinates: ∇F = 26. z) = x + y + z 4.

Boston. 1993 Thorough treatment of nonlinear optimization. 1990 An intermediate-level book on curve and surface design. Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide. New York: Academic Press. calculus-based probability theory. CA: Academic Press. Introduction to Probability Theory. covering a wide range of topics. Shetty. Hoel.Bibliography Abbott. MA: Addison-Wesley Publishing Co. Sherali and C. Inc. 2nd edition. with a modern approach based on differential forms. H. 1970 Standard intermediate-level treatment of classical mechanics. E.. Anton. M. Lots of good exercises. MA: Houghton Mifﬂin Co. New York: John Wiley & Sons. Optics.D. S. 2nd edition. New York: Dover Publications.G.. and C.. 2nd edition. 1971 An excellent introduction to elementary. B. 1966 Intermediate-level book on differential geometry. 1987 An intermediate-level book on optics. New York: Academic Press.. J. Nonlinear Programming: Theory and Algorithms. San Diego.. Elementary Linear Algebra: Applications Version. 7th edition. New York: John Wiley & Sons. Hecht.D.S. P. Flatland. H. Very thorough... 2nd edition. Most of the mathematics will be understandable after reading the present book. famous for being intimidating..M. 1952 Classic tale about a creature living in a 2-dimensional world who encounters a higherdimensional creature. 2nd edition. J. New York: John Wiley & Sons. Port and C. Classical Electrodynamics. 1975 An advanced book on electromagnetism. G. Rorres.C. Jackson... Stone. Marion. E. O’Neill.A. Farin. 8th edition. Elementary Differential Geometry.. 2000 Standard treatment of elementary linear algebra. Bazaraa.J. Reading.B. 187 . with lots of humor thrown in. Classical Dynamics of Particles and Systems.

Schey.V. 2nd edition. Boston. Highly recommended. Cambridge. New York: John Wiley & Sons. Vetterling and B. Div. Theory of Equations. Curl.. New York: W. A. New York: John Wiley & Sons. with a rigor not found in most recent books.. 1972 Excellent treatment of n-dimensional calculus. 1948 A classic on the subject.. Foundations of Electromagnetic Theory.. 1978 Standard treatment of elementary numerical analysis. MA: Addison-Wesley Publishing Co. New York: McGraw-Hill. Reading. Morrey.. 1973 Very intuitive approach to the subject.F. discussing many interesting topics.. and C. and All That: An Informal Text on Vector Calculus. MA: Addison-Wesley Publishing Co. A. from a physicist’s viewpoint. J. . A.R. the code is clear enough to implement in the language of your choice. A First Course in Numerical Analysis. H. Rabinowitz. 1992 An excellent source of information on numerical methods for solving a wide variety of problems.. Though all the examples are in the FORTRAN programming language. New York: McGraw-Hill.A. H. J. Reitz. Press. Taylor. Flannery. Weinberger. 1980 An intermediate/advanced book on analytic geometry. Uspensky. 1979 Intermediate text on electromagnetism. Analytic Geometry. Christy. Moscow: Mir Publishers.J. Welchons. 2nd edition. Advanced Calculus.E.H. Analytical Geometry. Protter. 3rd edition.P.. 1975 Thorough treatment of elementary analytic geometry. Reading.T. Many intriguing exercises.188 Bibliography Pogorelov. A First Course in Partial Differential Equations.R. Milford and R. MA: Ginn & Co. F. and W. 2nd edition. Solid Geometry.W. Grad.. Ralston. W.H. M.W. 1965 A good introduction to the vast subject of partial differential equations. Teukolsky. includes many topics which (sadly) do not seem to be taught anymore. 2nd edition. 1936 A very thorough treatment of 3-dimensional geometry from an elementary perspective.M.M.R. and W. Norton & Co.V. Mann. Numerical Recipes in FORTRAN: The Art of Scientific Computing.. UK: Cambridge University Press. and P. S.B. W. Krickenberger. A. A good book to study after the present book.

z = −3 + 8t 7. −4. 14) 1. 0◦ 9. since v · w = 0. x = 1 + t. lines a = b . 3) 11. 7. (a) r2 + 9z2 = 36 (b) ρ2 (1 + 8 cos2 φ) = 36 10. 16. 0 and (8. −6. Hint: See Theorem 1. −3) −2 −1 (c) √ . v + w is larger. 11π . 10 3.√ −1) 6. sin t).6 (p. 2) + t(5. 8) (g) (−7. 1) Section 1. radius: 1.816) 3 3 √ √ 11π 3.8 (p. z = 0 and a = − b . y = 2t. 11x − 24y + 21z − 26 = 0 17. Hint: use Definition 1. 3. 73. Section 1. 2−c . (a. (a) Line parallel to c (b) Half-line Section 1. No (d) 1 (b) x = 2 + t. 50) √ 1. −6. (1. z = 3 + t (c) x − 2 = z − 3. a cot φ) 12. 9/ 35 19. θ. 41 (e) (d) 2 √ 41 2 Section 1. 1. No intersection. (10.4 (p. √ 30 30 30 (b) (2. 3. x = 1. (a) (−4. 1 − cos t. 2 cos 2t. (a) (4. Yes 3. π . z = −7t 21.6. 2. v + w = 26 < 21 + 5 = v + w 15. (a) 5 √ (e) 2 17 √ √ (b) 5 (c) 17 2. −3) y = 3 + 4t. (a) r2 + z2 = 25 (b) ρ = 5 7. 39) 1. y = 1. −2. √ 7. 1) Chapter 1 Section 1. π . cos t) 9. 18) 1. 4 5 11. (a) (2 7.3 (p. −5) 5. −1) (b) ( 17. y = 2+3t. 2) 3. f ′ (t) = (1. Section 1. x = 1 + 2t. π ) 6 2 5. 2t. 4. −10.4◦ 5. 1. y = −2 + 7t. 1). radius: 5. 6 . (a) (2. z = 2 − 3t (c) x−2 5 (b) x = 2 + 5t. (−5.5 (p. 1) (i) (−2. −4) (h) (−1. 90◦ 7. = y−3 = z−2 4 −3 189 . (8.10(c). 0 (f) (14. 24.65 9. √5 .2 (p.72 9. 2) 15. (j) No. 46) 1. 0) (b) (2 7. Hint: Use the distance formula for Cartesian coordinates. v(t) = (1.7 (p. y = z = 1 3. 57) 1. Section 1. 0 √ 7. −23. 4x − 4y + 3z − 10 = 0 13. −24) 3. 9 13. 14 Section 1. center: (2. Yes. −1.Appendix A Answers and Hints to Selected Exercises 3. 2−c . 4. No. 8) √ 1. −1) 5. y = 1 5. 29) 1. (a) (2. center: (−1.1 (p. √ √ 11. f ′ (t) = (−2 sin 2t. 5) 3. 0. circle x2 + y2 = 4 in the planes z = ± 5 y y x x 9. 3. 4. 3) + t(1. x = 5t. z = 0 2a 2b 13. |v · w| = 0 < 21 √5 = v w √ √ 13. z = t 5. 3t2 ). sin t. a(t) = (0. x − 2y − z + 2 = 0 √ 15.

16 Section 1. ∂ f = 12x2 . 3. x2 +y2 +4 √ y x2 +y2 +4 ) 5. 0) = x(x2 + y + 4)−1/2 . Example 1. (−1. 0) 7. y) : x2 + y2 ≥ 4}. (0. −1). (2x. ∂f ∂x ∂f −(x2 +y2 ) 15. then write 11 1 5. N(t) = (− cos t. (1/x. ∂y 3 ∂x ∂f ∂x = 2x. (−1. ∂ f = 4x3 . Hint: Theorem 1. xz cos(xyz). 2 −4 9 6. does not exist Section 2. ∂f ∂y = 2y 3. ∂x 3 2 2 ∂f = 1 (x2 + y + 4)−2/3 13. (−1. 1).1 (p. ∂ f = x(x2 + y2 )−1/2 .3 (p.20(e). 1. ∂y2 ∂2 f 1 2 −3/2 ∂x ∂y = − 2 x(x + y + 4) 2f 2f 21. Hint: Use f ′ (t) = f(t) T. (1. 0). (2x. local max. −1. saddle pt. 63) 1.4 (p. − sin t. 15. ∂x ∂y = 0 ∂y2 parallel to c (c) Hint: Think of the functions as position vectors. range: [0. −0. increase: (45. 88) Section 2. 5. cos t. 0 17. width = height 9. √ 3π 5 2 3. 1) 5. t by 27s+16 2 2 3/2 27 (13 2/3 − 8) 5.6 (p. ∂2 f ∂2 f 2 −3/2 . ∞) 15. ∂y2 = 2.03256. 1) : → (1. z = 2 11.9 (p. ∂ f = y(x2 + y2 )−1/2 ∂x ∂y = depth=10 13.2858. 74) 1. saddle pt. does not exist 11. ∂x ∂y 2 ∂2 f xy + 1 23. local min. ∂y2 ∂x ∂2 f ∂2 f = −y−2 . 1/y) 7. ( √ x . −20) 9. ∂y = 2 (x + y + 4) ∂x ∂f (1. 2z) 11. ∂ 2 = x2 e xy . (x0 .5 (p.94037) . 82) 1.190 Appendix A: Answers and Hints to Selected Exercises ∂2 f 1 = − 4 (x2 + y + 4)−3/2 . 77) 9. 9 T 11. ∂x ∂y = (1 + xy)e ∂x2 2f ∂2 f ∂2 f = 0.37. ∂y = −2ye ∂x ∂2 f ∂2 f ∂f ∂y = x cos(xy) 17. y0 ) = (0. x = y = 4.2 (p. 2 13. 1). 0). 2y. Hint: Use κ(t) = 1/2 Chapter 2 Section 2. (0. 20). local min. 0) = xe xy + x 7. y0 ) = (1. −2x + y − z − 2 = 0 √ pressions into f ′ (t) × f ′′ (t). local min.3998). and Theorem 1. −1).16 7. 1 decrease: (−45. 2 (x − 1) + 4 (y − 2) + 12 (z − ′ (t) in terms of N(t). 70) 2. (1. 3x + 4y − 5z = 0 2 1 √ (sin t. ∂ 2 = −x−2 . xy cos(xyz)) √ 1 1. x + 2y = z 7. saddle pts. differentiate that to get f ′′ (t). (1. 2x + 3y − z − 3 = 0 3. 1/2) 11. T(t) = √ 2 11 1 √ (− sin t. ∂ f = 0 ∂y ∂x ∂y 9. ∞) 3. ∂ 2 = y2 e xy . 1] 7. −1). 3 ) = 0 9. 2y) 3. ∂ f = −2xe−(x +y ) . ∂ f = 2x (x2 + y + 4)−2/3 . 2 15. ∂x2 = 2. ∂f 1 2 −1/2 5. 2 2 13. √3 √ {(x. Hint: Use Exercise 6. ∂ f = y cos(xy). domain: 9. put those ex1. ∂x ∂y = 0 25. (−1. local min. − cos t. Section 2. ∂x ∂y = 0 19. ∂ f = ye xy + y. domain: 3 . 1). domain: range: [−1. Section 2. (yz cos(xyz). ∂x2 = (y + 4)(x + y + 4) Section 2. 95) 2. Replace B(t) = Theorem 1. 3 cos(1) 17. range: [−1. (x0 . 0) : → (0. local min.

√ π 2.5 (p. 142) 1. 9 3. otherwise use e = 2. (0. 0 3. √ 13 13 −9 2 √ . min. 1 4 7. 23 5. Yes 13.3 (p.318. (b) No. 12 x2 + y2 + z2 5. Hint: Think of how a vector field f(x. Hint: Start by showing that er = cos θ i + sin θ j. Yes. 2π Chapter 3 Section 3.705 4.2 (p. 6 Section 4. 163) 1. 134) 2 4 √ . div f = ρ − sin φ + cot φ. 10. 1 3.146 3. 12π/5 7.7 (p. 2π(π − 1) 7. 1 n 20 30 √ . 1 − sin 2 2 − 33/2 ) 9. min. 15 1. max. 0. y) = 4x2 y + 2y2 + 3x Section 3. −2π 9. − √ . 3 3.2 (p. √ 5 5 Section 3. 2 9. Section 3.1 (p. 16/15 3.exp(x). 1. 6 5 π 4 6. Yes. −5π 5. F(x. No 9. y) i+ Q(x. 109) 1.4 (p. 6(x + y + z) 2 7. 1 5.) 2.191 Section 2.6 (p. 155) 1.4 (p. 5a/12) 9. Section 4. 116) 1. F(x. 1 3 1 6 7. 3. 12ρ 8. Section 3. F(x. 112) 1.1 (p. No 4. (0. Both are n (n+1)2 (n+2) 7. eθ = − sin θ i + cos θ j. 2 2 π2 2. 8/3) 3. 7 12 7 6 1 2 Section 4. 4. Hint: Think of how F is defined. 104) 1. 216π 2. 5 9. Yes. max. 0.6 (p. (Hint: In Java the exponential function e x can be obtained with Math. y) = P(x. (1. 0 3. (2 cos(π2 ) + π4 − 2)/4 5. (7/12. 8abc √ 3 3 −4 −2 √ . 1 3. 4π 3 (8 Section 4.7 (p. 127) 4a 1. y) j in 2 can be extended in a natural way to be a vector field in 3 . 1/2 3. 15/4 Section 3. 5. . 4π 7. min. 149) 11. F(x. 186) 7. ez = k.168 Section 4. 7/12. (17 17 − 5 5)/3 3.3 (p. The values should converge to ≈ 1. (4ρ2 − 6)e−ρ 9. ≈ 0. ≈ 1. √ 5 5 30 20 . curl f = cot φ cos θ eρ + 2eθ − 2 cos θ eφ 25. 1 6. 8π 3. y) = x − y2 2 5.7182818284590455 in your program. 67/15 9. √ 5 5 13 13 √ 59 −1 4 . y) = xy2 + x3 7. 3π ) 5. − √ . (0. 7/12) 1. y) = axy + bx + cy + d 1 6 2 2 5. 2πab Section 3. ≈ 0. Chapter 4 Section 4. 175) √ √ √ 1.5 (p. 9 8. 100) 1. − 2z er + r12 ez r3 sin θ 2 11. 4 . Yes. 10. No 19. 8 ln 2 − 3 5. 2 10. Other languages have similar functions. 123) 1. 6 11. 7. max. 3π/16) 7. 2/5 4. 24π 11.

by definition. define a new vector. as follows: 1.Appendix B We will prove the right-hand rule for the cross product of two vectors in For any vectors v and w in 3. w) form a right-handed system. So assume that a 0 and b 0. w) is perpendicular to the plane containing v and w. so the result holds. If v and w are nonzero and not parallel. j. n(av. n(v. Thus. There are four possibilities for the combinations of signs for a and b. and (c) v. by definition. we will show that the result holds for v = i and w = k (the other possibilities follow in a similar fashion). Let v and w be any two of the basis vectors i. w. Also. the xz-plane. This was already shown in Example 1. w) = v × w for all v. is ai bk sin 90◦ = |ab|. For example. w) = 0. Since its magnitude is |ab|. w in 3 . namely. If either a = 0 or b = 0 then n(av. The goal is to show that n(v. w) = v × w if v and w are any two of the basis vectors i. k. Hence the magnitude of n(av. If v and w are nonzero and parallel. If either v or w is 0. n(av.11 in Section 1. (b) n(v. w). To do this. k. j.4. n(v. the angle θ between av and bw is 90◦ . w) is v w sin θ. bw). b if v and w are any two of the basis vectors i. 192 . w) is the vector in 3 such that: (a) the magnitude of n(v. 2. 3. then n(v. and θ is the angle between them. k. Step 2: Show that n(av. j. then n(v. 3. which would prove the right-hand rule for the cross product (by part 1(c) of our definition). We will consider the case when a > 0 and b > 0 (the other three possibilities are handled similarly). bw) is perpendicular to the plane containing ai and bk. bw) = 0 = ab(v × w). w) = 0. bw) = ab(v × w) for any scalars a. then n(av. then n(v. we will perform the following steps: Step 1: Show that n(v. For av = ai and bw = bk. bw) must be either |ab|j or −|ab|j. bw) must be a scalar multiple of j.

v) has magnitude 0. bw) must be either abj or −abj. j. w. then the result follows easily since n(u. v) + n(u. b > 0. bk. n(ai. w) are all the zero vector. then project the vector u v straight down onto the plane P. bw) = ab(v × w) Step 3: Show that n(u. So rotating pro jP u v by 90◦ in a counter-clockwise direction in the plane P gives a vector whose magnitude is the same as that of n(u. w) = 0 + n(u. v). bk) = −abj. v + w). v) u pro jP u v Now apply this same geometric construction to get n(u. j form a left-handed system. v) and which is perpendicular to pro jP u v (and hence perpendicular to v). Therefore. then i. since in that case θ = 0◦ and so sin θ = 0 which means that n(u. by definition. But we know that ai × bk = ab(i × k) = ab(−j) = −abj. v and this vector form a right-handed system. then we see that pro jP u v has magnitude u v sin θ. bk) = ab(i × k). n(u.193 In this case. If v = 0. with the light source directly overhead the terminal point of u v. If u = 0 then the result holds trivially since n(u. −j form a righthanded system. since i. and so i. You can think of this projection vector (denoted by pro jP u v) as the shadow of the vector u v on the plane P. Multiply the vector v by the positive scalar u . w). v). ai. n(av. n(ai. Since this vector is in P then it is also perpendicular to u. Let P be a plane perpendicular to u. this means that we must have n(ai. w) and n(u. ai. and since n(ai. u v θ v θ P n(u. bk) form a right-handed system. v + w). k. bk) has to be either abj or −abj. 0 + w) = n(u. w) = n(u. which is the magnitude of n(u. Thus. v. v). Since u (v + w) is the sum of the vectors u v and u w. which is what we would expect. v and w are all nonzero vectors. A similar argument shows that the result holds if w = 0. k. v) + n(u. We will describe a geometric construction of n(u. And we can see that u. So since. k form a righthanded system. w) for any vectors u. If θ is the angle between u and v. bk. So now assume that u. and ab > 0). Note that this holds even if u v. which is shown in the figure below. which is what we needed to show. ∴ n(av. −abj form a right-handed system (since a > 0. Now. then the projection vector . Hence this vector must be n(u. v + w) = n(u. w) = n(u. v + w) = n(u. v) and n(u. 0) = n(u.

v + w) = n(u. v) = −n(v. v. Thus. v. we have . which means that n(u. so the result holds. we have shown that −n(v. or if either is 0. w. u v u (v + w) v+w pro jP u v pro jP u (v + w) u w w θ θ v u pro jP u w n(u. −n(v. v) and is perpendicular to the plane containing w and v. n(v. Step 5: Show that n(v. w) form a right-handed system. Then by Steps 3 and 4. v). then n(w. and hence w. n(v. w) = v × w for all vectors v. think of how projecting a parallelogram onto a plane gives you a parallelogram in that plane). w. So then rotating all three projection vectors by 90◦ in a counter-clockwise direction in the plane P preserves that sum (see the figure below). w). w) P n(u. v. v) = 0 = −n(v. w). using the shadow analogy again and the parallelogram rule for vector addition. v) has magnitude w v sin θ. w) form a left-handed system. and hence is the same as the magnitude of −n(v.194 Appendix B: Proof of the Right-Hand Rule for the Cross Product pro jP u (v + w) is the sum of the projection vectors pro jP u v and pro jP u w (to see this. Also. w. w) must be n(w. and hence so is −n(v. Then n(w. v) n(u. w). So by definition this means that −n(v. If v and w are nonzero and parallel. w) is perpendicular to the plane containing w and v. −n(v. w). w) for any vectors v. So assume that v and w are nonzero and not parallel. n(v. w) form a right-handed system. w). and that w. By definition. which is the same as the magnitude of n(v. v + w) Step 4: Show that n(w. v. v) + n(u. w) form a right-handed system. w) is a vector with the same magnitude as n(w. Write v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k. and so w.

w) form a right-handed system. v3 k) = −v1 w1 n(i. v1 i + v2 j + v3 k) = v1 w2 k − v3 w2 i and −n(w3 j. w) = n(v1 i + v2 j + v3 k. w1 i + w2 j + w3 k) = n(v1 i + v2 j + v3 k. w. which completes the proof. n(v. we can calculate −n(w2 j. v1 i + v2 j + v3 k). w) = −v2 w1 k + v3 w1 j + v1 w2 k − v3 w2 i − v1 w3 j + v2 w3 i = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k = v × w by definition of the cross product. w1 i) + n(v1 i + v2 j + v3 k. then v. w) = v × w for all vectors v. . v1 i + v2 j + v3 k) = −v2 w1 k + v3 w1 j Similarly. w. ∴ n(v. v1 i + v2 j + v3 k) + −n(w3 k. w2 j) + n(v1 i + v2 j + v3 k. So since v.195 n(v. v1 i + v2 j + v3 k) + −n(w2 j. v1 i + v2 j + v3 k) = −n(w1 i. v2 j) + −n(w1 i. putting it all together. We can use Steps 1 and 2 to evaluate the three terms on the right side of the last equation above: −n(w1 i. v × w form a righthanded system. v1 i) + −n(w1 i. we have n(v. j) + −v3 w1 n(i. Thus. w1 i) + n(v1 i + v2 j + v3 k. w2 j + w3 k) = n(v1 i + v2 j + v3 k. k) = −v1 w1 (i × i) + −v2 w1 (i × j) + −v3 w1 (i × k) = −v1 w1 0 + −v2 w1 k + −v3 w1 (−j) −n(w1 i. w3 k) = −n(w1 i. w. v1 i + v2 j + v3 k) = −v1 w3 j + v2 w3 i . i) + −v2 w1 n(i.

run wgnuplot.0. You should now get a Gnuplot terminal with a gnuplot> command prompt. just type gnuplot in a terminal window. size “12” is usually a good choice (that choice can be saved for future sessions by right-clicking in the Gnuplot window again and selecting the option to update wgnuplot. which is version 4. For example. Versions are available for many operating systems.gnuplot. 3. GRAPHING FUNCTIONS The usual way to create 3D graphs in Gnuplot is with the splot command: splot <range> <comma-separated list of functions> 196 . INSTALLATION 1. you should get the Zip file with a name such as gp420win32. while in Linux it will appear in the terminal window where the gnuplot command was run.Appendix C 3D Graphing with Gnuplot Gnuplot is a free. which we will now describe.exe from the folder (or bin folder) where you installed Gnuplot.ini). Go to http://www. In Windows. All the examples we will discuss require at least version 4. In Linux.” option. Install the downloaded file. in Windows you would unzip the Zip file you downloaded in Step 1 into some folder (use the “Use folder names” option if extracting with WinZip). RUNNING GNUPLOT 1. style “Regular”.zip.2. At the gnuplot> command prompt you can now run graphing commands. Below is a very brief tutorial on how to use Gnuplot to graph functions of several variables. open-source software package for producing a variety of graphs. For Windows.info/download.0. if the font is unreadable you can change it by right-clicking on the text part of the Gnuplot window and selecting the “Choose Font. 2. For example.html and follow the links to download the latest version for your operating system. In Windows this will appear in a new window..2. 2. For Windows. the font “Courier”.

for some numbers a < b and c < d.5 0 .197 For a function z = f (x. This will cause the graph to be plotted for a ≤ x ≤ b and c ≤ y ≤ d. use an expression of the form [a : b][c : d]. y).5 -1.5 1 -2 -1 -0.5 0. To graph the function z = 2x2 + y2 from x = −1 to x = 1 and from y = −2 to y = 2.1.5 0 0.5 1 -1 -0. To specify an x range and a y range. <range> is the range of x and y values (and optionally the range of z values) over which to plot. Function definitions use the x and y variables in combination with mathematical operators. type this at the gnuplot> prompt: splot [−1 : 1][−2 : 2] 2*x**2 + y**2 The result is shown below: 2*(x**2) + y**2 7 6 5 4 3 2 1 0 2 1. listed below: Symbol + − * / ** exp(x) log(x) sin(x) cos(x) tan(x) Operation Addition Subtraction Multiplication Division Power ex ln x sin x cos x tan x Example 2+3 3−2 2*3 4/2 2**3 exp(2) log(2) sin(pi/2) cos(pi) tan(pi/4) Result 5 1 6 2 23 = 8 e2 ln 2 1 −1 1 Example C.

exp(x+y) By default. use this command: set view 60. 1 Also. To show the axes with the orientation which we have used throughout the text. exp(x+y) . 120. 120. by default the x. To get more of a colored/shaded surface. use these commands: set xlabel "x" set ylabel "y" set zlabel "z" To show the level curves of the surface z = f (x.and y-axes are switched from their usual position. 1. say. parentheses can be used to make sure the operations are being performed in the correct order: splot [−1 : 1][−2 : 2] 2*(x**2) + y**2 In the above example. put a comma after the first function then append the new function: splot [−1 : 1][−2 : 2] 2*(x**2) + y**2. use this command: set contour both The default mesh size for the grid on the surface is 10 units. to also plot the function z = e x+y on the same graph.198 Appendix C: 3D Graphing with Gnuplot Note that we had to type 2*x**2 to multiply 2 times x2 . increase the mesh size (to. To display the axes. For clarity. the x-axis and y-axis are not shown in the graph. use this command before the splot command: set zeroaxis Also. 1 set xlabel "x" set ylabel "y" set zlabel "z" set contour both set isosamples 25 splot [−1 : 1][−2 : 2] 2*(x**2) + y**2. 1. we get the following graph with these commands: set zeroaxis set view 60. 25) like this: set isosamples 25 Putting all this together. y) on both the surface and projected onto the xy-plane. to label the axes.

34 from Section 1. The graph of the helicoid z = θ in Example 1.5 2 1 x The numbers listed below the functions in the key in the upper right corner of the graph are the “levels” of the level curves of the corresponding surface. y = r sin θ .7 (p. z=z you would do the following: set mapping cylindrical set parametric splot [a : b][c : d] v*cos(u).5 1 0.v*sin(u).v) where the variable u represents θ. with c ≤ v ≤ d. the variable v represents r.199 25 20 15 z 10 5 0 2*(x**2) + y**2 6 5 4 3 2 1 exp(x+y) 20 15 10 5 -1 -0. it can be turned off with this command: unset key PARAMETRIC FUNCTIONS Gnuplot has the ability to graph surfaces given in various parametric forms. for a surface parametrized in cylindrical coordinates x = r cos θ .f(u.5 1. For example. If you do not want the function key displayed. v) is some function of u and v. 49) was created using the following commands: . y) = c.5 y 0 0.5 -2 -1.5 -1 0 -0. they are the numbers c such that f (x.2. That is. with a ≤ u ≤ b. and z = f (u. Example C.

to print a graph from Gnuplot right-click on the titlebar of the graph’s window. in the File menu again..” option.png. Looking at the graph.”... select “Output Device . Now run your splot command again and you should see a file called graph. 1 set xyplane 0 set xlabel "x" set ylabel "y" set zlabel "z" unset key set isosamples 15 splot [0 : 4*pi][0 : 2] v*cos(u). graph.200 Appendix C: 3D Graphing with Gnuplot set mapping cylindrical set parametric set view 60. There are many terminal types (which determine the output format). the postscript terminal type is popular. In Linux.” option in the File menu).. select the “Output . In Linux. PRINTING AND SAVING In Windows. To save a graph.” option and enter a filename (say. type quit at the gnuplot> command prompt.png) in the Output filename? textfield.png in the current directory (usually the directory where wgnuplot. as a PNG file.v*sin(u).. . Run the command set terminal to see all the possible types.exe is located. To quit Gnuplot. you would issue the following commands: set terminal png set output ’graph. since the print quality is high and there are many PostScript viewers available.. select “Options” and then the “Print.. say. 120.png’ and then run your splot command. hit OK. and enter png in the Terminal type? textfield. go to the File menu on the main Gnuplot menubar.u The command set xyplane 0 moves the z-axis so that z = 0 aligns with the xy-plane (which is not the default in Gnuplot). though you can change that setting using the “Change Directory . hit OK. you will see that r varies from 0 to 2. and θ varies from 0 to 4π. to save the graph as a file called graph. 1. Then.

GNU Free Documentation License Version 1.2. Any member of the public is a licensee. Fifth Floor. You accept the license if you copy. 201 . Copyright 51 Franklin St. regardless of subject matter or whether it is published as a printed book. royalty-free license. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work.2002 Free Software Foundation. to use that work under the conditions stated herein. Secondarily. this License preserves for the author and publisher a way to get credit for their work. 1. MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document. Boston. This License is a kind of "copyleft". it can be used for any textual work. We recommend this License principally for works whose purpose is instruction or reference. while not being considered responsible for modifications made by others. refers to any such manual or work. unlimited in duration. modify or distribute the work in a way requiring permission under copyright law. which is a copyleft license designed for free software. It complements the GNU General Public License. But this License is not limited to software manuals. Preamble The purpose of this License is to make a manual. November 2002 c 2000. which means that derivative works of the document must themselves be free in the same sense. either commercially or noncommercially. or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it. in any medium. We have designed this License in order to use it for manuals for free software. Such a notice grants a world-wide.2001. because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. textbook. below. and is addressed as "you". The "Document". but changing it is not allowed. with or without modifying it. that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Inc.

ethical or political position regarding them. PostScript or PDF designed for human modification. PostScript or PDF produced by some word processors for output purposes only. philosophical. For works in formats which do not have any title page as such. Texinfo input format. and standard-conforming simple HTML. or with modifications and/or translated into another language. or of legal. a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters. represented in a format whose specification is available to the general public. for a printed book. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. plus such following pages as are needed to hold. and a Back-Cover Text may be at most 25 words. and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. XCF and JPG. If the Document does not identify any Invariant Sections then there are none. A "Transparent" copy of the Document means a machine-readable copy. (Thus. in the notice that says that the Document is released under this License. SGML or XML using a publicly available DTD. as being those of Invariant Sections.202 GNU Free Documentation License A "Modified Version" of the Document means any work containing the Document or a portion of it. legibly. SGML or XML for which the DTD and/or processing tools are not generally available. that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor. commercial. and the machine-generated HTML. A copy that is not "Transparent" is called "Opaque". The "Cover Texts" are certain short passages of text that are listed. Examples of suitable formats for Transparent copies include plain ASCII without markup. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors. if the Document is in part a textbook of mathematics. the material this License requires to appear in the title page. in the notice that says that the Document is released under this License. or absence of markup. A Front-Cover Text may be at most 5 words. The "Invariant Sections" are certain Secondary Sections whose titles are designated. An image format is not Transparent if used for any substantial amount of text. The Document may contain zero Invariant Sections. either copied verbatim. Examples of transparent image formats include PNG. as Front-Cover Texts or Back-Cover Texts. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. has been arranged to thwart or discourage subsequent modification by readers is not Transparent. LaTeX input format. "Title Page" . A copy made in an otherwise Transparent file format whose markup. the title page itself. The "Title Page" means.

but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License. under the same conditions stated above. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. and you may publicly display copies. and Back-Cover Texts on the back cover. (Here XYZ stands for a specific section name mentioned below. and the license notice saying this License applies to the Document are reproduced in all copies. COPYING IN QUANTITY If you publish printed copies (or copies in media that commonly have printed covers) of the Document. you may accept compensation in exchange for copies. Both covers must also clearly and legibly identify you as the publisher of these copies. as long as they preserve the title of the Document and satisfy these conditions. 3. provided that this License. and that you add no other conditions whatsoever to those of this License. or "History". Copying with changes limited to the covers. These Warranty Disclaimers are considered to be included by reference in this License. the copyright notices. numbering more than 100. However. and the Document’s license notice requires Cover Texts. can be treated as verbatim copying in other respects. all these Cover Texts: Front-Cover Texts on the front cover. VERBATIM COPYING You may copy and distribute the Document in any medium. You may also lend copies. The front cover must present the full title with all words of the title equally prominent and visible. preceding the beginning of the body of the text. you should put the first ones listed (as many as fit reasonably) on the actual cover. . and continue the rest onto adjacent pages. 2. You may add other material on the covers in addition. A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. you must enclose the copies in covers that carry. If you distribute a large enough number of copies you must also follow the conditions in section 3. either commercially or noncommercially. If the required texts for either cover are too voluminous to fit legibly. "Dedications". The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document.) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition.203 means the text near the most prominent appearance of the work’s title. "Endorsements". clearly and legibly. such as "Acknowledgments".

when you begin distribution of Opaque copies in quantity. together with at least five of the principal authors of the Document (all of its principal authors. immediately after the copyright notices. It is requested. with the Modified Version filling the role of the Document. one or more persons or entities responsible for authorship of the modifications in the Modified Version. You may use the same title as a previous version if the original publisher of that version gives permission. but not required. free of added material. and from those of previous versions (which should. a license notice giving the public permission to use the Modified Version under the terms of this License. thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. you must either include a machine-readable Transparent copy along with each Opaque copy. that you contact the authors of the Document well before redistributing any large number of copies. Preserve all the copyright notices of the Document. List on the Title Page. be listed in the History section of the Document). State on the Title page the name of the publisher of the Modified Version. provided that you release the Modified Version under precisely this License. to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. you must do these things in the Modified Version: A. D. or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document. If you use the latter option. in the form shown in the Addendum below. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices. if there were any. if any) a title distinct from that of the Document. E. C. you must take reasonably prudent steps. as the publisher. unless they release you from this requirement. 4. Use in the Title Page (and on the covers. to give them a chance to provide you with an updated version of the Document. F. In addition. Include. if it has fewer than five). as authors. . B. MODIFICATIONS You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above.204 GNU Free Documentation License If you publish or distribute Opaque copies of the Document numbering more than 100.

then add an item describing the Modified Version as stated in the previous sentence. If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document. I. N. create one stating the title. Such a section may not be included in the Modified Version. and add to it an item stating at least the title. unaltered in their text and in their titles. Preserve the section Entitled "History". . J. and preserve in the section all the substance and tone of each of the contributor acknowledgments and/or dedications given therein. Preserve any Warranty Disclaimers. provided it contains nothing but endorsements of your Modified Version by various parties–for example. and publisher of the Modified Version as given on the Title Page. Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section. Include an unaltered copy of this License. year. if any. add their titles to the list of Invariant Sections in the Modified Version’s license notice. Preserve the network location. Preserve the Title of the section. authors. These titles must be distinct from any other section titles. statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. given in the Document for public access to a Transparent copy of the Document. Delete any section Entitled "Endorsements". You may add a section Entitled "Endorsements". Preserve its Title. O. To do this. H. and publisher of the Document as given on its Title Page. or if the original publisher of the version it refers to gives permission. Section numbers or the equivalent are not considered part of the section titles. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice. Preserve all the Invariant Sections of the Document. year. and likewise the network locations given in the Document for previous versions it was based on. You may omit a network location for a work that was published at least four years before the Document itself. new authors.205 G. K. For any section Entitled "Acknowledgments" or "Dedications". L. These may be placed in the "History" section. M. you may at your option designate some or all of these sections as invariant. If there is no section Entitled "History" in the Document.

You may extract a single document from such a collection. You must delete all sections Entitled "Endorsements". 6. unmodified. and distribute it individually under this License. provided that you include in the combination all of the Invariant Sections of all of the original documents. and that you preserve all their Warranty Disclaimers.206 GNU Free Documentation License You may add a passage of up to five words as a Front-Cover Text. to the end of the list of Cover Texts in the Modified Version. under the terms defined in section 4 above for modified versions. likewise combine any sections Entitled "Acknowledgments". but you may replace the old one. previously added by you or by arrangement made by the same entity you are acting on behalf of. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. forming one section Entitled "History". The combined work need only contain one copy of this License. If there are multiple Invariant Sections with the same name but different contents. and list them all as Invariant Sections of your combined work in its license notice. and a passage of up to 25 words as a Back-Cover Text. and any sections Entitled "Dedications". the name of the original author or publisher of that section if known. in parentheses. 5. make the title of each such section unique by adding at the end of it. COLLECTIONS OF DOCUMENTS You may make a collection consisting of the Document and other documents released under this License. you must combine any sections Entitled "History" in the various original documents. . In the combination. or else a unique number. you may not add another. and multiple identical Invariant Sections may be replaced with a single copy. If the Document already includes a cover text for the same cover. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. COMBINING DOCUMENTS You may combine the Document with other documents released under this License. on explicit permission from the previous publisher that added the old one. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. and replace the individual copies of this License in the various documents with a single copy that is included in the collection. and follow this License in all other respects regarding verbatim copying of that document. provided you insert a copy of this License into the extracted document.

If the Cover Text requirement of section 3 is applicable to these copies of the Document. If a section in the Document is Entitled "Acknowledgments". from you under this License will not have their licenses terminated so long as such parties remain in full compliance. then if the Document is less than one half of the entire aggregate. Otherwise they must appear on printed covers that bracket the whole aggregate. AGGREGATION WITH INDEPENDENT WORKS A compilation of the Document or its derivatives with other separate and independent documents or works. or the electronic equivalent of covers if the Document is in electronic form. and will automatically terminate your rights under this License. and any Warranty Disclaimers. or rights. sublicense or distribute the Document is void. the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title. provided that you also include the original English version of this License and the original versions of those notices and disclaimers. "Dedications". sublicense. Replacing Invariant Sections with translations requires special permission from their copyright holders. 8. the original version will prevail. However. or "History". but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. 10. is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer. so you may distribute translations of the Document under the terms of section 4. Any other attempt to copy. or distribute the Document except as expressly provided for under this License. the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate. TERMINATION You may not copy. in or on a volume of a storage or distribution medium. modify. You may include a translation of this License. 9. parties who have received copies. When the Document is included in an aggregate. and all the license notices in the Document.207 7. TRANSLATION Translation is considered a kind of modification. FUTURE REVISIONS OF THIS LICENSE . this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document. modify.

Permission is granted to copy. with no Invariant Sections. distribute and/or modify this document under the terms of the GNU Free Documentation License. revised versions of the GNU Free Documentation License from time to time. Front-Cover Texts and Back-Cover Texts.Texts. If the Document does not specify a version number of this License.208 GNU Free Documentation License The Free Software Foundation may publish new.org/copyleft/. include a copy of the License in the document and put the following copyright and license notices just after the title page: Copyright c YEAR YOUR NAME. merge those two alternatives to suit the situation.. If you have Invariant Sections. Version 1." line with this: with the Invariant Sections being LIST THEIR TITLES. you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. replace the "with.gnu. to permit their use in free software. Each version of the License is given a distinguishing version number. you may choose any version ever published (not as a draft) by the Free Software Foundation. . or some other combination of the three.. If the Document specifies that a particular numbered version of this License "or any later version" applies to it. such as the GNU General Public License. no Front-Cover Texts.2 or any later version published by the Free Software Foundation. we recommend releasing these examples in parallel under your choice of free software license. and with the Back-Cover Texts being LIST. If you have Invariant Sections without Cover Texts. If your document contains nontrivial examples of program code. A copy of the license is included in the section entitled "GNU Free Documentation License". and no BackCover Texts. with the Front-Cover Texts being LIST. See http://www. ADDENDUM: How to use this License for your documents To use this License in a document you have written. Such new versions will be similar in spirit to the present version. but may differ in detail to address new problems or concerns.

VERSION: 1. For persons making modifications to the book. following the format in the first item below. 1. please record the pertinent information here.0 Date: 2008-01-04 Author(s): Michael Corral Title: Vector Calculus Modification(s): Initial version 209 .History This section contains the revision history of the book.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52. . . . . . . . . . . . . . . . . . . . . . . . . . 80. . . . . . . . . . . 124 ∂(x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .136. . . . . . . . z) . . . . . . . . 69 continuously differentiable . . . . . 17 center of mass . . . . . . . . . . . . . . . . . . . .Index Symbols D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Cartesian . . . . . . . . . . . . . . . . . . . . 181 dr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 M x . . . . . . . . . . . 47. . . . . . . . . . . . . . . . . 163 Σ C . . . . . . . . . . . . . . . . . Myz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . w) ∂f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 polar . . . . . . . . . . . . . . . . . . 15 annulus . . . . . . . . . . . . . . . . . . . . . . . . . 113 B Bézier curve . . . . . . . . . . . . . . . . . . . . . 1 curvilinear . . . . 47 cylindrical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 average value . 126 ∆ . . . 60. . . . . . . . . . 177 ∇2 . . . . 71 Dv f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 er . . 121 rectangular . . . . . . . . . . . . . . . . . . . . 110 S angle . . . . . . . . . . . . . . . . . . . . . 124 M xy . . . . . . . . . . . . . . . eθ . . 119 ∂(u. . . . . . . . . . . . . . . . . . . . . . . . . . . . . My . . . . . . . . . . . . 125 Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . 147 change of variable. . . . . . . . . . . . . . . . . . . 145 closed surface . . . 55 210 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 2 . 71 ∂x . . . . . . . . . . . . . . . . . . . . . . . y) . . . . . . . . . . . . . . . . . . . 139 i. . . . . . . . . . . . . . . . . 182 ellipsoidal . . . . . . . . . . . . . 1 . . . . . 153 arc length . . . . . . . .145 ∂ . . . . . . . . . . . . . . . . . M xz . . . . . . . . . . . . . 56 Beta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 . 47. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 left-handed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 C capping surface . . 124 ¯ z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 ∇ . . . . . . . . . . . . . . . . . 119 circulation . . . . . . . . . . . . . . . . . . . . . . . . . j. . . . . . . . . . . . . . . . . . 167 conservative field . . . . . . . y. . . . . . . . . . . . . . . . . 105 R C C1 . . . . . . . . . . . . . . 80 coordinates . . . . . . . . . . . . . . . . . . . . . . 36 conical helix . . . . . . . . . 2. . . . . . . . . . 126 ¯ δ(x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . eφ . . . . . . . . . 139 C∞ . . . . . . . . . . . . k . . . . . . . . . . . v. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117. . . . . . . . . . . . . . . . . . . . . . 59. . . . 102 . . . . . . . . . . . . . . . . . ez . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 centroid . . . . . . . . . . . . . . 161 collinear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 constrained critical point . . . . . . . . . .1 3 . . .124 ¯ y . . . . . . . . . . . . . . . . . . . 174 closed curve .1 x . . . . . . . . . . . . . eρ . . . . . 12 A acceleration . . . . . . . . . . . . . 59 area element . . . . . . 96 continuity .

. . . . . . . . . . . . . . .134 critical point. . . . . . 108 iterated . . 124 derivative . . . . . . . . . . . . . . 162 force . . . . . . . . . . . . . . 177. . . . . . . . . . . . . . . . . . . . . . .Index right-handed . . . . . . . . . . . . . . . . . . . . . . . .1 continuous . . . . 71 vector-valued function . . . . . . 20 curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 I improper integral . . . . . . . . . . . . . . . 3 directional. . . . . . . . . . . . . . . . . . . . . . . . 33 point to plane . . . . . . . . . . 139 differential form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 multiple . . . . . . . . 110 irrotational. . . . . . . . . .102. . . . . . . . 42 distribution function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 directional derivative . . . . . . . . . . . . . . . . . . 102. . . . . . . . . . . . . . . . . . 123. . . . . . . . . . . . . . . . . . 59 double. . . . . . . . . . . 144 direction angles . . . . . . 43 hypersurface . . . 182 Green’s identities . . . . . . . . . . . . . . . . . . . . . . 70 global maximum . . . . . . . . . . . . . . . . . . . 7 from point to line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47. . 62 cylinder . . . 186 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . 6. . . . . . . . . . 139.45 G Gaussian blur . . . . . . . . . . . . . . . . . . . . . . 101 surface . . . . . . . . . . . . . . . . . . . . . . 78 distance . . . . . . . . . . . . . . . 2 spherical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59. 154. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 helicoid . . . . . . . 105 polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 between points . . . 162. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 determinant . . . . . . 130 divergence . . . . . . . . 108 integral . . . . . . . . . . . . . . . . . . . . . 45 . . . . . . . . . . . . . . . . . . . . . . . . 105 improper . . . . . . . . . . . . 83 cross product . . . . . . . . . . . . . . . . . . . . . 73 partial . . . . . . . . . . . . . . . . . . . . . . . 44 hyperboloid . . 162 dot product . 110 hypervolume . . . . . . . . 175 expected value . . . . . . . 182 Divergence Theorem . . . . . . . . . . . . . . . 83 F flux . . . . . . . . . . . . . . . . . . . . 55 function. . 131 normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 curvature . . . 83 gradient . . . . . . . . . . 51 D density . . . . . . . . 169. . . . . . 182 coplanar . . . . . . . . . . . 19 direction cosines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Euclidean space . 178. . . . . . . . . . . . . 167 hyperbolic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . 156. . 43 two sheets . . . 134 covariance. . . . 51. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 directed curve . . . . . . . . . . 43 one sheet . . 1 exact differential form . . 49 helix . . . . . . . . . .174 E ellipsoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 triple . . . . . . . . . . . . . . . . 83 global minimum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 H harmonic . . 129 joint . . . . . . . . . . . . . . . 132 extreme point . . . . . . . . . . 41. . . . . . . 43. . . . . . . . . . . . . . . . 26 differential . . . . . . .78 mixed partial . . . . . . . . . . . . . . 121 doubly ruled surface. . . . . . . . . . . . 26 correlation . . 164 elliptic cone . . . . . . . . . . . . . . . . . . . . . . . . . 15 double integral . . . . . . . 69 scalar. . . . . . . . . . 42 211 elliptic paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 vector-valued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . 60 partial derivative. . . . . . . . . . . . . . . . . . . . . . . . . . . 66 limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 mixed partial derivative. . . . . . . . . . . . . . 32 through two points . . . . . . . . . . . . . 35 coordinate . . 124 matrix . . . . . . . . . . . . . . . . . . . . . 31 line integral . . . . . . . . . . . 83 local minimum . 153 Q quadric surface . . . . . . . .34 symmetric representation . . . . . . . 74 path independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 in space . . . . 24 volume . . . . . . . . . . 35 line of intersection . . . . . . . . . . 81 normal vector field . . 168 J Jacobi identity . .73 Möbius strip . . . . . . . . . . . . . . 160 L Lagrange multiplier . . . . 131 O orientable . . . . . . . . . . . . . . . . . . . 139 potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 R random variable . . . . . . . . . . . . 128 Riemann integral . . . . . . . . . 182 level curve . . . . . 102 Index normal to a curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 vector-valued function . . . . . . . . 62 multiple integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 moving frame fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 momentum . . . . . . . . . .71 partial differential equation. . . . . . . . . . . 35 point-normal form. . . . . . . . . . . 119 joint distribution . . . . . . . . 36 position vector . . . . .19 M mass . . . . 84 of revolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 normal derivative . . . . . . 168 moment . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 . . . . . . 1 Euclidean. . . . . . . . . . . . . . . 96 lamina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35 tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 outward normal . 30 Jacobian . . 34 parametric representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 intersection of planes . . . . . . . . 54. . 139 local maximum . . 124. . . . 175 piecewise smooth curve . . . . . . . . . . . . . . . . . . 169 Newton’s algorithm . . . . 55 Monte Carlo method . . . . . . 52 line . . . . . . . . . . . . . .129 projection. . . . . . . . . . 33 vector representation . . . . . . . . . . . . . . . . . . . . . . . 168 orthonormal vectors . . . . . . . . . . . . . . . . . . . . . . 60 parametrization . . . . . . . . . . . . . 44 hyperbolic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 skew. . . . . . . . . . . . . . . . 31 perpendicular . . . . 128 probability density function. . . . . . . . . . . . . . 38 normal form . . . . 136. . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 P paraboloid . . . 146. . . . . . . . . . . . . . . . . . . . . . . . . . . 101 multiply connected . . 35 normal vector . . . . . . 141 plane . . . . . . . . . . . 148 probability . . . . . . . 75 through three points . . . . . . . . . . . . . . . . . . . . . . 38 parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44 parallelepiped . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154. . . . . . . . . . . . . . . . . . . . . 44. . . . . . 178. . . . . . . . . . . . . . . 31. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .212 iterated integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55. . . . . . . . . . . . . . . . . 43 N n-positive direction . . . . . . . . . . . . . . . . . . . . . . . . 44 elliptic . . . . . . . . . . . . . . 25 parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . . . . . . . . . . . . . . . 3. . . . . . . . . . . 129 . . . . . . . . . . 85 sample space . . . 64 scalar multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 basis . . . . . . . . . . . . 9 unit . 18 triple integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 second-degree equation . . . . . . . . . 10 tangent . . . . . . . . . . . . . . . . . . . . .65 S saddle point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 smooth function . . . . . 35. . . . . . . . . . . . . . . . . . . 110 cylindrical coordinates . . . . . . . . . .Index right-hand rule . 168 smooth . . . . . 128 scalar . . . . . . . . . . 25 Second Derivative Test . . . . . . . . . . . . . . . . . . . . .25 velocity . . . . . . . . . . . . . . . . . . . . . . . 18 sphere . . . . . . . . . . . 12 components . . . . 158 trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 perpendicular . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 unit disk. . . . . . . . . . . . . . . . . . . . . . . 166 Z zenith angle . . . . . . . . . . . . . . 17 positive unit normal . . . . . . . . . . . . 110 T tangent plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . 46 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 ruled . . . 122 spherical coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . 12 scalar function . . . . . . . . . . . . . . . . . . . . . . . 43 simple closed curve . . . . . . . . 9 angle between . . 154. . . . . . . . . . 47 U uniform density . . . . . . 55 volume element . . 12 unit binormal B . . . . . . . . . 169 surface . . . . . . . . . . . . . . . . . . 95 stereographic projection . . . . . . . . . . . . . . . . . 74 work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 213 uniformly distributed . . . . . . 138 normal . . . . . 13 direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 combination . . . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 W wave equation . . . . . . . . . . . . . . . 3 magnitude . . . . . . . . . . . . . . 124 uniform distribution . 40 spherical spiral . . . . . . . . . . . . 42 triangle inequality . . . . . . 145 simply connected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 zero . . . . . 75 torus . . . . . . . 156. . . . . . . . . . . . . . . . . . . . . . . . . . . 7 normal . . . . . . . . . . . 150 vector triple product. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 subtraction . . . . . . . . . . . . . . 192 ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16. . . . . . . . . . . . . . . . . . . 40 doubly ruled . . . . 160 normalized . . . . 169 principal normal N . 54 standard normal distribution . . . 4 vector field . . . . . 64 unit tangent T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 translation . . . 21. . . . . . . . . . . . . . . . . . . . . . . 84 solenoidal . . . . . . . . . . . . . . . . . . . . . . . . . . 168. . . 134 vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 two-sided . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 surface integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 V variance . 135. . . . . . . . . . . . . . . . 3 addition . . . . . . . . . . . . . . . 53 scalar triple product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 span . . . . . 12 parallel . . . . . . . . . . . . . . . . . . . 130 steepest descent . . . . . 45 orientable . . . . . . . . . 84 second moment .