BooksAudiobooksComicsSheet Music### Categories

### Categories

### Categories

Editors' Picks Books

Hand-picked favorites from

our editors

our editors

Editors' Picks Audiobooks

Hand-picked favorites from

our editors

our editors

Editors' Picks Comics

Hand-picked favorites from

our editors

our editors

Editors' Picks Sheet Music

Hand-picked favorites from

our editors

our editors

Top Books

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Audiobooks

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Comics

What's trending, bestsellers,

award-winners & more

award-winners & more

Top Sheet Music

What's trending, bestsellers,

award-winners & more

award-winners & more

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

-5

0

5

10

-10

-5

0

5

10

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

z

x

y

z

Vector

Calculus

Michael Corral

Vector Calculus

Michael Corral

Schoolcraft College

About the author:

Michael Corral is an Adjunct Faculty member of the Department of Mathematics at

Schoolcraft College. He received a B.A. in Mathematics from the University of California

at Berkeley, and received an M.A. in Mathematics and an M.S. in Industrial & Operations

Engineering from the University of Michigan.

This text was typeset in L

A

T

E

X2

ε

with the KOMA-Script bundle, using the GNU Emacs text

editor on a Fedora Linux system. The graphics were created using MetaPost, PGF, and

Gnuplot.

Copyright © 2008 Michael Corral.

Permission is granted to copy, distribute and/or modify this document under the terms of the

GNU Free Documentation License, Version 1.2 or any later version published by the Free

Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover

Texts. A copy of the license is included in the section entitled “GNU Free Documentation

License”.

Preface

This book covers calculus in two and three variables. It is suitable for a one-semester course,

normally known as “Vector Calculus”, “Multivariable Calculus”, or simply “Calculus III”.

The prerequisites are the standard courses in single-variable calculus (a.k.a. Calculus I and

II).

I have tried to be somewhat rigorous about proving results. But while it is important for

students to see full-blown proofs - since that is how mathematics works - too much rigor and

emphasis on proofs can impede the ﬂow of learning for the vast majority of the audience at

this level. If I were to rate the level of rigor in the book on a scale of 1 to 10, with 1 being

completely informal and 10 being completely rigorous, I would rate it as a 5.

There are 420 exercises throughout the text, which in my experience are more than

enough for a semester course in this subject. There are exercises at the end of each sec-

tion, divided into three categories: A, B and C. The A exercises are mostly of a routine

computational nature, the B exercises are slightly more involved, and the C exercises usu-

ally require some effort or insight to solve. A crude way of describing A, B and C would be

“Easy”, “Moderate” and “Challenging”, respectively. However, many of the B exercises are

easy and not all the C exercises are difﬁcult.

There are a few exercises that require the student to write his or her own computer pro-

gram to solve some numerical approximation problems (e.g. the Monte Carlo method for

approximating multiple integrals, in Section 3.4). The code samples in the text are in the

Java programming language, hopefully with enough comments so that the reader can ﬁgure

out what is being done even without knowing Java. Those exercises do not mandate the use

of Java, so students are free to implement the solutions using the language of their choice.

While it would have been simple to use a scripting language like Python, and perhaps even

easier with a functional programming language (such as Haskell or Scheme), Java was cho-

sen due to its ubiquity, relatively clear syntax, and easy availability for multiple platforms.

Answers and hints to most odd-numbered and some even-numbered exercises are pro-

vided in Appendix A. Appendix B contains a proof of the right-hand rule for the cross prod-

uct, which seems to have virtually disappeared from calculus texts over the last few decades.

Appendix C contains a brief tutorial on Gnuplot for graphing functions of two variables.

This book is released under the GNU Free Documentation License (GFDL), which allows

others to not only copy and distribute the book but also to modify it. For more details, see

the included copy of the GFDL. So that there is no ambiguity on this matter, anyone can

make as many copies of this book as desired and distribute it as desired, without needing

my permission. The PDF version will always be freely available to the public at no cost

(go to http://www.mecmath.net). Feel free to contact me at mcorral@schoolcraft.edu for

iii

iv Preface

any questions on this or any other matter involving the book (e.g. comments, suggestions,

corrections, etc). I welcome your input.

Finally, I would like to thank my students in Math 240 for being the guinea pigs for the

initial draft of this book, and for ﬁnding the numerous errors and typos it contained.

January 2008 MICHAEL CORRAL

Contents

Preface iii

1 Vectors in Euclidean Space 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Vector Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3 Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4 Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.5 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1.6 Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

1.7 Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1.8 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1.9 Arc Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2 Functions of Several Variables 65

2.1 Functions of Two or Three Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 65

2.2 Partial Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

2.3 Tangent Plane to a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

2.4 Directional Derivatives and the Gradient . . . . . . . . . . . . . . . . . . . . . . . 78

2.5 Maxima and Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

2.6 Unconstrained Optimization: Numerical Methods . . . . . . . . . . . . . . . . . . 89

2.7 Constrained Optimization: Lagrange Multipliers . . . . . . . . . . . . . . . . . . 96

3 Multiple Integrals 101

3.1 Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

3.2 Double Integrals Over a General Region . . . . . . . . . . . . . . . . . . . . . . . . 105

3.3 Triple Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

3.4 Numerical Approximation of Multiple Integrals . . . . . . . . . . . . . . . . . . . 113

3.5 Change of Variables in Multiple Integrals . . . . . . . . . . . . . . . . . . . . . . . 117

3.6 Application: Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.7 Application: Probability and Expected Value . . . . . . . . . . . . . . . . . . . . . 128

4 Line and Surface Integrals 135

4.1 Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

4.2 Properties of Line Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

4.3 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

v

vi Contents

4.4 Surface Integrals and the Divergence Theorem . . . . . . . . . . . . . . . . . . . . 156

4.5 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

4.6 Gradient, Divergence, Curl and Laplacian . . . . . . . . . . . . . . . . . . . . . . . 177

Bibliography 187

Appendix A: Answers and Hints to Selected Exercises 189

Appendix B: Proof of the Right-Hand Rule for the Cross Product 192

Appendix C: 3D Graphing with Gnuplot 196

GNU Free Documentation License 201

History 209

Index 210

1 Vectors in Euclidean Space

1.1 Introduction

In single-variable calculus, the functions that one encounters are functions of a variable

(usually x or t) that varies over some subset of the real number line (which we denote by R).

For such a function, say, y f (x), the graph of the function f consists of the points (x, y)

(x, f (x)). These points lie in the Euclidean plane, which, in the Cartesian or rectangular

coordinate system, consists of all ordered pairs of real numbers (a, b). We use the word

“Euclidean” to denote a system in which all the usual rules of Euclidean geometry hold. We

denote the Euclidean plane by R

2

; the “2” represents the number of dimensions of the plane.

The Euclidean plane has two perpendicular coordinate axes: the x-axis and the y-axis.

In vector (or multivariable) calculus, we will deal with functions of two or three variables

(usually x, y or x, y, z, respectively). The graph of a function of two variables, say, z f (x, y),

lies in Euclidean space, which in the Cartesian coordinate system consists of all ordered

triples of real numbers (a, b, c). Since Euclidean space is 3-dimensional, we denote it by R

3

.

The graph of f consists of the points (x, y, z) (x, y, f (x, y)). The 3-dimensional coordinate

system of Euclidean space can be represented on a ﬂat surface, such as this page or a black-

board, only by giving the illusion of three dimensions, in the manner shown in Figure 1.1.1.

Euclidean space has three mutually perpendicular coordinate axes (x, y and z), and three

mutually perpendicular coordinate planes: the xy-plane, yz-plane and xz-plane (see Figure

1.1.2).

x

y

z

0

P(a, b, c)

a

b

c

Figure 1.1.1

x

y

z

0

yz-plane

xy-plane

xz-plane

Figure 1.1.2

1

2 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

The coordinate system shown in Figure 1.1.1 is known as a right-handed coordinate

system, because it is possible, using the right hand, to point the index ﬁnger in the positive

direction of the x-axis, the middle ﬁnger in the positive direction of the y-axis, and the thumb

in the positive direction of the z-axis, as in Figure 1.1.3.

x

z

y

0

Figure 1.1.3 Right-handed coordinate system

An equivalent way of deﬁning a right-handed system is if you can point your thumb up-

wards in the positive z-axis direction while using the remaining four ﬁngers to rotate the

x-axis towards the y-axis. Doing the same thing with the left hand is what deﬁnes a left-

handed coordinate system. Notice that switching the x- and y-axes in a right-handed

system results in a left-handed system, and that rotating either type of system does not

change its “handedness”. Throughout the book we will use a right-handed system.

For functions of three variables, the graphs exist in 4-dimensional space (i.e. R

4

), which

we can not see in our 3-dimensional space, let alone simulate in 2-dimensional space. So

we can only think of 4-dimensional space abstractly. For an entertaining discussion of this

subject, see the book by ABBOTT.

1

So far, we have discussed the position of an object in 2-dimensional or 3-dimensional space.

But what about something such as the velocity of the object, or its acceleration? Or the

gravitational force acting on the object? These phenomena all seem to involve motion and

direction in some way. This is where the idea of a vector comes in.

1

One thing you will learn is why a 4-dimensional creature would be able to reach inside an egg and remove the

yolk without cracking the shell!

1.1 Introduction 3

You have already dealt with velocity and acceleration in single-variable calculus. For

example, for motion along a straight line, if y f (t) gives the displacement of an object after

time t, then dy/dt f

′

(t) is the velocity of the object at time t. The derivative f

′

(t) is just a

number, which is positive if the object is moving in an agreed-upon “positive” direction, and

negative if it moves in the opposite of that direction. So you can think of that number, which

was called the velocity of the object, as having two components: a magnitude, indicated

by a nonnegative number, preceded by a direction, indicated by a plus or minus symbol

(representing motion in the positive direction or the negative direction, respectively), i.e.

f

′

(t) ±a for some number a ≥0. Then a is the magnitude of the velocity (normally called

the speed of the object), and the ± represents the direction of the velocity (though the + is

usually omitted for the positive direction).

For motion along a straight line, i.e. in a 1-dimensional space, the velocities are also con-

tained in that 1-dimensional space, since they are just numbers. For general motion along a

curve in 2- or 3-dimensional space, however, velocity will need to be represented by a multi-

dimensional object which should have both a magnitude and a direction. A geometric object

which has those features is an arrow, which in elementary geometry is called a “directed line

segment”. This is the motivation for how we will deﬁne a vector.

Deﬁnition 1.1. A (nonzero) vector is a directed line segment drawn from a point P (called

its initial point) to a point Q (called its terminal point), with P and Q being distinct

points. The vector is denoted by

−−→

PQ. Its magnitude is the length of the line segment,

denoted by

_

_

−−→

PQ

_

_

, and its direction is the same as that of the directed line segment. The

zero vector is just a point, and it is denoted by 0.

To indicate the direction of a vector, we draw an arrow from its initial point to its terminal

point. We will often denote a vector by a single bold-faced letter (e.g. v) and use the terms

“magnitude” and “length” interchangeably. Note that our deﬁnition could apply to systems

with any number of dimensions (see Figure 1.1.4 (a)-(c)).

0 x

P Q R S

−−→

PQ

−−→

RS

(a) One dimension

x

y

0

P

Q

R

S

−

−

→

P

Q

−−→

RS

v

(b) Two dimensions

x

y

z

0

P

Q

R

S

−

−

→

P

Q

−

−

→

R

S

v

(c) Three dimensions

Figure 1.1.4 Vectors in different dimensions

4 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

A few things need to be noted about the zero vector. Our motivation for what a vector is

included the notions of magnitude and direction. What is the magnitude of the zero vector?

We deﬁne it to be zero, i.e. |0| 0. This agrees with the deﬁnition of the zero vector as just

a point, which has zero length. What about the direction of the zero vector? A single point

really has no well-deﬁned direction. Notice that we were careful to only deﬁne the direction

of a nonzero vector, which is well-deﬁned since the initial and terminal points are distinct.

Not everyone agrees on the direction of the zero vector. Some contend that the zero vector

has arbitrary direction (i.e. can take any direction), some say that it has indeterminate

direction (i.e. the direction can not be determined), while others say that it has no direction.

Our deﬁnition of the zero vector, however, does not require it to have a direction, and we will

leave it at that.

2

Now that we know what a vector is, we need a way of determining when two vectors are

equal. This leads us to the following deﬁnition.

Deﬁnition 1.2. Two nonzero vectors are equal if they have the same magnitude and the

same direction. Any vector with zero magnitude is equal to the zero vector.

By this deﬁnition, vectors with the same magnitude and direction but with different initial

points would be equal. For example, in Figure 1.1.5 the vectors u, v and w all have the same

magnitude

_

5 (by the Pythagorean Theorem). And we see that u and w are parallel, since

they lie on lines having the same slope

1

2

, and they point in the same direction. So u w,

even though they have different initial points. We also see that v is parallel to u but points

in the opposite direction. So u/v.

1

2

3

4

1 2 3 4

x

y

0

u

v

w

Figure 1.1.5

So we can see that there are an inﬁnite number of vectors for a given magnitude and

direction, those vectors all being equal and differing only by their initial and terminal points.

Is there a single vector which we can choose to represent all those equal vectors? The answer

is yes, and is suggested by the vector w in Figure 1.1.5.

2

In the subject of linear algebra there is a more abstract way of deﬁning a vector where the concept of “direction”

is not really used. See ANTON and RORRES.

1.1 Introduction 5

Unless otherwise indicated, when speaking of “the vector” with a given magnitude and

direction, we will mean the one whose initial point is at the origin of the coordinate

system.

Thinking of vectors as starting from the origin provides a way of dealing with vectors in

a standard way, since every coordinate system has an origin. But there will be times when

it is convenient to consider a different initial point for a vector (for example, when adding

vectors, which we will do in the next section).

Another advantage of using the origin as the initial point is that it provides an easy cor-

respondence between a vector and its terminal point.

Example 1.1. Let v be the vector in R

3

whose initial point is at the origin and whose ter-

minal point is (3, 4, 5). Though the point (3, 4, 5) and the vector v are different objects, it is

convenient to write v (3, 4, 5). When doing this, it is understood that the initial point of v

is at the origin (0, 0, 0) and the terminal point is (3, 4, 5).

x

y

z

0

P(3, 4, 5)

(a) The point (3,4,5)

x

y

z

0

v(3, 4, 5)

(b) The vector (3,4,5)

Figure 1.1.6 Correspondence between points and vectors

Unless otherwise stated, when we refer to vectors as v (a, b) in R

2

or v (a, b, c) in R

3

,

we mean vectors in Cartesian coordinates starting at the origin. Also, we will write the zero

vector 0 in R

2

and R

3

as (0, 0) and (0, 0, 0), respectively.

The point-vector correspondence provides an easy way to check if two vectors are equal,

without having to determine their magnitude and direction. Similar to seeing if two points

are the same, you are now seeing if the terminal points of vectors starting at the origin

are the same. For each vector, ﬁnd the (unique!) vector it equals whose initial point is

the origin. Then compare the coordinates of the terminal points of these “new” vectors: if

those coordinates are the same, then the original vectors are equal. To get the “new” vectors

starting at the origin, you translate each vector to start at the origin by subtracting the

coordinates of the original initial point from the original terminal point. The resulting point

will be the terminal point of the “new” vector whose initial point is the origin. Do this for

each original vector then compare.

6 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.2. Consider the vectors

−−→

PQ and

−−→

RS in R

3

, where P (2, 1, 5), Q (3, 5, 7), R

(1, −3, −2) and S (2, 1, 0). Does

−−→

PQ

−−→

RS?

Solution: The vector

−−→

PQ is equal to the vector v with initial point (0, 0, 0) and terminal point

Q−P (3, 5, 7) −(2, 1, 5) (3−2, 5−1, 7−5) (1, 4, 2).

Similarly,

−−→

RS is equal to the vector w with initial point (0, 0, 0) and terminal point S−R

(2, 1, 0) −(1, −3, −2) (2−1, 1−(−3), 0−(−2)) (1, 4, 2).

So

−−→

PQ v (1, 4, 2) and

−−→

RS w(1, 4, 2).

∴

−−→

PQ

−−→

RS

y

z

x

0

−−→

P

Q

−−→

R

S

Translate

−−→

PQ to v

Translate

−−→

RS to w

P

(2, 1, 5)

Q

(3, 5, 7)

R

(1, −3, −2)

S

(2, 1, 0)

(1, 4, 2)

vw

Figure 1.1.7

Recall the distance formula for points in the Euclidean plane:

For points P (x

1

, y

1

), Q (x

2

, y

2

) in R

2

, the distance d between P and Q is:

d

_

(x

2

−x

1

)

2

+(y

2

− y

1

)

2

(1.1)

By this formula, we have the following result:

For a vector

−−→

PQ in R

2

with initial point P (x

1

, y

1

) and terminal point

Q (x

2

, y

2

), the magnitude of

−−→

PQ is:

_

_

−−→

PQ

_

_

_

(x

2

−x

1

)

2

+(y

2

− y

1

)

2

(1.2)

1.1 Introduction 7

Finding the magnitude of a vector v (a, b) in R

2

is a special case of formula (1.2) with

P (0, 0) and Q (a, b) :

For a vector v (a, b) in R

2

, the magnitude of v is:

|v|

_

a

2

+b

2

(1.3)

To calculate the magnitude of vectors in R

3

, we need a distance formula for points in

Euclidean space (we will postpone the proof until the next section):

Theorem 1.1. The distance d between points P (x

1

, y

1

, z

1

) and Q (x

2

, y

2

, z

2

) in R

3

is:

d

_

(x

2

−x

1

)

2

+(y

2

− y

1

)

2

+(z

2

−z

1

)

2

(1.4)

The proof will use the following result:

Theorem 1.2. For a vector v (a, b, c) in R

3

, the magnitude of v is:

|v|

_

a

2

+b

2

+c

2

(1.5)

Proof: There are four cases to consider:

Case 1: a b c 0. Then v 0, so |v| 0

_

0

2

+0

2

+0

2

_

a

2

+b

2

+c

2

.

Case 2: exactly two of a, b, c are 0. Without loss of generality, we assume that a b 0 and

c /0 (the other two possibilities are handled in a similar manner). Then v (0, 0, c), which

is a vector of length [ c[ along the z-axis. So |v| [c[

_

c

2

_

0

2

+0

2

+c

2

_

a

2

+b

2

+c

2

.

Case 3: exactly one of a, b, c is 0. Without loss of generality, we assume that a 0, b / 0

and c / 0 (the other two possibilities are handled in a similar manner). Then v (0, b, c),

which is a vector in the yz-plane, so by the Pythagorean Theorem we have |v|

_

b

2

+c

2

_

0

2

+b

2

+c

2

_

a

2

+b

2

+c

2

.

x

y

z

0

a

Q(a, b, c)

S

P

R

b

c

v

Figure 1.1.8

Case 4: none of a, b, c are 0. Without loss of generality, we can as-

sume that a, b, c are all positive (the other seven possibilities are

handled in a similar manner). Consider the points P (0, 0, 0),

Q (a, b, c), R (a, b, 0), and S (a, 0, 0), as shown in Figure

1.1.8. Applying the Pythagorean Theorem to the right trian-

gle △PSR gives [PR[

2

a

2

+ b

2

. A second application of the

Pythagorean Theorem, this time to the right triangle △PQR,

gives |v| [PQ[

_

[PR[

2

+[QR[

2

_

a

2

+b

2

+c

2

.

This proves the theorem. QED

8 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.3. Calculate the following:

(a) The magnitude of the vector

−−→

PQ in R

2

with P (−1, 2) and Q (5, 5).

Solution: By formula (1.2),

_

_

−−→

PQ

_

_

_

(5−(−1))

2

+(5−2)

2

_

36+9

_

45 3

_

5.

(b) The magnitude of the vector v (8, 3) in R

2

.

Solution: By formula (1.3), |v|

_

8

2

+3

2

_

73.

(c) The distance between the points P (2, −1, 4) and Q (4, 2, −3) in R

2

.

Solution: By formula (1.4), the distance d

_

(4−2)

2

+(2−(−1))

2

+(−3−4)

2

_

4+9+49

_

62.

(d) The magnitude of the vector v (5, 8, −2) in R

3

.

Solution: By formula (1.5), |v|

_

5

2

+8

2

+(−2)

2

_

25+64+4

_

93.

Exercises

A

1. Calculate the magnitudes of the following vectors:

(a) v (2, −1) (b) v (2, −1, 0) (c) v (3, 2, −2) (d) v (0, 0, 1) (e) v (6, 4, −4)

2. For the points P (1, −1, 1), Q (2, −2, 2), R (2, 0, 1), S (3, −1, 2), does

−−→

PQ

−−→

RS?

3. For the points P (0, 0, 0), Q (1, 3, 2), R (1, 0, 1), S (2, 3, 4), does

−−→

PQ

−−→

RS?

B

4. Let v (1, 0, 0) and w(a, 0, 0) be vectors in R

3

. Show that |w| [a[ |v|.

5. Let v (a, b, c) and w(3a, 3b, 3c) be vectors in R

3

. Show that |w| 3|v|.

C

x

y

z

0

P(x

1

, y

1

, z

1

)

Q(x

2

, y

2

, z

2

)

R(x

2

, y

2

, z

1

)

S(x

1

, y

1

, 0)

T(x

2

, y

2

, 0)

U(x

2

, y

1

, 0)

Figure 1.1.9

6. Though we will see a simple proof of Theorem 1.1

in the next section, it is possible to prove it using

methods similar to those in the proof of Theorem

1.2. Prove the special case of Theorem 1.1 where the

points P (x

1

, y

1

, z

1

) and Q (x

2

, y

2

, z

2

) satisfy the fol-

lowing conditions:

x

2

> x

1

>0, y

2

> y

1

>0, and z

2

> z

1

>0.

(Hint: Think of Case 4 in the proof of Theorem 1.2,

and consider Figure 1.1.9.)

1.2 Vector Algebra 9

1.2 Vector Algebra

Now that we know what vectors are, we can start to perform some of the usual algebraic

operations on them (e.g. addition, subtraction). Before doing that, we will introduce the

notion of a scalar.

Deﬁnition 1.3. A scalar is a quantity that can be represented by a single number.

For our purposes, scalars will always be real numbers.

3

Examples of scalar quantities are

mass, electric charge, and speed (not velocity).

4

We can now deﬁne scalar multiplication of

a vector.

Deﬁnition 1.4. For a scalar k and a nonzero vector v, the scalar multiple of v by k,

denoted by kv, is the vector whose magnitude is [ k[ |v|, points in the same direction as v if

k >0, points in the opposite direction as v if k <0, and is the zero vector 0 if k 0. For the

zero vector 0, we deﬁne k0 0 for any scalar k.

Two vectors v and ware parallel (denoted by v ∥ w) if one is a scalar multiple of the other.

You can think of scalar multiplication of a vector as stretching or shrinking the vector, and

as ﬂipping the vector in the opposite direction if the scalar is a negative number (see Figure

1.2.1).

v

2v 3v 0.5v

−v

−2v

Figure 1.2.1

Recall that translating a nonzero vector means that the initial point of the vector is

changed but the magnitude and direction are preserved. We are now ready to deﬁne the

sum of two vectors.

Deﬁnition 1.5. The sum of vectors v and w, denoted by v+w, is obtained by translating

w so that its initial point is at the terminal point of v; the initial point of v+w is the initial

point of v, and its terminal point is the new terminal point of w.

3

The term scalar was invented by 19

th

century Irish mathematician, physicist and astronomer William Rowan

Hamilton, to convey the sense of something that could be represented by a point on a scale or graduated ruler.

The word vector comes from Latin, where it means “carrier”.

4

An alternate deﬁnition of scalars and vectors, used in physics, is that under certain types of coordinate trans-

formations (e.g. rotations), a quantity that is not affected is a scalar, while a quantity that is affected (in a

certain way) is a vector. See MARION for details.

10 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Intuitively, adding w to v means tacking on w to the end of v (see Figure 1.2.2).

v

w

(a) Vectors v and w

v

w

(b) Translate w to the end of v

v

w

v+w

(c) The sum v+w

Figure 1.2.2 Adding vectors v and w

Notice that our deﬁnition is valid for the zero vector (which is just a point, and hence can

be translated), and so we see that v+0 v 0+v for any vector v. In particular, 0+0 0.

Also, it is easy to see that v+(−v) 0, as we would expect. In general, since the scalar

multiple −v −1v is a well-deﬁned vector, we can deﬁne vector subtraction as follows:

v−wv+(−w). See Figure 1.2.3.

v

w

(a) Vectors v and w

v

−w

(b) Translate −w to the end of v

v

−w

v−w

(c) The difference v−w

Figure 1.2.3 Subtracting vectors v and w

Figure 1.2.4 shows the use of “geometric proofs” of various laws of vector algebra, that is,

it uses laws from elementary geometry to prove statements about vectors. For example, (a)

shows that v+ww+v for any vectors v, w. And (c) shows how you can think of v−w as

the vector that is tacked on to the end of w to add up to v.

v

v

w w

w+v

v+w

(a) Add vectors

−w

w

v−w

v−w

v

(b) Subtract vectors

v

w

v+w

v−w

(c) Combined add/subtract

Figure 1.2.4 “Geometric” vector algebra

Notice that we have temporarily abandoned the practice of starting vectors at the origin.

In fact, we have not even mentioned coordinates in this section so far. Since we will deal

mostly with Cartesian coordinates in this book, the following two theorems are useful for

performing vector algebra on vectors in R

2

and R

3

starting at the origin.

1.2 Vector Algebra 11

Theorem 1.3. Let v (v

1

, v

2

), w(w

1

, w

2

) be vectors in R

2

, and let k be a scalar. Then

(a) kv (kv

1

, kv

2

)

(b) v + w(v

1

+w

1

, v

2

+w

2

)

Proof: (a) Without loss of generality, we assume that v

1

, v

2

> 0 (the other possibilities are

handled in a similar manner). If k 0 then kv 0v 0 (0, 0) (0v

1

, 0v

2

) (kv

1

, kv

2

), which

is what we needed to show. If k /0, then (kv

1

, kv

2

) lies on a line with slope

kv

2

kv

1

v

2

v

1

, which

is the same as the slope of the line on which v (and hence kv) lies, and (kv

1

, kv

2

) points in

the same direction on that line as kv. Also, by formula (1.3) the magnitude of (kv

1

, kv

2

) is

_

(kv

1

)

2

+(kv

2

)

2

_

k

2

v

2

1

+k

2

v

2

2

_

k

2

(v

2

1

+v

2

2

) [ k[

_

v

2

1

+v

2

2

[ k[ |v|. So kv and (kv

1

, kv

2

)

have the same magnitude and direction. This proves (a).

x

y

0

w

2

v

2

w

1

v

1

v

1

+w

1

v

2

+w

2

w

2

w

1

v

v

w

w

v+w

Figure 1.2.5

(b) Without loss of generality, we assume that

v

1

, v

2

, w

1

, w

2

> 0 (the other possibilities are han-

dled in a similar manner). From Figure 1.2.5, we

see that when translating wto start at the end of

v, the new terminal point of w is (v

1

+w

1

, v

2

+w

2

),

so by the deﬁnition of v+w this must be the ter-

minal point of v+w. This proves (b). QED

Theorem 1.4. Let v (v

1

, v

2

, v

3

), w(w

1

, w

2

, w

3

) be vectors in R

3

, let k be a scalar. Then

(a) kv (kv

1

, kv

2

, kv

3

)

(b) v + w(v

1

+w

1

, v

2

+w

2

, v

3

+w

3

)

The following theorem summarizes the basic laws of vector algebra.

Theorem 1.5. For any vectors u, v, w, and scalars k, l, we have

(a) v+ww+v Commutative Law

(b) u+(v+w) (u+v) +w Associative Law

(c) v+0 v 0+v Additive Identity

(d) v+(−v) 0 Additive Inverse

(e) k(lv) (kl)v Associative Law

(f) k(v+w) kv+kw Distributive Law

(g) (k+l)v kv+lv Distributive Law

Proof: (a) We already presented a geometric proof of this in Figure 1.2.4(a).

(b) To illustrate the difference between analytic proofs and geometric proofs in vector alge-

bra, we will present both types here. For the analytic proof, we will use vectors in R

3

(the

proof for R

2

is similar).

12 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Let u(u

1

, u

2

, u

3

), v (v

1

, v

2

, v

3

), w(w

1

, w

2

, w

3

) be vectors in R

3

. Then

u+(v+w) (u

1

, u

2

, u

3

) +((v

1

, v

2

, v

3

) +(w

1

, w

2

, w

3

))

(u

1

, u

2

, u

3

) +(v

1

+w

1

, v

2

+w

2

, v

3

+w

3

) by Theorem 1.4(b)

(u

1

+(v

1

+w

1

), u

2

+(v

2

+w

2

), u

3

+(v

3

+w

3

)) by Theorem 1.4(b)

((u

1

+v

1

) +w

1

, (u

2

+v

2

) +w

2

, (u

3

+v

3

) +w

3

) by properties of real numbers

(u

1

+v

1

, u

2

+v

2

, u

3

+v

3

) +(w

1

, w

2

, w

3

) by Theorem 1.4(b)

(u+v) +w

This completes the analytic proof of (b). Figure 1.2.6 provides the geometric proof.

u

v

w

u+v

v+w

u+(v+w) (u+v) +w

Figure 1.2.6 Associative Law for vector addition

(c) We already discussed this on p.10.

(d) We already discussed this on p.10.

(e) We will prove this for a vector v (v

1

, v

2

, v

3

) in R

3

(the proof for R

2

is similar):

k(lv) k(lv

1

, lv

2

, lv

3

) by Theorem 1.4(a)

(klv

1

, klv

2

, klv

3

) by Theorem 1.4(a)

(kl)(v

1

, v

2

, v

3

) by Theorem 1.4(a)

(kl)v

(f) and (g): Left as exercises for the reader. QED

A unit vector is a vector with magnitude 1. Notice that for any nonzero vector v, the

vector

v

|v|

is a unit vector which points in the same direction as v, since

1

|v|

>0 and

_

_

v

|v|

_

_

|v|

|v|

1. Dividing a nonzero vector v by |v| is often called normalizing v.

There are speciﬁc unit vectors which we will often use, called the basis vectors:

i (1, 0, 0), j (0, 1, 0), and k(0, 0, 1) in R

3

; i (1, 0) and j (0, 1) in R

2

.

These are useful for several reasons: they are mutually perpendicular, since they lie on

distinct coordinate axes; they are all unit vectors: |i| |j| |k| 1; every vector can

be written as a unique scalar combination of the basis vectors: v (a, b) ai +bj in R

2

,

v (a, b, c) ai +bj +ck in R

3

. See Figure 1.2.7.

1.2 Vector Algebra 13

1

2

1 2

x

y

0

i

j

(a) R

2

x

y

0

ai

bj

v(a, b)

(b) v ai +bj

1

2

1 2

1

2

x

y

z

0

i

j

k

(c) R

3

x

y

z

0

ai

bj

ck

v(a, b, c)

(d) v ai +bj +ck

Figure 1.2.7 Basis vectors in different dimensions

When a vector v (a, b, c) is written as v ai +bj + ck, we say that v is in component

form, and that a, b, and c are the i, j, and k components, respectively, of v. We have:

v v

1

i +v

2

j +v

3

k, k a scalar ⇒kv kv

1

i +kv

2

j +kv

3

k

v v

1

i +v

2

j +v

3

k, ww

1

i +w

2

j +w

3

k⇒v+w(v

1

+w

1

)i +(v

2

+w

2

)j +(v

3

+w

3

)k

v v

1

i +v

2

j +v

3

k⇒|v|

_

v

2

1

+v

2

2

+v

2

3

Example 1.4. Let v (2, 1, −1) and w(3, −4, 2) in R

3

.

(a) Find v−w.

Solution: v−w(2−3, 1−(−4), −1−2) (−1, 5, −3)

(b) Find 3v+2w.

Solution: 3v+2w(6, 3, −3) +(6, −8, 4) (12, −5, 1)

(c) Write v and w in component form.

Solution: v 2i +j −k, w3i −4j +2k

(d) Find the vector u such that u+v w.

Solution: By Theorem 1.5, uw−v−(v−w) −(−1, 5, −3) (1, −5, 3), by part(a).

(e) Find the vector u such that u+v+w0.

Solution: By Theorem 1.5, u−w−v −(3, −4, 2) −(2, 1, −1) (−5, 3, −1).

(f) Find the vector u such that 2u+i −2j k.

Solution: 2u−i +2j +k⇒u−

1

2

i +j +

1

2

k

(g) Find the unit vector

v

|v|

.

Solution:

v

|v|

1

_

2

2

+1

2

+(−1)

2

(2, 1, −1)

_

2

_

6

,

1

_

6

,

−1

_

6

_

14 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

We can now easily prove Theorem 1.1 from the previous section. The distance d between

two points P (x

1

, y

1

, z

1

) and Q (x

2

, y

2

, z

2

) in R

3

is the same as the length of the vector w−v,

where the vectors v and w are deﬁned as v (x

1

, y

1

, z

1

) and w(x

2

, y

2

, z

2

) (see Figure 1.2.8).

So since w−v (x

2

−x

1

, y

2

−y

1

, z

2

−z

1

), then d |w−v|

_

(x

2

−x

1

)

2

+(y

2

− y

1

)

2

+(z

2

−z

1

)

2

by

Theorem 1.2.

x

y

z

0

P(x

1

, y

1

, z

1

)

Q(x

2

, y

2

, z

2

)

v

w

w−v

Figure 1.2.8 Proof of Theorem 1.2: d |w−v|

Exercises

A

1. Let v (−1, 5, −2) and w(3, 1, 1).

(a) Find v−w. (b) Find v+w. (c) Find

v

|v|

. (d) Find

_

_

1

2

(v−w)

_

_

.

(e) Find

_

_

1

2

(v+w)

_

_

. (f) Find −2v+4w. (g) Find v−2w.

(h) Find the vector u such that u+v+wi.

(i) Find the vector u such that u+v+w2j +k.

(j) Is there a scalar m such that m(v+2w) k? If so, ﬁnd it.

2. For the vectors v and w from Exercise 1, is |v−w| |v|−|w|? If not, which quantity

is larger?

3. For the vectors v and w from Exercise 1, is |v+w| |v|+|w|? If not, which quantity

is larger?

B

4. Prove Theorem 1.5(f) for R

3

. 5. Prove Theorem 1.5(g) for R

3

.

C

6. We know that every vector in R

3

can be written as a scalar combination of the vectors i,

j, and k. Can every vector in R

3

be written as a scalar combination of just i and j, i.e. for

any vector v in R

3

, are there scalars m, n such that v mi +nj? Justify your answer.

1.3 Dot Product 15

1.3 Dot Product

You may have noticed that while we did deﬁne multiplication of a vector by a scalar in the

previous section on vector algebra, we did not deﬁne multiplication of a vector by a vector.

We will now see one type of multiplication of vectors, called the dot product.

Deﬁnition 1.6. Let v (v

1

, v

2

, v

3

) and w(w

1

, w

2

, w

3

) be vectors in R

3

.

The dot product of v and w, denoted by v··· w, is given by:

v··· wv

1

w

1

+v

2

w

2

+v

3

w

3

(1.6)

Similarly, for vectors v (v

1

, v

2

) and w(w

1

, w

2

) in R

2

, the dot product is:

v··· wv

1

w

1

+v

2

w

2

(1.7)

Notice that the dot product of two vectors is a scalar, not a vector. So the associative law

that holds for multiplication of numbers and for addition of vectors (see Theorem 1.5(b),(e)),

does not hold for the dot product of vectors. Why? Because for vectors u, v, w, the dot

product u··· v is a scalar, and so (u··· v) ··· w is not deﬁned since the left side of that dot product

(the part in parentheses) is a scalar and not a vector.

For vectors v v

1

i +v

2

j +v

3

k and ww

1

i +w

2

j +w

3

k in component form, the dot product

is still v··· wv

1

w

1

+v

2

w

2

+v

3

w

3

.

Also notice that we deﬁned the dot product in an analytic way, i.e. by referencing vector

coordinates. There is a geometric way of deﬁning the dot product, which we will now develop

as a consequence of the analytic deﬁnition.

Deﬁnition 1.7. The angle between two nonzero vectors with the same initial point is the

smallest angle between them.

We do not deﬁne the angle between the zero vector and any other vector. Any two nonzero

vectors with the same initial point have two angles between them: θ and 360

◦

−θ. We will

always choose the smallest nonnegative angle θ between them, so that 0

◦

≤ θ ≤ 180

◦

. See

Figure 1.3.1.

θ

360

◦

−θ

(a) 0

◦

<θ <180

◦

θ

360

◦

−θ

(b) θ 180

◦

θ

360

◦

−θ

(c) θ 0

◦

Figure 1.3.1 Angle between vectors

We can now take a more geometric view of the dot product by establishing a relationship

between the dot product of two vectors and the angle between them.

16 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Theorem 1.6. Let v, w be nonzero vectors, and let θ be the angle between them. Then

cosθ

v··· w

|v||w|

(1.8)

Proof: We will prove the theorem for vectors in R

3

(the proof for R

2

is similar). Let v

(v

1

, v

2

, v

3

) and w(w

1

, w

2

, w

3

). By the Law of Cosines (see Figure 1.3.2), we have

|v−w|

2

|v|

2

+|w|

2

−2|v||w|cosθ (1.9)

(note that equation (1.9) holds even for the “degenerate” cases θ 0

◦

and 180

◦

).

θ

x

y

z

0

v

w

v−w

Figure 1.3.2

Since v−w(v

1

−w

1

, v

2

−w

2

, v

3

−w

3

), expanding |v−w|

2

in equation (1.9) gives

|v|

2

+|w|

2

−2|v||w|cosθ (v

1

−w

1

)

2

+(v

2

−w

2

)

2

+(v

3

−w

3

)

2

(v

2

1

−2v

1

w

1

+w

2

1

) +(v

2

2

−2v

2

w

2

+w

2

2

) +(v

2

3

−2v

3

w

3

+w

2

3

)

(v

2

1

+v

2

2

+v

2

3

) +(w

2

1

+w

2

2

+w

2

3

) −2(v

1

w

1

+v

2

w

2

+v

3

w

3

)

|v|

2

+|w|

2

−2(v··· w) , so

−2|v||w|cosθ −2(v··· w) , so since v /0 and w/0 then

cosθ

v··· w

|v||w|

, since |v| >0 and |w| >0. QED

Example 1.5. Find the angle θ between the vectors v (2, 1, −1) and w(3, −4, 1).

Solution: Since v··· w(2)(3) +(1)(−4) +(−1)(1) 1, |v|

_

6, and |w|

_

26, then

cosθ

v··· w

|v||w|

1

_

6

_

26

1

2

_

39

≈0.08 ⇒ θ 85.41

◦

Two nonzero vectors are perpendicular if the angle between them is 90

◦

. Since cos90

◦

**0, we have the following important corollary to Theorem 1.6:
**

Corollary 1.7. Two nonzero vectors v and w are perpendicular if and only if v··· w0.

We will write v ⊥w to indicate that v and w are perpendicular.

1.3 Dot Product 17

Since cosθ >0 for 0

◦

≤θ <90

◦

and cosθ <0 for 90

◦

<θ ≤180

◦

, we also have:

Corollary 1.8. If θ is the angle between nonzero vectors v and w, then

v··· w is

_

¸

¸

_

¸

¸

_

>0 for 0

◦

≤θ <90

◦

0 for θ 90

◦

<0 for 90

◦

<θ ≤180

◦

By Corollary 1.8, the dot product can be thought of as a way of telling if the angle be-

tween two vectors is acute, obtuse, or a right angle, depending on whether the dot product

is positive, negative, or zero, respectively. See Figure 1.3.3.

0

◦

≤θ <90

◦

v

w

(a) v··· w>0

90

◦

<θ ≤180

◦

v

w

(b) v··· w<0

θ 90

◦

v

w

(c) v··· w0

Figure 1.3.3 Sign of the dot product & angle between vectors

Example 1.6. Are the vectors v (−1, 5, −2) and w(3, 1, 1) perpendicular?

Solution: Yes, v ⊥w since v··· w(−1)(3) +(5)(1) +(−2)(1) 0.

The following theorem summarizes the basic properties of the dot product.

Theorem 1.9. For any vectors u, v, w, and scalar k, we have

(a) v··· ww··· v Commutative Law

(b) (kv) ··· wv··· (kw) k(v··· w) Associative Law

(c) v··· 0 0 0··· v

(d) u··· (v+w) u··· v+u··· w Distributive Law

(e) (u+v) ··· wu··· w+v··· w Distributive Law

(f) [v··· w[ ≤|v||w| Cauchy-Schwarz Inequality

5

Proof: The proofs of parts (a)-(e) are straightforward applications of the deﬁnition of the

dot product, and are left to the reader as exercises. We will prove part (f).

(f) If either v 0 or w0, then v··· w0 by part (c), and so the inequality holds trivially. So

assume that v and w are nonzero vectors. Then by Theorem 1.6,

v··· wcosθ|v||w| , so

[v··· w[ [cosθ[ |v||w| , so

[v··· w[ ≤|v||w| since [cosθ[ ≤1. QED

5

Also known as the Cauchy-Schwarz-Buniakovski Inequality.

18 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Using Theorem 1.9, we see that if u···v 0 and u···w0, then u···(kv+lw) k(u···v)+l(u···w)

k(0) +l(0) 0 for all scalars k, l. Thus, we have the following fact:

If u⊥v and u⊥w, then u⊥(kv+lw) for all scalars k, l.

For vectors v and w, the collection of all scalar combinations kv+lw is called the span

of v and w. If nonzero vectors v and w are parallel, then their span is a line; if they are

not parallel, then their span is a plane. So what we showed above is that a vector which is

perpendicular to two other vectors is also perpendicular to their span.

The dot product can be used to derive properties of the magnitudes of vectors, the most

important of which is the Triangle Inequality, as given in the following theorem:

Theorem 1.10. For any vectors v, w, we have

(a) |v|

2

v··· v

(b) |v+w| ≤|v|+|w| Triangle Inequality

(c) |v−w| ≥|v|−|w|

Proof: (a) Left as an exercise for the reader.

(b) By part (a) and Theorem 1.9, we have

|v+w|

2

(v+w) ··· (v+w) v··· v+v··· w+w··· v+w··· w

|v|

2

+2(v··· w) +|w|

2

, so since a ≤[a[ for any real number a, we have

≤|v|

2

+2[v··· w[ +|w|

2

, so by Theorem 1.9(f) we have

≤|v|

2

+2|v||w|+|w|

2

(|v|+|w|)

2

and so

|v+w| ≤|v|+|w| after taking square roots of both sides, which proves (b).

(c) Since v w+(v−w), then |v| |w+(v−w)| ≤|w|+|v−w| by the Triangle Inequality,

so subtracting |w| from both sides gives |v|−|w| ≤|v−w|. QED

v

w

v+w

Figure 1.3.4

The Triangle Inequality gets its name from the fact that in any triangle,

no one side is longer than the sum of the lengths of the other two sides (see

Figure 1.3.4). Another way of saying this is with the familiar statement “the

shortest distance between two points is a straight line.”

Exercises

A

1. Let v (5, 1, −2) and w(4, −4, 3). Calculate v··· w.

2. Let v −3i −2j −k and w6i +4j +2k. Calculate v··· w.

For Exercises 3-8, ﬁnd the angle θ between the vectors v and w.

1.3 Dot Product 19

3. v (5, 1, −2), w(4, −4, 3) 4. v (7, 2, −10), w(2, 6, 4)

5. v (2, 1, 4), w(1, −2, 0) 6. v (4, 2, −1), w(8, 4, −2)

7. v −i +2j +k, w−3i +6j +3k 8. v i, w3i +2j +4k

9. Let v (8, 4, 3) and w(−2, 1, 4). Is v ⊥w? Justify your answer.

10. Let v (6, 0, 4) and w(0, 2, −1). Is v ⊥w? Justify your answer.

11. For v, w from Exercise 5, verify the Cauchy-Schwarz Inequality [v··· w[ ≤|v||w|.

12. For v, w from Exercise 6, verify the Cauchy-Schwarz Inequality [v··· w[ ≤|v||w|.

13. For v, w from Exercise 5, verify the Triangle Inequality |v+w| ≤|v|+|w|.

14. For v, w from Exercise 6, verify the Triangle Inequality |v+w| ≤|v|+|w|.

B

Note: Consider only vectors in R

3

for Exercises 15-25.

15. Prove Theorem 1.9(a). 16. Prove Theorem 1.9(b).

17. Prove Theorem 1.9(c). 18. Prove Theorem 1.9(d).

19. Prove Theorem 1.9(e). 20. Prove Theorem 1.10(a).

21. Prove or give a counterexample: If u··· v u··· w, then v w.

C

22. Prove or give a counterexample: If v··· w0 for all v, then w0.

23. Prove or give a counterexample: If u··· v u··· w for all u, then v w.

24. Prove that

¸

¸

|v|−|w|

¸

¸

≤|v−w| for all v, w.

L

w

v

u

Figure 1.3.5

25. For nonzero vectors v and w, the projection of v onto w (some-

times written as pro j

w

v) is the vector u along the same line L

as w whose terminal point is obtained by dropping a perpendic-

ular line from the terminal point of v to L (see Figure 1.3.5).

Show that

|u|

[v··· w[

|w|

.

(Hint: Consider the angle between v and w.)

26. Let α, β, and γ be the angles between a nonzero vector v in R

3

and the vectors i, j, and

k, respectively. Show that cos

2

α+cos

2

β+cos

2

γ 1.

(Note: α, β, γ are often called the direction angles of v, and cosα, cosβ, cosγ are called

the direction cosines.)

20 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

1.4 Cross Product

In Section 1.3 we deﬁned the dot product, which gave a way of multiplying two vectors. The

resulting product, however, was a scalar, not a vector. In this section we will deﬁne a product

of two vectors that does result in another vector. This product, called the cross product, is

only deﬁned for vectors in R

3

. The deﬁnition may appear strange and lacking motivation,

but we will see the geometric basis for it shortly.

Deﬁnition 1.8. Let v (v

1

, v

2

, v

3

) and w(w

1

, w

2

, w

3

) be vectors in R

3

. The cross product

of v and w, denoted by v×××w, is the vector in R

3

given by:

v×××w(v

2

w

3

−v

3

w

2

, v

3

w

1

−v

1

w

3

, v

1

w

2

−v

2

w

1

) (1.10)

1

1

1

x

y

z

0

i

j

ki×××j

Figure 1.4.1

Example 1.7. Find i ×××j.

Solution: Since i (1, 0, 0) and j (0, 1, 0), then

i ×××j ((0)(0) −(0)(1), (0)(0) −(1)(0), (1)(1) −(0)(0))

(0, 0, 1)

k

Similarly it can be shown that j ×××ki and k×××i j.

In the above example, the cross product of the given vectors was perpendicular to both

those vectors. It turns out that this will always be the case.

Theorem 1.11. If the cross product v×××w of two nonzero vectors v and w is also a nonzero

vector, then it is perpendicular to both v and w.

Proof: We will show that (v×××w) ··· v 0:

(v×××w) ··· v (v

2

w

3

−v

3

w

2

, v

3

w

1

−v

1

w

3

, v

1

w

2

−v

2

w

1

) ··· (v

1

, v

2

, v

3

)

v

2

w

3

v

1

−v

3

w

2

v

1

+v

3

w

1

v

2

−v

1

w

3

v

2

+v

1

w

2

v

3

−v

2

w

1

v

3

v

1

v

2

w

3

−v

1

v

2

w

3

+w

1

v

2

v

3

−w

1

v

2

v

3

+v

1

w

2

v

3

−v

1

w

2

v

3

0 , after rearranging the terms.

∴v×××w⊥v by Corollary 1.7.

The proof that v×××w⊥w is similar. QED

As a consequence of the above theorem and Theorem 1.9, we have the following:

Corollary 1.12. If the cross product v×××w of two nonzero vectors v and w is also a nonzero

vector, then it is perpendicular to the span of v and w.

1.4 Cross Product 21

The span of any two nonzero, nonparallel vectors v, w in R

3

is a plane P, so the above

corollary shows that v×××w is perpendicular to that plane. As shown in Figure 1.4.2, there

are two possible directions for v×××w, one the opposite of the other. It turns out (see Appendix

B) that the direction of v×××w is given by the right-hand rule, that is, the vectors v, w, v×××w

form a right-handed system. Recall from Section 1.1 that this means that you can point your

thumb upwards in the direction of v×××w while rotating v towards w with the remaining four

ﬁngers.

x

y

z

0

θ

v

w

v×××w

−v×××w

P

Figure 1.4.2 Direction of v×××w

We will now derive a formula for the magnitude of v×××w, for nonzero vectors v, w:

|v×××w|

2

(v

2

w

3

−v

3

w

2

)

2

+(v

3

w

1

−v

1

w

3

)

2

+(v

1

w

2

−v

2

w

1

)

2

v

2

2

w

2

3

−2v

2

w

2

v

3

w

3

+v

2

3

w

2

2

+v

2

3

w

2

1

−2v

1

w

1

v

3

w

3

+v

2

1

w

2

3

+v

2

1

w

2

2

−2v

1

w

1

v

2

w

2

+v

2

2

w

2

1

v

2

1

(w

2

2

+w

2

3

) +v

2

2

(w

2

1

+w

2

3

) +v

2

3

(w

2

1

+w

2

2

) −2(v

1

w

1

v

2

w

2

+v

1

w

1

v

3

w

3

+v

2

w

2

v

3

w

3

)

and now adding and subtracting v

2

1

w

2

1

, v

2

2

w

2

2

, and v

2

3

w

2

3

on the right side gives

v

2

1

(w

2

1

+w

2

2

+w

2

3

) +v

2

2

(w

2

1

+w

2

2

+w

2

3

) +v

2

3

(w

2

1

+w

2

2

+w

2

3

)

−(v

2

1

w

2

1

+v

2

2

w

2

2

+v

2

3

w

2

3

+2(v

1

w

1

v

2

w

2

+v

1

w

1

v

3

w

3

+v

2

w

2

v

3

w

3

))

(v

2

1

+v

2

2

+v

2

3

)(w

2

1

+w

2

2

+w

2

3

)

−((v

1

w

1

)

2

+(v

2

w

2

)

2

+(v

3

w

3

)

2

+2(v

1

w

1

)(v

2

w

2

) +2(v

1

w

1

)(v

3

w

3

) +2(v

2

w

2

)(v

3

w

3

))

so using (a+b+c)

2

a

2

+b

2

+c

2

+2ab+2ac +2bc for the subtracted term gives

(v

2

1

+v

2

2

+v

2

3

)(w

2

1

+w

2

2

+w

2

3

) −(v

1

w

1

+v

2

w

2

+v

3

w

3

)

2

|v|

2

|w|

2

−(v··· w)

2

|v|

2

|w|

2

_

1−

(v··· w)

2

|v|

2

|w|

2

_

, since |v| >0 and |w| >0, so by Theorem 1.6

|v|

2

|w|

2

(1−cos

2

θ) , where θ is the angle between v and w, so

|v×××w|

2

|v|

2

|w|

2

sin

2

θ , and since 0

◦

≤θ ≤180

◦

, then sinθ ≥0, so we have:

22 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

If θ is the angle between nonzero vectors v and w in R

3

, then

|v×××w| |v||w| sinθ (1.11)

It may seem strange to bother with the above formula, when the magnitude of the cross

product can be calculated directly, like for any other vector. The formula is more useful for

its applications in geometry, as in the following example.

Example 1.8. Let △PQR and PQRS be a triangle and parallelogram, respectively, as shown

in Figure 1.4.3.

b

h h

θ θ

P P

Q Q R R

S S

v

w

Figure 1.4.3

Think of the triangle as existing in R

3

, and identify the sides QR and QP with vectors v

and w, respectively, in R

3

. Let θ be the angle between v and w. The area A

PQR

of △PQR is

1

2

bh, where b is the base of the triangle and h is the height. So we see that

b |v| and h |w| sinθ

A

PQR

1

2

|v||w| sinθ

1

2

|v×××w|

So since the area A

PQRS

of the parallelogram PQRS is twice the area of the triangle △PQR,

then

A

PQRS

|v||w| sinθ

By the discussion in Example 1.8, we have proved the following theorem:

Theorem 1.13. Area of triangles and parallelograms

(a) The area A of a triangle with adjacent sides v, w (as vectors in R

3

) is:

A

1

2

|v×××w|

(b) The area A of a parallelogram with adjacent sides v, w (as vectors in R

3

) is:

A |v×××w|

1.4 Cross Product 23

It may seem at ﬁrst glance that since the formulas derived in Example 1.8 were for the

adjacent sides QP and QR only, then the more general statements in Theorem 1.13 that the

formulas hold for any adjacent sides are not justiﬁed. We would get a different formula for

the area if we had picked PQ and PR as the adjacent sides, but it can be shown (see Exercise

26) that the different formulas would yield the same value, so the choice of adjacent sides

indeed does not matter, and Theorem 1.13 is valid.

Theorem 1.13 makes it simpler to calculate the area of a triangle in 3-dimensional space

than by using traditional geometric methods.

Example 1.9. Calculate the area of the triangle △PQR, where P (2, 4, −7), Q (3, 7, 18),

and R (−5, 12, 8).

y

z

x

0

v

w

R(−5, 12, 8)

Q(3, 7, 18)

P(2, 4, −7)

Figure 1.4.4

Solution: Let v

−−→

PQ and w

−−→

PR, as in Figure 1.4.4. Then

v (3, 7, 18)−(2, 4, −7) (1, 3, 25) and w(−5, 12, 8)−(2, 4, −7)

(−7, 8, 15), so the area A of the triangle △PQR is

A

1

2

|v×××w|

1

2

|(1, 3, 25) ×××(−7, 8, 15)|

1

2

_

_

((3)(15) −(25)(8), (25)(−7) −(1)(15), (1)(8) −(3)(−7))

_

_

1

2

_

_

(−155, −190, 29)

_

_

1

2

_

(−155)

2

+(−190)

2

+29

2

1

2

_

60966

A ≈123.46

Example 1.10. Calculate the area of the parallelogram PQRS, where P (1, 1), Q (2, 3),

R (5, 4), and S (4, 2).

x

y

0

1

2

3

4

1 2 3 4 5

P

Q

R

S

v

w

Figure 1.4.5

Solution: Let v

−−→

SP and w

−−→

SR, as in Figure 1.4.5. Then

v (1, 1) −(4, 2) (−3, −1) and w (5, 4) −(4, 2) (1, 2). But

these are vectors in R

2

, and the cross product is only deﬁned

for vectors in R

3

. However, R

2

can be thought of as the subset

of R

3

such that the z-coordinate is always 0. So we can write

v (−3, −1, 0) and w(1, 2, 0). Then the area A of PQRS is

A |v×××w|

_

_

(−3, −1, 0) ×××(1, 2, 0)

_

_

_

_

((−1)(0) −(0)(2), (0)(1) −(−3)(0), (−3)(2) −(−1)(1))

_

_

_

_

(0, 0, −5)

_

_

A 5

24 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

The following theorem summarizes the basic properties of the cross product.

Theorem 1.14. For any vectors u, v, w in R

3

, and scalar k, we have

(a) v×××w−w×××v Anticommutative Law

(b) u×××(v+w) u×××v+u×××w Distributive Law

(c) (u+v) ×××wu×××w+v×××w Distributive Law

(d) (kv) ×××wv×××(kw) k(v×××w) Associative Law

(e) v×××0 0 0×××v

(f) v×××v 0

(g) v×××w0 if and only if v ∥ w

Proof: The proofs of properties (b)-(f) are straightforward. We will prove parts (a) and (g)

and leave the rest to the reader as exercises.

x

y

z

0

v

w

v×××w

w×××v

Figure 1.4.6

(a) By the deﬁnition of the cross product and scalar multipli-

cation, we have:

v×××w(v

2

w

3

−v

3

w

2

, v

3

w

1

−v

1

w

3

, v

1

w

2

−v

2

w

1

)

−(v

3

w

2

−v

2

w

3

, v

1

w

3

−v

3

w

1

, v

2

w

1

−v

1

w

2

)

−(w

2

v

3

−w

3

v

2

, w

3

v

1

−w

1

v

3

, w

1

v

2

−w

2

v

1

)

−w×××v

Note that this says that v×××w and w×××v have the same mag-

nitude but opposite direction (see Figure 1.4.6).

(g) If either v or w is 0 then v×××w0 by part (e), and either v 0 0w or w0 0v, so v

and w are scalar multiples, i.e. they are parallel.

If both v and w are nonzero, and θ is the angle between them, then by formula (1.11),

v×××w0 if and only if |v||w| sinθ 0, which is true if and only if sinθ 0 (since |v| >0

and |w| > 0). So since 0

◦

≤ θ ≤ 180

◦

, then sinθ 0 if and only if θ 0

◦

or 180

◦

. But the

angle between v and w is 0

◦

or 180

◦

if and only if v ∥ w. QED

Example 1.11. Adding to Example 1.7, we have

i ×××j k j ×××ki k×××i j

j ×××i −k k×××j −i i ×××k−j

i ×××i j ×××j k×××k0

Recall from geometry that a parallelepiped is a 3-dimensional solid with 6 faces, all of

which are parallelograms.

6

6

An equivalent deﬁnition of a parallelepiped is: the collection of all scalar combinations k

1

v

1

+k

2

v

2

+k

3

v

3

of

some vectors v

1

, v

2

, v

3

in R

3

, where 0 ≤ k

1

, k

2

, k

3

≤1.

1.4 Cross Product 25

Example 1.12. Volume of a parallelepiped: Let the vectors u, v, w in R

3

represent adjacent

sides of a parallelepiped P, with u, v, w forming a right-handed system, as in Figure 1.4.7.

Show that the volume of P is the scalar triple product u··· (v×××w).

h

θ

u

w

v

v×××w

Figure 1.4.7 Parallelepiped P

Solution: Recall that the volume vol(P) of a par-

allelepiped P is the area A of the base parallel-

ogram times the height h. By Theorem 1.13(b),

the area A of the base parallelogram is |v×××w|.

And we can see that since v×××w is perpendicular

to the base parallelogram determined by v and

w, then the height h is |u| cosθ, where θ is the

angle between u and v×××w. By Theorem 1.6 we

know that

cosθ

u··· (v×××w)

|u||v×××w|

. Hence,

vol(P) Ah

|v×××w|

|u|u··· (v×××w)

|u||v×××w|

u··· (v×××w)

In Example 1.12 the height h of the parallelepiped is |u| cosθ, and not −|u| cosθ, be-

cause the vector u is on the same side of the base parallelogram’s plane as the vector v×××w

(so that cosθ > 0). Since the volume is the same no matter which base and height we use,

then repeating the same steps using the base determined by u and v (since w is on the same

side of that base’s plane as u×××v), the volume is w··· (u×××v). Repeating this with the base

determined by w and u, we have the following result:

For any vectors u, v, w in R

3

,

u··· (v×××w) w··· (u×××v) v··· (w×××u) (1.12)

(Note that the equalities hold trivially if any of the vectors are 0.)

Since v×××w−w×××v for any vectors v, w in R

3

, then picking the wrong order for the three

adjacent sides in the scalar triple product in formula (1.12) will give you the negative of the

volume of the parallelepiped. So taking the absolute value of the scalar triple product for

any order of the three adjacent sides will always give the volume:

Theorem 1.15. If vectors u, v, w in R

3

represent any three adjacent sides of a paral-

lelepiped, then the volume of the parallelepiped is [u··· (v×××w)[.

Another type of triple product is the vector triple product u×××(v×××w). The proof of the

following theorem is left as an exercise for the reader:

26 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Theorem 1.16. For any vectors u, v, w in R

3

,

u×××(v×××w) (u··· w)v−(u··· v)w (1.13)

An examination of the formula in Theorem 1.16 gives some idea of the geometry of the

vector triple product. By the right side of formula (1.13), we see that u×××(v×××w) is a scalar

combination of v and w, and hence lies in the plane containing v and w (i.e. u×××(v×××w), v

and w are coplanar). This makes sense since, by Theorem 1.11, u×××(v×××w) is perpendicular

to both u and v×××w. In particular, being perpendicular to v×××w means that u×××(v×××w) lies

in the plane containing v and w, since that plane is itself perpendicular to v×××w. But then

how is u×××(v×××w) also perpendicular to u, which could be any vector? The following example

may help to see how this works.

Example 1.13. Find u×××(v×××w) for u(1, 2, 4), v (2, 2, 0), w(1, 3, 0).

Solution: Since u··· v 6 and u··· w7, then

u×××(v×××w) (u··· w)v−(u··· v)w

7(2, 2, 0) −6(1, 3, 0) (14, 14, 0) −(6, 18, 0)

(8, −4, 0)

Note that v and w lie in the xy-plane, and that u×××(v×××w) also lies in that plane. Also,

u×××(v×××w) is perpendicular to both u and v×××w(0, 0, 4) (see Figure 1.4.8).

y

z

x

0

u

v

w

v ××× w

u ××× (v ××× w)

Figure 1.4.8

For vectors v v

1

i+v

2

j+v

3

k and ww

1

i+w

2

j+w

3

k in component form, the cross product

is written as: v×××w(v

2

w

3

−v

3

w

2

)i+(v

3

w

1

−v

1

w

3

)j+(v

1

w

2

−v

2

w

1

)k. It is often easier to use the

component form for the cross product, because it can be represented as a determinant. We

will not go too deeply into the theory of determinants

7

; we will just cover what is essential

for our purposes.

7

See ANTON and RORRES for a fuller development.

1.4 Cross Product 27

A 2×××2 matrix is an array of two rows and two columns of scalars, written as

_

a b

c d

_

or

_

a b

c d

_

where a, b, c, d are scalars. The determinant of such a matrix, written as

¸

¸

¸

¸

a b

c d

¸

¸

¸

¸

or det

_

a b

c d

_

,

is the scalar deﬁned by the following formula:

¸

¸

¸

¸

a b

c d

¸

¸

¸

¸

ad−bc

It may help to remember this formula as being the product of the scalars on the downward

diagonal minus the product of the scalars on the upward diagonal.

Example 1.14.

¸

¸

¸

¸

1 2

3 4

¸

¸

¸

¸

(1)(4) −(2)(3) 4−6 −2

A 3×××3 matrix is an array of three rows and three columns of scalars, written as

_

_

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

_

_

or

_

_

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

_

_

,

and its determinant is given by the formula:

¸

¸

¸

¸

¸

¸

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

¸

¸

¸

¸

¸

¸

a

1

¸

¸

¸

¸

b

2

b

3

c

2

c

3

¸

¸

¸

¸

− a

2

¸

¸

¸

¸

b

1

b

3

c

1

c

3

¸

¸

¸

¸

+ a

3

¸

¸

¸

¸

b

1

b

2

c

1

c

2

¸

¸

¸

¸

(1.14)

One way to remember the above formula is the following: multiply each scalar in the ﬁrst

row by the determinant of the 2×2 matrix that remains after removing the row and column

that contain that scalar, then sum those products up, putting alternating plus and minus

signs in front of each (starting with a plus).

Example 1.15.

¸

¸

¸

¸

¸

¸

1 0 2

4 −1 3

1 0 2

¸

¸

¸

¸

¸

¸

1

¸

¸

¸

¸

−1 3

0 2

¸

¸

¸

¸

− 0

¸

¸

¸

¸

4 3

1 2

¸

¸

¸

¸

+ 2

¸

¸

¸

¸

4 −1

1 0

¸

¸

¸

¸

1(−2−0) −0(8−3) +2(0+1) 0

28 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

We deﬁned the determinant as a scalar, derived from algebraic operations on scalar en-

tries in a matrix. However, if we put three vectors in the ﬁrst row of a 3×3 matrix, then

the deﬁnition still makes sense, since we would be performing scalar multiplication on those

three vectors (they would be multiplied by the 2×2 scalar determinants as before). This gives

us a determinant that is now a vector, and lets us write the cross product of v v

1

i+v

2

j+v

3

k

and ww

1

i +w

2

j +w

3

k as a determinant:

v×××w

¸

¸

¸

¸

¸

¸

i j k

v

1

v

2

v

3

w

1

w

2

w

3

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

v

2

v

3

w

2

w

3

¸

¸

¸

¸

i −

¸

¸

¸

¸

v

1

v

3

w

1

w

3

¸

¸

¸

¸

j +

¸

¸

¸

¸

v

1

v

2

w

1

w

2

¸

¸

¸

¸

k

(v

2

w

3

−v

3

w

2

)i +(v

3

w

1

−v

1

w

3

)j +(v

1

w

2

−v

2

w

1

)k

Example 1.16. Let v 4i −j +3k and wi +2k. Then

v×××w

¸

¸

¸

¸

¸

¸

i j k

4 −1 3

1 0 2

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

−1 3

0 2

¸

¸

¸

¸

i −

¸

¸

¸

¸

4 3

1 2

¸

¸

¸

¸

j +

¸

¸

¸

¸

4 −1

1 0

¸

¸

¸

¸

k−2i −5j +k

The scalar triple product can also be written as a determinant. In fact, by Example 1.12,

the following theorem provides an alternate deﬁnition of the determinant of a 3×3 matrix

as the volume of a parallelepiped whose adjacent sides are the rows of the matrix and form

a right-handed system (a left-handed system would give the negative volume).

Theorem 1.17. For any vectors u(u

1

, u

2

, u

3

), v (v

1

, v

2

, v

3

), w(w

1

, w

2

, w

3

) in R

3

:

u··· (v×××w)

¸

¸

¸

¸

¸

¸

u

1

u

2

u

3

v

1

v

2

v

3

w

1

w

2

w

3

¸

¸

¸

¸

¸

¸

(1.15)

Example 1.17. Find the volume of the parallelepiped with adjacent sides u (2, 1, 3), v

(−1, 3, 2), w(1, 1, −2) (see Figure 1.4.9).

y

z

x

0

u

v

w

Figure 1.4.9 P

Solution: By Theorem 1.15, the volume vol(P) of the parallelepiped

P is the absolute value of the scalar triple product of the three

adjacent sides (in any order). By Theorem 1.17,

u··· (v×××w)

¸

¸

¸

¸

¸

¸

2 1 3

−1 3 2

1 1 −2

¸

¸

¸

¸

¸

¸

2

¸

¸

¸

¸

3 2

1 −2

¸

¸

¸

¸

− 1

¸

¸

¸

¸

−1 2

1 −2

¸

¸

¸

¸

+ 3

¸

¸

¸

¸

−1 3

1 1

¸

¸

¸

¸

2(−8) −1(0) +3(−4) −28, so

vol(P) [−28[ 28.

1.4 Cross Product 29

Interchanging the dot and cross products can be useful in proving vector identities:

Example 1.18. Prove: (u×××v) ··· (w×××z)

¸

¸

¸

¸

u··· w u··· z

v··· w v··· z

¸

¸

¸

¸

for all vectors u, v, w, z in R

3

.

Solution: Let x u×××v. Then

(u×××v) ··· (w×××z) x··· (w×××z)

w··· (z×××x) (by formula (1.12))

w··· (z×××(u×××v))

w··· ((z··· v)u−(z··· u)v) (by Theorem 1.16)

(z··· v)(w··· u) −(z··· u)(w··· v)

(u··· w)(v··· z) −(u··· z)(v··· w) (by commutativity of the dot product).

¸

¸

¸

¸

u··· w u··· z

v··· w v··· z

¸

¸

¸

¸

Exercises

A

For Exercises 1-6, calculate v×××w.

1. v (5, 1, −2), w(4, −4, 3) 2. v (7, 2, −10), w(2, 6, 4)

3. v (2, 1, 4), w(1, −2, 0) 4. v (1, 3, 2), w(7, 2, −10)

5. v −i +2j +k, w−3i +6j +3k 6. v i, w3i +2j +4k

For Exercises 7-8, calculate the area of the triangle △PQR.

7. P (5, 1, −2), Q (4, −4, 3), R (2, 4, 0) 8. P (4, 0, 2), Q (2, 1, 5), R (−1, 0, −1)

For Exercises 9-10, calculate the area of the parallelogram PQRS.

9. P (2, 1, 3), Q (1, 4, 5), R (2, 5, 3), S (3, 2, 1)

10. P (−2, −2), Q (1, 4), R (6, 6), S (3, 0)

For Exercises 11-12, ﬁnd the volume of the parallelepiped with adjacent sides u, v, w.

11. u(1, 1, 3), v (2, 1, 4), w(5, 1, −2) 12. u(1, 3, 2), v (7, 2, −10), w(1, 0, 1)

For Exercises 13-14, calculate u··· (v×××w) and u×××(v×××w).

13. u(1, 1, 1), v (3, 0, 2), w(2, 2, 2) 14. u(1, 0, 2), v (−1, 0, 3), w(2, 0, −2)

15. Calculate (u×××v) ··· (w×××z) for u(1, 1, 1), v (3, 0, 2), w(2, 2, 2), z (2, 1, 4).

30 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

B

16. If v and w are unit vectors in R

3

, under what condition(s) would v×××w also be a unit

vector in R

3

? Justify your answer.

17. Show that if v×××w0 for all w in R

3

, then v 0.

18. Prove Theorem 1.14(b). 19. Prove Theorem 1.14(c).

20. Prove Theorem 1.14(d). 21. Prove Theorem 1.14(e).

22. Prove Theorem 1.14(f). 23. Prove Theorem 1.16.

24. Prove Theorem 1.17. (Hint: Expand both sides of the equation.)

25. Prove the following for all vectors v, w in R

3

:

(a) |v×××w|

2

+[v··· w[

2

|v|

2

|w|

2

(b) If v··· w0 and v×××w0, then v 0 or w0.

C

26. Prove that in Example 1.8 the formula for the area of the triangle △PQR yields the

same value no matter which two adjacent sides are chosen. To do this, show that

1

2

|u×××

(−w)|

1

2

|v×××w|, where u PR, −w PQ, and v QR, w QP as before. Similarly,

show that

1

2

|(−u) ×××(−v)|

1

2

|v×××w|, where −uRP and −vRQ.

27. Consider the vector equation a×××x b in R

3

, where a /0. Show that:

(a) a··· b0

(b) x

b×××a

|a|

2

+ka is a solution to the equation, for any scalar k

28. Prove the Jacobi identity: u×××(v×××w) +v×××(w×××u) +w×××(u×××v) 0

29. Show that u, v, w lie in the same plane in R

3

if and only if u··· (v×××w) 0.

30. For all vectors u, v, w, z in R

3

, show that

(u×××v) ×××(w×××z) (z··· (u×××v))w−(w··· (u×××v))z

and that

(u×××v) ×××(w×××z) (u··· (w×××z))v−(v··· (w×××z))u

Why do both equations make sense geometrically?

1.5 Lines and Planes 31

1.5 Lines and Planes

Nowthat we knowhowto performsome operations on vectors, we can start to deal with some

familiar geometric objects, like lines and planes, in the language of vectors. The reason

for doing this is simple: using vectors makes it easier to study objects in 3-dimensional

Euclidean space. We will ﬁrst consider lines.

Line through a point, parallel to a vector

Let P (x

0

, y

0

, z

0

) be a point in R

3

, let v (a, b, c) be a nonzero vector, and let L be the line

through P which is parallel to v (see Figure 1.5.1).

x

y

z

0

L

t >0

t <0

P(x

0

, y

0

, z

0

)

r

v

tv

r+tv

r+tv

Figure 1.5.1

Let r (x

0

, y

0

, z

0

) be the vector pointing from the origin to P. Since multiplying the vector

v by a scalar t lengthens or shrinks v while preserving its direction if t > 0, and reversing

its direction if t < 0, then we see from Figure 1.5.1 that every point on the line L can be

obtained by adding the vector tv to the vector r for some scalar t. That is, as t varies over all

real numbers, the vector r+tv will point to every point on L. We can summarize the vector

representation of L as follows:

For a point P (x

0

, y

0

, z

0

) and nonzero vector v in R

3

, the line L through P parallel to v

is given by

r+tv, for −∞< t <∞ (1.16)

where r (x

0

, y

0

, z

0

) is the vector pointing to P.

Note that we used the correspondence between a vector and its terminal point. Since

v (a, b, c), then the terminal point of the vector r+tv is (x

0

+at, y

0

+bt, z

0

+ct). We then get

the parametric representation of L with the parameter t:

For a point P (x

0

, y

0

, z

0

) and nonzero vector v (a, b, c) in R

3

, the line L through P

parallel to v consists of all points (x, y, z) given by

x x

0

+at, y y

0

+bt, z z

0

+ct, for −∞< t <∞ (1.17)

Note that in both representations we get the point P on L by letting t 0.

32 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

In formula (1.17), if a /0, then we can solve for the parameter t: t (x−x

0

)/a. We can also

solve for t in terms of y and in terms of z if neither b nor c, respectively, is zero: t (y−y

0

)/b

and t (z−z

0

)/c. These three values all equal the same value t, so we can write the following

system of equalities, called the symmetric representation of L:

For a point P (x

0

, y

0

, z

0

) and vector v (a, b, c) in R

3

with a, b and c all nonzero, the line

L through P parallel to v consists of all points (x, y, z) given by the equations

x−x

0

a

y− y

0

b

z −z

0

c

(1.18)

x

y

z

0

x x

0

x

0

L

Figure 1.5.2

What if, say, a 0 in the above scenario? We can not divide by

zero, but we do know that x x

0

+at, and so x x

0

+0t x

0

. Then the

symmetric representation of L would be:

x x

0

,

y− y

0

b

z −z

0

c

(1.19)

Note that this says that the line L lies in the plane x x

0

, which is

parallel to the yz-plane (see Figure 1.5.2). Similar equations can be

derived for the cases when b 0 or c 0.

You may have noticed that the vector representation of L in formula (1.16) is more compact

than the parametric and symmetric formulas. That is an advantage of using vector notation.

Technically, though, the vector representation gives us the vectors whose terminal points

make up the line L, not just L itself. So you have to remember to identify the vectors r+tv

with their terminal points. On the other hand, the parametric representation always gives

just the points on L and nothing else.

Example 1.19. Write the line L through the point P (2, 3, 5) and parallel to the vector

v (4, −1, 6), in the following forms: (a) vector, (b) parametric, (c) symmetric. Lastly: (d) ﬁnd

two points on L distinct from P.

Solution: (a) Let r (2, 3, 5). Then by formula (1.16), L is given by:

r+tv(2, 3, 5) +t(4, −1, 6), for −∞< t <∞

(b) L consists of the points (x, y, z) such that

x 2+4t, y 3−t, z 5+6t, for −∞< t <∞

(c) L consists of the points (x, y, z) such that

x−2

4

y−3

−1

z −5

6

(d) Letting t 1 and t 2 in part(b) yields the points (6, 2, 11) and (10, 1, 17) on L.

1.5 Lines and Planes 33

Line through two points

x

y

z

0

L

P

1

(x

1

, y

1

, z

1

)

P

2

(x

2

, y

2

, z

2

)

r

1

r

2

r

2

−r

1

r

1

+t(r

2

−r

1

)

Figure 1.5.3

Let P

1

(x

1

, y

1

, z

1

) and P

2

(x

2

, y

2

, z

2

) be distinct points

in R

3

, and let L be the line through P

1

and P

2

. Let r

1

(x

1

, y

1

, z

1

) and r

2

(x

2

, y

2

, z

2

) be the vectors pointing to P

1

and P

2

, respectively. Then as we can see from Figure

1.5.3, r

2

−r

1

is the vector from P

1

to P

2

. So if we multiply

the vector r

2

−r

1

by a scalar t and add it to the vector

r

1

, we will get the entire line L as t varies over all real

numbers. The following is a summary of the vector, para-

metric, and symmetric forms for the line L:

Let P

1

(x

1

, y

1

, z

1

), P

2

(x

2

, y

2

, z

2

) be distinct points in R

3

, and let r

1

(x

1

, y

1

, z

1

), r

2

(x

2

, y

2

, z

2

). Then the line L through P

1

and P

2

has the following representations:

Vector:

r

1

+t(r

2

−r

1

) , for −∞< t <∞ (1.20)

Parametric:

x x

1

+(x

2

−x

1

)t, y y

1

+(y

2

− y

1

)t, z z

1

+(z

2

−z

1

)t, for −∞< t <∞ (1.21)

Symmetric:

x−x

1

x

2

−x

1

y− y

1

y

2

− y

1

z −z

1

z

2

−z

1

(if x

1

/ x

2

, y

1

/ y

2

, and z

1

/ z

2

) (1.22)

Example 1.20. Write the line L through the points P

1

(−3, 1, −4) and P

2

(4, 4, −6) in

parametric form.

Solution: By formula (1.21), L consists of the points (x, y, z) such that

x −3+7t, y 1+3t, z −4−2t, for −∞< t <∞

Distance between a point and a line

θ L

v

w d

Q

P

Figure 1.5.4

Let L be a line in R

3

in vector form as r + tv (for −∞< t < ∞),

and let P be a point not on L. The distance d from P to L is the

length of the line segment from P to L which is perpendicular to L

(see Figure 1.5.4). Pick a point Q on L, and let w be the vector from

Q to P. If θ is the angle between w and v, then d |w| sinθ. So

since |v×××w| |v||w| sinθ and v /0, then:

d

|v×××w|

|v|

(1.23)

34 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.21. Find the distance d from the point P (1, 1, 1) to the line L in Example 1.20.

Solution: From Example 1.20, we see that we can represent L in vector form as: r+tv, for

r (−3, 1, −4) and v (7, 3, −2). Since the point Q (−3, 1, −4) is on L, then for w

−−→

QP

(1, 1, 1) −(−3, 1, −4) (4, 0, 5), we have:

v×××w

¸

¸

¸

¸

¸

¸

i j k

7 3 −2

4 0 5

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

3 −2

0 5

¸

¸

¸

¸

i −

¸

¸

¸

¸

7 −2

4 5

¸

¸

¸

¸

j +

¸

¸

¸

¸

7 3

4 0

¸

¸

¸

¸

k15i −43j −12k , so

d

|v×××w|

|v|

_

_

15i −43j −12k

_

_

_

_

(7, 3, −2)

_

_

_

15

2

+(−43)

2

+(−12)

2

_

7

2

+3

2

+(−2)

2

_

2218

_

62

5.98

It is clear that two lines L

1

and L

2

, represented in vector form as r

1

+sv

1

and r

2

+tv

2

,

respectively, are parallel (denoted as L

1

∥ L

2

) if v

1

and v

2

are parallel. Also, L

1

and L

2

are

perpendicular (denoted as L

1

⊥L

2

) if v

1

and v

2

are perpendicular.

x

y

z

0

L

1

L

2

Figure 1.5.5

In 2-dimensional space, two lines are either identical, parallel, or they

intersect. In 3-dimensional space, there is an additional possibility: two

lines can be skew, that is, they do not intersect but they are not parallel.

However, even though they are not parallel, skew lines are on parallel

planes (see Figure 1.5.5).

To determine whether two lines in R

3

intersect, it is often easier to use

the parametric representation of the lines. In this case, you should use dif-

ferent parameter variables (usually s and t) for the lines, since the values of the parameters

may not be the same at the point of intersection. Setting the two (x, y, z) triples equal will

result in a system of 3 equations in 2 unknowns (s and t).

Example 1.22. Find the point of intersection (if any) of the following lines:

x+1

3

y−2

2

z −1

−1

and x+3

y−8

−3

z +3

2

Solution: First we write the lines in parametric form, with parameters s and t:

x −1+3s, y 2+2s, z 1−s and x −3+t, y 8−3t, z −3+2t

The lines intersect when (−1+3s, 2+2s, 1−s) (−3+t, 8−3t, −3+2t) for some s, t:

−1+3s −3+t : ⇒t 2+3s

2+2s 8−3t : ⇒2+2s 8−3(2+3s) 2−9s ⇒2s −9s ⇒s 0 ⇒t 2+3(0) 2

1−s −3+2t : 1−0 −3+2(2) ⇒1 1 (Note that we had to check this.)

Letting s 0 in the equations for the ﬁrst line, or letting t 2 in the equations for the second

line, gives the point of intersection (−1, 2, 1).

1.5 Lines and Planes 35

We will now consider planes in 3-dimensional Euclidean space.

Plane through a point, perpendicular to a vector

Let P be a plane in R

3

, and suppose it contains a point P

0

(x

0

, y

0

, z

0

). Let n (a, b, c) be

a nonzero vector which is perpendicular to the plane P. Such a vector is called a normal

vector (or just a normal) to the plane. Now let (x, y, z) be any point in the plane P. Then

the vector r (x−x

0

, y−y

0

, z−z

0

) lies in the plane P (see Figure 1.5.6). So if r /0, then r ⊥n

and hence n··· r 0. And if r 0 then we still have n··· r 0.

(x

0

, y

0

, z

0

) (x, y, z)

n

r

Figure 1.5.6 The plane P

Conversely, if (x, y, z) is any point in R

3

such that r (x−x

0

, y− y

0

, z −z

0

) /0 and n··· r 0,

then r ⊥n and so (x, y, z) lies in P. This proves the following theorem:

Theorem 1.18. Let P be a plane in R

3

, let (x

0

, y

0

, z

0

) be a point in P, and let n(a, b, c) be a

nonzero vector which is perpendicular to P. Then P consists of the points (x, y, z) satisfying

the vector equation:

n··· r 0 (1.24)

where r (x−x

0

, y− y

0

, z −z

0

), or equivalently:

a(x−x

0

) +b(y− y

0

) +c(z −z

0

) 0 (1.25)

The above equation is called the point-normal form of the plane P.

Example 1.23. Find the equation of the plane P containing the point (−3, 1, 3) and perpen-

dicular to the vector n(2, 4, 8).

Solution: By formula (1.25), the plane P consists of all points (x, y, z) such that:

2(x+3) +4(y−1) +8(z −3) 0

If we multiply out the terms in formula (1.25) and combine the constant terms, we get an

equation of the plane in normal form:

ax+by+cz +d 0 (1.26)

For example, the normal form of the plane in Example 1.23 is 2x+4y+8z −22 0.

36 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Plane containing three noncollinear points

In 2-dimensional and 3-dimensional space, two points determine a line. Two points do

not determine a plane in R

3

. In fact, three collinear points (i.e. all on the same line) do not

determine a plane; an inﬁnite number of planes would contain the line on which those three

points lie. However, three noncollinear points do determine a plane. For if Q, R and S are

noncollinear points in R

3

, then

−−→

QR and

−−→

QS are nonzero vectors which are not parallel (by

noncollinearity), and so their cross product

−−→

QR×××

−−→

QS is perpendicular to both

−−→

QR and

−−→

QS.

So

−−→

QR and

−−→

QS (and hence Q, R and S) lie in the plane through the point Q with normal

vector n

−−→

QR×××

−−→

QS (see Figure 1.5.7).

Q

R

S

n

−−→

QR×××

−−→

QS

−−→

QR

−−→

QS

Figure 1.5.7 Noncollinear points Q, R, S

Example 1.24. Find the equation of the plane P containing the points (2, 1, 3), (1, −1, 2) and

(3, 2, 1).

Solution: Let Q (2, 1, 3), R (1, −1, 2) and S (3, 2, 1). Then for the vectors

−−→

QR (−1, −2, −1)

and

−−→

QS (1, 1, −2), the plane P has a normal vector

n

−−→

QR×××

−−→

QS (−1, −2, −1) ×××(1, 1, −2) (5, −3, 1)

So using formula (1.25) with the point Q (we could also use R or S), the plane P consists of

all points (x, y, z) such that:

5(x−2) −3(y−1) +(z −3) 0

or in normal form,

5x−3y+z −10 0

We mentioned earlier that skew lines in R

3

lie on separate, parallel planes. So two skew

lines do not determine a plane. But two (nonidentical) lines which either intersect or are

parallel do determine a plane. In both cases, to ﬁnd the equation of the plane that contains

those two lines, simply pick from the two lines a total of three noncollinear points (i.e. one

point from one line and two points from the other), then use the technique above, as in

Example 1.24, to write the equation. We will leave examples of this as exercises for the

reader.

1.5 Lines and Planes 37

Distance between a point and a plane

The distance between a point in R

3

and a plane is the length of the line segment from

that point to the plane which is perpendicular to the plane. The following theorem gives a

formula for that distance.

Theorem 1.19. Let Q (x

0

, y

0

, z

0

) be a point in R

3

, and let P be a plane with normal form

ax+by+cz +d 0 that does not contain Q. Then the distance D from Q to P is:

D

[ax

0

+by

0

+cz

0

+d[

_

a

2

+b

2

+c

2

(1.27)

Proof: Let R (x, y, z) be any point in the plane P (so that ax +by + cz +d 0) and let

r

−−→

RQ (x

0

−x, y

0

− y, z

0

−z). Then r /0 since Q does not lie in P. From the normal form

equation for P, we know that n (a, b, c) is a normal vector for P. Now, any plane divides

R

3

into two disjoint parts. Assume that n points toward the side of P where the point Q

is located. Place n so that its initial point is at R, and let θ be the angle between r and

n. Then 0

◦

<θ <90

◦

, so cosθ >0. Thus, the distance D is cosθ|r| [cosθ[ |r| (see Figure

1.5.8).

Q

R

n

r D

θ

D

P

Figure 1.5.8

By Theorem 1.6 in Section 1.3, we know that cosθ

n··· r

|n||r|

, so

D [cosθ[ |r|

¸

¸

n··· r

¸

¸

|n||r|

|r|

¸

¸

n··· r

¸

¸

|n|

[a(x

0

−x) +b(y

0

− y) +c(z

0

−z)[

_

a

2

+b

2

+c

2

[ax

0

+by

0

+cz

0

−(ax+by+cz)[

_

a

2

+b

2

+c

2

[ax

0

+by

0

+cz

0

−(−d)[

_

a

2

+b

2

+c

2

[ax

0

+by

0

+cz

0

+d[

_

a

2

+b

2

+c

2

If n points away from the side of P where the point Q is located, then 90

◦

< θ < 180

◦

and

so cosθ < 0. The distance D is then [cosθ[ |r|, and thus repeating the same argument as

above still gives the same result. QED

Example 1.25. Find the distance D from (2, 4, −5) to the plane from Example 1.24.

Solution: Recall that the plane is given by 5x−3y+z −10 0. So

D

[5(2) −3(4) +1(−5) −10[

_

5

2

+(−3)

2

+1

2

[−17[

_

35

17

_

35

≈2.87

38 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Line of intersection of two planes

L

Figure 1.5.9

Note that two planes are parallel if they have normal vectors that

are parallel, and the planes are perpendicular if their normal vectors

are perpendicular. If two planes do intersect, they do so in a line (see

Figure 1.5.9). Suppose that two planes P

1

and P

2

with normal vectors

n

1

and n

2

, respectively, intersect in a line L. Since n

1

×××n

2

⊥ n

1

, then

n

1

×××n

2

is parallel to the plane P

1

. Likewise, n

1

×××n

2

⊥ n

2

means that

n

1

×××n

2

is also parallel to P

2

. Thus, n

1

×××n

2

is parallel to the intersection

of P

1

and P

2

, i.e. n

1

×××n

2

is parallel to L. Thus, we can write L in the following vector form:

L : r+t(n

1

×××n

2

) , for −∞< t <∞ (1.28)

where r is any vector pointing to a point belonging to both planes. To ﬁnd a point in both

planes, ﬁnd a common solution (x, y, z) to the two normal form equations of the planes. This

can often be made easier by setting one of the coordinate variables to zero, which leaves you

to solve two equations in just two unknowns.

Example 1.26. Find the line of intersection L of the planes 5x−3y+z−10 0 and 2x+4y−

z +3 0.

Solution: The plane 5x −3y+z −10 0 has normal vector n

1

(5, −3, 1) and the plane 2x +

4y−z +3 0 has normal vector n

2

(2, 4, −1). Since n

1

and n

2

are not scalar multiples, then

the two planes are not parallel and hence will intersect. A point (x, y, z) on both planes will

satisfy the following system of two equations in three unknowns:

5x−3y+z −10 0

2x+4y−z + 3 0

Set x 0 (why is that a good choice?). Then the above equations are reduced to:

−3y+z −10 0

4y−z + 3 0

The second equation gives z 4y +3, substituting that into the ﬁrst equation gives y 7.

Then z 31, and so the point (0, 7, 31) is on L. Since n

1

×××n

2

(−1, 7, 26), then L is given by:

r+t(n

1

×××n

2

) (0, 7, 31) +t(−1, 7, 26), for −∞< t <∞

or in parametric form:

x −t, y 7+7t, z 31+26t, for −∞< t <∞

1.5 Lines and Planes 39

Exercises

A

For Exercises 1-4, write the line L through the point P and parallel to the vector v in the

following forms: (a) vector, (b) parametric, and (c) symmetric.

1. P (2, 3, −2), v (5, 4, −3) 2. P (3, −1, 2), v (2, 8, 1)

3. P (2, 1, 3), v (1, 0, 1) 4. P (0, 0, 0), v (7, 2, −10)

For Exercises 5-6, write the line L through the points P

1

and P

2

in parametric form.

5. P

1

(1, −2, −3), P

2

(3, 5, 5) 6. P

1

(4, 1, 5), P

2

(−2, 1, 3)

For Exercises 7-8, ﬁnd the distance d from the point P to the line L.

7. P (1, −1, −1), L : x −2−2t, y 4t, z 7+t

8. P (0, 0, 0), L : x 3+2t, y 4+3t, z 5+4t

For Exercises 9-10, ﬁnd the point of intersection (if any) of the given lines.

9. x 7+3s, y −4−3s, z −7−5s and x 1+6t, y 2+t, z 3−2t

10.

x−6

4

y+3 z and

x−11

3

y−14

−6

z +9

2

For Exercises 11-12, write the normal form of the plane P containing the point Q and per-

pendicular to the vector n.

11. Q (5, 1, −2), n(4, −4, 3) 12. Q (6, −2, 0), n(2, 6, 4)

For Exercises 13-14, write the normal form of the plane containing the given points.

13. (1, 0, 3), (1, 2, −1), (6, 1, 6) 14. (−3, 1, −3), (4, −4, 3), (0, 0, 1)

15. Write the normal form of the plane containing the lines from Exercise 9.

16. Write the normal form of the plane containing the lines from Exercise 10.

For Exercises 17-18, ﬁnd the distance D from the point Q to the plane P.

17. Q (4, 1, 2), P : 3x− y−5z +8 0 18. Q (0, 2, 0), P : −5x+2y−7z +1 0

For Exercises 19-20, ﬁnd the line of intersection (if any) of the given planes.

19. x+3y+2z −6 0, 2x− y+z +2 0 20. 3x+ y−5z 0, x+2y+z +4 0

B

21. Find the point(s) of intersection (if any) of the line

x−6

4

y +3 z with the plane

x+3y+2z −6 0. (Hint: Put the equations of the line into the equation of the plane.)

40 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

1.6 Surfaces

In the previous section we discussed planes in Euclidean space. A plane is an example of

a surface, which we will deﬁne informally

8

as the solution set of the equation F(x, y, z) 0

in R

3

, for some real-valued function F. For example, a plane given by ax +by +cz +d 0

is the solution set of F(x, y, z) 0 for the function F(x, y, z) ax +by +cz +d. Surfaces are

2-dimensional. The plane is the simplest surface, since it is “ﬂat”. In this section we will

look at some surfaces that are more complex, the most important of which are the sphere

and the cylinder.

Deﬁnition 1.9. A sphere S is the set of all points (x, y, z) in R

3

which are a ﬁxed distance r

(called the radius) from a ﬁxed point P

0

(x

0

, y

0

, z

0

) (called the center of the sphere):

S { (x, y, z) : (x−x

0

)

2

+(y− y

0

)

2

+(z −z

0

)

2

r

2

} (1.29)

Using vector notation, this can be written in the equivalent form:

S { x : |x−x

0

| r } (1.30)

where x (x, y, z) and x

0

(x

0

, y

0

, z

0

) are vectors.

Figure 1.6.1 illustrates the vectorial approach to spheres.

y

z

x

0

|x| r

x

(a) radius r, center (0, 0, 0)

y

z

x

0

|x−x

0

| r

x

x

0

x−x

0

(x

0

, y

0

, z

0

)

(b) radius r, center (x

0

, y

0

, z

0

)

Figure 1.6.1 Spheres in R

3

Note in Figure 1.6.1(a) that the intersection of the sphere with the xy-plane is a circle

of radius r (i.e. a great circle, given by x

2

+ y

2

r

2

as a subset of R

2

). Similarly for the

intersections with the xz-plane and the yz-plane. In general, a plane intersects a sphere

either at a single point or in a circle.

8

See O’NEILL for a deeper and more rigorous discussion of surfaces.

1.6 Surfaces 41

Example 1.27. Find the intersection of the sphere x

2

+ y

2

+z

2

169 with the plane z 12.

y

z

x

0

z 12

Figure 1.6.2

Solution: The sphere is centered at the origin and has radius

13

_

169, so it does intersect the plane z 12. Putting

z 12 into the equation of the sphere gives

x

2

+ y

2

+12

2

169

x

2

+ y

2

169−144 25 5

2

which is a circle of radius 5 centered at (0, 0, 12), parallel to

the xy-plane (see Figure 1.6.2).

If the equation in formula (1.29) is multiplied out, we get an equation of the form:

x

2

+ y

2

+z

2

+ax+by+cz +d 0 (1.31)

for some constants a, b, c and d. Conversely, an equation of this form may describe a sphere,

which can be determined by completing the square for the x, y and z variables.

Example 1.28. Is 2x

2

+2y

2

+2z

2

−8x+4y−16z +10 0 the equation of a sphere?

Solution: Dividing both sides of the equation by 2 gives

x

2

+ y

2

+z

2

−4x+2y−8z +5 0

(x

2

−4x+4) +(y

2

+2y+1) +(z

2

−8z +16) +5−4−1−16 0

(x−2)

2

+(y+1)

2

+(z −4)

2

16

which is a sphere of radius 4 centered at (2, −1, 4).

Example 1.29. Find the points(s) of intersection (if any) of the sphere from Example 1.28

and the line x 3+t, y 1+2t, z 3−t.

Solution: Put the equations of the line into the equation of the sphere, which was (x −2)

2

+

(y+1)

2

+(z −4)

2

16, and solve for t:

(3+t −2)

2

+(1+2t +1)

2

+(3−t −4)

2

16

(t +1)

2

+(2t +2)

2

+(−t −1)

2

16

6t

2

+12t −10 0

The quadratic formula gives the solutions t −1±

4

_

6

. Putting those two values into the

equations of the line gives the following two points of intersection:

_

2+

4

_

6

, −1+

8

_

6

, 4−

4

_

6

_

and

_

2−

4

_

6

, −1−

8

_

6

, 4+

4

_

6

_

42 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

If two spheres intersect, they do so either at a single point or in a circle.

Example 1.30. Find the intersection (if any) of the spheres x

2

+y

2

+z

2

25 and x

2

+y

2

+(z−

2)

2

16.

Solution: For any point (x, y, z) on both spheres, we see that

x

2

+ y

2

+z

2

25 ⇒ x

2

+ y

2

25−z

2

, and

x

2

+ y

2

+(z −2)

2

16 ⇒ x

2

+ y

2

16−(z −2)

2

, so

16−(z −2)

2

25−z

2

⇒ 4z −4 9 ⇒ z 13/4

⇒ x

2

+ y

2

25−(13/4)

2

231/16

∴ The intersection is the circle x

2

+ y

2

231

16

of radius

_

231

4

≈3.8 centered at (0, 0,

13

4

).

The cylinders that we will consider are right circular cylinders. These are cylinders ob-

tained by moving a line L along a circle C in R

3

in a way so that L is always perpendicular

to the plane containing C. We will only consider the cases where the plane containing C is

parallel to one of the three coordinate planes (see Figure 1.6.3).

y

z

x

0

r

(a) x

2

+ y

2

r

2

, any z

y

z

x

0

r

(b) x

2

+z

2

r

2

, any y

y

z

x

0

r

(c) y

2

+z

2

r

2

, any x

Figure 1.6.3 Cylinders in R

3

For example, the equation of a cylinder whose base circle C lies in the xy-plane and is

centered at (a, b, 0) and has radius r is

(x−a)

2

+(y−b)

2

r

2

, (1.32)

where the value of the z coordinate is unrestricted. Similar equations can be written when

the base circle lies in one of the other coordinate planes. A plane intersects a right circular

cylinder in a circle, ellipse, or one or two lines, depending on whether that plane is parallel,

oblique

9

, or perpendicular, respectively, to the plane containing C. The intersection of a

surface with a plane is called the trace of the surface.

9

i.e. at an angle strictly between 0

◦

and 90

◦

.

1.6 Surfaces 43

The equations of spheres and cylinders are examples of second-degree equations in R

3

, i.e.

equations of the form

Ax

2

+By

2

+Cz

2

+Dxy+Exz +Fyz +Gx+Hy+I z +J 0 (1.33)

for some constants A, B, . . . , J. If the above equation is not that of a sphere, cylinder, plane,

line or point, then the resulting surface is called a quadric surface.

y

z

x

0

a

b

c

Figure 1.6.4 Ellipsoid

One type of quadric surface is the ellipsoid, given

by an equation of the form:

x

2

a

2

+

y

2

b

2

+

z

2

c

2

1 (1.34)

In the case where a b c, this is just a sphere.

In general, an ellipsoid is egg-shaped (think of an

ellipse rotated around its major axis). Its traces in

the coordinate planes are ellipses.

Two other types of quadric surfaces are the hyperboloid of one sheet, given by an

equation of the form:

x

2

a

2

+

y

2

b

2

−

z

2

c

2

1 (1.35)

and the hyperboloid of two sheets, whose equation has the form:

x

2

a

2

−

y

2

b

2

−

z

2

c

2

1 (1.36)

y

z

x

0

Figure 1.6.5 Hyperboloid of one sheet

y

z

x

0

Figure 1.6.6 Hyperboloid of two sheets

44 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

For the hyperboloid of one sheet, the trace in any plane parallel to the xy-plane is an

ellipse. The traces in the planes parallel to the xz- or yz-planes are hyperbolas (see Figure

1.6.5), except for the special cases x ±a and y ±b; in those planes the traces are pairs of

intersecting lines (see Exercise 8).

For the hyperboloid of two sheets, the trace in any plane parallel to the xy- or xz-plane is

a hyperbola (see Figure 1.6.6). There is no trace in the yz-plane. In any plane parallel to the

yz-plane for which [ x[ >[a[, the trace is an ellipse.

y

z

x

0

Figure 1.6.7 Paraboloid

The elliptic paraboloid is another type of quadric surface,

whose equation has the form:

x

2

a

2

+

y

2

b

2

z

c

(1.37)

The traces in planes parallel to the xy-plane are ellipses, though

in the xy-plane itself the trace is a single point. The traces in

planes parallel to the xz- or yz-planes are parabolas. Figure

1.6.7 shows the case where c > 0. When c < 0 the surface is

turned downward. In the case where a b, the surface is called

a paraboloid of revolution, which is often used as a reﬂecting sur-

face, e.g. in vehicle headlights.

10

A more complicated quadric surface is the hyperbolic paraboloid, given by:

x

2

a

2

−

y

2

b

2

z

c

(1.38)

-10

-5

0

5

10

-10

-5

0

5

10

-100

-50

0

50

100

z

x

y

z

Figure 1.6.8 Hyperbolic paraboloid

10

For a discussion of this see pp. 157-158 in HECHT.

1.6 Surfaces 45

The hyperbolic paraboloid can be tricky to draw; using graphing software on a computer

can make it easier. For example, Figure 1.6.8 was created using the free Gnuplot package

(see Appendix C). It shows the graph of the hyperbolic paraboloid z y

2

−x

2

, which is the

special case where a b 1 and c −1 in equation (1.38). The mesh lines on the surface are

the traces in planes parallel to the coordinate planes. So we see that the traces in planes

parallel to the xz-plane are parabolas pointing upward, while the traces in planes parallel

to the yz-plane are parabolas pointing downward. Also, notice that the traces in planes

parallel to the xy-plane are hyperbolas, though in the xy-plane itself the trace is a pair of

intersecting lines through the origin. This is true in general when c <0 in equation (1.38).

When c > 0, the surface would be similar to that in Figure 1.6.8, only rotated 90

◦

around

the z-axis and the nature of the traces in planes parallel to the xz- or yz-planes would be

reversed.

y

z

x

0

Figure 1.6.9 Elliptic cone

The last type of quadric surface that we will consider is the

elliptic cone, which has an equation of the form:

x

2

a

2

+

y

2

b

2

−

z

2

c

2

0 (1.39)

The traces in planes parallel to the xy-plane are ellipses, ex-

cept in the xy-plane itself where the trace is a single point.

The traces in planes parallel to the xz- or yz-planes are hyper-

bolas, except in the xz- and yz-planes themselves where the

traces are pairs of intersecting lines.

Notice that every point on the elliptic cone is on a line which

lies entirely on the surface; in Figure 1.6.9 these lines all go

through the origin. This makes the elliptic cone an example of

a ruled surface. The cylinder is also a ruled surface.

What may not be as obvious is that both the hyperboloid of one sheet and the hyperbolic

paraboloid are ruled surfaces. In fact, on both surfaces there are two lines through each

point on the surface (see Exercises 11-12). Such surfaces are called doubly ruled surfaces,

and the pairs of lines are called a regulus.

It is clear that for each of the six types of quadric surfaces that we discussed, the surface

can be translated away from the origin (e.g. by replacing x

2

by (x−x

0

)

2

in its equation). It can

be proved

11

that every quadric surface can be translated and/or rotated so that its equation

matches one of the six types that we described. For example, z 2xy is a case of equation

(1.33) with “mixed” variables, e.g. with D /0 so that we get an xy term. This equation does

not match any of the types we considered. However, by rotating the x- and y-axes by 45

◦

in

the xy-plane by means of the coordinate transformation x (x

′

−y

′

)/

_

2, y (x

′

+y

′

)/

_

2, z z

′

,

then z 2xy becomes the hyperbolic paraboloid z

′

(x

′

)

2

−(y

′

)

2

in the (x

′

, y

′

, z

′

) coordinate

system. That is, z 2xy is a hyperbolic paraboloid as in equation (1.38), but rotated 45

◦

in

the xy-plane.

11

See Ch. 7 in POGORELOV.

46 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Exercises

A

For Exercises 1-4, determine if the given equation describes a sphere. If so, ﬁnd its radius

and center.

1. x

2

+ y

2

+z

2

−4x−6y−10z +37 0 2. x

2

+ y

2

+z

2

+2x−2y−8z +19 0

3. 2x

2

+2y

2

+2z

2

+4x+4y+4z −44 0 4. x

2

+ y

2

−z

2

+12x+2y−4z +32 0

5. Find the point(s) of intersection of the sphere (x −3)

2

+(y+1)

2

+(z −3)

2

9 and the line

x −1+2t, y −2−3t, z 3+t.

B

6. Find the intersection of the spheres x

2

+ y

2

+z

2

9 and (x−4)

2

+(y+2)

2

+(z −4)

2

9.

7. Find the intersection of the sphere x

2

+ y

2

+z

2

9 and the cylinder x

2

+ y

2

4.

8. Find the trace of the hyperboloid of one sheet

x

2

a

2

+

y

2

b

2

−

z

2

c

2

1 in the plane x a, and the

trace in the plane y b.

9. Find the trace of the hyperbolic paraboloid

x

2

a

2

−

y

2

b

2

z

c

in the xy-plane.

C

10. It can be shown that any four noncoplanar points (i.e. points that do not lie in the same

plane) determine a sphere.

12

Find the equation of the sphere that passes through the

points (0, 0, 0), (0, 0, 2), (1, −4, 3) and (0, −1, 3). (Hint: Equation (1.31))

11. Show that the hyperboloid of one sheet is a doubly ruled surface, i.e. each point on

the surface is on two lines lying entirely on the surface. (Hint: Write equation (1.35) as

x

2

a

2

−

z

2

c

2

1−

y

2

b

2

, factor each side. Recall that two planes intersect in a line.)

12. Show that the hyperbolic paraboloid is a doubly ruled surface. (Hint: Exercise 11)

y

z

x

0

(0, 0, 2)

(x, y, 0)

(a, b, c)

1

S

Figure 1.6.10

13. Let S be the sphere with radius 1 centered at (0, 0, 1),

and let S

∗

be S without the “north pole” point (0, 0, 2). Let

(a, b, c) be an arbitrary point on S

∗

. Then the line passing

through (0, 0, 2) and (a, b, c) intersects the xy-plane at some

point (x, y, 0), as in Figure 1.6.10. Find this point (x, y, 0) in

terms of a, b and c.

(Note: Every point in the xy-plane can be matched with a

point on S

∗

, and vice versa, in this manner. This method is

called stereographic projection, which essentially identiﬁes

all of R

2

with a “punctured” sphere.)

12

See WELCHONS and KRICKENBERGER, p. 160, for a proof.

1.7 Curvilinear Coordinates 47

1.7 Curvilinear Coordinates

x

y

z

0

(x, y, z)

x

y

z

Figure 1.7.1

The Cartesian coordinates of a point (x, y, z) are determined by

following straight paths starting from the origin: ﬁrst along the

x-axis, then parallel to the y-axis, then parallel to the z-axis, as

in Figure 1.7.1. In curvilinear coordinate systems, these paths can

be curved. The two types of curvilinear coordinates which we will

consider are cylindrical and spherical coordinates. Instead of ref-

erencing a point in terms of sides of a rectangular parallelepiped,

as with Cartesian coordinates, we will think of the point as ly-

ing on a cylinder or sphere. Cylindrical coordinates are often used when there is symmetry

around the z-axis; spherical coordinates are useful when there is symmetry about the origin.

Let P (x, y, z) be a point in Cartesian coordinates in R

3

, and let P

0

(x, y, 0) be the

projection of P upon the xy-plane. Treating (x, y) as a point in R

2

, let (r, θ) be its polar

coordinates (see Figure 1.7.2). Let ρ be the length of the line segment from the origin to P,

and let φ be the angle between that line segment and the positive z-axis (see Figure 1.7.3).

φ is called the zenith angle. Then the cylindrical coordinates (r, θ, z) and the spherical

coordinates (ρ, θ, φ) of P(x, y, z) are deﬁned as follows:

13

x

y

z

0

P(x, y, z)

P

0

(x, y, 0)

θ x

y

z

r

Figure 1.7.2

Cylindrical coordinates

Cylindrical coordinates (r, θ, z):

x r cosθ r

_

x

2

+ y

2

y r sinθ θ tan

−1

_

y

x

_

z z z z

where 0 ≤θ ≤π if y ≥0 and π<θ <2π if y <0

x

y

z

0

P(x, y, z)

P

0

(x, y, 0)

θ x

y

z

ρ

φ

Figure 1.7.3

Spherical coordinates

Spherical coordinates (ρ, θ, φ):

x ρsinφ cosθ ρ

_

x

2

+ y

2

+z

2

y ρsinφ sinθ θ tan

−1

_

y

x

_

z ρcosφ φcos

−1

_

z

_

x

2

+y

2

+z

2

_

where 0 ≤θ ≤π if y ≥0 and π<θ <2π if y <0

Both θ and φ are measured in radians. Note that r ≥ 0, 0 ≤ θ < 2π, ρ ≥ 0 and 0 ≤ φ ≤ π.

Also, θ is undeﬁned when (x, y) (0, 0), and φ is undeﬁned when (x, y, z) (0, 0, 0).

13

This “standard” deﬁnition of spherical coordinates used by mathematicians results in a left-handed system.

For this reason, physicists usually switch the deﬁnitions of θ and φ to make (ρ, θ, φ) a right-handed system.

48 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.31. Convert the point (−2, −2, 1) from Cartesian coordinates to (a) cylindrical

and (b) spherical coordinates.

Solution: (a) r

_

(−2)

2

+(−2)

2

2

_

2, θ tan

−1

_

−2

−2

_

tan

−1

(1)

5π

4

, since y −2 <0.

∴ (r, θ, z)

_

2

_

2,

5π

4

, 1

_

(b) ρ

_

(−2)

2

+(−2)

2

+1

2

_

9 3, φcos

−1

_

1

3

_

≈1.23 radians.

∴ (ρ, θ, φ)

_

3,

5π

4

, 1.23

_

For cylindrical coordinates (r, θ, z), and constants r

0

, θ

0

and z

0

, we see from Figure 1.7.4

that the surface r r

0

is a cylinder of radius r

0

centered along the z-axis, the surface θ θ

0

is a half-plane emanating from the z-axis, and the surface z z

0

is a plane parallel to the

xy-plane.

y

z

x

0

r

0

(a) r r

0

y

z

x

0

θ

0

(b) θ θ

0

y

z

x

0

z

0

(c) z z

0

Figure 1.7.4 Cylindrical coordinate surfaces

For spherical coordinates (ρ, θ, φ), and constants ρ

0

, θ

0

and φ

0

, we see from Figure 1.7.5

that the surface ρ ρ

0

is a sphere of radius ρ

0

centered at the origin, the surface θ θ

0

is a

half-plane emanating from the z-axis, and the surface φφ

0

is a circular cone whose vertex

is at the origin.

y

z

x

0

ρ

0

(a) ρ ρ

0

y

z

x

0

θ

0

(b) θ θ

0

y

z

x

0

φ

0

(c) φφ

0

Figure 1.7.5 Spherical coordinate surfaces

Figures 1.7.4(a) and 1.7.5(a) show how these coordinate systems got their names.

1.7 Curvilinear Coordinates 49

Sometimes the equation of a surface in Cartesian coordinates can be transformed into a

simpler equation in some other coordinate system, as in the following example.

Example 1.32. Write the equation of the cylinder x

2

+ y

2

4 in cylindrical coordinates.

Solution: Since r

_

x

2

+ y

2

, then the equation in cylindrical coordinates is r 2.

Using spherical coordinates to write the equation of a sphere does not necessarily make

the equation simpler, if the sphere is not centered at the origin.

Example 1.33. Write the equation (x−2)

2

+(y−1)

2

+z

2

9 in spherical coordinates.

Solution: Multiplying the equation out gives

x

2

+ y

2

+z

2

−4x−2y+5 9 , so we get

ρ

2

−4ρsinφ cosθ −2ρsinφ sinθ −4 0 , or

ρ

2

−2sinφ(2cosθ −sinθ) ρ −4 0

after combining terms. Note that this actually makes it more difﬁcult to ﬁgure out what the

surface is, as opposed to the Cartesian equation where you could immediately identify the

surface as a sphere of radius 3 centered at (2, 1, 0).

Example 1.34. Describe the surface given by θ z in cylindrical coordinates.

Solution: This surface is called a helicoid. As the (vertical) z coordinate increases, so does

the angle θ, while the radius r is unrestricted. So this sweeps out a (ruled!) surface shaped

like a spiral staircase, where the spiral has an inﬁnite radius. Figure 1.7.6 shows a section

of this surface restricted to 0 ≤ z ≤4π and 0 ≤ r ≤2.

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0

2

4

6

8

10

12

14

z

x

y

z

Figure 1.7.6 Helicoid θ z

50 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Exercises

A

For Exercises 1-4, ﬁnd the (a) cylindrical and (b) spherical coordinates of the point whose

Cartesian coordinates are given.

1. (2, 2

_

3, −1) 2. (−5, 5, 6) 3. (

_

21, −

_

7, 0) 4. (0,

_

2, 2)

For Exercises 5-7, write the given equation in (a) cylindrical and (b) spherical coordinates.

5. x

2

+ y

2

+z

2

25 6. x

2

+ y

2

2y 7. x

2

+ y

2

+9z

2

36

B

8. Describe the intersection of the surfaces whose equations in spherical coordinates are

θ

π

2

and φ

π

4

.

9. Show that for a / 0, the equation ρ 2asinφ cosθ in spherical coordinates describes a

sphere centered at (a, 0, 0) with radius [a[.

C

10. Let P (a, θ, φ) be a point in spherical coordinates, with a > 0 and 0 < φ < π. Then P

lies on the sphere ρ a. Since 0 < φ < π, the line segment from the origin to P can be

extended to intersect the cylinder given by r a (in cylindrical coordinates). Find the

cylindrical coordinates of that point of intersection.

11. Let P

1

and P

2

be points whose spherical coordinates are (ρ

1

, θ

1

, φ

1

) and (ρ

2

, θ

2

, φ

2

), respec-

tively. Let v

1

be the vector from the origin to P

1

, and let v

2

be the vector from the origin

to P

2

. For the angle γ between v

1

and v

2

, show that

cosγ cosφ

1

cosφ

2

+sinφ

1

sinφ

2

cos( θ

2

−θ

1

).

This formula is used in electrodynamics to prove the addition theorem for spherical har-

monics, which provides a general expression for the electrostatic potential at a point due

to a unit charge. See pp. 100-102 in JACKSON.

12. Show that the distance d between the points P

1

and P

2

with cylindrical coordinates

(r

1

, θ

1

, z

1

) and (r

2

, θ

2

, z

2

), respectively, is

d

_

r

2

1

+r

2

2

−2r

1

r

2

cos( θ

2

−θ

1

) +(z

2

−z

1

)

2

.

13. Show that the distance d between the points P

1

and P

2

with spherical coordinates

(ρ

1

, θ

1

, φ

1

) and (ρ

2

, θ

2

, φ

2

), respectively, is

d

_

ρ

2

1

+ρ

2

2

−2ρ

1

ρ

2

[sinφ

1

sinφ

2

cos( θ

2

−θ

1

) +cosφ

1

cosφ

2

] .

1.8 Vector-Valued Functions 51

1.8 Vector-Valued Functions

Now that we are familiar with vectors and their operations, we can begin discussing func-

tions whose values are vectors.

Deﬁnition 1.10. A vector-valued function of a real variable is a rule that associates a

vector f(t) with a real number t, where t is in some subset D of R

1

(called the domain of f).

We write f : D →R

3

to denote that f is a mapping of D into R

3

.

For example, f(t) ti +t

2

j +t

3

k is a vector-valued function in R

3

, deﬁned for all real num-

bers t. We would write f : R →R

3

. At t 1 the value of the function is the vector i +j +k,

which in Cartesian coordinates has the terminal point (1, 1, 1).

A vector-valued function of a real variable can be written in component form as

f(t) f

1

(t)i + f

2

(t)j + f

3

(t)k

or in the form

f(t) ( f

1

(t), f

2

(t), f

3

(t))

for some real-valued functions f

1

(t), f

2

(t), f

3

(t), called the component functions of f. The ﬁrst

form is often used when emphasizing that f(t) is a vector, and the second form is useful

when considering just the terminal points of the vectors. By identifying vectors with their

terminal points, a curve in space can be written as a vector-valued function.

y

z

x

0

f(0)

f(2π)

Figure 1.8.1

Example 1.35. Deﬁne f : R→R

3

by f(t) (cos t, sint, t).

This is the equation of a helix (see Figure 1.8.1). As the value of

t increases, the terminal points of f(t) trace out a curve spiraling

upward. For each t, the x- and y-coordinates of f(t) are x cos t

and y sint, so

x

2

+ y

2

cos

2

t +sin

2

t 1.

Thus, the curve lies on the surface of the right circular cylinder

x

2

+ y

2

1.

It may help to think of vector-valued functions of a real variable in R

3

as a generalization

of the parametric functions in R

2

which you learned about in single-variable calculus. Much

of the theory of real-valued functions of a single real variable can be applied to vector-valued

functions of a real variable. Since each of the three component functions are real-valued, it

will sometimes be the case that results from single-variable calculus can simply be applied

to each of the component functions to yield a similar result for the vector-valued function.

However, there are times when such generalizations do not hold (see Exercise 13). The

concept of a limit, though, can be extended naturally to vector-valued functions, as in the

following deﬁnition.

52 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Deﬁnition 1.11. Let f(t) be a vector-valued function, let a be a real number and let c be a

vector. Then we say that the limit of f(t) as t approaches a equals c, written as lim

t→a

f(t) c,

if lim

t→a

|f(t) −c| 0. If f(t) ( f

1

(t), f

2

(t), f

3

(t)), then

lim

t→a

f(t)

_

lim

t→a

f

1

(t), lim

t→a

f

2

(t), lim

t→a

f

3

(t)

_

provided that all three limits on the right side exist.

The above deﬁnition shows that continuity and the derivative of vector-valued functions

can also be deﬁned in terms of its component functions.

Deﬁnition 1.12. Let f(t) ( f

1

(t), f

2

(t), f

3

(t)) be a vector-valued function, and let a be a real

number in its domain. Then f(t) is continuous at a if lim

t→a

f(t) f(a). Equivalently, f(t) is

continuous at a if and only if f

1

(t), f

2

(t), and f

3

(t) are continuous at a.

The derivative of f(t) at a, denoted by f

′

(a) or

df

dt

(a), is the limit

f

′

(a) lim

h→0

f(a+h) −f(a)

h

if that limit exists. Equivalently, f

′

(a) ( f

1

′

(a), f

2

′

(a), f

3

′

(a)), if the component derivatives

exist. We say that f(t) is differentiable at a if f

′

(a) exists.

Recall that the derivative of a real-valued function of a single variable is a real number,

representing the slope of the tangent line to the graph of the function at a point. Similarly,

the derivative of a vector-valued function is a tangent vector to the curve in space which

the function represents, and it lies on the tangent line to the curve (see Figure 1.8.2).

y

z

x

0

L

f(t)

f

′

(a)

f(a)

f(a+h)

f

(

a

+

h

)

−

f

(

a

)

Figure 1.8.2 Tangent vector f

′

(a) and tangent line L f(a) +sf

′

(a)

Example 1.36. Let f(t) (cos t, sint, t). Then f

′

(t) (−sint, cos t, 1) for all t. The tangent line

L to the curve at f(2π) (1, 0, 2π) is L f(2π) +sf

′

(2π) (1, 0, 2π) +s(0, 1, 1), or in parametric

form: x 1, y s, z 2π+s for −∞< s <∞.

1.8 Vector-Valued Functions 53

A scalar function is a real-valued function. Note that if u(t) is a scalar function and

f(t) is a vector-valued function, then their product, deﬁned by (uf)(t) u(t) f(t) for all t, is a

vector-valued function (since the product of a scalar with a vector is a vector).

The basic properties of derivatives of vector-valued functions are summarized in the fol-

lowing theorem.

Theorem 1.20. Let f(t) and g(t) be differentiable vector-valued functions, let u(t) be a

differentiable scalar function, let k be a scalar, and let c be a constant vector. Then

(a)

d

dt

(c) 0

(b)

d

dt

(kf) k

df

dt

(c)

d

dt

(f +g)

df

dt

+

dg

dt

(d)

d

dt

(f −g)

df

dt

−

dg

dt

(e)

d

dt

(uf)

du

dt

f + u

df

dt

(f)

d

dt

(f ··· g)

df

dt

··· g + f ···

dg

dt

(g)

d

dt

(f ××× g)

df

dt

××× g + f ×××

dg

dt

Proof: The proofs of parts (a)-(e) follow easily by differentiating the component functions

and using the rules for derivatives from single-variable calculus. We will prove part (f),

and leave the proof of part (g) as an exercise for the reader.

(f) Write f(t) ( f

1

(t), f

2

(t), f

3

(t)) and g(t) (g

1

(t), g

2

(t), g

3

(t)), where the component functions

f

1

(t), f

2

(t), f

3

(t), g

1

(t), g

2

(t), g

3

(t) are all differentiable real-valued functions. Then

d

dt

(f(t) ··· g(t))

d

dt

( f

1

(t) g

1

(t) + f

2

(t) g

2

(t) + f

3

(t) g

3

(t))

d

dt

( f

1

(t) g

1

(t)) +

d

dt

( f

2

(t) g

2

(t)) +

d

dt

( f

3

(t) g

3

(t))

df

1

dt

(t) g

1

(t) + f

1

(t)

dg

1

dt

(t) +

df

2

dt

(t) g

2

(t) + f

2

(t)

dg

2

dt

(t) +

df

3

dt

(t) g

3

(t) + f

3

(t)

dg

3

dt

(t)

_

df

1

dt

(t),

df

2

dt

(t),

df

3

dt

(t)

_

··· (g

1

(t), g

2

(t), g

3

(t))

+( f

1

(t), f

2

(t), f

3

(t)) ···

_

dg

1

dt

(t),

dg

2

dt

(t),

dg

3

dt

(t)

_

df

dt

(t) ··· g(t) + f(t) ···

dg

dt

(t) for all t. QED

54 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Example 1.37. Suppose f(t) is differentiable. Find the derivative of |f(t)|.

Solution: Since |f(t)| is a real-valued function of t, then by the Chain Rule for real-valued

functions, we know that

d

dt

|f(t)|

2

2|f(t)|

d

dt

|f(t)|.

But |f(t)|

2

f(t) ··· f(t), so

d

dt

|f(t)|

2

d

dt

(f(t) ··· f(t)). Hence, we have

2|f(t)|

d

dt

|f(t)|

d

dt

(f(t) ··· f(t)) f

′

(t) ··· f(t) + f(t) ··· f

′

(t) by Theorem 1.20(f), so

2f

′

(t) ··· f(t) , so if |f(t)| /0 then

d

dt

|f(t)|

f

′

(t) ··· f(t)

|f(t)|

.

We know that |f(t)| is constant if and only if

d

dt

|f(t)| 0 for all t. Also, f(t) ⊥f

′

(t) if and

only if f

′

(t) ··· f(t) 0. Thus, the above example shows this important fact:

If |f(t)| /0, then |f(t)| is constant if and only if f(t) ⊥f

′

(t) for all t.

This means that if a curve lies completely on a sphere (or circle) centered at the origin, then

the tangent vector f

′

(t) is always perpendicular to the position vector f(t).

Example 1.38. The spherical spiral f(t)

_

cos t

_

1+a

2

t

2

,

sint

_

1+a

2

t

2

,

−at

_

1+a

2

t

2

_

, for a /0.

Figure 1.8.3 shows the graph of the curve when a 0.2. In the exercises, the reader will be

asked to show that this curve lies on the sphere x

2

+ y

2

+z

2

1 and to verify directly that

f

′

(t) ··· f(t) 0 for all t.

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

z

x

y

z

Figure 1.8.3 Spherical spiral with a 0.2

1.8 Vector-Valued Functions 55

Just as in single-variable calculus, higher-order derivatives of vector-valued functions are

obtained by repeatedly differentiating the (ﬁrst) derivative of the function:

f

′′

(t)

d

dt

f

′

(t) , f

′′′

(t)

d

dt

f

′′

(t) , . . . ,

d

n

f

dt

n

d

dt

_

d

n−1

f

dt

n−1

_

(for n 2, 3, 4, . . .)

We can use vector-valued functions to represent physical quantities, such as velocity, ac-

celeration, force, momentum, etc. For example, let the real variable t represent time elapsed

from some initial time (t 0), and suppose that an object of constant mass m is subjected

to some force so that it moves in space, with its position (x, y, z) at time t a function of

t. That is, x x(t), y y(t), z z(t) for some real-valued functions x(t), y(t), z(t). Call

r(t) (x(t), y(t), z(t)) the position vector of the object. We can deﬁne various physical quan-

tities associated with the object as follows:

14

position: r(t) (x(t), y(t), z(t))

velocity: v(t) ˙ r(t) r

′

(t)

dr

dt

(x

′

(t), y

′

(t), z

′

(t))

acceleration: a(t) ˙ v(t) v

′

(t)

dv

dt

¨ r(t) r

′′

(t)

d

2

r

dt

2

(x

′′

(t), y

′′

(t), z

′′

(t))

momentum: p(t) mv(t)

force: F(t) ˙ p(t) p

′

(t)

dp

dt

(Newton’s Second Law of Motion)

The magnitude |v(t)| of the velocity vector is called the speed of the object. Note that since

the mass m is a constant, the force equation becomes the familiar F(t) ma(t).

Example 1.39. Let r(t) (5cos t, 3sint, 4sint) be the position vector of an object at time t ≥0.

Find its (a) velocity and (b) acceleration vectors.

Solution: (a) v(t) ˙ r(t) (−5sint, 3cos t, 4cos t)

(b) a(t) ˙ v(t) (−5cos t, −3sint, −4sint)

Note that |r(t)|

_

25cos

2

t +25sin

2

t 5 for all t, so by Example 1.37 we know that r(t) ···

˙ r(t) 0 for all t (which we can verify from part (a)). In fact, |v(t)| 5 for all t also. And not

only does r(t) lie on the sphere of radius 5 centered at the origin, but perhaps not so obvious

is that it lies completely within a circle of radius 5 centered at the origin. Also, note that

a(t) −r(t). It turns out (see Exercise 16) that whenever an object moves in a circle with

constant speed, the acceleration vector will point in the opposite direction of the position

vector (i.e. towards the center of the circle).

14

We will often use the older dot notation for derivatives when physics is involved.

56 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Recall from Section 1.5 that if r

1

, r

2

are position vectors to distinct points then r

1

+t(r

2

−r

1

)

represents a line through those two points as t varies over all real numbers. That vector

sum can be written as (1−t)r

1

+tr

2

. So the function l(t) (1−t)r

1

+tr

2

is a line through

the terminal points of r

1

and r

2

, and when t is restricted to the interval [0, 1] it is the line

segment between the points, with l(0) r

1

and l(1) r

2

.

In general, a function of the form f(t) (a

1

t +b

1

, a

2

t +b

2

, a

3

t +b

3

) represents a line in R

3

. A

function of the form f(t) (a

1

t

2

+b

1

t +c

1

, a

2

t

2

+b

2

t +c

2

, a

3

t

2

+b

3

t +c

3

) represents a (possibly

degenerate) parabola in R

3

.

Example 1.40. Bézier curves are used in Computer Aided Design (CAD) to approximate

the shape of a polygonal path in space (called the Bézier polygon or control polygon). For

instance, given three points (or position vectors) b

0

, b

1

, b

2

in R

3

, deﬁne

b

1

0

(t) (1−t)b

0

+tb

1

b

1

1

(t) (1−t)b

1

+tb

2

b

2

0

(t) (1−t)b

1

0

(t) +tb

1

1

(t)

(1−t)

2

b

0

+2t(1−t)b

1

+t

2

b

2

for all real t. For t in the interval [0, 1], we see that b

1

0

(t) is the line segment between b

0

and

b

1

, and b

1

1

(t) is the line segment between b

1

and b

2

. The function b

2

0

(t) is the Bézier curve

for the points b

0

, b

1

, b

2

. Note from the last formula that the curve is a parabola that goes

through b

0

(when t 0) and b

2

(when t 1).

As an example, let b

0

(0, 0, 0), b

1

(1, 2, 3), and b

2

(4, 5, 2). Then the explicit formula for

the Bézier curve is b

2

0

(t) (2t +2t

2

, 4t +t

2

, 6t −4t

2

), as shown in Figure 1.8.4, where the line

segments are b

1

0

(t) and b

1

1

(t), and the curve is b

2

0

(t).

0

0.5

1

1.5

2

2.5

3

3.5

4

0

1

2

3

4

5

0

0.5

1

1.5

2

2.5

3

z

x

y

z

(0, 0, 0)

(1, 2, 3)

(4, 5, 2)

Figure 1.8.4 Bézier curve approximation for three points

1.8 Vector-Valued Functions 57

In general, the polygonal path determined by n ≥3 noncollinear points in R

3

can be used

to deﬁne the Bézier curve recursively by a process called repeated linear interpolation. This

curve will be a vector-valued function whose components are polynomials of degree n−1,

and its formula is given by de Casteljau’s algorithm.

15

In the exercises, the reader will be

given the algorithm for the case of n 4 points and asked to write the explicit formula for

the Bézier curve for the four points shown in Figure 1.8.5.

0

0.5

1

1.5

2

2.5

3

3.5

4

0

1

2

3

4

5

0

0.5

1

1.5

2

z

x

y

z

(0, 0, 0)

(0, 1, 1)

(2, 3, 0)

(4, 5, 2)

Figure 1.8.5 Bézier curve approximation for four points

Exercises

A

For Exercises 1-4, calculate f

′

(t) and ﬁnd the tangent line at f(0).

1. f(t) (t +1, t

2

+1, t

3

+1) 2. f(t) (e

t

+1, e

2t

+1, e

t

2

+1)

3. f(t) (cos2t, sin2t, t) 4. f(t) (sin2t, 2sin

2

t, 2cos t)

For Exercises 5-6, ﬁnd the velocity v(t) and acceleration a(t) of an object with the given

position vector r(t).

5. r(t) (t, t −sint, 1−cos t) 6. r(t) (3cos t, 2sint, 1)

B

7. Let f(t)

_

cos t

_

1+a

2

t

2

,

sint

_

1+a

2

t

2

,

−at

_

1+a

2

t

2

_

, with a /0.

(a) Show that |f(t)| 1 for all t.

(b) Show directly that f

′

(t) ··· f(t) 0 for all t.

8. If f

′

(t) 0 for all t in some interval (a, b), show that f(t) is a constant vector in (a, b).

15

See pp. 27-30 in FARIN.

58 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

9. For a constant vector c /0, the function f(t) tc represents a line parallel to c.

(a) What kind of curve does g(t) t

3

c represent? Explain.

(b) What kind of curve does h(t) e

t

c represent? Explain.

(c) Compare f

′

(0) and g

′

(0). Given your answer to part (a), how do you explain the

difference in the two derivatives?

10. Show that

d

dt

_

f ×××

df

dt

_

f ×××

d

2

f

dt

2

.

11. Let a particle of (constant) mass m have position vector r(t), velocity v(t), acceleration

a(t) and momentum p(t) at time t. The angular momentum L(t) of the particle with

respect to the origin at time t is deﬁned as L(t) r(t) ××× p(t). If F(t) is the force acting on

the particle at time t, then deﬁne the torque N(t) acting on the particle with respect to

the origin as N(t) r(t) ×××F(t). Show that L

′

(t) N(t).

12. Show that

d

dt

(f ··· (g ××× h))

df

dt

··· (g ××× h) + f ···

_

dg

dt

××× h

_

+ f ···

_

g ×××

dh

dt

_

.

13. The Mean Value Theorem does not hold for vector-valued functions: Show that for f(t)

(cos t, sint, t), there is no t in the interval (0, 2π) such that

f

′

(t)

f(2π) −f(0)

2π−0

.

C

14. The Bézier curve b

3

0

(t) for four noncollinear points b

0

, b

1

, b

2

, b

3

in R

3

is deﬁned by the

following algorithm (going from the left column to the right):

b

1

0

(t) (1−t)b

0

+tb

1

b

2

0

(t) (1−t)b

1

0

(t) +tb

1

1

(t) b

3

0

(t) (1−t)b

2

0

(t) +tb

2

1

(t)

b

1

1

(t) (1−t)b

1

+tb

2

b

2

1

(t) (1−t)b

1

1

(t) +tb

1

2

(t)

b

1

2

(t) (1−t)b

2

+tb

3

(a) Show that b

3

0

(t) (1−t)

3

b

0

+3t(1−t)

2

b

1

+3t

2

(1−t)b

2

+t

3

b

3

.

(b) Write the explicit formula (as in Example 1.40) for the Bézier curve for the points

b

0

(0, 0, 0), b

1

(0, 1, 1), b

2

(2, 3, 0), b

3

(4, 5, 2).

15. Let r(t) be the position vector for a particle moving in R

3

. Show that

d

dt

(r×××(v ××× r)) |r|

2

a+(r··· v)v−(|v|

2

+r··· a)r.

16. Let r(t) be the position vector in R

3

for a particle that moves with constant speed c >0

in a circle of radius a >0 in the xy-plane. Show that a(t) points in the opposite direction

as r(t) for all t. (Hint: Use Example 1.37 to show that r(t) ⊥v(t) and a(t) ⊥v(t), and hence

a(t) ∥ r(t).)

17. Prove Theorem 1.20(g).

1.9 Arc Length 59

1.9 Arc Length

Let r(t) (x(t), y(t), z(t)) be the position vector of an object moving in R

3

. Since |v(t)| is the

speed of the object at time t, it seems natural to deﬁne the distance s traveled by the object

from time t a to t b as the deﬁnite integral

s

_

b

a

|v(t)|dt

_

b

a

_

x

′

(t)

2

+ y

′

(t)

2

+z

′

(t)

2

dt , (1.40)

which is analogous to the case from single-variable calculus for parametric functions in R

2

.

This is indeed how we will deﬁne the distance traveled and, in general, the arc length of a

curve in R

3

.

Deﬁnition 1.13. Let f(t) (x(t), y(t), z(t)) be a curve in R

3

whose domain includes the inter-

val [a, b]. Suppose that in the interval (a, b) the ﬁrst derivative of each component function

x(t), y(t) and z(t) exists and is continuous, and that no section of the curve is repeated. Then

the arc length L of the curve from t a to t b is

L

_

b

a

|f

′

(t)|dt

_

b

a

_

x

′

(t)

2

+ y

′

(t)

2

+z

′

(t)

2

dt (1.41)

A real-valued function whose ﬁrst derivative is continuous is called continuously differ-

entiable (or a C

1

function), and a function whose derivatives of all orders are continuous

is called smooth (or a C

∞

function). All the functions we will consider will be smooth. A

smooth curve f(t) is one whose derivative f

′

(t) is never the zero vector and whose component

functions are all smooth.

Note that we did not prove that the formula in the above deﬁnition actually gives the

length of a section of a curve. A rigorous proof requires dealing with some subtleties, nor-

mally glossed over in calculus texts, which are beyond the scope of this book.

16

Example 1.41. Find the length L of the helix f(t) (cos t, sint, t) from t 0 to t 2π.

Solution: By formula (1.41), we have

L

_

2π

0

_

(−sint)

2

+(cos t)

2

+1

2

dt

_

2π

0

_

sin

2

t +cos

2

t +1dt

_

2π

0

_

2dt

_

2(2π−0) 2

_

2π

Similar to the case in R

2

, if there are values of t in the interval [a, b] where the derivative

of a component function is not continuous then it is often possible to partition [a, b] into

subintervals where all the component functions are continuously differentiable (except at

the endpoints, which can be ignored). The sum of the arc lengths over the subintervals will

be the arc length over [a, b].

16

In particular, Duhamel’s principle is needed. See the proof in TAYLOR and MANN, § 14.2 and § 18.2.

60 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Notice that the curve traced out by the function f(t) (cos t, sint, t) from Example 1.41 is

also traced out by the function g(t) (cos2t, sin2t, 2t). For example, over the interval [0, π],

g(t) traces out the same section of the curve as f(t) does over the interval [0, 2π]. Intuitively,

this says that g(t) traces the curve twice as fast as f(t). This makes sense since, viewing the

functions as position vectors and their derivatives as velocity vectors, the speeds of f(t) and

g(t) are |f

′

(t)|

_

2 and |g

′

(t)| 2

_

2, respectively. We say that g(t) and f(t) are different

parametrizations of the same curve.

Deﬁnition 1.14. Let C be a smooth curve in R

3

represented by a function f(t) deﬁned on an

interval [a, b], and let α : [c, d] →[a, b] be a smooth one-to-one mapping of an interval [c, d]

onto [a, b]. Then the function g : [c, d] →R

3

deﬁned by g(s) f(α(s)) is a parametrization of

C with parameter s. If α is strictly increasing on [c, d] then we say that g(s) is equivalent

to f(t).

s t f(t)

[c, d] [a, b]

R

3

α f

g(s) f(α(s)) f(t)

Note that the differentiability of g(s) follows from a version of the Chain Rule for vector-

valued functions (the proof is left as an exercise):

Theorem 1.21. Chain Rule: If f(t) is a differentiable vector-valued function of t, and t

α(s) is a differentiable scalar function of s, then f(s) f(α(s)) is a differentiable vector-valued

function of s, and

df

ds

df

dt

dt

ds

(1.42)

for any s where the composite function f(α(s)) is deﬁned.

Example 1.42. The following are all equivalent parametrizations of the same curve:

f(t) (cos t, sint, t) for t in [0, 2π]

g(s) (cos2s, sin2s, 2s) for s in [0, π]

h(s) (cos2πs, sin2πs, 2πs) for s in [0, 1]

To see that g(s) is equivalent to f(t), deﬁne α: [0, π] →[0, 2π] by α(s) 2s. Then α is smooth,

one-to-one, maps [0, π] onto [0, 2π], and is strictly increasing (since α

′

(s) 2 > 0 for all s).

Likewise, deﬁning α: [0, 1] →[0, 2π] by α(s) 2πs shows that h(s) is equivalent to f(t).

1.9 Arc Length 61

A curve can have many parametrizations, with different speeds, so which one is the best

to use? In some situations the arc length parametrization can be useful. The idea behind

this is to replace the parameter t, for any given smooth parametrization f(t) deﬁned on [a, b],

by the parameter s given by

s s(t)

_

t

a

|f

′

(u)|du. (1.43)

In terms of motion along a curve, s is the distance traveled along the curve after time t

has elapsed. So the new parameter will be distance instead of time. There is a natural

correspondence between s and t: from a starting point on the curve, the distance traveled

along the curve (in one direction) is uniquely determined by the amount of time elapsed, and

vice versa.

Since s is the arc length of the curve over the interval [a, t] for each t in [a, b], then it is a

function of t. By the Fundamental Theorem of Calculus, its derivative is

s

′

(t)

ds

dt

d

dt

_

t

a

|f

′

(u)|du |f

′

(t)| for all t in [a, b].

Since f(t) is smooth, then |f

′

(t)| >0 for all t in [a, b]. Thus s

′

(t) >0 and hence s(t) is strictly

increasing on the interval [a, b]. Recall that this means that s is a one-to-one mapping of the

interval [a, b] onto the interval [s(a), s(b)]. But we see that

s(a)

_

a

a

|f

′

(u)|du 0 and s(b)

_

b

a

|f

′

(u)|du L arc length from t a to t b

s t

[0, L] [a, b]

α(s)

s(t)

Figure 1.9.1 t α(s)

So the function s : [a, b] →[0, L] is a one-to-one, differentiable

mapping onto the interval [0, L]. From single-variable calculus,

we know that this means that there exists an inverse function

α: [0, L] →[a, b] that is differentiable and the inverse of s : [a, b] →

[0, L]. That is, for each t in [a, b] there is a unique s in [0, L] such

that s s(t) and t α(s). And we know that the derivative of α is

α

′

(s)

1

s

′

(α(s))

1

|f

′

(α(s))|

So deﬁne the arc length parametrization f : [0, L] →R

3

by

f(s) f(α(s)) for all s in [0, L].

Then f(s) is smooth, by the Chain Rule. In fact, f(s) has unit speed:

f

′

(s) f

′

(α(s)) α

′

(s) by the Chain Rule, so

f

′

(α(s))

1

|f

′

(α(s))|

, so

|f

′

(s)| 1 for all s in [0, L].

So the arc length parametrization traverses the curve at a “normal” rate.

62 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

In practice, parametrizing a curve f(t) by arc length requires you to evaluate the integral

s

_

t

a

|f

′

(u)|du in some closed form (as a function of t) so that you could then solve for t in

terms of s. If that can be done, you would then substitute the expression for t in terms of s

(which we called α(s)) into the formula for f(t) to get f(s).

Example 1.43. Parametrize the helix f(t) (cos t, sint, t), for t in [0, 2π], by arc length.

Solution: By Example 1.41 and formula (1.43), we have

s

_

t

0

|f

′

(u)|du

_

t

0

_

2du

_

2t for all t in [0, 2π].

So we can solve for t in terms of s: t α(s)

s

_

2

.

∴ f(s)

_

cos

s

_

2

, sin

s

_

2

,

s

_

2

_

for all s in [0, 2

_

2π]. Note that |f

′

(s)| 1.

Arc length plays an important role when discussing curvature and moving frame ﬁelds,

in the ﬁeld of mathematics known as differential geometry.

17

The methods involve using

an arc length parametrization, which often leads to an integral that is either difﬁcult or

impossible to evaluate in a simple closed form. The simple integral in Example 1.43 is

the exception, not the norm. In general, arc length parametrizations are more useful for

theoretical purposes than for practical computations.

18

Curvature and moving frame ﬁelds

can be deﬁned without using arc length, which makes their computation much easier, and

these deﬁnitions can be shown to be equivalent to those using arc length. We will leave this

to the exercises.

The arc length for curves given in other coordinate systems can also be calculated:

Theorem 1.22. Suppose that r r(t), θ θ(t) and z z(t) are the cylindrical coordinates of

a curve f(t), for t in [a, b]. Then the arc length L of the curve over [a, b] is

L

_

b

a

_

r

′

(t)

2

+r(t)

2

θ

′

(t)

2

+z

′

(t)

2

dt (1.44)

Proof: The Cartesian coordinates (x(t), y(t), z(t)) of a point on the curve are given by

x(t) r(t) cosθ(t), y(t) r(t) sinθ(t), z(t) z(t)

so differentiating the above expressions for x(t) and y(t) with respect to t gives

x

′

(t) r

′

(t) cosθ(t) −r(t)θ

′

(t) sinθ(t), y

′

(t) r

′

(t) sinθ(t) +r(t)θ

′

(t) cosθ(t)

17

See O’NEILL for an introduction to elementary differential geometry.

18

For example, the usual parametrizations of Bézier curves, which we discussed in Section 1.8, are polynomial

functions in R

3

. This makes their computation relatively simple, which, in CAD, is desirable. But their arc

length parametrizations are not only not polynomials, they are in fact usually impossible to calculate at all.

1.9 Arc Length 63

and so

x

′

(t)

2

+ y

′

(t)

2

(r

′

(t) cosθ(t) −r(t)θ

′

(t) sinθ(t))

2

+(r

′

(t) sinθ(t) +r(t)θ

′

(t) cosθ(t))

2

r

′

(t)

2

(cos

2

θ +sin

2

θ) +r(t)

2

θ

′

(t)

2

(cos

2

θ +sin

2

θ)

−2r

′

(t)r(t)θ

′

(t) cosθsinθ +2r

′

(t)r(t)θ

′

(t) cosθsinθ

r

′

(t)

2

+r(t)

2

θ

′

(t)

2

, and so

L

_

b

a

_

x

′

(t)

2

+ y

′

(t)

2

+z

′

(t)

2

dt

_

b

a

_

r

′

(t)

2

+r(t)

2

θ

′

(t)

2

+z

′

(t)

2

dt QED

Example 1.44. Find the arc length L of the curve whose cylindrical coordinates are r e

t

,

θ t and z e

t

, for t over the interval [0, 1].

Solution: Since r

′

(t) e

t

, θ

′

(t) 1 and z

′

(t) e

t

, then

L

_

1

0

_

r

′

(t)

2

+r(t)

2

θ

′

(t)

2

+z

′

(t)

2

dt

_

1

0

_

e

2t

+e

2t

(1) +e

2t

dt

_

1

0

e

t

_

3dt

_

3(e −1)

Exercises

A

For Exercises 1-3, calculate the arc length of f(t) over the given interval.

1. f(t) (3cos2t, 3sin2t, 3t) on [0, π/2]

2. f(t) ((t

2

+1) cos t, (t

2

+1) sint, 2

_

2t) on [0, 1]

3. f(t) (2cos3t, 2sin3t, 2t

3/2

) on [0, 1]

4. Parametrize the curve from Exercise 1 by arc length.

5. Parametrize the curve from Exercise 3 by arc length.

B

6. Let f(t) be a differentiable curve such that f(t) /0 for all t. Show that

d

dt

_

f(t)

_

_

f(t)

_

_

_

f(t) ×××(f

′

(t) ××× f(t))

|f(t)|

3

.

64 CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

Exercises 7-9 develop the moving frame ﬁeld T, N, B at a point on a curve.

7. Let f(t) be a smooth curve such that f

′

(t) /0 for all t. Then we can deﬁne the unit tangent

vector T by

T(t)

f

′

(t)

|f

′

(t)|

.

Show that

T

′

(t)

f

′

(t) ×××(f

′′

(t) ××× f

′

(t))

|f

′

(t)|

3

.

8. Continuing Exercise 7, assume that f

′

(t) and f

′′

(t) are not parallel. Then T

′

(t) /0 so we

can deﬁne the unit principal normal vector N by

N(t)

T

′

(t)

|T

′

(t)|

.

Show that

N(t)

f

′

(t) ×××(f

′′

(t) ××× f

′

(t))

|f

′

(t)||f

′′

(t) ××× f

′

(t)|

.

9. Continuing Exercise 8, the unit binormal vector B is deﬁned by

B(t) T(t) ××× N(t).

Show that

B(t)

f

′

(t) ××× f

′′

(t)

|f

′

(t) ××× f

′′

(t)|

.

Note: The vectors T(t), N(t) and B(t) form a right-handed system of mutually perpendic-

ular unit vectors (called orthonormal vectors) at each point on the curve f(t).

10. Continuing Exercise 9, the curvature κ is deﬁned by

κ(t)

|T

′

(t)|

|f

′

(t)|

|f

′

(t) ×××(f

′′

(t) ××× f

′

(t))|

|f

′

(t)|

4

.

Show that

κ(t)

|f

′

(t) ××× f

′′

(t)|

|f

′

(t)|

3

and that T

′

(t) |f

′

(t)|κ(t) N(t).

Note: κ(t) gives a sense of how “curved” the curve f(t) is at each point.

11. Find T, N, B and κ at each point of the helix f(t) (cos t, sint, t).

12. Show that the arc length L of a curve whose spherical coordinates are ρ ρ(t), θ θ(t)

and φφ(t) for t in an interval [a, b] is

L

_

b

a

_

ρ

′

(t)

2

+(ρ(t)

2

sin

2

φ(t)) θ

′

(t)

2

+ρ(t)

2

φ

′

(t)

2

dt.

2 Functions of Several Variables

2.1 Functions of Two or Three Variables

In Section 1.8 we discussed vector-valued functions of a single real variable. We will now

examine real-valued functions of a point (or vector) in R

2

or R

3

. For the most part these

functions will be deﬁned on sets of points in R

2

, but there will be times when we will use

points in R

3

, and there will also be times when it will be convenient to think of the points as

vectors (or terminal points of vectors).

A real-valued function f deﬁned on a subset D of R

2

is a rule that assigns to each point

(x, y) in D a real number f (x, y). The largest possible set D in R

2

on which f is deﬁned is

called the domain of f , and the range of f is the set of all real numbers f (x, y) as (x, y)

varies over the domain D. A similar deﬁnition holds for functions f (x, y, z) deﬁned on points

(x, y, z) in R

3

.

Example 2.1. The domain of the function

f (x, y) xy

is all of R

2

, and the range of f is all of R.

Example 2.2. The domain of the function

f (x, y)

1

x− y

is all of R

2

except the points (x, y) for which x y. That is, the domain is the set D {(x, y) :

x / y}. The range of f is all real numbers except 0.

Example 2.3. The domain of the function

f (x, y)

_

1−x

2

− y

2

is the set D {(x, y) : x

2

+ y

2

≤1}, since the quantity inside the square root is nonnegative if

and only if 1−(x

2

+ y

2

) ≥0. We see that D consists of all points on and inside the unit circle

in R

2

(D is sometimes called the closed unit disk). The range of f is the interval [0, 1] in R.

65

66 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Example 2.4. The domain of the function

f (x, y, z) e

x+y−z

is all of R

3

, and the range of f is all positive real numbers.

A function f (x, y) deﬁned in R

2

is often written as z f (x, y), as was mentioned in Section

1.1, so that the graph of f (x, y) is the set {(x, y, z) : z f (x, y)} in R

3

. So we see that this

graph is a surface in R

3

, since it satisﬁes an equation of the form F(x, y, z) 0 (namely,

F(x, y, z) f (x, y) −z). The traces of this surface in the planes z c, where c varies over R,

are called the level curves of the function. Equivalently, the level curves are the solution

sets of the equations f (x, y) c, for c in R. Level curves are often projected onto the xy-plane

to give an idea of the various “elevation” levels of the surface (as is done in topography).

Example 2.5. The graph of the function

f (x, y)

sin

_

x

2

+ y

2

_

x

2

+ y

2

is shown below. Note that the level curves (shown both on the surface and projected onto the

xy-plane) are groups of concentric circles.

-10

-5

0

5

10

-10

-5

0

5

10

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

z

x

y

z

Figure 2.1.1 The function f (x, y)

sin

_

x

2

+y

2

_

x

2

+y

2

You may be wondering what happens to the function in Example 2.5 at the point (x, y)

(0, 0), since both the numerator and denominator are 0 at that point. The function is not

deﬁned at (0, 0), but the limit of the function exists (and equals 1) as (x, y) approaches (0, 0).

We will now state explicitly what is meant by the limit of a function of two variables.

2.1 Functions of Two or Three Variables 67

Deﬁnition 2.1. Let (a, b) be a point in R

2

, and let f (x, y) be a real-valued function deﬁned

on some set containing (a, b) (but not necessarily deﬁned at (a, b) itself). Then we say that

the limit of f (x, y) equals L as (x, y) approaches (a, b), written as

lim

(x, y)→(a,b)

f (x, y) L , (2.1)

if given any ǫ >0, there exists a δ >0 such that

[ f (x, y) −L[ <ǫ whenever 0 <

_

(x−a)

2

+(y−b)

2

<δ.

A similar deﬁnition can be made for functions of three variables. The idea behind the

above deﬁnition is that the values of f (x, y) can get arbitrarily close to L (i.e. within ǫ of

L) if we pick (x, y) sufﬁciently close to (a, b) (i.e. inside a circle centered at (a, b) with some

sufﬁciently small radius δ).

If you recall the “epsilon-delta” proofs of limits of real-valued functions of a single variable,

you may remember how awkward they can be, and how they can usually only be done easily

for simple functions. In general, the multivariable cases are at least equally awkward to go

through, so we will not bother with such proofs. Instead, we will simply state that when the

function f (x, y) is given by a single formula and is deﬁned at the point (a, b) (e.g. is not some

indeterminate form like 0/0) then you can just substitute (x, y) (a, b) into the formula for

f (x, y) to ﬁnd the limit.

Example 2.6.

lim

(x, y)→(1,2)

xy

x

2

+ y

2

(1)(2)

1

2

+2

2

2

5

since f (x, y)

xy

x

2

+y

2

is properly deﬁned at the point (1, 2).

The major difference between limits in one variable and limits in two or more variables

has to do with how a point is approached. In the single-variable case, the statement “x →a”

means that x gets closer to the value a from two possible directions along the real number

line (see Figure 2.1.2(a)). In two dimensions, however, (x, y) can approach a point (a, b) along

an inﬁnite number of paths (see Figure 2.1.2(b)).

0 x a

x x

(a) x →a in R

x

y

0

(a, b)

(b) (x, y) →(a, b) in R

2

Figure 2.1.2 “Approaching” a point in different dimensions

68 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Example 2.7.

lim

(x, y)→(0,0)

xy

x

2

+ y

2

does not exist

Note that we can not simply substitute (x, y) (0, 0) into the function, since doing so gives an

indeterminate form 0/0. To show that the limit does not exist, we will show that the function

approaches different values as (x, y) approaches (0, 0) along different paths in R

2

. To see this,

suppose that (x, y) →(0, 0) along the positive x-axis, so that y 0 along that path. Then

f (x, y)

xy

x

2

+ y

2

x0

x

2

+0

2

0

along that path (since x >0 in the denominator). But if (x, y) →(0, 0) along the straight line

y x through the origin, for x >0, then we see that

f (x, y)

xy

x

2

+ y

2

x

2

x

2

+x

2

1

2

which means that f (x, y) approaches different values as (x, y) →(0, 0) along different paths.

Hence the limit does not exist.

Limits of real-valued multivariable functions obey the same algebraic rules as in the

single-variable case, as shown in the following theorem, which we state without proof.

Theorem 2.1. Suppose that lim

(x, y)→(a,b)

f (x, y) and lim

(x, y)→(a,b)

g(x, y) both exist, and that k is

some scalar. Then:

(a) lim

(x, y)→(a,b)

[ f (x, y) ± g(x, y)]

_

lim

(x, y)→(a,b)

f (x, y)

_

±

_

lim

(x, y)→(a,b)

g(x, y)

_

(b) lim

(x, y)→(a,b)

k f (x, y) k

_

lim

(x, y)→(a,b)

f (x, y)

_

(c) lim

(x, y)→(a,b)

[ f (x, y)g(x, y)]

_

lim

(x, y)→(a,b)

f (x, y)

__

lim

(x, y)→(a,b)

g(x, y)

_

(d) lim

(x, y)→(a,b)

f (x, y)

g(x, y)

lim

(x, y)→(a,b)

f (x, y)

lim

(x, y)→(a,b)

g(x, y)

if lim

(x, y)→(a,b)

g(x, y) /0

(e) If [ f (x, y) −L[ ≤ g(x, y) for all (x, y) and if lim

(x, y)→(a,b)

g(x, y) 0, then lim

(x, y)→(a,b)

f (x, y) L.

Note that in part (e), it sufﬁces to have [ f (x, y)−L[ ≤ g(x, y) for all (x, y) “sufﬁciently close”

to (a, b) (but excluding (a, b) itself).

2.1 Functions of Two or Three Variables 69

Example 2.8. Show that

lim

(x, y)→(0,0)

y

4

x

2

+ y

2

0.

Since substituting (x, y) (0, 0) into the function gives the indeterminate form 0/0, we need

an alternate method for evaluating this limit. We will use Theorem 2.1(e). First, notice that

y

4

__

y

2

_

4

and so 0 ≤ y

4

≤

__

x

2

+ y

2

_

4

for all (x, y). But

__

x

2

+ y

2

_

4

(x

2

+ y

2

)

2

. Thus, for

all (x, y) /(0, 0) we have

¸

¸

¸

¸

y

4

x

2

+ y

2

¸

¸

¸

¸

≤

(x

2

+ y

2

)

2

x

2

+ y

2

x

2

+ y

2

→0 as (x, y) →(0, 0).

Therefore lim

(x, y)→(0,0)

y

4

x

2

+ y

2

0.

Continuity can be deﬁned similarly as in the single-variable case.

Deﬁnition 2.2. A real-valued function f (x, y) with domain D in R

2

is continuous at the

point (a, b) in D if lim

(x, y)→(a,b)

f (x, y) f (a, b). We say that f (x, y) is a continuous function if

it is continuous at every point in its domain D.

Unless indicated otherwise, you can assume that all the functions we deal with are con-

tinuous. In fact, we can modify the function from Example 2.8 so that it is continuous on all

of R

2

.

Example 2.9. Deﬁne a function f (x, y) on all of R

2

as follows:

f (x, y)

_

¸

_

¸

_

0 if (x, y) (0, 0)

y

4

x

2

+ y

2

if (x, y) /(0, 0)

Then f (x, y) is well-deﬁned for all (x, y) in R

2

(i.e. there are no indeterminate forms for any

(x, y)), and we see that

lim

(x, y)→(a,b)

f (x, y)

b

4

a

2

+b

2

f (a, b) for (a, b) /(0, 0).

So since

lim

(x, y)→(0,0)

f (x, y) 0 f (0, 0) by Example 2.8,

then f (x, y) is continuous on all of R

2

.

70 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Exercises

A

For Exercises 1-6, state the domain and range of the given function.

1. f (x, y) x

2

+ y

2

−1 2. f (x, y)

1

x

2

+ y

2

3. f (x, y)

_

x

2

+ y

2

−4 4. f (x, y)

x

2

+1

y

5. f (x, y, z) sin(xyz) 6. f (x, y, z)

_

(x−1)(yz −1)

For Exercises 7-18, evaluate the given limit.

7. lim

(x, y)→(0,0)

cos(xy) 8. lim

(x, y)→(0,0)

e

xy

9. lim

(x, y)→(0,0)

x

2

− y

2

x

2

+ y

2

10. lim

(x, y)→(0,0)

xy

2

x

2

+ y

4

11. lim

(x, y)→(1,−1)

x

2

−2xy+ y

2

x− y

12. lim

(x, y)→(0,0)

xy

2

x

2

+ y

2

13. lim

(x, y)→(1,1)

x

2

− y

2

x− y

14. lim

(x, y)→(0,0)

x

2

−2xy+ y

2

x− y

15. lim

(x, y)→(0,0)

y

4

sin(xy)

x

2

+ y

2

16. lim

(x, y)→(0,0)

(x

2

+ y

2

) cos

_

1

xy

_

17. lim

(x, y)→(0,0)

x

y

18. lim

(x, y)→(0,0)

cos

_

1

xy

_

B

19. Show that f (x, y)

1

2πσ

2

e

−(x

2

+y

2

)/2σ

2

, for σ > 0, is constant on the circle of radius r > 0

centered at the origin. This function is called a Gaussian blur, and is used as a ﬁlter in

image processing software to produce a “blurred” effect.

20. Suppose that f (x, y) ≤ f (y, x) for all (x, y) in R

2

. Show that f (x, y) f (y, x) for all (x, y) in

R

2

.

21. Use the substitution r

_

x

2

+ y

2

to show that

lim

(x, y)→(0,0)

sin

_

x

2

+ y

2

_

x

2

+ y

2

1 .

(Hint: You will need to use L’Hôpital’s Rule for single-variable limits.)

C

22. Prove Theorem 2.1(a) in the case of addition. (Hint: Use Deﬁnition 2.1.)

23. Prove Theorem 2.1(b).

2.2 Partial Derivatives 71

2.2 Partial Derivatives

Now that we have an idea of what functions of several variables are, and what a limit of

such a function is, we can start to develop an idea of a derivative of a function of two or more

variables. We will start with the notion of a partial derivative.

Deﬁnition 2.3. Let f (x, y) be a real-valued function with domain D in R

2

, and let (a, b) be

a point in D. Then the partial derivative of f at (a, b) with respect to x, denoted by

∂f

∂x

(a, b), is deﬁned as

∂f

∂x

(a, b) lim

h→0

f (a+h, b) − f (a, b)

h

(2.2)

and the partial derivative of f at (a, b) with respect to y, denoted by

∂f

∂y

(a, b), is deﬁned

as

∂f

∂y

(a, b) lim

h→0

f (a, b+h) − f (a, b)

h

. (2.3)

Note: The symbol ∂ is pronounced “del”.

1

Recall that the derivative of a function f (x) can be interpreted as the rate of change of

that function in the (positive) x direction. From the deﬁnitions above, we can see that the

partial derivative of a function f (x, y) with respect to x or y is the rate of change of f (x, y) in

the (positive) x or y direction, respectively. What this means is that the partial derivative of

a function f (x, y) with respect to x can be calculated by treating the y variable as a constant,

and then simply differentiating f (x, y) as if it were a function of x alone, using the usual

rules from single-variable calculus. Likewise, the partial derivative of f (x, y) with respect to

y is obtained by treating the x variable as a constant and then differentiating f (x, y) as if it

were a function of y alone.

Example 2.10. Find

∂f

∂x

(x, y) and

∂f

∂y

(x, y) for the function f (x, y) x

2

y+ y

3

.

Solution: Treating y as a constant and differentiating f (x, y) with respect to x gives

∂f

∂x

(x, y) 2xy

and treating x as a constant and differentiating f (x, y) with respect to y gives

∂f

∂y

(x, y) x

2

+3y

2

.

1

It is not a Greek letter. The symbol was ﬁrst used by the mathematicians A. Clairaut and L. Euler around

1740, to distinguish it from the letter d used for the “usual” derivative.

72 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

We will often simply write

∂f

∂x

and

∂f

∂y

instead of

∂f

∂x

(x, y) and

∂f

∂y

(x, y).

Example 2.11. Find

∂f

∂x

and

∂f

∂y

for the function f (x, y)

sin(xy

2

)

x

2

+1

.

Solution: Treating y as a constant and differentiating f (x, y) with respect to x gives

∂f

∂x

(x

2

+1)(y

2

cos(xy

2

)) −(2x) sin(xy

2

)

(x

2

+1)

2

and treating x as a constant and differentiating f (x, y) with respect to y gives

∂f

∂y

2xy cos(xy

2

)

x

2

+1

.

Since both

∂f

∂x

and

∂f

∂y

are themselves functions of x and y, we can take their partial

derivatives with respect to x and y. This yields the higher-order partial derivatives:

∂

2

f

∂x

2

∂

∂x

_

∂f

∂x

_

∂

2

f

∂y

2

∂

∂y

_

∂f

∂y

_

∂

2

f

∂y∂x

∂

∂y

_

∂f

∂x

_

∂

2

f

∂x∂y

∂

∂x

_

∂f

∂y

_

∂

3

f

∂x

3

∂

∂x

_

∂

2

f

∂x

2

_

∂

3

f

∂y

3

∂

∂y

_

∂

2

f

∂y

2

_

∂

3

f

∂y∂x

2

∂

∂y

_

∂

2

f

∂x

2

_

∂

3

f

∂x∂y

2

∂

∂x

_

∂

2

f

∂y

2

_

∂

3

f

∂y

2

∂x

∂

∂y

_

∂

2

f

∂y∂x

_

∂

3

f

∂x

2

∂y

∂

∂x

_

∂

2

f

∂x∂y

_

∂

3

f

∂x∂y∂x

∂

∂x

_

∂

2

f

∂y∂x

_

∂

3

f

∂y∂x∂y

∂

∂y

_

∂

2

f

∂x∂y

_

.

.

.

Example 2.12. Find the partial derivatives

∂f

∂x

,

∂f

∂y

,

∂

2

f

∂x

2

,

∂

2

f

∂y

2

,

∂

2

f

∂y∂x

and

∂

2

f

∂x∂y

for the

function f (x, y) e

x

2

y

+xy

3

.

2.2 Partial Derivatives 73

Solution: Proceeding as before, we have

∂f

∂x

2xye

x

2

y

+ y

3

∂f

∂y

x

2

e

x

2

y

+3xy

2

∂

2

f

∂x

2

∂

∂x

(2xye

x

2

y

+ y

3

)

∂

2

f

∂y

2

∂

∂y

(x

2

e

x

2

y

+3xy

2

)

2ye

x

2

y

+4x

2

y

2

e

x

2

y

x

4

e

x

2

y

+6xy

∂

2

f

∂y∂x

∂

∂y

(2xye

x

2

y

+ y

3

)

∂

2

f

∂x∂y

∂

∂x

(x

2

e

x

2

y

+3xy

2

)

2xe

x

2

y

+2x

3

ye

x

2

y

+3y

2

2xe

x

2

y

+2x

3

ye

x

2

y

+3y

2

Higher-order partial derivatives that are taken with respect to different variables, such

as

∂

2

f

∂y∂x

and

∂

2

f

∂x∂y

, are called mixed partial derivatives. Notice in the above example that

∂

2

f

∂y∂x

∂

2

f

∂x∂y

. It turns that this will usually be the case. Speciﬁcally, whenever both

∂

2

f

∂y∂x

and

∂

2

f

∂x∂y

are continuous at a point (a, b), then they are equal at that point.

2

All the functions

we will deal with will have continuous partial derivatives of all orders, so you can assume in

the remainder of the text that

∂

2

f

∂y∂x

∂

2

f

∂x∂y

for all (x, y) in the domain of f .

In other words, it doesn’t matter in which order you take partial derivatives. This applies

even to mixed partial derivatives of order 3 or higher.

The notation for partial derivatives varies. All of the following are equivalent:

∂f

∂x

: f

x

(x, y) , f

1

(x, y) , D

x

(x, y) , D

1

(x, y)

∂f

∂y

: f

y

(x, y) , f

2

(x, y) , D

y

(x, y) , D

2

(x, y)

∂

2

f

∂x

2

: f

xx

(x, y) , f

11

(x, y) , D

xx

(x, y) , D

11

(x, y)

∂

2

f

∂y

2

: f

yy

(x, y) , f

22

(x, y) , D

yy

(x, y) , D

22

(x, y)

∂

2

f

∂y∂x

: f

xy

(x, y) , f

12

(x, y) , D

xy

(x, y) , D

12

(x, y)

∂

2

f

∂x∂y

: f

yx

(x, y) , f

21

(x, y) , D

yx

(x, y) , D

21

(x, y)

2

See pp. 214-216 in TAYLOR and MANN for a proof.

74 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Exercises

A

For Exercises 1-16, ﬁnd

∂f

∂x

and

∂f

∂y

.

1. f (x, y) x

2

+ y

2

2. f (x, y) cos(x+ y)

3. f (x, y)

_

x

2

+ y+4

4. f (x, y)

x+1

y+1

5. f (x, y) e

xy

+xy 6. f (x, y) x

2

− y

2

+6xy+4x−8y+2

7. f (x, y) x

4

8. f (x, y) x+2y

9. f (x, y)

_

x

2

+ y

2

10. f (x, y) sin(x+ y)

11. f (x, y)

3

_

x

2

+ y+4

12. f (x, y)

xy+1

x+ y

13. f (x, y) e

−(x

2

+y

2

)

14. f (x, y) ln(xy)

15. f (x, y) sin(xy) 16. f (x, y) tan(x+ y)

For Exercises 17-26, ﬁnd

∂

2

f

∂x

2

,

∂

2

f

∂y

2

and

∂

2

f

∂y∂x

(use Exercises 1-8, 14, 15).

17. f (x, y) x

2

+ y

2

18. f (x, y) cos(x+ y)

19. f (x, y)

_

x

2

+ y+4

20. f (x, y)

x+1

y+1

21. f (x, y) e

xy

+xy 22. f (x, y) x

2

− y

2

+6xy+4x−8y+2

23. f (x, y) x

4

24. f (x, y) x+2y

25. f (x, y) ln(xy) 26. f (x, y) sin(xy)

B

27. Show that the function f (x, y) sin(x+ y) +cos(x− y) satisﬁes the wave equation

∂

2

f

∂x

2

−

∂

2

f

∂y

2

0 .

The wave equation is an example of a partial differential equation.

28. Let u and v be twice-differentiable functions of a single variable, and let c /0 be a con-

stant. Show that f (x, y) u(x+cy) +v(x−cy) is a solution of the general one-dimensional

wave equation

3

∂

2

f

∂x

2

−

1

c

2

∂

2

f

∂y

2

0 .

3

Conversely, it turns out that any solution must be of this form. See Ch. 1 in WEINBERGER.

2.3 Tangent Plane to a Surface 75

2.3 Tangent Plane to a Surface

In the previous section we mentioned that the partial derivatives

∂f

∂x

and

∂f

∂y

can be thought

of as the rate of change of a function z f (x, y) in the positive x and y directions, respectively.

Recall that the derivative

dy

dx

of a function y f (x) has a geometric meaning, namely as the

slope of the tangent line to the graph of f at the point (x, f (x)) in R

2

. There is a similar

geometric meaning to the partial derivatives

∂f

∂x

and

∂f

∂y

of a function z f (x, y): given a

point (a, b) in the domain D of f (x, y), the trace of the surface described by z f (x, y) in the

plane y b is a curve in R

3

through the point (a, b, f (a, b)), and the slope of the tangent line

L

x

to that curve at that point is

∂f

∂x

(a, b). Similarly,

∂f

∂y

(a, b) is the slope of the tangent line

L

y

to the trace of the surface z f (x, y) in the plane x a (see Figure 2.3.1).

y

z

x

0

(a, b)

D

L

x

b

(a, b, f (a, b))

slope

∂f

∂x

(a, b)

z f (x, y)

(a) Tangent line L

x

in the plane y b

y

z

x

0

(a, b)

D

L

y

a

(a, b, f (a, b))

slope

∂f

∂y

(a, b)

z f (x, y)

(b) Tangent line L

y

in the plane x a

Figure 2.3.1 Partial derivatives as slopes

Since the derivative

dy

dx

of a function y f (x) is used to ﬁnd the tangent line to the graph

of f (which is a curve in R

2

), you might expect that partial derivatives can be used to deﬁne

a tangent plane to the graph of a surface z f (x, y). This indeed turns out to be the case.

First, we need a deﬁnition of a tangent plane. The intuitive idea is that a tangent plane “just

touches” a surface at a point. The formal deﬁnition mimics the intuitive notion of a tangent

line to a curve.

Deﬁnition 2.4. Let z f (x, y) be the equation of a surface S in R

3

, and let P (a, b, c) be

a point on S. Let T be a plane which contains the point P, and let Q (x, y, z) represent a

generic point on the surface S. If the (acute) angle between the vector

−−→

PQ and the plane

T approaches zero as the point Q approaches P along the surface S, then we call T the

tangent plane to S at P.

Note that since two lines in R

3

determine a plane, then the two tangent lines to the surface

z f (x, y) in the x and y directions described in Figure 2.3.1 are contained in the tangent

plane at that point, if the tangent plane exists at that point. The existence of those two

76 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

tangent lines does not by itself guarantee the existence of the tangent plane. It is possible

that if we take the trace of the surface in the plane x− y 0 (which makes a 45

◦

angle with

the positive x-axis), the resulting curve in that plane may have a tangent line which is not

in the plane determined by the other two tangent lines, or it may not have a tangent line

at all at that point. Luckily, it turns out

4

that if

∂f

∂x

and

∂f

∂y

exist in a region around a point

(a, b) and are continuous at (a, b) then the tangent plane to the surface z f (x, y) will exist

at the point (a, b, f (a, b)). In this text, those conditions will always hold.

y

z

x

0

(a, b, f (a, b))

z f (x, y)

T

L

x

L

y

Figure 2.3.2 Tangent plane

Suppose that we want an equation of the tangent plane T

to the surface z f (x, y) at a point (a, b, f (a, b)). Let L

x

and

L

y

be the tangent lines to the traces of the surface in the

planes y b and x a, respectively (as in Figure 2.3.2), and

suppose that the conditions for T to exist do hold. Then the

equation for T is

A(x−a) +B(y−b) +C(z − f (a, b)) 0 (2.4)

where n (A, B, C) is a normal vector to the plane T. Since

T contains the lines L

x

and L

y

, then all we need are vectors v

x

and v

y

that are parallel to L

x

and L

y

, respectively, and then let nv

x

×××v

y

.

x

z

0

v

x

(1, 0,

∂f

∂x

(a, b))

∂f

∂x

(a, b)

1

Figure 2.3.3

Since the slope of L

x

is

∂f

∂x

(a, b), then the vector v

x

(1, 0,

∂f

∂x

(a, b)) is

parallel to L

x

(since v

x

lies in the xz-plane and lies in a line with slope

∂f

∂x

(a,b)

1

∂f

∂x

(a, b). See Figure 2.3.3). Similarly, the vector

v

y

(0, 1,

∂f

∂y

(a, b)) is parallel to L

y

. Hence, the vector

nv

x

×××v

y

¸

¸

¸

¸

¸

¸

¸

i j k

1 0

∂f

∂x

(a, b)

0 1

∂f

∂y

(a, b)

¸

¸

¸

¸

¸

¸

¸

−

∂f

∂x

(a, b) i −

∂f

∂y

(a, b) j +k

is normal to the plane T. Thus the equation of T is

−

∂f

∂x

(a, b) (x−a) −

∂f

∂y

(a, b) (y−b) +z − f (a, b) 0 . (2.5)

Multiplying both sides by −1, we have the following result:

The equation of the tangent plane to the surface z f (x, y) at the point (a, b, f (a, b)) is

∂f

∂x

(a, b) (x−a) +

∂f

∂y

(a, b) (y−b) −z + f (a, b) 0 (2.6)

4

See TAYLOR and MANN, § 6.4.

2.3 Tangent Plane to a Surface 77

Example 2.13. Find the equation of the tangent plane to the surface z x

2

+y

2

at the point

(1, 2, 5).

Solution: For the function f (x, y) x

2

+ y

2

, we have

∂f

∂x

2x and

∂f

∂y

2y, so the equation of

the tangent plane at the point (1, 2, 5) is

2(1)(x−1) +2(2)(y−2) −z +5 0 , or

2x+4y−z −5 0 .

In a similar fashion, it can be shown that if a surface is deﬁned implicitly by an equation

of the form F(x, y, z) 0, then the tangent plane to the surface at a point (a, b, c) is given by

the equation

∂F

∂x

(a, b, c) (x−a) +

∂F

∂y

(a, b, c) (y−b) +

∂F

∂z

(a, b, c) (z −c) 0 . (2.7)

Note that formula (2.6) is the special case of formula (2.7) where F(x, y, z) f (x, y) −z.

Example 2.14. Find the equation of the tangent plane to the surface x

2

+ y

2

+z

2

9 at the

point (2, 2, −1).

Solution: For the function F(x, y, z) x

2

+ y

2

+z

2

−9, we have

∂F

∂x

2x,

∂F

∂y

2y, and

∂F

∂z

2z,

so the equation of the tangent plane at (2, 2, −1) is

2(2)(x−2) +2(2)(y−2) +2(−1)(z +1) 0 , or

2x+2y−z −9 0 .

Exercises

A

For Exercises 1-6, ﬁnd the equation of the tangent plane to the surface z f (x, y) at the

point P.

1. f (x, y) x

2

+ y

3

, P (1, 1, 2) 2. f (x, y) xy, P (1, −1, −1)

3. f (x, y) x

2

y, P (−1, 1, 1) 4. f (x, y) xe

y

, P (1, 0, 1)

5. f (x, y) x+2y, P (2, 1, 4) 6. f (x, y)

_

x

2

+ y

2

, P (3, 4, 5)

For Exercises 7-10, ﬁnd the equation of the tangent plane to the given surface at the point

P.

7.

x

2

4

+

y

2

9

+

z

2

16

1, P

_

1, 2,

2

_

11

3

_

8. x

2

+ y

2

+z

2

9, P (0, 0, 3)

9. x

2

+ y

2

−z

2

0, P (3, 4, 5) 10. x

2

+ y

2

4, P (

_

3, 1, 0)

78 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

2.4 Directional Derivatives and the Gradient

For a function z f (x, y), we learned that the partial derivatives

∂f

∂x

and

∂f

∂y

represent the

(instantaneous) rate of change of f in the positive x and y directions, respectively. What

about other directions? It turns out that we can ﬁnd the rate of change in any direction

using a more general type of derivative called a directional derivative.

Deﬁnition 2.5. Let f (x, y) be a real-valued function with domain D in R

2

, and let (a, b) be a

point in D. Let v be a unit vector in R

2

. Then the directional derivative of f at (a, b) in

the direction of v, denoted by D

v

f (a, b), is deﬁned as

D

v

f (a, b) lim

h→0

f ((a, b) +hv) − f (a, b)

h

(2.8)

Notice in the deﬁnition that we seem to be treating the point (a, b) as a vector, since we

are adding the vector hv to it. But this is just the usual idea of identifying vectors with their

terminal points, which the reader should be used to by now. If we were to write the vector v

as v (v

1

, v

2

), then

D

v

f (a, b) lim

h→0

f (a+hv

1

, b+hv

2

) − f (a, b)

h

. (2.9)

From this we can immediately recognize that the partial derivatives

∂f

∂x

and

∂f

∂y

are special

cases of the directional derivative with v i (1, 0) and v j (0, 1), respectively. That is,

∂f

∂x

D

i

f and

∂f

∂y

D

j

f . Since there are many vectors with the same direction, we use a unit

vector in the deﬁnition, as that represents a “standard” vector for a given direction.

If f (x, y) has continuous partial derivatives

∂f

∂x

and

∂f

∂y

(which will always be the case in

this text), then there is a simple formula for the directional derivative:

Theorem 2.2. Let f (x, y) be a real-valued function with domain D in R

2

such that the

partial derivatives

∂f

∂x

and

∂f

∂y

exist and are continuous in D. Let (a, b) be a point in D, and

let v (v

1

, v

2

) be a unit vector in R

2

. Then

D

v

f (a, b) v

1

∂f

∂x

(a, b) +v

2

∂f

∂y

(a, b) . (2.10)

Proof: Note that if v i (1, 0) then the above formula reduces to D

v

f (a, b)

∂f

∂x

(a, b),

which we know is true since D

i

f

∂f

∂x

, as we noted earlier. Similarly, for v j (0, 1) the

formula reduces to D

v

f (a, b)

∂f

∂y

(a, b), which is true since D

j

f

∂f

∂y

. So since i (1, 0) and

j (0, 1) are the only unit vectors in R

2

with a zero component, then we need only show the

formula holds for unit vectors v (v

1

, v

2

) with v

1

/0 and v

2

/0. So ﬁx such a vector v and

ﬁx a number h /0.

2.4 Directional Derivatives and the Gradient 79

Then

f (a+hv

1

, b+hv

2

) − f (a, b) f (a+hv

1

, b+hv

2

) − f (a+hv

1

, b) + f (a+hv

1

, b) − f (a, b) . (2.11)

Since h / 0 and v

2

/ 0, then hv

2

/ 0 and thus any number c between b and b+hv

2

can be

written as c b+αhv

2

for some number 0 <α<1. So since the function f (a+hv

1

, y) is a real-

valued function of y (since a+hv

1

is a ﬁxed number), then the Mean Value Theorem from

single-variable calculus can be applied to the function g(y) f (a+hv

1

, y) on the interval

[b, b+hv

2

] (or [b+hv

2

, b] if one of h or v

2

is negative) to ﬁnd a number 0 <α<1 such that

∂f

∂y

(a+hv

1

, b+αhv

2

) g

′

(b+αhv

2

)

g(b+hv

2

) − g(b)

b+hv

2

−b

f (a+hv

1

, b+hv

2

) − f (a+hv

1

, b)

hv

2

and so

f (a+hv

1

, b+hv

2

) − f (a+hv

1

, b) hv

2

∂f

∂y

(a+hv

1

, b+αhv

2

) .

By a similar argument, there exists a number 0 <β<1 such that

f (a+hv

1

, b) − f (a, b) hv

1

∂f

∂x

(a+βhv

1

, b) .

Thus, by equation (2.11), we have

f (a+hv

1

, b+hv

2

) − f (a, b)

h

hv

2

∂f

∂y

(a+hv

1

, b+αhv

2

) +hv

1

∂f

∂x

(a+βhv

1

, b)

h

v

2

∂f

∂y

(a+hv

1

, b+αhv

2

) +v

1

∂f

∂x

(a+βhv

1

, b)

so by formula (2.9) we have

D

v

f (a, b) lim

h→0

f (a+hv

1

, b+hv

2

) − f (a, b)

h

lim

h→0

_

v

2

∂f

∂y

(a+hv

1

, b+αhv

2

) +v

1

∂f

∂x

(a+βhv

1

, b)

_

v

2

∂f

∂y

(a, b) +v

1

∂f

∂x

(a, b) by the continuity of

∂f

∂x

and

∂f

∂y

, so

D

v

f (a, b) v

1

∂f

∂x

(a, b) +v

2

∂f

∂y

(a, b)

after reversing the order of summation. QED

Note that D

v

f (a, b) v···

_

∂f

∂x

(a, b),

∂f

∂y

(a, b)

_

. The second vector has a special name:

80 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Deﬁnition 2.6. For a real-valued function f (x, y), the gradient of f , denoted by ∇f , is the

vector

∇f

_

∂f

∂x

,

∂f

∂y

_

(2.12)

in R

2

. For a real-valued function f (x, y, z), the gradient is the vector

∇f

_

∂f

∂x

,

∂f

∂y

,

∂f

∂z

_

(2.13)

in R

3

. The symbol ∇ is pronounced “del”.

5

Corollary 2.3. D

v

f v··· ∇f

Example 2.15. Find the directional derivative of f (x, y) xy

2

+x

3

y at the point (1, 2) in the

direction of v

_

1

_

2

,

1

_

2

_

.

Solution: We see that ∇f (y

2

+3x

2

y, 2xy+x

3

), so

D

v

f (1, 2) v··· ∇f (1, 2)

_

1

_

2

,

1

_

2

_

··· (2

2

+3(1)

2

(2), 2(1)(2) +1

3

)

15

_

2

A real-valued function z f (x, y) whose partial derivatives

∂f

∂x

and

∂f

∂y

exist and are con-

tinuous is called continuously differentiable. Assume that f (x, y) is such a function and that

∇f / 0. Let c be a real number in the range of f and let v be a unit vector in R

2

which is

tangent to the level curve f (x, y) c (see Figure 2.4.1).

x

y

0

v ∇f

f (x, y) c

Figure 2.4.1

5

Sometimes the notation grad( f ) is used instead of ∇f .

2.4 Directional Derivatives and the Gradient 81

The value of f (x, y) is constant along a level curve, so since v is a tangent vector to this

curve, then the rate of change of f in the direction of v is 0, i.e. D

v

f 0. But we know that

D

v

f v··· ∇f |v||∇f | cosθ, where θ is the angle between v and ∇f . So since |v| 1 then

D

v

f |∇f | cosθ. So since ∇f /0 then D

v

f 0 ⇒cosθ 0 ⇒θ 90

◦

. In other words, ∇f ⊥v,

which means that ∇f is normal to the level curve.

In general, for any unit vector v in R

2

, we still have D

v

f |∇f | cosθ, where θ is the angle

between v and ∇f . At a ﬁxed point (x, y) the length |∇f | is ﬁxed, and the value of D

v

f then

varies as θ varies. The largest value that D

v

f can take is when cosθ 1 (θ 0

◦

), while the

smallest value occurs when cosθ −1 (θ 180

◦

). In other words, the value of the function

f increases the fastest in the direction of ∇f (since θ 0

◦

in that case), and the value of

f decreases the fastest in the direction of −∇f (since θ 180

◦

in that case). We have thus

proved the following theorem:

Theorem 2.4. Let f (x, y) be a continuously differentiable real-valued function, with ∇f /0.

Then:

(a) The gradient ∇f is normal to any level curve f (x, y) c.

(b) The value of f (x, y) increases the fastest in the direction of ∇f .

(c) The value of f (x, y) decreases the fastest in the direction of −∇f .

Example 2.16. In which direction does the function f (x, y) xy

2

+x

3

y increase the fastest

from the point (1, 2)? In which direction does it decrease the fastest?

Solution: Since ∇f (y

2

+3x

2

y, 2xy +x

3

), then ∇f (1, 2) (10, 5) / 0. A unit vector in that

direction is v

∇f

|∇f |

_

2

_

5

,

1

_

5

_

. Thus, f increases the fastest in the direction of

_

2

_

5

,

1

_

5

_

and

decreases the fastest in the direction of

_

−2

_

5

,

−1

_

5

_

.

Though we proved Theorem 2.4 for functions of two variables, a similar argument can

be used to show that it also applies to functions of three or more variables. Likewise, the

directional derivative in the three-dimensional case can also be deﬁned by the formula D

v

f

v··· ∇f .

Example 2.17. The temperature T of a solid is given by the function T(x, y, z) e

−x

+e

−2y

+

e

4z

, where x, y, z are space coordinates relative to the center of the solid. In which direction

from the point (1, 1, 1) will the temperature decrease the fastest?

Solution: Since ∇f (−e

−x

, −2e

−2y

, 4e

4z

), then the temperature will decrease the fastest in

the direction of −∇f (1, 1, 1) (e

−1

, 2e

−2

, −4e

4

).

82 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Exercises

A

For Exercises 1-10, compute the gradient ∇f .

1. f (x, y) x

2

+ y

2

−1 2. f (x, y)

1

x

2

+ y

2

3. f (x, y)

_

x

2

+ y

2

+4 4. f (x, y) x

2

e

y

5. f (x, y) ln(xy) 6. f (x, y) 2x+5y

7. f (x, y, z) sin(xyz) 8. f (x, y, z) x

2

e

yz

9. f (x, y, z) x

2

+ y

2

+z

2

10. f (x, y, z)

_

x

2

+ y

2

+z

2

For Exercises 11-14, ﬁnd the directional derivative of f at the point P in the direction of

v

_

1

_

2

,

1

_

2

_

.

11. f (x, y) x

2

+ y

2

−1, P (1, 1) 12. f (x, y)

1

x

2

+ y

2

, P (1, 1)

13. f (x, y)

_

x

2

+ y

2

+4, P (1, 1) 14. f (x, y) x

2

e

y

, P (1, 1)

For Exercises 15-16, ﬁnd the directional derivative of f at the point P in the direction of

v

_

1

_

3

,

1

_

3

,

1

_

3

_

.

15. f (x, y, z) sin(xyz), P (1, 1, 1) 16. f (x, y, z) x

2

e

yz

, P (1, 1, 1)

17. Repeat Example 2.16 at the point (2, 3).

18. Repeat Example 2.17 at the point (3, 1, 2).

B

For Exercises 19-26, let f (x, y) and g(x, y) be continuously differentiable real-valued func-

tions, let c be a constant, and let v be a unit vector in R

2

. Show that:

19. ∇(cf ) c∇f 20. ∇( f + g) ∇f +∇g

21. ∇( f g) f ∇g+ g∇f 22. ∇( f / g)

g∇f − f ∇g

g

2

if g(x, y) /0

23. D

−v

f −D

v

f 24. D

v

(cf ) c D

v

f

25. D

v

( f + g) D

v

f + D

v

g 26. D

v

( f g) f D

v

g + gD

v

f

27. The function r(x, y)

_

x

2

+ y

2

is the length of the position vector r xi + yj for each

point (x, y) in R

2

. Show that ∇r

1

r

r when (x, y) /(0, 0), and that ∇(r

2

) 2r.

2.5 Maxima and Minima 83

2.5 Maxima and Minima

The gradient can be used to ﬁnd extreme points of real-valued functions of several variables,

that is, points where the function has a local maximum or local minimum. We will consider

only functions of two variables; functions of three or more variables require methods using

linear algebra.

Deﬁnition 2.7. Let f (x, y) be a real-valued function, and let (a, b) be a point in the domain

of f . We say that f has a local maximum at (a, b) if f (x, y) ≤ f (a, b) for all (x, y) inside some

disk of positive radius centered at (a, b), i.e. there is some sufﬁciently small r >0 such that

f (x, y) ≤ f (a, b) for all (x, y) for which (x−a)

2

+(y−b)

2

< r

2

.

Likewise, we say that f has a local minimum at (a, b) if f (x, y) ≥ f (a, b) for all (x, y)

inside some disk of positive radius centered at (a, b).

If f (x, y) ≤ f (a, b) for all (x, y) in the domain of f , then f has a global maximum at

(a, b). If f (x, y) ≥ f (a, b) for all (x, y) in the domain of f , then f has a global minimum at

(a, b).

Suppose that (a, b) is a local maximum point for f (x, y), and that the ﬁrst-order partial

derivatives of f exist at (a, b). We know that f (a, b) is the largest value of f (x, y) as (x, y)

goes in all directions from the point (a, b), in some sufﬁciently small disk centered at (a, b).

In particular, f (a, b) is the largest value of f in the x direction (around the point (a, b)), that

is, the single-variable function g(x) f (x, b) has a local maximum at x a. So we know that

g

′

(a) 0. Since g

′

(x)

∂f

∂x

(x, b), then

∂f

∂x

(a, b) 0. Similarly, f (a, b) is the largest value of f

near (a, b) in the y direction and so

∂f

∂y

(a, b) 0. We thus have the following theorem:

Theorem 2.5. Let f (x, y) be a real-valued function such that both

∂f

∂x

(a, b) and

∂f

∂y

(a, b) exist.

Then a necessary condition for f (x, y) to have a local maximum or minimum at (a, b) is that

∇f (a, b) 0.

Note: Theorem 2.5 can be extended to apply to functions of three or more variables.

A point (a, b) where ∇f (a, b) 0 is called a critical point for the function f (x, y). So given

a function f (x, y), to ﬁnd the critical points of f you have to solve the equations

∂f

∂x

(x, y) 0

and

∂f

∂y

(x, y) 0 simultaneously for (x, y). Similar to the single-variable case, the necessary

condition that ∇f (a, b) 0 is not always sufﬁcient to guarantee that a critical point is a local

maximum or minimum.

Example 2.18. The function f (x, y) xy has a critical point at (0, 0):

∂f

∂x

y 0 ⇒ y 0, and

∂f

∂y

x 0 ⇒ x 0, so (0, 0) is the only critical point. But clearly f does not have a local

maximum or minimum at (0, 0) since any disk around (0, 0) contains points (x, y) where the

values of x and y have the same sign (so that f (x, y) xy >0 f (0, 0)) and different signs (so

that f (x, y) xy <0 f (0, 0)). In fact, along the path y x in R

2

, f (x, y) x

2

, which has a

84 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

local minimum at (0, 0), while along the path y −x we have f (x, y) −x

2

, which has a local

maximum at (0, 0). So (0, 0) is an example of a saddle point, i.e. it is a local maximum in one

direction and a local minimum in another direction. The graph of f (x, y) is shown in Figure

2.5.1, which is a hyperbolic paraboloid.

-10

-5

0

5

10

-10

-5

0

5

10

-100

-50

0

50

100

z

x

y

z

Figure 2.5.1 f (x, y) xy, saddle point at (0, 0)

The following theorem gives sufﬁcient conditions for a critical point to be a local maximum

or minimum of a smooth function (i.e. a function whose partial derivatives of all orders exist

and are continuous), which we will not prove here.

6

Theorem 2.6. Let f (x, y) be a smooth real-valued function, with a critical point at (a, b) (i.e.

∇f (a, b) 0). Deﬁne

D

∂

2

f

∂x

2

(a, b)

∂

2

f

∂y

2

(a, b) −

_

∂

2

f

∂y∂x

(a, b)

_

2

Then

(a) if D >0 and

∂

2

f

∂x

2

(a, b) >0, then f has a local minimum at (a, b)

(b) if D >0 and

∂

2

f

∂x

2

(a, b) <0, then f has a local maximum at (a, b)

(c) if D <0, then f has neither a local minimum nor a local maximum at (a, b)

(d) if D 0, then the test fails.

6

See TAYLOR and MANN, § 7.6.

2.5 Maxima and Minima 85

If condition (c) holds, then (a, b) is a saddle point. Note that the assumption that f (x, y) is

smooth means that

D

¸

¸

¸

¸

¸

¸

¸

¸

¸

∂

2

f

∂x

2

(a, b)

∂

2

f

∂y∂x

(a, b)

∂

2

f

∂x∂y

(a, b)

∂

2

f

∂y

2

(a, b)

¸

¸

¸

¸

¸

¸

¸

¸

¸

since

∂

2

f

∂y∂x

∂

2

f

∂x∂y

. Also, if D >0 then

∂

2

f

∂x

2

(a, b)

∂

2

f

∂y

2

(a, b) D+

_

∂

2

f

∂y∂x

(a, b)

_

2

>0, and so

∂

2

f

∂x

2

(a, b)

and

∂

2

f

∂y

2

(a, b) have the same sign. This means that in parts (a) and (b) of the theorem one

can replace

∂

2

f

∂x

2

(a, b) by

∂

2

f

∂y

2

(a, b) if desired.

Example 2.19. Find all local maxima and minima of f (x, y) x

2

+xy+ y

2

−3x.

Solution: First ﬁnd the critical points, i.e. where ∇f 0. Since

∂f

∂x

2x+ y−3 and

∂f

∂y

x+2y

then the critical points (x, y) are the common solutions of the equations

2x+ y−3 0

x+2y 0

which has the unique solution (x, y) (2, −1). So (2, −1) is the only critical point.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

2 ,

∂

2

f

∂y

2

2 ,

∂

2

f

∂y∂x

1

and so

D

∂

2

f

∂x

2

(2, −1)

∂

2

f

∂y

2

(2, −1) −

_

∂

2

f

∂y∂x

(2, −1)

_

2

(2)(2) −1

2

3 > 0

and

∂

2

f

∂x

2

(2, −1) 2 >0. Thus, (2, −1) is a local minimum.

Example 2.20. Find all local maxima and minima of f (x, y) xy−x

3

− y

2

.

Solution: First ﬁnd the critical points, i.e. where ∇f 0. Since

∂f

∂x

y−3x

2

and

∂f

∂y

x−2y

86 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

then the critical points (x, y) are the common solutions of the equations

y−3x

2

0

x−2y 0

The ﬁrst equation yields y 3x

2

, substituting that into the second equation yields x−6x

2

0,

which has the solutions x 0 and x

1

6

. So x 0 ⇒y 3(0) 0 and x

1

6

⇒y 3

_

1

6

_

2

1

12

.

So the critical points are (x, y) (0, 0) and (x, y)

_

1

6

,

1

12

_

.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

−6x ,

∂

2

f

∂y

2

−2 ,

∂

2

f

∂y∂x

1

So

D

∂

2

f

∂x

2

(0, 0)

∂

2

f

∂y

2

(0, 0) −

_

∂

2

f

∂y∂x

(0, 0)

_

2

(−6(0))(−2) −1

2

−1 < 0

and thus (0, 0) is a saddle point. Also,

D

∂

2

f

∂x

2

_

1

6

,

1

12

_

∂

2

f

∂y

2

_

1

6

,

1

12

_

−

_

∂

2

f

∂y∂x

_

1

6

,

1

12

_

_

2

(−6

_

1

6

_

)(−2) −1

2

1 > 0

and

∂

2

f

∂x

2

_

1

6

,

1

12

_

−1 <0. Thus,

_

1

6

,

1

12

_

is a local maximum.

Example 2.21. Find all local maxima and minima of f (x, y) (x−2)

4

+(x−2y)

2

.

Solution: First ﬁnd the critical points, i.e. where ∇f 0. Since

∂f

∂x

4(x−2)

3

+2(x−2y) and

∂f

∂y

−4(x−2y)

then the critical points (x, y) are the common solutions of the equations

4(x−2)

3

+2(x−2y) 0

−4(x−2y) 0

The second equation yields x 2y, substituting that into the ﬁrst equation yields 4(2y−2)

3

**0, which has the solution y 1, and so x 2(1) 2. Thus, (2, 1) is the only critical point.
**

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

12(x−2)

2

+2 ,

∂

2

f

∂y

2

8 ,

∂

2

f

∂y∂x

−4

2.5 Maxima and Minima 87

So

D

∂

2

f

∂x

2

(2, 1)

∂

2

f

∂y

2

(2, 1) −

_

∂

2

f

∂y∂x

(2, 1)

_

2

(2)(8) −(−4)

2

0

and so the test fails. What can be done in this situation? Sometimes it is possible to examine

the function to see directly the nature of a critical point. In our case, we see that f (x, y) ≥0

for all (x, y), since f (x, y) is the sum of fourth and second powers of numbers and hence must

be nonnegative. But we also see that f (2, 1) 0. Thus f (x, y) ≥ 0 f (2, 1) for all (x, y), and

hence (2, 1) is in fact a global minimum for f .

Example 2.22. Find all local maxima and minima of f (x, y) (x

2

+ y

2

)e

−(x

2

+y

2

)

.

Solution: First ﬁnd the critical points, i.e. where ∇f 0. Since

∂f

∂x

2x(1−(x

2

+ y

2

))e

−(x

2

+y

2

)

∂f

∂y

2y(1−(x

2

+ y

2

))e

−(x

2

+y

2

)

then the critical points are (0, 0) and all points (x, y) on the unit circle x

2

+ y

2

1.

To use Theorem 2.6, we need the second-order partial derivatives:

∂

2

f

∂x

2

2[1−(x

2

+ y

2

) −2x

2

−2x

2

(1−(x

2

+ y

2

))]e

−(x

2

+y

2

)

∂

2

f

∂y

2

2[1−(x

2

+ y

2

) −2y

2

−2y

2

(1−(x

2

+ y

2

))]e

−(x

2

+y

2

)

∂

2

f

∂y∂x

−4xy[2−(x

2

+ y

2

)]e

−(x

2

+y

2

)

At (0, 0), we have D 4 > 0 and

∂

2

f

∂x

2

(0, 0) 2 > 0, so (0, 0) is a local minimum. However, for

points (x, y) on the unit circle x

2

+ y

2

1, we have

D (−4x

2

e

−1

)(−4y

2

e

−1

) −(−4xye

−1

)

2

0

and so the test fails. If we look at the graph of f (x, y), as shown in Figure 2.5.2, it looks like

we might have a local maximum for (x, y) on the unit circle x

2

+ y

2

1. If we switch to using

polar coordinates (r, θ) instead of (x, y) in R

2

, where r

2

x

2

+y

2

, then we see that we can write

f (x, y) as a function g(r) of the variable r alone: g(r) r

2

e

−r

2

. Then g

′

(r) 2r(1−r

2

)e

−r

2

,

so it has a critical point at r 1, and we can check that g

′′

(1) −4e

−1

< 0, so the Second

Derivative Test from single-variable calculus says that r 1 is a local maximum. But r 1

corresponds to the unit circle x

2

+ y

2

1. Thus, the points (x, y) on the unit circle x

2

+ y

2

1

are local maximum points for f .

88 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

-3

-2

-1

0

1

2

3

-3

-2

-1

0

1

2

3

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

z

x

y

z

Figure 2.5.2 f (x, y) (x

2

+ y

2

)e

−(x

2

+y

2

)

Exercises

A

For Exercises 1-10, ﬁnd all local maxima and minima of the function f (x, y).

1. f (x, y) x

3

−3x+ y

2

2. f (x, y) x

3

−12x+ y

2

+8y

3. f (x, y) x

3

−3x+ y

3

−3y 4. f (x, y) x

3

+3x

2

+ y

3

−3y

2

5. f (x, y) 2x

3

+6xy+3y

2

6. f (x, y) 2x

3

−6xy+ y

2

7. f (x, y)

_

x

2

+ y

2

8. f (x, y) x+2y

9. f (x, y) 4x

2

−4xy+2y

2

+10x−6y 10. f (x, y) −4x

2

+4xy−2y

2

+16x−12y

B

11. For a rectangular solid of volume 1000 cubic meters, ﬁnd the dimensions that will min-

imize the surface area. (Hint: Use the volume condition to write the surface area as a

function of just two variables.)

12. Prove that if (a, b) is a local maximum or local minimum point for a smooth function

f (x, y), then the tangent plane to the surface z f (x, y) at the point (a, b, f (a, b)) is parallel

to the xy-plane. (Hint: Use Theorem 2.5.)

C

13. Find three positive numbers x, y, z whose sum is 10 such that x

2

y

2

z is a maximum.

2.6 Unconstrained Optimization: Numerical Methods 89

2.6 Unconstrained Optimization: Numerical Methods

The types of problems that we solved in the previous section were examples of unconstrained

optimization problems. That is, we tried to ﬁnd local (and perhaps even global) maximum

and minimum points of real-valued functions f (x, y), where the points (x, y) could be any

points in the domain of f . The method we used required us to ﬁnd the critical points of f ,

which meant having to solve the equation ∇f 0, which in general is a system of two equa-

tions in two unknowns (x and y). While this was relatively simple for the examples we did,

in general this will not be the case. If the equations involve polynomials in x and y of degree

three or higher, or complicated expressions involving trigonometric, exponential, or loga-

rithmic functions, then solving even one such equation, let alone two, could be impossible by

elementary means.

7

For example, if one of the equations that had to be solved was

x

3

+9x−2 0 ,

you may have a hard time getting the exact solutions. Trial and error would not help much,

especially since the only real solution

8

turns out to be

3

_

_

28+1−

3

_

_

28−1. In a situation

such as this, the only choice may be to ﬁnd a solution using some numerical method which

gives a sequence of numbers which converge to the actual solution. For example, Newton’s

method for solving equations f (x) 0, which you probably learned in single-variable calcu-

lus. In this section we will describe another method of Newton for ﬁnding critical points of

real-valued functions of two variables.

Let f (x, y) be a smooth real-valued function, and deﬁne

D(x, y)

∂

2

f

∂x

2

(x, y)

∂

2

f

∂y

2

(x, y) −

_

∂

2

f

∂y∂x

(x, y)

_

2

.

Newton’s algorithm: Pick an initial point (x

0

, y

0

). For n 0, 1, 2, 3, . . . , deﬁne:

x

n+1

x

n

−

¸

¸

¸

¸

¸

¸

∂

2

f

∂y

2

(x

n

, y

n

)

∂

2

f

∂x∂y

(x

n

, y

n

)

∂f

∂y

(x

n

, y

n

)

∂f

∂x

(x

n

, y

n

)

¸

¸

¸

¸

¸

¸

D(x

n

, y

n

)

, y

n+1

y

n

−

¸

¸

¸

¸

¸

¸

∂

2

f

∂x

2

(x

n

, y

n

)

∂

2

f

∂x∂y

(x

n

, y

n

)

∂f

∂x

(x

n

, y

n

)

∂f

∂y

(x

n

, y

n

)

¸

¸

¸

¸

¸

¸

D(x

n

, y

n

)

(2.14)

Then the sequence of points (x

n

, y

n

)

∞

n1

converges to a critical point. If there are several

critical points, then you will have to try different initial points to ﬁnd them.

7

This is also a problem for the equivalent method (the Second Derivative Test) in single-variable calculus,

though one that is not usually emphasized.

8

There are also two nonreal, complex number solutions. Cubic polynomial equations in one variable can be

solved using Cardan’s formulas, which are not quite as simple as the familiar quadratic formula. See USPENSKY

for more details. There are formulas for solving polynomial equations of degree 4, but it can be proved that there

is no general formula for solving equations for polynomials of degree ﬁve or higher.

90 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

Example 2.23. Find all local maxima and minima of f (x, y) x

3

−xy−x+xy

3

− y

4

.

Solution: First calculate the necessary partial derivatives:

∂f

∂x

3x

2

− y−1+ y

3

,

∂f

∂y

−x+3xy

2

−4y

3

∂

2

f

∂x

2

6x ,

∂

2

f

∂y

2

6xy−12y

2

,

∂

2

f

∂y∂x

−1+3y

2

Notice that solving ∇f 0 would involve solving two third-degree polynomial equations in x

and y, which in this case can not be done easily.

We need to pick an initial point (x

0

, y

0

) for our algorithm. Looking at the graph of z f (x, y)

over a large region may help (see Figure 2.6.1 below), though it may be hard to tell where

the critical points are.

-20

-15

-10

-5

0

5

10

15

20

-20

-15

-10

-5

0

5

10

15

20

-350000

-300000

-250000

-200000

-150000

-100000

-50000

0

50000

z

x

y

z

Figure 2.6.1 f (x, y) x

3

−xy−x+xy

3

− y

4

for −20 ≤ x ≤20 and −20 ≤ y ≤20

Notice in the formulas (2.14) that we divide by D, so we should pick an initial point where

D is not zero. And we can see that D(0, 0) (0)(0) −(−1)

2

−1 /0, so take (0, 0) as our initial

point. Since it may take a large number of iterations of Newton’s algorithm to be sure that

we are close enough to the actual critical point, and since the computations are quite tedious,

we will let a computer do the computing. For this, we will write a simple program, using

the Java programming language, which will take a given initial point as a parameter and

then perform 100 iterations of Newton’s algorithm. In each iteration the new point will be

printed, so that we can see if there is convergence. The full code is shown in Listing 2.1.

2.6 Unconstrained Optimization: Numerical Methods 91

//Program to find the critical points of f(x,y)=x^3-xy-x+xy^3-y^4

public class newton {

public static void main(String[] args) {

//Get the initial point (x,y) as command-line parameters

double x = Double.parseDouble(args[0]); //Initial x value

double y = Double.parseDouble(args[1]); //Initial y value

System.out.println("Initial point: (" + x + "," + y + ")");

//Go through 100 iterations of Newton’s algorithm

for (int n=1; n<=100; n++) {

double D = fxx(x,y)

*

fyy(x,y) - Math.pow(fxy(x,y),2);

double xn = x; double yn = y; //The current x and y values

if (D == 0) { //We can not divide by 0

System.out.println("Error: D = 0 at iteration n = " + n);

System.exit(0); //End the program

} else { //Calculate the new values for x and y

x = xn - (fyy(xn,yn)

*

fx(xn,yn) - fxy(xn,yn)

*

fy(xn,yn))/D;

y = yn - (fxx(xn,yn)

*

fy(xn,yn) - fxy(xn,yn)

*

fx(xn,yn))/D;

System.out.println("n = " + n + ": (" + x + "," + y + ")");

}

}

}

//Below are the parts specific to the function f

//The first partial derivative of f wrt x: 3x^2-y-1+y^3

public static double fx(double x, double y) {

return 3

*

Math.pow(x,2) - y - 1 + Math.pow(y,3);

}

//The first partial derivative of f wrt y: -x+3xy^2-4y^3

public static double fy(double x, double y) {

return -x + 3

*

x

*

Math.pow(y,2) - 4

*

Math.pow(y,3);

}

//The second partial derivative of f wrt x: 6x

public static double fxx(double x, double y) {

return 6

*

x;

}

//The second partial derivative of f wrt y: 6xy-12y^2

public static double fyy(double x, double y) {

return 6

*

x

*

y - 12

*

Math.pow(y,2);

}

//The mixed second partial derivative of f wrt x and y: -1+3y^2

public static double fxy(double x, double y) {

return -1 + 3

*

Math.pow(y,2);

}

}

Listing 2.1 Program listing for newton.java

92 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

To use this program, you should ﬁrst save the code in Listing 2.1 in a plain text ﬁle called

newton.java. You will need the Java Development Kit

9

to compile the code. In the directory

where newton.java is saved, run this command at a command prompt to compile the code:

javac newton.java

Then run the program with the initial point (0, 0) with this command:

java newton 0 0

Below is the output of the program using (0, 0) as the initial point, truncated to show the

ﬁrst 10 lines and the last 5 lines:

java newton 0 0

Initial point: (0.0,0.0)

n = 1: (0.0,-1.0)

n = 2: (1.0,-0.5)

n = 3: (0.6065857885615251,-0.44194107452339687)

n = 4: (0.484506572966545,-0.405341511995805)

n = 5: (0.47123972682634485,-0.3966334583092305)

n = 6: (0.47113558510349535,-0.39636450001936047)

n = 7: (0.4711356343449705,-0.3963643379632247)

n = 8: (0.4711356343449874,-0.39636433796318005)

n = 9: (0.4711356343449874,-0.39636433796318005)

n = 10: (0.4711356343449874,-0.39636433796318005)

...

n = 96: (0.4711356343449874,-0.39636433796318005)

n = 97: (0.4711356343449874,-0.39636433796318005)

n = 98: (0.4711356343449874,-0.39636433796318005)

n = 99: (0.4711356343449874,-0.39636433796318005)

n = 100: (0.4711356343449874,-0.39636433796318005)

As you can see, we appear to have converged fairly quickly (after only 8 iterations) to

what appears to be an actual critical point (up to Java’s level of precision), namely the point

(0.4711356343449874, −0.39636433796318005). It is easy to conﬁrm that ∇f 0 at this

point, either by evaluating

∂f

∂x

and

∂f

∂y

at the point ourselves or by modifying our program to

also print the values of the partial derivatives at the point. It turns out that both partial

derivatives are indeed close enough to zero to be considered zero:

∂f

∂x

(0.4711356343449874, −0.39636433796318005) 4.85722573273506×10

−17

∂f

∂y

(0.4711356343449874, −0.39636433796318005) −8.326672684688674×10

−17

We also have D(0.4711356343449874, −0.39636433796318005) −8.776075636032301 < 0,

so by Theorem 2.6 we know that (0.4711356343449874, −0.39636433796318005) is a saddle

point.

9

Available for free at http://www.oracle.com/technetwork/java/javase/downloads/

2.6 Unconstrained Optimization: Numerical Methods 93

Since ∇f consists of cubic polynomials, it seems likely that there may be three critical

points. The computer program makes experimenting with other initial points easy, and

trying different values does indeed lead to different sequences which converge:

java newton -1 -1

Initial point: (-1.0,-1.0)

n = 1: (-0.5,-0.5)

n = 2: (-0.49295774647887325,-0.08450704225352113)

n = 3: (-0.1855674752461383,-1.2047647348546167)

n = 4: (-0.4540060574531383,-0.8643989895639324)

n = 5: (-0.3672160534444,-0.5426077421319053)

n = 6: (-0.4794622222856417,-0.24529117721011612)

n = 7: (0.11570743992954591,-2.4319791238981274)

n = 8: (-0.05837851765533317,-1.6536079835854451)

n = 9: (-0.129841298650007,-1.121516233310142)

n = 10: (-1.004453014967208,-0.9206128022529645)

n = 11: (-0.5161209914612475,-0.4176293491131443)

n = 12: (-0.5788664043863884,0.2918236503332734)

n = 13: (-0.6985177124230715,0.49848120123515316)

n = 14: (-0.6733618916578702,0.4345777963475479)

n = 15: (-0.6704392913413444,0.4252025996474051)

n = 16: (-0.6703832679150286,0.4250147307973365)

n = 17: (-0.6703832459238701,0.42501465652421205)

n = 18: (-0.6703832459238667,0.4250146565242004)

n = 19: (-0.6703832459238667,0.42501465652420045)

n = 20: (-0.6703832459238667,0.42501465652420045)

...

n = 98: (-0.6703832459238667,0.42501465652420045)

n = 99: (-0.6703832459238667,0.42501465652420045)

n = 100: (-0.6703832459238667,0.42501465652420045)

Again, it is easy to conﬁrm that both

∂f

∂x

and

∂f

∂y

vanish at the point

(−0.6703832459238667, 0.42501465652420045), which means it is a critical point. And

D(−0.6703832459238667, 0.42501465652420045) 15.3853578526055 >0

∂

2

f

∂x

2

(−0.6703832459238667, 0.42501465652420045) −4.0222994755432 <0

so we know that (−0.6703832459238667, 0.42501465652420045) is a local maximum. An

idea of what the graph of f looks like near that point is shown in Figure 2.6.2, which does

suggest a local maximum around that point.

Finally, running the computer program with the initial point (−5, −5) yields the critical

point (−7.540962756992551, −5.595509445899435), with D < 0 at that point, which makes

it a saddle point.

94 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

-1

-0.8

-0.6

-0.4

-0.2

0

0

0.2

0.4

0.6

0.8

1

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

z

x

y

z

(−0.67, 0.42, 0.57)

Figure 2.6.2 f (x, y) x

3

−xy−x+xy

3

− y

4

for −1 ≤ x ≤0 and 0 ≤ y ≤1

We can summarize our ﬁndings for the function f (x, y) x

3

−xy−x+xy

3

− y

4

:

(0.4711356343449874, −0.39636433796318005) : saddle point

(−0.6703832459238667, 0.42501465652420045) : local maximum

(−7.540962756992551, −5.595509445899435) : saddle point

The derivation of Newton’s algorithm, and the proof that it converges (given a “reason-

able” choice for the initial point) requires techniques beyond the scope of this text. See

RALSTON and RABINOWITZ for more detail and for discussion of other numerical methods.

Our description of Newton’s algorithm is the special two-variable case of a more general

algorithm that can be applied to functions of n ≥2 variables.

In the case of functions which have a global maximum or minimum, Newton’s algorithm

can be used to ﬁnd those points. In general, global maxima and minima tend to be more

interesting than local versions, at least in practical applications. A maximization problem

can always be turned into a minimization problem (why?), so a large number of methods

have been developed to ﬁnd the global minimum of functions of any number of variables.

This ﬁeld of study is called nonlinear programming. Many of these methods are based on the

steepest descent technique, which is based on an idea that we discussed in Section 2.4. Recall

that the negative gradient −∇f gives the direction of the fastest rate of decrease of a function

f . The crux of the steepest descent idea, then, is that starting from some initial point, you

move a certain amount in the direction of −∇f at that point. Wherever that takes you

2.6 Unconstrained Optimization: Numerical Methods 95

becomes your new point, and you then just keep repeating that procedure until eventually

(hopefully) you reach the point where f has its smallest value. There is a “pure” steepest

descent method, and a multitude of variations on it that improve the rate of convergence,

ease of calculation, etc. In fact, Newton’s algorithm can be interpreted as a modiﬁed steepest

descent method. For more discussion of this, and of nonlinear programming in general, see

BAZARAA, SHERALI and SHETTY.

Exercises

C

1. Recall Example 2.21 from the previous section, where we showed that the point (2, 1) was

a global minimum for the function f (x, y) (x −2)

4

+(x −2y)

2

. Notice that our computer

program can be modiﬁed fairly easily to use this function (just change the return values

in the fx, fy, fxx, fyy and fxy function deﬁnitions to use the appropriate partial derivative).

Either modify that program or write one of your own in a programming language of your

choice to show that Newton’s algorithm does lead to the point (2, 1). First use the initial

point (0, 3), then use the initial point (3, 2), and compare the results. Make sure that your

program attempts to do 100 iterations of the algorithm. Did anything strange happen

when your program ran? If so, how do you explain it? (Hint: Something strange should

happen.)

2. There is a version of Newton’s algorithm for solving a system of two equations

f

1

(x, y) 0 and f

2

(x, y) 0 ,

where f

1

(x, y) and f

2

(x, y) are smooth real-valued functions:

Pick an initial point (x

0

, y

0

). For n 0, 1, 2, 3, . . . , deﬁne:

x

n+1

x

n

−

¸

¸

¸

¸

¸

f

1

(x

n

, y

n

) f

2

(x

n

, y

n

)

∂f

1

∂y

(x

n

, y

n

)

∂f

2

∂y

(x

n

, y

n

)

¸

¸

¸

¸

¸

D(x

n

, y

n

)

, y

n+1

y

n

+

¸

¸

¸

¸

¸

f

1

(x

n

, y

n

) f

2

(x

n

, y

n

)

∂f

1

∂x

(x

n

, y

n

)

∂f

2

∂x

(x

n

, y

n

)

¸

¸

¸

¸

¸

D(x

n

, y

n

)

, where

D(x

n

, y

n

)

∂f

1

∂x

(x

n

, y

n

)

∂f

2

∂y

(x

n

, y

n

) −

∂f

1

∂y

(x

n

, y

n

)

∂f

2

∂x

(x

n

, y

n

) .

Then the sequence of points (x

n

, y

n

)

∞

n1

converges to a solution. Write a computer program

that uses this algorithm to ﬁnd approximate solutions to the system of equations

f

1

(x, y) sin(xy) −x− y 0 and f

2

(x, y) e

2x

−2x+3y 0 .

Show that you get two different solutions when using (0, 0) and (1, 1) for the initial point

(x

0

, y

0

).

96 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

2.7 Constrained Optimization: Lagrange Multipliers

In Sections 2.5 and 2.6 we were concerned with ﬁnding maxima and minima of functions

without any constraints on the variables (other than being in the domain of the function).

What would we do if there were constraints on the variables? The following example illus-

trates a simple case of this type of problem.

Example 2.24. For a rectangle whose perimeter is 20 m, ﬁnd the dimensions that will max-

imize the area.

Solution: The area A of a rectangle with width x and height y is A xy. The perimeter P of

the rectangle is then given by the formula P 2x+2y. Since we are given that the perimeter

P 20, this problem can be stated as:

Maximize : f (x, y) xy

given : 2x+2y 20

The reader is probably familiar with a simple method, using single-variable calculus, for

solving this problem. Since we must have 2x +2y 20, then we can solve for, say, y in

terms of x using that equation. This gives y 10−x, which we then substitute into f to get

f (x, y) xy x(10−x) 10x −x

2

. This is now a function of x alone, so we now just have to

maximize the function f (x) 10x−x

2

on the interval [0, 10]. Since f

′

(x) 10−2x 0 ⇒x 5

and f

′′

(5) −2 < 0, then the Second Derivative Test tells us that x 5 is a local maximum

for f , and hence x 5 must be the global maximum on the interval [0, 10] (since f 0 at

the endpoints of the interval). So since y 10−x 5, then the maximum area occurs for a

rectangle whose width and height both are 5 m.

Notice in the above example that the ease of the solution depended on being able to solve

for one variable in terms of the other in the equation 2x+2y 20. But what if that were not

possible (which is often the case)? In this section we will use a general method, called the

Lagrange multiplier method

10

, for solving constrained optimization problems:

Maximize (or minimize) : f (x, y) (or f (x, y, z))

given : g(x, y) c (or g(x, y, z) c) for some constant c

The equation g(x, y) c is called the constraint equation, and we say that x and y are con-

strained by g(x, y) c. Points (x, y) which are maxima or minima of f (x, y) with the con-

dition that they satisfy the constraint equation g(x, y) c are called constrained maximum

or constrained minimum points, respectively. Similar deﬁnitions hold for functions of three

variables.

The Lagrange multiplier method for solving such problems can now be stated:

10

Named after the French mathematician Joseph Louis Lagrange (1736-1813).

2.7 Constrained Optimization: Lagrange Multipliers 97

Theorem 2.7. Let f (x, y) and g(x, y) be smooth functions, and suppose that c is a scalar

constant such that ∇g(x, y) / 0 for all (x, y) that satisfy the equation g(x, y) c. Then to

solve the constrained optimization problem

Maximize (or minimize) : f (x, y)

given : g(x, y) c ,

ﬁnd the points (x, y) that solve the equation ∇f (x, y) λ∇g(x, y) for some constant λ (the

number λ is called the Lagrange multiplier). If there is a constrained maximum or mini-

mum, then it must be such a point.

A rigorous proof of the above theorem requires use of the Implicit Function Theorem,

which is beyond the scope of this text.

11

Note that the theorem only gives a necessary con-

dition for a point to be a constrained maximum or minimum. Whether a point (x, y) that

satisﬁes ∇f (x, y) λ∇g(x, y) for some λ actually is a constrained maximum or minimum can

sometimes be determined by the nature of the problem itself. For instance, in Example 2.24

it was clear that there had to be a global maximum.

So how can you tell when a point that satisﬁes the condition in Theorem 2.7 really is a

constrained maximum or minimum? The answer is that it depends on the constraint func-

tion g(x, y), together with any implicit constraints. It can be shown

12

that if the constraint

equation g(x, y) c (plus any hidden constraints) describes a bounded set B in R

2

, then the

constrained maximum or minimum of f (x, y) will occur either at a point (x, y) satisfying

∇f (x, y) λ∇g(x, y) or at a “boundary” point of the set B.

In Example 2.24 the constraint equation 2x+2y 20 describes a line in R

2

, which by itself

is not bounded. However, there are “hidden” constraints, due to the nature of the problem,

namely 0 ≤ x, y ≤10, which cause that line to be restricted to a line segment in R

2

(including

the endpoints of that line segment), which is bounded.

Example 2.25. For a rectangle whose perimeter is 20 m, use the Lagrange multiplier method

to ﬁnd the dimensions that will maximize the area.

Solution: As we saw in Example 2.24, with x and y representing the width and height,

respectively, of the rectangle, this problem can be stated as:

Maximize : f (x, y) xy

given : g(x, y) 2x+2y 20

Then solving the equation ∇f (x, y) λ∇g(x, y) for some λ means solving the equations

11

See TAYLOR and MANN, § 6.8 for more detail.

12

Again, see TAYLOR and MANN.

98 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

∂f

∂x

λ

∂g

∂x

and

∂f

∂y

λ

∂g

∂y

, namely:

y 2λ ,

x 2λ

The general idea is to solve for λ in both equations, then set those expressions equal (since

they both equal λ) to solve for x and y. Doing this we get

y

2

λ

x

2

⇒ x y ,

so now substitute either of the expressions for x or y into the constraint equation to solve for

x and y:

20 g(x, y) 2x+2y 2x+2x 4x ⇒ x 5 ⇒ y 5

There must be a maximum area, since the minimum area is 0 and f (5, 5) 25 > 0, so the

point (5, 5) that we found (called a constrained critical point) must be the constrained maxi-

mum.

∴ The maximum area occurs for a rectangle whose width and height both are 5 m.

Example 2.26. Find the points on the circle x

2

+ y

2

80 which are closest to and farthest

from the point (1, 2).

Solution: The distance d from any point (x, y) to the point (1, 2) is

d

_

(x−1)

2

+(y−2)

2

,

and minimizing the distance is equivalent to minimizing the square of the distance. Thus

the problem can be stated as:

Maximize (and minimize) : f (x, y) (x−1)

2

+(y−2)

2

given : g(x, y) x

2

+ y

2

80

Solving ∇f (x, y) λ∇g(x, y) means solving the following equations:

2(x−1) 2λx ,

2(y−2) 2λy

Note that x / 0 since otherwise we would get −2 0 in the ﬁrst equation. Similarly, y / 0.

So we can solve both equations for λ as follows:

x−1

x

λ

y−2

y

⇒ xy− y xy−2x ⇒ y 2x

2.7 Constrained Optimization: Lagrange Multipliers 99

x

y

0

(4, 8)

(1, 2)

(−4, −8)

x

2

+ y

2

80

Figure 2.7.1

Substituting this into g(x, y) x

2

+ y

2

80 yields 5x

2

80,

so x ±4. So the two constrained critical points are (4, 8) and

(−4, −8). Since f (4, 8) 45 and f (−4, −8) 125, and since there

must be points on the circle closest to and farthest from (1, 2),

then it must be the case that (4, 8) is the point on the circle clos-

est to (1, 2) and (−4, −8) is the farthest from (1, 2) (see Figure

2.7.1).

Notice that since the constraint equation x

2

+y

2

80 describes

a circle, which is a bounded set in R

2

, then we were guaranteed

that the constrained critical points we found were indeed the

constrained maximum and minimum.

The Lagrange multiplier method can be extended to functions of three variables.

Example 2.27.

Maximize (and minimize) : f (x, y, z) x+z

given : g(x, y, z) x

2

+ y

2

+z

2

1

Solution: Solve the equation ∇f (x, y, z) λ∇g(x, y, z):

1 2λx

0 2λy

1 2λz

The ﬁrst equation implies λ/0 (otherwise we would have 1 0), so we can divide by λ in the

second equation to get y 0 and we can divide by λ in the ﬁrst and third equations to get

x

1

2λ

z. Substituting these expressions into the constraint equation g(x, y, z) x

2

+ y

2

+

z

2

1 yields the constrained critical points

_

1

_

2

, 0,

1

_

2

_

and

_

−1

_

2

, 0,

−1

_

2

_

. Since f

_

1

_

2

, 0,

1

_

2

_

>

f

_

−1

_

2

, 0,

−1

_

2

_

, and since the constraint equation x

2

+ y

2

+z

2

1 describes a sphere (which is

bounded) in R

3

, then

_

1

_

2

, 0,

1

_

2

_

is the constrained maximum point and

_

−1

_

2

, 0,

−1

_

2

_

is the

constrained minimum point.

So far we have not attached any signiﬁcance to the value of the Lagrange multiplier λ. We

needed λ only to ﬁnd the constrained critical points, but made no use of its value. It turns

out that λ gives an approximation of the change in the value of the function f (x, y) that we

wish to maximize or minimize, when the constant c in the constraint equation g(x, y) c is

changed by 1.

100 CHAPTER 2. FUNCTIONS OF SEVERAL VARIABLES

For example, in Example 2.25 we showed that the constrained optimization problem

Maximize : f (x, y) xy

given : g(x, y) 2x+2y 20

had the solution (x, y) (5, 5), and that λ x/2 y/2. Thus, λ 2.5. In a similar fashion we

could show that the constrained optimization problem

Maximize : f (x, y) xy

given : g(x, y) 2x+2y 21

has the solution (x, y) (5.25, 5.25). So we see that the value of f (x, y) at the constrained

maximum increased from f (5, 5) 25 to f (5.25, 5.25) 27.5625, i.e. it increased by 2.5625

when we increased the value of c in the constraint equation g(x, y) c from c 20 to c 21.

Notice that λ2.5 is close to 2.5625, that is,

λ ≈ ∆f f (new max. pt) − f (old max. pt) .

Finally, note that solving the equation ∇f (x, y) λ∇g(x, y) means having to solve a system

of two (possibly nonlinear) equations in three unknowns, which as we have seen before,

may not be possible to do. And the 3-variable case can get even more complicated. All of

this somewhat restricts the usefulness of Lagrange’s method to relatively simple functions.

Luckily there are many numerical methods for solving constrained optimization problems,

though we will not discuss them here.

13

Exercises

A

1. Find the constrained maxima and minima of f (x, y) 2x+ y given that x

2

+ y

2

4.

2. Find the constrained maxima and minima of f (x, y) xy given that x

2

+3y

2

6.

3. Find the points on the circle x

2

+y

2

100 which are closest to and farthest from the point

(2, 3).

B

4. Find the constrained maxima and minima of f (x, y, z) x+ y

2

+2z given that 4x

2

+9y

2

−

36z

2

36.

5. Find the volume of the largest rectangular parallelepiped that can be inscribed in the

ellipsoid

x

2

a

2

+

y

2

b

2

+

z

2

c

2

1 .

13

See BAZARAA, SHERALI and SHETTY.

3 Multiple Integrals

3.1 Double Integrals

In single-variable calculus, differentiation and integration are thought of as inverse opera-

tions. For instance, to integrate a function f (x) it is necessary to ﬁnd the antiderivative of f ,

that is, another function F(x) whose derivative is f (x). Is there a similar way of deﬁning in-

tegration of real-valued functions of two or more variables? The answer is yes, as we will see

shortly. Recall also that the deﬁnite integral of a nonnegative function f (x) ≥0 represented

the area “under” the curve y f (x). As we will now see, the double integral of a nonnegative

real-valued function f (x, y) ≥0 represents the volume “under” the surface z f (x, y).

Let f (x, y) be a continuous function such that f (x, y) ≥ 0 for all (x, y) on the rectangle

R {(x, y) : a ≤ x ≤ b, c ≤ y ≤ d} in R

2

. We will often write this as R [a, b] ×[c, d]. For any

number x∗ in the interval [a, b], slice the surface z f (x, y) with the plane x x∗ parallel to

the yz-plane. Then the trace of the surface in that plane is the curve f (x∗, y), where x∗ is

ﬁxed and only y varies. The area A under that curve (i.e. the area of the region between the

curve and the xy-plane) as y varies over the interval [c, d] then depends only on the value of

x∗. So using the variable x instead of x∗, let A(x) be that area (see Figure 3.1.1).

y

z

x

0 A(x)

R

a

x

b

c d

z f (x, y)

Figure 3.1.1 The area A(x) varies with x

Then A(x)

_

d

c

f (x, y) dy since we are treating x as ﬁxed, and only y varies. This makes

sense since for a ﬁxed x the function f (x, y) is a continuous function of y over the interval

[c, d], so we know that the area under the curve is the deﬁnite integral. The area A(x) is a

function of x, so by the “slice” or cross-section method from single-variable calculus we know

that the volume V of the solid under the surface z f (x, y) but above the xy-plane over the

101

102 CHAPTER 3. MULTIPLE INTEGRALS

rectangle R is the integral over [a, b] of that cross-sectional area A(x):

V

_

b

a

A(x) dx

_

b

a

__

d

c

f (x, y) dy

_

dx (3.1)

We will always refer to this volume as “the volume under the surface”. The above expression

uses what are called iterated integrals. First the function f (x, y) is integrated as a func-

tion of y, treating the variable x as a constant (this is called integrating with respect to y).

That is what occurs in the “inner” integral between the square brackets in equation (3.1).

This is the ﬁrst iterated integral. Once that integration is performed, the result is then an

expression involving only x, which can then be integrated with respect to x. That is what

occurs in the “outer” integral above (the second iterated integral). The ﬁnal result is then

a number (the volume). This process of going through two iterations of integrals is called

double integration, and the last expression in equation (3.1) is called a double integral.

Notice that integrating f (x, y) with respect to y is the inverse operation of taking the

partial derivative of f (x, y) with respect to y. Also, we could just as easily have taken the

area of cross-sections under the surface which were parallel to the xz-plane, which would

then depend only on the variable y, so that the volume V would be

V

_

d

c

__

b

a

f (x, y) dx

_

dy . (3.2)

It turns out that in general

1

the order of the iterated integrals does not matter. Also, we will

usually discard the brackets and simply write

V

_

d

c

_

b

a

f (x, y) dxdy , (3.3)

where it is understood that the fact that dx is written before dy means that the function

f (x, y) is ﬁrst integrated with respect to x using the “inner” limits of integration a and b,

and then the resulting function is integrated with respect to y using the “outer” limits of

integration c and d. This order of integration can be changed if it is more convenient.

Example 3.1. Find the volume V under the plane z 8x+6y over the rectangle R [0, 1] ×

[0, 2].

1

due to Fubini’s Theorem. See Ch. 18 in TAYLOR and MANN.

3.1 Double Integrals 103

Solution: We see that f (x, y) 8x+6y ≥0 for 0 ≤ x ≤1 and 0 ≤ y ≤2, so:

V

_

2

0

_

1

0

(8x+6y) dxdy

_

2

0

_

4x

2

+6xy

¸

¸

¸

x1

x0

_

dy

_

2

0

(4+6y) dy

4y+3y

2

¸

¸

¸

2

0

20

Suppose we had switched the order of integration. We can verify that we still get the same

answer:

V

_

1

0

_

2

0

(8x+6y) dydx

_

1

0

_

8xy+3y

2

¸

¸

¸

y2

y0

_

dx

_

1

0

(16x+12) dx

8x

2

+12x

¸

¸

¸

1

0

20

Example 3.2. Find the volume V under the surface z e

x+y

over the rectangle R [2, 3] ×

[1, 2].

Solution: We know that f (x, y) e

x+y

>0 for all (x, y), so

V

_

2

1

_

3

2

e

x+y

dxdy

_

2

1

_

e

x+y

¸

¸

¸

x3

x2

_

dy

_

2

1

(e

y+3

−e

y+2

) dy

e

y+3

−e

y+2

¸

¸

¸

2

1

e

5

−e

4

−(e

4

−e

3

) e

5

−2e

4

+e

3

Recall that for a general function f (x), the integral

_

b

a

f (x) dx represents the difference of

the area below the curve y f (x) but above the x-axis when f (x) ≥0, and the area above the

104 CHAPTER 3. MULTIPLE INTEGRALS

curve but below the x-axis when f (x) ≤ 0. Similarly, the double integral of any continuous

function f (x, y) represents the difference of the volume belowthe surface z f (x, y) but above

the xy-plane when f (x, y) ≥0, and the volume above the surface but belowthe xy-plane when

f (x, y) ≤ 0. Thus, our method of double integration by means of iterated integrals can be

used to evaluate the double integral of any continuous function over a rectangle, regardless

of whether f (x, y) ≥0 or not.

Example 3.3. Evaluate

_

2π

0

_

π

0

sin(x+ y) dxdy.

Solution: Note that f (x, y) sin(x+y) is both positive and negative over the rectangle [0, π]×

[0, 2π]. We can still evaluate the double integral:

_

2π

0

_

π

0

sin(x+ y) dxdy

_

2π

0

_

−cos(x+ y)

¸

¸

¸

xπ

x0

_

dy

_

2π

0

(−cos(y+π) +cos y) dy

−sin(y+π) +sin y

¸

¸

¸

2π

0

−sin3π+sin2π−(−sinπ+sin0)

0

Exercises

A

For Exercises 1-4, ﬁnd the volume under the surface z f (x, y) over the rectangle R.

1. f (x, y) 4xy, R [0, 1] ×[0, 1] 2. f (x, y) e

x+y

, R [0, 1] ×[−1, 1]

3. f (x, y) x

3

+ y

2

, R [0, 1] ×[0, 1] 4. f (x, y) x

4

+xy+ y

3

, R [1, 2] ×[0, 2]

For Exercises 5-12, evaluate the given double integral.

5.

_

1

0

_

2

1

(1− y)x

2

dxdy 6.

_

1

0

_

2

0

x(x+ y) dxdy

7.

_

2

0

_

1

0

(x+2) dxdy 8.

_

2

−1

_

1

−1

x(xy+sinx) dxdy

9.

_

π/2

0

_

1

0

xycos(x

2

y) dxdy 10.

_

π

0

_

π/2

0

sinxcos(y−π) dxdy

11.

_

2

0

_

4

1

xydxdy 12.

_

1

−1

_

2

−1

1dxdy

13. Let M be a constant. Show that

_

d

c

_

b

a

Mdxdy M(d−c)(b−a).

3.2 Double Integrals Over a General Region 105

3.2 Double Integrals Over a General Region

In the previous section we got an idea of what a double integral over a rectangle represents.

We can now deﬁne the double integral of a real-valued function f (x, y) over more general

regions in R

2

.

Suppose that we have a region R in the xy-plane that is bounded on the left by the vertical

line x a, bounded on the right by the vertical line x b (where a < b), bounded below by

a curve y g

1

(x), and bounded above by a curve y g

2

(x), as in Figure 3.2.1(a). We will

assume that g

1

(x) and g

2

(x) do not intersect on the open interval (a, b) (they could intersect

at the endpoints x a and x b, though).

a b

x

y

0

y g

2

(x)

y g

1

(x)

R

(a) Vertical slice:

_

b

a

_

g

2

(x)

g

1

(x)

f (x, y) dydx

x

y

0

x h

1

(y)

x h

2

(y)

R

c

d

(b) Horizontal slice:

_

d

c

_

h

2

(y)

h

1

(y)

f (x, y) dxdy

Figure 3.2.1 Double integral over a nonrectangular region R

Then using the slice method from the previous section, the double integral of a real-valued

function f (x, y) over the region R, denoted by

R

f (x, y) dA, is given by

R

f (x, y) dA

_

b

a

__

g

2

(x)

g

1

(x)

f (x, y) dy

_

dx (3.4)

This means that we take vertical slices in the region R between the curves y g

1

(x) and

y g

2

(x). The symbol dA is sometimes called an area element or inﬁnitesimal, with the A

signifying area. Note that f (x, y) is ﬁrst integrated with respect to y, with functions of x as

the limits of integration. This makes sense since the result of the ﬁrst iterated integral will

have to be a function of x alone, which then allows us to take the second iterated integral

with respect to x.

Similarly, if we have a region R in the xy-plane that is bounded on the left by a curve

x h

1

(y), bounded on the right by a curve x h

2

(y), bounded below by the horizontal line

106 CHAPTER 3. MULTIPLE INTEGRALS

y c, and bounded above by the horizontal line y d (where c < d), as in Figure 3.2.1(b)

(assuming that h

1

(y) and h

2

(y) do not intersect on the open interval (c, d)), then taking

horizontal slices gives

R

f (x, y) dA

_

d

c

__

h

2

(y)

h

1

(y)

f (x, y) dx

_

dy (3.5)

Notice that these deﬁnitions include the case when the region R is a rectangle. Also, if

f (x, y) ≥ 0 for all (x, y) in the region R, then

R

f (x, y) dA is the volume under the surface

z f (x, y) over the region R.

Example 3.4. Find the volume V under the plane z 8x+6y over the region R {(x, y) : 0 ≤

x ≤1, 0 ≤ y ≤2x

2

}.

x

y

0

y 2x

2

R

1

Figure 3.2.2

Solution: The region R is shown in Figure 3.2.2. Using vertical slices we

get:

V

R

(8x+6y) dA

_

1

0

_

_

2x

2

0

(8x+6y) dy

_

dx

_

1

0

_

8xy+3y

2

¸

¸

¸

y2x

2

y0

_

dx

_

1

0

(16x

3

+12x

4

) dx

4x

4

+

12

5

x

5

¸

¸

¸

1

0

4+

12

5

32

5

6.4

x

y

0

2

x

_

y/2

R

1

Figure 3.2.3

We get the same answer using horizontal slices (see Figure 3.2.3):

V

R

(8x+6y) dA

_

2

0

__

1

_

y/2

(8x+6y) dx

_

dy

_

2

0

_

4x

2

+6xy

¸

¸

¸

x1

x

_

y/2

_

dy

_

2

0

(4+6y−(2y+

6

_

2

y

_

y)) dy

_

2

0

(4+4y−3

_

2y

3/2

) dy

4y+2y

2

−

6

_

2

5

y

5/2

¸

¸

¸

2

0

8+8−

6

_

2

_

32

5

16−

48

5

32

5

6.4

3.2 Double Integrals Over a General Region 107

Example 3.5. Find the volume V of the solid bounded by the three coordinate planes and

the plane 2x+ y+4z 4.

y

z

x

0

(0, 4, 0)

(0, 0, 1)

(2, 0, 0)

2x+ y+4z 4

(a)

x

y

0

y −2x+4

R

2

4

(b)

Figure 3.2.4

Solution: The solid is shown in Figure 3.2.4(a) with a typical vertical slice. The volume V

is given by

R

f (x, y) dA, where f (x, y) z

1

4

(4−2x − y) and the region R, shown in Figure

3.2.4(b), is R {(x, y) : 0 ≤ x ≤2, 0 ≤ y ≤−2x+4}. Using vertical slices in R gives

V

R

1

4

(4−2x− y) dA

_

2

0

__

−2x+4

0

1

4

(4−2x− y) dy

_

dx

_

2

0

_

−

1

8

(4−2x− y)

2

¸

¸

¸

y−2x+4

y0

_

dx

_

2

0

1

8

(4−2x)

2

dx

−

1

48

(4−2x)

3

¸

¸

¸

2

0

64

48

4

3

For a general region R, which may not be one of the types of regions we have considered so

far, the double integral

R

f (x, y) dA is deﬁned as follows. Assume that f (x, y) is a nonnega-

tive real-valued function and that R is a bounded region in R

2

, so it can be enclosed in some

rectangle [a, b]×[c, d]. Then divide that rectangle into a grid of subrectangles. Only consider

the subrectangles that are enclosed completely within the region R, as shown by the shaded

subrectangles in Figure 3.2.5(a). In any such subrectangle [x

i

, x

i+1

] ×[y

j

, y

j+1

], pick a point

(x

i∗

, y

j∗

). Then the volume under the surface z f (x, y) over that subrectangle is approxi-

mately f (x

i∗

, y

j∗

) ∆x

i

∆y

j

, where ∆x

i

x

i+1

−x

i

, ∆y

j

y

j+1

− y

j

, and f (x

i∗

, y

j∗

) is the height and

108 CHAPTER 3. MULTIPLE INTEGRALS

∆x

i

∆y

j

is the base area of a parallelepiped, as shown in Figure 3.2.5(b). Then the total vol-

ume under the surface is approximately the sum of the volumes of all such parallelepipeds,

namely

j

i

f (x

i∗

, y

j∗

) ∆x

i

∆y

j

, (3.6)

where the summation occurs over the indices of the subrectangles inside R. If we take

smaller and smaller subrectangles, so that the length of the largest diagonal of the subrect-

angles goes to 0, then the subrectangles begin to ﬁll more and more of the region R, and so

the above sum approaches the actual volume under the surface z f (x, y) over the region R.

We then deﬁne

R

f (x, y) dA as the limit of that double summation (the limit is taken over all

subdivisions of the rectangle [a, b] ×[c, d] as the largest diagonal of the subrectangles goes

to 0).

a b x

i

x

i+1

x

y

0

d

c

y

j

y

j+1

(x

i∗

, y

j∗

)

(a) Subrectangles inside the region R

y

z

x

0

R

x

i

x

i+1

y

j

y

j+1

z f (x, y)

∆y

j

∆x

i

(x

i∗

, y

j∗

)

f (x

i∗

, y

j∗

)

(b) Parallelepiped over a subrectangle,

with volume f (x

i∗

, y

j∗

) ∆x

i

∆y

j

Figure 3.2.5 Double integral over a general region R

A similar deﬁnition can be made for a function f (x, y) that is not necessarily always non-

negative: just replace each mention of volume by the negative volume in the description

above when f (x, y) <0. In the case of a region of the type shown in Figure 3.2.1, using the def-

inition of the Riemann integral from single-variable calculus, our deﬁnition of

R

f (x, y) dA

reduces to a sequence of two iterated integrals.

Finally, the region R does not have to be bounded. We can evaluate improper double

integrals (i.e. over an unbounded region, or over a region which contains points where the

function f (x, y) is not deﬁned) as a sequence of iterated improper single-variable integrals.

3.2 Double Integrals Over a General Region 109

Example 3.6. Evaluate

_

∞

1

_

1/x

2

0

2ydydx.

Solution:

_

∞

1

_

1/x

2

0

2ydydx

_

∞

1

_

y

2

¸

¸

¸

y1/x

2

y0

_

dx

_

∞

1

x

−4

dx −

1

3

x

−3

¸

¸

¸

∞

1

0−(−

1

3

)

1

3

Exercises

A

For Exercises 1-6, evaluate the given double integral.

1.

_

1

0

_

1

_

x

24x

2

ydydx

2.

_

π

0

_

y

0

sinxdxdy

3.

_

2

1

_

lnx

0

4xdydx 4.

_

2

0

_

2y

0

e

y

2

dxdy

5.

_

π/2

0

_

y

0

cos x sin ydxdy

6.

_

∞

0

_

∞

0

xye

−(x

2

+y

2

)

dxdy

7.

_

2

0

_

y

0

1dxdy

8.

_

1

0

_

x

2

0

2dydx

9. Find the volume V of the solid bounded by the three coordinate planes and the plane

x+ y+z 1.

10. Find the volume V of the solid bounded by the three coordinate planes and the plane

3x+2y+5z 6.

B

11. Explain why the double integral

R

1dA gives the area of the region R. For simplicity,

you can assume that R is a region of the type shown in Figure 3.2.1(a).

C

b

c

a

Figure 3.2.6

12. Prove that the volume of a tetrahedron with mutually perpendic-

ular adjacent sides of lengths a, b, and c, as in Figure 3.2.6, is

abc

6

.

(Hint: Mimic Example 3.5, and recall from

Section 1.5 how three noncollinear points determine a plane.)

13. Show how Exercise 12 can be used to solve Exercise 10.

110 CHAPTER 3. MULTIPLE INTEGRALS

3.3 Triple Integrals

Our deﬁnition of a double integral of a real-valued function f (x, y) over a region R in R

2

can

be extended to deﬁne a triple integral of a real-valued function f (x, y, z) over a solid S in R

3

.

We simply proceed as before: the solid S can be enclosed in some rectangular parallelepiped,

which is then divided into subparallelepipeds. In each subparallelepiped inside S, with sides

of lengths ∆x, ∆y and ∆z, pick a point (x

∗

, y

∗

, z

∗

). Then deﬁne the triple integral of f (x, y, z)

over S, denoted by

S

f (x, y, z) dV, by

S

f (x, y, z) dV lim

f (x

∗

, y

∗

, z

∗

) ∆x∆y∆z , (3.7)

where the limit is over all divisions of the rectangular parallelepiped enclosing S into sub-

parallelepipeds whose largest diagonal is going to 0, and the triple summation is over all the

subparallelepipeds inside S. It can be shown that this limit does not depend on the choice

of the rectangular parallelepiped enclosing S. The symbol dV is often called the volume

element.

Physically, what does the triple integral represent? We saw that a double integral could

be thought of as the volume under a two-dimensional surface. It turns out that the triple

integral simply generalizes this idea: it can be thought of as representing the hypervolume

under a three-dimensional hypersurface w f (x, y, z) whose graph lies in R

4

. In general,

the word “volume” is often used as a general term to signify the same concept for any n-

dimensional object (e.g. length in R

1

, area in R

2

). It may be hard to get a grasp on the concept

of the “volume” of a four-dimensional object, but at least we now know how to calculate that

volume!

In the case where S is a rectangular parallelepiped [x

1

, x

2

] ×[y

1

, y

2

] ×[z

1

, z

2

], that is, S

{(x, y, z) : x

1

≤ x ≤ x

2

, y

1

≤ y ≤ y

2

, z

1

≤ z ≤ z

2

}, the triple integral is a sequence of three iterated

integrals, namely

S

f (x, y, z) dV

_

z

2

z

1

_

y

2

y

1

_

x

2

x

1

f (x, y, z) dxdydz , (3.8)

where the order of integration does not matter. This is the simplest case.

A more complicated case is where S is a solid which is bounded below by a surface z

g

1

(x, y), bounded above by a surface z g

2

(x, y), y is bounded between two curves h

1

(x) and

h

2

(x), and x varies between a and b. Then

S

f (x, y, z) dV

_

b

a

_

h

2

(x)

h

1

(x)

_

g

2

(x, y)

g

1

(x, y)

f (x, y, z) dz dydx . (3.9)

Notice in this case that the ﬁrst iterated integral will result in a function of x and y (since its

limits of integration are functions of x and y), which then leaves you with a double integral of

3.3 Triple Integrals 111

a type that we learned how to evaluate in Section 3.2. There are, of course, many variations

on this case (for example, changing the roles of the variables x, y, z), so as you can probably

tell, triple integrals can be quite tricky. At this point, just learning how to evaluate a triple

integral, regardless of what it represents, is the most important thing. We will see some

other ways in which triple integrals are used later in the text.

Example 3.7. Evaluate

_

3

0

_

2

0

_

1

0

(xy+z) dxdydz.

Solution:

_

3

0

_

2

0

_

1

0

(xy+z) dxdydz

_

3

0

_

2

0

_

1

2

x

2

y+xz

¸

¸

¸

x1

x0

_

dydz

_

3

0

_

2

0

_

1

2

y+z

_

dydz

_

3

0

_

1

4

y

2

+ yz

¸

¸

¸

y2

y0

_

dz

_

3

0

(1+2z) dz

z +z

2

¸

¸

¸

3

0

12

Example 3.8. Evaluate

_

1

0

_

1−x

0

_

2−x−y

0

(x+ y+z) dz dydx.

Solution:

_

1

0

_

1−x

0

_

2−x−y

0

(x+ y+z) dz dydx

_

1

0

_

1−x

0

_

(x+ y)z +

1

2

z

2

¸

¸

¸

z2−x−y

z0

_

dydx

_

1

0

_

1−x

0

_

(x+ y)(2−x− y) +

1

2

(2−x− y)

2

_

dydx

_

1

0

_

1−x

0

_

2−

1

2

x

2

−xy−

1

2

y

2

_

dydx

_

1

0

_

2y−

1

2

x

2

y−xy−

1

2

xy

2

−

1

6

y

3

¸

¸

¸

y1−x

y0

_

dx

_

1

0

_

11

6

−2x+

1

6

x

3

_

dx

11

6

x−x

2

+

1

24

x

4

¸

¸

¸

1

0

7

8

112 CHAPTER 3. MULTIPLE INTEGRALS

Note that the volume V of a solid in R

3

is given by

V

S

1dV . (3.10)

Since the function being integrated is the constant 1, then the above triple integral reduces

to a double integral of the types that we considered in the previous section if the solid is

bounded above by some surface z f (x, y) and bounded below by the xy-plane z 0. There

are many other possibilities. For example, the solid could be bounded below and above by

surfaces z g

1

(x, y) and z g

2

(x, y), respectively, with y bounded between two curves h

1

(x)

and h

2

(x), and x varies between a and b. Then

V

S

1dV

_

b

a

_

h

2

(x)

h

1

(x)

_

g

2

(x, y)

g

1

(x, y)

1dz dydx

_

b

a

_

h

2

(x)

h

1

(x)

(g

2

(x, y) − g

1

(x, y)) dydx

just like in equation (3.9). See Exercise 10 for an example.

Exercises

A

For Exercises 1-8, evaluate the given triple integral.

1.

_

3

0

_

2

0

_

1

0

xyz dxdydz 2.

_

1

0

_

x

0

_

y

0

xyz dz dydx

3.

_

π

0

_

x

0

_

xy

0

x

2

sinz dz dydx

4.

_

1

0

_

z

0

_

y

0

ze

y

2

dxdydz

5.

_

e

1

_

y

0

_

1/ y

0

x

2

z dxdz dy

6.

_

2

1

_

y

2

0

_

z

2

0

yz dxdz dy

7.

_

2

1

_

4

2

_

3

0

1dxdydz 8.

_

1

0

_

1−x

0

_

1−x−y

0

1dz dydx

9. Let M be a constant. Show that

_

z

2

z

1

_

y

2

y

1

_

x

2

x

1

Mdxdydz M(z

2

−z

1

)(y

2

− y

1

)(x

2

−x

1

).

B

10. Find the volume V of the solid S bounded by the three coordinate planes, bounded above

by the plane x+ y+z 2, and bounded below by the plane z x+ y.

C

11. Show that

_

b

a

_

z

a

_

y

a

f (x) dxdydz

_

b

a

(b−x)

2

2

f (x) dx. (Hint: Think of how changing the

order of integration in the triple integral changes the limits of integration.)

3.4 Numerical Approximation of Multiple Integrals 113

3.4 Numerical Approximation of Multiple Integrals

As you have seen, calculating multiple integrals is tricky even for simple functions and

regions. For complicated functions, it may not be possible to evaluate one of the iterated in-

tegrals in a simple closed form. Luckily there are numerical methods for approximating the

value of a multiple integral. The method we will discuss is called the Monte Carlo method.

The idea behind it is based on the concept of the average value of a function, which you

learned in single-variable calculus. Recall that for a continuous function f (x), the average

value

¯

f of f over an interval [a, b] is deﬁned as

¯

f

1

b−a

_

b

a

f (x) dx . (3.11)

The quantity b −a is the length of the interval [a, b], which can be thought of as the

“volume” of the interval. Applying the same reasoning to functions of two or three variables,

we deﬁne the average value of f (x, y) over a region R to be

¯

f

1

A(R)

R

f (x, y) dA , (3.12)

where A(R) is the area of the region R, and we deﬁne the average value of f (x, y, z) over a

solid S to be

¯

f

1

V(S)

S

f (x, y, z) dV , (3.13)

where V(S) is the volume of the solid S. Thus, for example, we have

R

f (x, y) dA A(R)

¯

f . (3.14)

The average value of f (x, y) over R can be thought of as representing the sum of all the

values of f divided by the number of points in R. Unfortunately there are an inﬁnite number

(in fact, uncountably many) points in any region, i.e. they can not be listed in a discrete

sequence. But what if we took a very large number N of random points in the region R

(which can be generated by a computer) and then took the average of the values of f for

those points, and used that average as the value of

¯

f ? This is exactly what the Monte Carlo

method does. So in formula (3.14) the approximation we get is

R

f (x, y) dA ≈ A(R)

¯

f ±A(R)

_

f

2

−(

¯

f )

2

N

, (3.15)

where

¯

f

N

i1

f (x

i

, y

i

)

N

and f

2

N

i1

( f (x

i

, y

i

))

2

N

, (3.16)

114 CHAPTER 3. MULTIPLE INTEGRALS

with the sums taken over the N random points (x

1

, y

1

), . . ., (x

N

, y

N

). The ± “error term” in

formula (3.15) does not really provide hard bounds on the approximation. It represents a

single standard deviation from the expected value of the integral. That is, it provides a likely

bound on the error. Due to its use of random points, the Monte Carlo method is an example

of a probabilistic method (as opposed to deterministic methods such as Newton’s method,

which use a speciﬁc formula for generating points).

For example, we can use formula (3.15) to approximate the volume V under the plane

z 8x +6y over the rectangle R [0, 1] ×[0, 2]. In Example 3.1 in Section 3.1, we showed

that the actual volume is 20. Below is a code listing (montecarlo.java) for a Java program

that calculates the volume, using a number of points N that is passed on the command line

as a parameter.

//Program to approximate the double integral of f(x,y)=8x+6y

//over the rectangle [0,1]x[0,2].

public class montecarlo {

public static void main(String[] args) {

//Get the number N of random points as a command-line parameter

int N = Integer.parseInt(args[0]);

double x = 0; //x-coordinate of a random point

double y = 0; //y-coordinate of a random point

double f = 0.0; //Value of f at a random point

double mf = 0.0; //Mean of the values of f

double mf2 = 0.0; //Mean of the values of f^2

for (int i=0;i<N;i++) { //Get the random coordinates

x = Math.random(); //x is between 0 and 1

y = 2

*

Math.random(); //y is between 0 and 2

f = 8

*

x + 6

*

y; //Value of the function

mf = mf + f; //Add to the sum of the f values

mf2 = mf2 + f

*

f; //Add to the sum of the f^2 values

}

mf = mf/N; //Compute the mean of the f values

mf2 = mf2/N; //Compute the mean of the f^2 values

System.out.println("N = " + N + ": integral = " + vol()

*

mf + " +/- "

+ vol()

*

Math.sqrt((mf2 - Math.pow(mf,2))/N)); //Print the result

}

//The volume of the rectangle [0,1]x[0,2]

public static double vol() {

return 1

*

2;

}

}

Listing 3.1 Program listing for montecarlo.java

The results of running this program with various numbers of random points (e.g. java

montecarlo 100) are shown below:

3.4 Numerical Approximation of Multiple Integrals 115

N = 10: 19.36543087722646 +/- 2.7346060413546147

N = 100: 21.334419561385353 +/- 0.7547037194998519

N = 1000: 19.807662237526227 +/- 0.26701709691370235

N = 10000: 20.080975812043256 +/- 0.08378816229769506

N = 100000: 20.009403854556716 +/- 0.026346782289498317

N = 1000000: 20.000866994982314 +/- 0.008321168748642816

As you can see, the approximation is fairly good. As N →∞, it can be shown that the

Monte Carlo approximation converges to the actual volume (on the order of O(

_

N), in com-

putational complexity terminology).

In the above example the region R was a rectangle. To use the Monte Carlo method for

a nonrectangular (bounded) region R, only a slight modiﬁcation is needed. Pick a rectangle

˜

R that encloses R, and generate random points in that rectangle as before. Then use those

points in the calculation of

¯

f only if they are inside R. There is no need to calculate the area

of R for formula (3.15) in this case, since the exclusion of points not inside R allows you to

use the area of the rectangle

˜

R instead, similar to before.

For instance, in Example 3.4 we showed that the volume under the surface z 8x +6y

over the nonrectangular region R {(x, y) : 0 ≤ x ≤1, 0 ≤ y ≤2x

2

} is 6.4. Since the rectangle

˜

R [0, 1] ×[0, 2] contains R, we can use the same program as before, with the only change

being a check to see if y < 2x

2

for a random point (x, y) in [0, 1] ×[0, 2]. Listing 3.2 below

contains the code (montecarlo2.java):

//Program to approximate the double integral of f(x,y)=8x+6y over the

//region bounded by x=0, x=1, y=0, and y=2x^2

public class montecarlo2 {

public static void main(String[] args) {

//Get the number N of random points as a command-line parameter

int N = Integer.parseInt(args[0]);

double x = 0; //x-coordinate of a random point

double y = 0; //y-coordinate of a random point

double f = 0.0; //Value of f at a random point

double mf = 0.0; //Mean of the values of f

double mf2 = 0.0; //Mean of the values of f^2

for (int i=0;i<N;i++) { //Get the random coordinates

x = Math.random(); //x is between 0 and 1

y = 2

*

Math.random(); //y is between 0 and 2

if (y < 2

*

Math.pow(x,2)) { //The point is in the region

f = 8

*

x + 6

*

y; //Value of the function

mf = mf + f; //Add to the sum of the f values

mf2 = mf2 + f

*

f; //Add to the sum of the f^2 values

}

}

mf = mf/N; //Compute the mean of the f values

mf2 = mf2/N; //Compute the mean of the f^2 values

System.out.println("N = " + N + ": integral = " + vol()

*

mf +

116 CHAPTER 3. MULTIPLE INTEGRALS

" +/- " + vol()

*

Math.sqrt((mf2 - Math.pow(mf,2))/N));

}

//The volume of the rectangle [0,1]x[0,2]

public static double vol() {

return 1

*

2;

}

}

Listing 3.2 Program listing for montecarlo2.java

The results of running the program with various numbers of random points (e.g. java

montecarlo2 1000) are shown below:

N = 10: integral = 6.95747529014894 +/- 2.9185131565120592

N = 100: integral = 6.3149056229650355 +/- 0.9549009662159909

N = 1000: integral = 6.477032813858756 +/- 0.31916837260973624

N = 10000: integral = 6.349975080015089 +/- 0.10040086346895105

N = 100000: integral = 6.440184132811864 +/- 0.03200476870881392

N = 1000000: integral = 6.417050897922222 +/- 0.01009454409789472

To use the Monte Carlo method to evaluate triple integrals, you will need to generate

random triples (x, y, z) in a parallelepiped, instead of random pairs (x, y) in a rectangle, and

use the volume of the parallelepiped instead of the area of a rectangle in formula (3.15) (see

Exercise 2). For a more detailed discussion of numerical integration methods, see PRESS et

al.

Exercises

C

1. Write a program that uses the Monte Carlo method to approximate the double integral

R

e

xy

dA, where R [0, 1]×[0, 1]. Show the program output for N 10, 100, 1000, 10000,

100000 and 1000000 random points.

2. Write a program that uses the Monte Carlo method to approximate the triple integral

S

e

xyz

dV, where S [0, 1] ×[0, 1] ×[0, 1]. Show the program output for N 10, 100,

1000, 10000, 100000 and 1000000 random points.

3. Repeat Exercise 1 with the region R {(x, y) : −1 ≤ x ≤1, 0 ≤ y ≤ x

2

}.

4. Repeat Exercise 2 with the solid S {(x, y, z) : 0 ≤ x ≤1, 0 ≤ y ≤1, 0 ≤ z ≤1−x− y}.

5. Use the Monte Carlo method to approximate the volume of a sphere of radius 1.

6. Use the Monte Carlo method to approximate the volume of the ellipsoid

x

2

9

+

y

2

4

+

z

2

1

1.

3.5 Change of Variables in Multiple Integrals 117

3.5 Change of Variables in Multiple Integrals

Given the difﬁculty of evaluating multiple integrals, the reader may be wondering if it is

possible to simplify those integrals using a suitable substitution for the variables. The an-

swer is yes, though it is a bit more complicated than the substitution method which you

learned in single-variable calculus.

Recall that if you are given, for example, the deﬁnite integral

_

2

1

x

3

_

x

2

−1dx ,

then you would make the substitution

u x

2

−1 ⇒ x

2

u+1

du 2xdx

which changes the limits of integration

x 1 ⇒ u 0

x 2 ⇒ u 3

so that we get

_

2

1

x

3

_

x

2

−1dx

_

2

1

1

2

x

2

· 2x

_

x

2

−1dx

_

3

0

1

2

(u+1)

_

udu

1

2

_

3

0

_

u

3/2

+u

1/2

_

du , which can be easily integrated to give

14

_

3

5

.

Let us take a different look at what happened when we did that substitution, which will give

some motivation for how substitution works in multiple integrals. First, we let u x

2

−1.

On the interval of integration [1, 2], the function x ·→x

2

−1 is strictly increasing (and maps

[1, 2] onto [0, 3]) and hence has an inverse function (deﬁned on the interval [0, 3]). That is,

on [0, 3] we can deﬁne x as a function of u, namely

x g(u)

_

u+1 .

Then substituting that expression for x into the function f (x) x

3

_

x

2

−1 gives

f (x) f (g(u)) (u+1)

3/2

_

u ,

118 CHAPTER 3. MULTIPLE INTEGRALS

and we see that

dx

du

g

′

(u) ⇒ dx g

′

(u) du

dx

1

2

(u+1)

−1/2

du ,

so since

g(0) 1 ⇒ 0 g

−1

(1)

g(3) 2 ⇒ 3 g

−1

(2)

then performing the substitution as we did earlier gives

_

2

1

f (x) dx

_

2

1

x

3

_

x

2

−1dx

_

3

0

1

2

(u+1)

_

udu , which can be written as

_

3

0

(u+1)

3/2

_

u ·

1

2

(u+1)

−1/2

du , which means

_

2

1

f (x) dx

_

g

−1

(2)

g

−1

(1)

f (g(u)) g

′

(u) du .

In general, if x g(u) is a one-to-one, differentiable function from an interval [c, d] (which

you can think of as being on the “u-axis”) onto an interval [a, b] (on the x-axis), which means

that g

′

(u) / 0 on the interval (c, d), so that a g(c) and b g(d), then c g

−1

(a) and d

g

−1

(b), and

_

b

a

f (x) dx

_

g

−1

(b)

g

−1

(a)

f (g(u)) g

′

(u) du . (3.17)

This is called the change of variable formula for integrals of single-variable functions, and it

is what you were implicitly using when doing integration by substitution. This formula turns

out to be a special case of a more general formula which can be used to evaluate multiple

integrals. We will state the formulas for double and triple integrals involving real-valued

functions of two and three variables, respectively. We will assume that all the functions

involved are continuously differentiable and that the regions and solids involved all have

“reasonable” boundaries. The proof of the following theorem is beyond the scope of the text.

2

2

See TAYLOR and MANN, § 15.32 and § 15.62 for all the details.

3.5 Change of Variables in Multiple Integrals 119

Theorem 3.1. Change of Variables Formula for Multiple Integrals

Let x x(u, v) and y y(u, v) deﬁne a one-to-one mapping of a region R

′

in the uv-plane onto

a region R in the xy-plane such that the determinant

J(u, v)

¸

¸

¸

¸

¸

¸

¸

¸

∂x

∂u

∂x

∂v

∂y

∂u

∂y

∂v

¸

¸

¸

¸

¸

¸

¸

¸

(3.18)

is never 0 in R

′

. Then

R

f (x, y) dA(x, y)

R

′

f (x(u, v), y(u, v)) [ J(u, v)[ dA(u, v) . (3.19)

We use the notation dA(x, y) and dA(u, v) to denote the area element in the (x, y) and (u, v)

coordinates, respectively.

Similarly, if x x(u, v, w), y y(u, v, w) and z z(u, v, w) deﬁne a one-to-one mapping of

a solid S

′

in uvw-space onto a solid S in xyz-space such that the determinant

J(u, v, w)

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

∂x

∂u

∂x

∂v

∂x

∂w

∂y

∂u

∂y

∂v

∂y

∂w

∂z

∂u

∂z

∂v

∂z

∂w

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

(3.20)

is never 0 in S

′

, then

S

f (x, y, z) dV(x, y, z)

S

′

f (x(u, v, w), y(u, v, w), z(u, v, w)) [ J(u, v, w)[ dV(u, v, w) . (3.21)

The determinant J(u, v) in formula (3.18) is called the Jacobian of x and y with respect

to u and v, and is sometimes written as

J(u, v)

∂(x, y)

∂(u, v)

. (3.22)

Similarly, the Jacobian J(u, v, w) of three variables is sometimes written as

J(u, v, w)

∂(x, y, z)

∂(u, v, w)

. (3.23)

Notice that formula (3.19) is saying that dA(x, y) [ J(u, v)[ dA(u, v), which you can think of

as a two-variable version of the relation dx g

′

(u) du in the single-variable case.

The following example shows how the change of variables formula is used.

120 CHAPTER 3. MULTIPLE INTEGRALS

Example 3.9. Evaluate

R

e

x−y

x+y

dA, where R {(x, y) : x ≥0, y ≥0, x+ y ≤1}.

Solution: First, note that evaluating this double integral without using substitution is prob-

ably impossible, at least in a closed form. By looking at the numerator and denominator of

the exponent of e, we will try the substitution u x − y and v x + y. To use the change of

variables formula (3.19), we need to write both x and y in terms of u and v. So solving for

x and y gives x

1

2

(u+v) and y

1

2

(v −u). In Figure 3.5.1 below, we see how the mapping

x x(u, v)

1

2

(u+v), y y(u, v)

1

2

(v−u) maps the region R

′

onto R in a one-to-one manner.

x

y

0

x+ y 1

1

1

R

u

v

0

1

−1 1

R

′

u v u −v

x

1

2

(u+v)

y

1

2

(v−u)

Figure 3.5.1 The regions R and R

′

Now we see that

J(u, v)

¸

¸

¸

¸

¸

¸

¸

¸

∂x

∂u

∂x

∂v

∂y

∂u

∂y

∂v

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

1

2

1

2

−

1

2

1

2

¸

¸

¸

¸

¸

1

2

⇒ [ J(u, v)[

¸

¸

¸

¸

1

2

¸

¸

¸

¸

1

2

,

so using horizontal slices in R

′

, we have

R

e

x−y

x+y

dA

R

′

f (x(u, v), y(u, v)) [ J(u, v)[ dA

_

1

0

_

v

−v

e

u

v

1

2

dudv

_

1

0

_

v

2

e

u

v

¸

¸

¸

uv

u−v

_

dv

_

1

0

v

2

(e −e

−1

) dv

v

2

4

(e −e

−1

)

¸

¸

¸

1

0

1

4

_

e −

1

e

_

e

2

−1

4e

3.5 Change of Variables in Multiple Integrals 121

The change of variables formula can be used to evaluate double integrals in polar coordi-

nates. Letting

x x(r, θ) r cosθ and y y(r, θ) r sinθ ,

we have

J(u, v)

¸

¸

¸

¸

¸

¸

¸

¸

∂x

∂r

∂x

∂θ

∂y

∂r

∂y

∂θ

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

cosθ −r sinθ

sinθ r cosθ

¸

¸

¸

¸

¸

r cos

2

θ +r sin

2

θ r ⇒ [ J(u, v)[ [ r[ r ,

so we have the following formula:

Double Integral in Polar Coordinates

R

f (x, y) dxdy

R

′

f (r cosθ, r sinθ) r dr dθ , (3.24)

where the mapping x r cosθ, y r sinθ maps the region R

′

in the rθ-plane onto the

region R in the xy-plane in a one-to-one manner.

Example 3.10. Find the volume V inside the paraboloid z x

2

+ y

2

for 0 ≤ z ≤1.

y

z

x

0

x

2

+ y

2

1

1

Figure 3.5.2 z x

2

+ y

2

Solution: Using vertical slices, we see that

V

R

(1−z) dA

R

(1−(x

2

+ y

2

)) dA ,

where R {(x, y) : x

2

+ y

2

≤ 1} is the unit disk in R

2

(see

Figure 3.5.2). In polar coordinates (r, θ) we know that

x

2

+y

2

r

2

and that the unit disk R is the set R

′

{(r, θ) :

0 ≤ r ≤1, 0 ≤θ ≤2π}. Thus,

V

_

2π

0

_

1

0

(1−r

2

) r dr dθ

_

2π

0

_

1

0

(r −r

3

) dr dθ

_

2π

0

_

r

2

2

−

r

4

4

¸

¸

¸

r1

r0

_

dθ

_

2π

0

1

4

dθ

π

2

122 CHAPTER 3. MULTIPLE INTEGRALS

Example 3.11. Find the volume V inside the cone z

_

x

2

+ y

2

for 0 ≤ z ≤1.

y

z

x

0

x

2

+ y

2

1

1

Figure 3.5.3 z

_

x

2

+ y

2

Solution: Using vertical slices, we see that

V

R

(1−z) dA

R

_

1−

_

x

2

+ y

2

_

dA ,

where R {(x, y) : x

2

+ y

2

≤1} is the unit disk in R

2

(see Figure 3.5.3). In polar coordinates (r, θ) we know

that

_

x

2

+ y

2

r and that the unit disk R is the set

R

′

{(r, θ) : 0 ≤ r ≤1, 0 ≤θ ≤2π}. Thus,

V

_

2π

0

_

1

0

(1−r) r dr dθ

_

2π

0

_

1

0

(r −r

2

) dr dθ

_

2π

0

_

r

2

2

−

r

3

3

¸

¸

¸

r1

r0

_

dθ

_

2π

0

1

6

dθ

π

3

In a similar fashion, it can be shown (see Exercises 5-6) that triple integrals in cylindrical

and spherical coordinates take the following forms:

Triple Integral in Cylindrical Coordinates

S

f (x, y, z) dxdydz

S

′

f (r cosθ, r sinθ, z) r dr dθ dz , (3.25)

where the mapping x r cosθ, y r sinθ, z z maps the solid S

′

in rθz-space onto the

solid S in xyz-space in a one-to-one manner.

Triple Integral in Spherical Coordinates

S

f (x, y, z) dxdydz

S

′

f (ρsinφ cosθ, ρsinφ sinθ, ρcosφ) ρ

2

sinφdρ dφdθ , (3.26)

where the mapping x ρsinφ cosθ, y ρsinφ sinθ, z ρcosφ maps the solid S

′

in ρφθ-

space onto the solid S in xyz-space in a one-to-one manner.

3.5 Change of Variables in Multiple Integrals 123

Example 3.12. For a >0, ﬁnd the volume V inside the sphere S x

2

+ y

2

+z

2

a

2

.

Solution: We see that S is the set ρ a in spherical coordinates, so

V

S

1dV

_

2π

0

_

π

0

_

a

0

1ρ

2

sinφdρ dφdθ

_

2π

0

_

π

0

_

ρ

3

3

¸

¸

¸

ρa

ρ0

_

sinφdφdθ

_

2π

0

_

π

0

a

3

3

sinφdφdθ

_

2π

0

_

−

a

3

3

cosφ

¸

¸

¸

φπ

φ0

_

dθ

_

2π

0

2a

3

3

dθ

4πa

3

3

.

Exercises

A

1. Find the volume V inside the paraboloid z x

2

+ y

2

for 0 ≤ z ≤4.

2. Find the volume V inside the cone z

_

x

2

+ y

2

for 0 ≤ z ≤3.

B

3. Find the volume V of the solid inside both x

2

+ y

2

+z

2

4 and x

2

+ y

2

1.

4. Find the volume V inside both the sphere x

2

+ y

2

+z

2

1 and the cone z

_

x

2

+ y

2

.

5. Prove formula (3.25). 6. Prove formula (3.26).

7. Evaluate

R

sin

_

x+y

2

_

cos

_

x−y

2

_

dA, where R is the triangle with vertices (0, 0), (2, 0) and

(1, 1). (Hint: Use the change of variables u (x+ y)/2, v (x− y)/2.)

8. Find the volume of the solid bounded by z x

2

+ y

2

and z

2

4(x

2

+ y

2

).

9. Find the volume inside the elliptic cylinder

x

2

a

2

+

y

2

b

2

1 for 0 ≤ z ≤2.

C

10. Show that the volume inside the ellipsoid

x

2

a

2

+

y

2

b

2

+

z

2

c

2

1 is

4πabc

3

. (Hint: Use the change

of variables x au, y bv, z cw, then consider Example 3.12.)

11. Show that the Beta function, deﬁned by

B(x, y)

_

1

0

t

x−1

(1−t)

y−1

dt , for x >0, y >0,

satisﬁes the relation B(y, x) B(x, y) for x >0, y >0.

12. Using the substitution t u/(u+1), show that the Beta function can be written as

B(x, y)

_

∞

0

u

x−1

(u+1)

x+y

du , for x >0, y >0.

124 CHAPTER 3. MULTIPLE INTEGRALS

3.6 Application: Center of Mass

a b

x

y

0

y f (x)

R

( ¯ x, ¯ y)

Figure 3.6.1 Center of mass of R

Recall from single-variable calculus that for a region

R {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ f (x)} in R

2

that represents

a thin, ﬂat plate (see Figure 3.6.1), where f (x) is a con-

tinuous function on [a, b], the center of mass of R has

coordinates ( ¯ x, ¯ y) given by

¯ x

M

y

M

and ¯ y

M

x

M

,

where

M

x

_

b

a

( f (x))

2

2

dx , M

y

_

b

a

xf (x) dx , M

_

b

a

f (x) dx , (3.27)

assuming that R has uniform density, i.e the mass of R is uniformly distributed over the

region. In this case the area M of the region is considered the mass of R (the density is

constant, and taken as 1 for simplicity).

In the general case where the density of a region (or lamina) R is a continuous function

δ δ(x, y) of the coordinates (x, y) of points inside R (where R can be any region in R

2

) the

coordinates ( ¯ x, ¯ y) of the center of mass of R are given by

¯ x

M

y

M

and ¯ y

M

x

M

, (3.28)

where

M

y

R

xδ(x, y) dA , M

x

R

yδ(x, y) dA , M

R

δ(x, y) dA , (3.29)

The quantities M

x

and M

y

are called the moments (or ﬁrst moments) of the region R about

the x-axis and y-axis, respectively. The quantity M is the mass of the region R. To see this,

think of taking a small rectangle inside R with dimensions ∆x and ∆y close to 0. The mass

of that rectangle is approximately δ(x

∗

, y

∗

)∆x∆y, for some point (x

∗

, y

∗

) in that rectangle.

Then the mass of R is the limit of the sums of the masses of all such rectangles inside R as

the diagonals of the rectangles approach 0, which is the double integral

R

δ(x, y) dA.

Note that the formulas in (3.27) represent a special case when δ(x, y) 1 throughout R in

the formulas in (3.29).

Example 3.13. Find the center of mass of the region R {(x, y) : 0 ≤ x ≤1, 0 ≤ y ≤2x

2

}, if the

density function at (x, y) is δ(x, y) x+ y.

3.6 Application: Center of Mass 125

x

y

0

y 2x

2

R

1

Figure 3.6.2

Solution: The region R is shown in Figure 3.6.2. We have

M

R

δ(x, y) dA

_

1

0

_

2x

2

0

(x+ y) dydx

_

1

0

_

xy+

y

2

2

¸

¸

¸

¸

y2x

2

y0

_

dx

_

1

0

(2x

3

+2x

4

) dx

x

4

2

+

2x

5

5

¸

¸

¸

¸

1

0

9

10

and

M

x

R

yδ(x, y) dA M

y

R

xδ(x, y) dA

_

1

0

_

2x

2

0

y(x+ y) dydx

_

1

0

_

2x

2

0

x(x+ y) dydx

_

1

0

_

xy

2

2

+

y

3

3

¸

¸

¸

¸

y2x

2

y0

_

dx

_

1

0

_

x

2

y+

xy

2

2

¸

¸

¸

¸

y2x

2

y0

_

dx

_

1

0

(2x

5

+

8x

6

3

) dx

_

1

0

(2x

4

+2x

5

) dx

x

6

3

+

8x

7

21

¸

¸

¸

¸

1

0

5

7

2x

5

5

+

x

6

3

¸

¸

¸

¸

1

0

11

15

,

so the center of mass ( ¯ x, ¯ y) is given by

¯ x

M

y

M

11/15

9/10

22

27

, ¯ y

M

x

M

5/7

9/10

50

63

.

Note how this center of mass is a little further towards the upper corner of the region R than

when the density is uniform (it is easy to use the formulas in (3.27) to show that ( ¯ x, ¯ y)

_

3

4

,

3

5

_

in that case). This makes sense since the density function δ(x, y) x + y increases as (x, y)

approaches that upper corner, where there is quite a bit of area.

In the special case where the density function δ(x, y) is a constant function on the region

R, the center of mass ( ¯ x, ¯ y) is called the centroid of R.

126 CHAPTER 3. MULTIPLE INTEGRALS

The formulas for the center of mass of a region in R

2

can be generalized to a solid S in R

3

.

Let S be a solid with a continuous mass density function δ(x, y, z) at any point (x, y, z) in S.

Then the center of mass of S has coordinates ( ¯ x, ¯ y, ¯ z), where

¯ x

M

yz

M

, ¯ y

M

xz

M

, ¯ z

M

xy

M

, (3.30)

where

M

yz

S

xδ(x, y, z) dV , M

xz

S

yδ(x, y, z) dV , M

xy

S

zδ(x, y, z) dV , (3.31)

M

S

δ(x, y, z) dV . (3.32)

In this case, M

yz

, M

xz

and M

xy

are called the moments (or ﬁrst moments) of S around the

yz-plane, xz-plane and xy-plane, respectively. Also, M is the mass of S.

Example 3.14. Find the center of mass of the solid S {(x, y, z) : z ≥ 0, x

2

+ y

2

+z

2

≤ a

2

}, if

the density function at (x, y, z) is δ(x, y, z) 1.

y

z

x

0 a

( ¯ x, ¯ y, ¯ z)

a

Figure 3.6.3

Solution: The solid S is just the upper hemisphere inside the sphere

of radius a centered at the origin (see Figure 3.6.3). So since the

density function is a constant and S is symmetric about the z-axis,

then it is clear that ¯ x 0 and ¯ y 0, so we need only ﬁnd ¯ z. We have

M

S

δ(x, y, z) dV

S

1dV Volume(S).

But since the volume of S is half the volume of the sphere of radius

a, which we know by Example 3.12 is

4πa

3

3

, then M

2πa

3

3

. And

M

xy

S

zδ(x, y, z) dV

S

z dV , which in spherical coordinates is

_

2π

0

_

π/2

0

_

a

0

(ρ cosφ) ρ

2

sinφdρ dφdθ

_

2π

0

_

π/2

0

sinφ cosφ

__

a

0

ρ

3

dρ

_

dφdθ

_

2π

0

_

π/2

0

a

4

4

sinφ cosφdφdθ

3.6 Application: Center of Mass 127

M

xy

_

2π

0

_

π/2

0

a

4

8

sin2φdφdθ (since sin2φ2sinφ cosφ)

_

2π

0

_

−

a

4

16

cos2φ

¸

¸

¸

φπ/2

φ0

_

dθ

_

2π

0

a

4

8

dθ

πa

4

4

,

so

¯ z

M

xy

M

πa

4

4

2πa

3

3

3a

8

.

Thus, the center of mass of S is ( ¯ x, ¯ y, ¯ z)

_

0, 0,

3a

8

_

.

Exercises

A

For Exercises 1-5, ﬁnd the center of mass of the region R with the given density function

δ(x, y).

1. R {(x, y) : 0 ≤ x ≤2, 0 ≤ y ≤4 }, δ(x, y) 2y

2. R {(x, y) : 0 ≤ x ≤1, 0 ≤ y ≤ x

2

}, δ(x, y) x+ y

3. R {(x, y) : y ≥0, x

2

+ y

2

≤a

2

}, δ(x, y) 1

4. R {(x, y) : y ≥0, x ≥0, 1 ≤ x

2

+ y

2

≤4 }, δ(x, y)

_

x

2

+ y

2

5. R {(x, y) : y ≥0, x

2

+ y

2

≤1 }, δ(x, y) y

B

For Exercises 6-10, ﬁnd the center of mass of the solid S with the given density function

δ(x, y, z).

6. S {(x, y, z) : 0 ≤ x ≤1, 0 ≤ y ≤1, 0 ≤ z ≤1 }, δ(x, y, z) xyz

7. S {(x, y, z) : z ≥0, x

2

+ y

2

+z

2

≤a

2

}, δ(x, y, z) x

2

+ y

2

+z

2

8. S {(x, y, z) : x ≥0, y ≥0, z ≥0, x

2

+ y

2

+z

2

≤a

2

}, δ(x, y, z) 1

9. S {(x, y, z) : 0 ≤ x ≤1, 0 ≤ y ≤1, 0 ≤ z ≤1 }, δ(x, y, z) x

2

+ y

2

+z

2

10. S {(x, y, z) : 0 ≤ x ≤1, 0 ≤ y ≤1, 0 ≤ z ≤1−x− y}, δ(x, y, z) 1

128 CHAPTER 3. MULTIPLE INTEGRALS

3.7 Application: Probability and Expected Value

In this section we will brieﬂy discuss some applications of multiple integrals in the ﬁeld of

probability theory. In particular we will see ways in which multiple integrals can be used to

calculate probabilities and expected values.

Probability

Suppose that you have a standard six-sided (fair) die, and you let a variable X represent

the value rolled. Then the probability of rolling a 3, written as P(X 3), is

1

6

, since there

are six sides on the die and each one is equally likely to be rolled, and hence in particular

the 3 has a one out of six chance of being rolled. Likewise the probability of rolling at most a

3, written as P(X ≤3), is

3

6

1

2

, since of the six numbers on the die, there are three equally

likely numbers (1, 2, and 3) that are less than or equal to 3. Note that P(X ≤ 3) P(X

1) +P(X 2) +P(X 3). We call X a discrete random variable on the sample space (or

probability space) Ω consisting of all possible outcomes. In our case, Ω {1, 2, 3, 4, 5, 6}. An

event A is a subset of the sample space. For example, in the case of the die, the event X ≤3

is the set {1, 2, 3}.

Now let X be a variable representing a random real number in the interval (0, 1). Note

that the set of all real numbers between 0 and 1 is not a discrete (or countable) set of values,

i.e. it can not be put into a one-to-one correspondence with the set of positive integers.

3

In

this case, for any real number x in (0, 1), it makes no sense to consider P(X x) since it must

be 0 (why?). Instead, we consider the probability P(X ≤ x), which is given by P(X ≤ x) x.

The reasoning is this: the interval (0, 1) has length 1, and for x in (0, 1) the interval (0, x)

has length x. So since X represents a random number in (0, 1), and hence is uniformly

distributed over (0, 1), then

P(X ≤ x)

length of (0, x)

length of (0, 1)

x

1

x .

We call X a continuous random variable on the sample space Ω (0, 1). An event A is a

subset of the sample space. For example, in our case the event X ≤ x is the set (0, x).

In the case of a discrete random variable, we saw how the probability of an event was the

sum of the probabilities of the individual outcomes comprising that event (e.g. P(X ≤ 3)

P(X 1) +P(X 2) +P(X 3) in the die example). For a continuous random variable, the

probability of an event will instead be the integral of a function, which we will now describe.

Let X be a continuous real-valued random variable on a sample space Ω in R. For sim-

3

For a proof see p. 9-10 in KAMKE, E., Theory of Sets, New York: Dover, 1950.

3.7 Application: Probability and Expected Value 129

plicity, let Ω(a, b). Deﬁne the distribution function F of X as

F(x) P(X ≤ x) , for −∞< x <∞ (3.33)

_

¸

¸

_

¸

¸

_

1, for x ≥ b

P(X ≤ x), for a < x < b

0, for x ≤a .

(3.34)

Suppose that there is a nonnegative, continuous real-valued function f on R such that

F(x)

_

x

−∞

f (y) dy , for −∞< x <∞, (3.35)

and

_

∞

−∞

f (x) dx 1 . (3.36)

Then we call f the probability density function (or p.d.f. for short) for X. We thus have

P(X ≤ x)

_

x

a

f (y) dy , for a < x < b . (3.37)

Also, by the Fundamental Theorem of Calculus, we have

F

′

(x) f (x) , for −∞< x <∞. (3.38)

Example 3.15. Let X represent a randomly selected real number in the interval (0, 1). We

say that X has the uniform distribution on (0, 1), with distribution function

F(x) P(X ≤ x)

_

¸

¸

_

¸

¸

_

1, for x ≥1

x, for 0 < x <1

0, for x ≤0 ,

(3.39)

and probability density function

f (x) F

′

(x)

_

1, for 0 < x <1

0, elsewhere.

(3.40)

In general, if X represents a randomly selected real number in an interval (a, b), then X has

the uniform distribution function

F(x) P(X ≤ x)

_

¸

¸

_

¸

¸

_

1, for x ≥ b

x

b−a

, for a < x < b

0, for x ≤a ,

(3.41)

and probability density function

f (x) F

′

(x)

_

1

b−a

, for a < x < b

0, elsewhere.

(3.42)

130 CHAPTER 3. MULTIPLE INTEGRALS

Example 3.16. A famous distribution function is given by the standard normal distribution,

whose probability density function f is

f (x)

1

_

2π

e

−x

2

/2

, for −∞< x <∞. (3.43)

This is often called a “bell curve”, and is used widely in statistics. Since we are claiming that

f is a p.d.f., we should have

_

∞

−∞

1

_

2π

e

−x

2

/2

dx 1 (3.44)

by formula (3.36), which is equivalent to

_

∞

−∞

e

−x

2

/2

dx

_

2π . (3.45)

We can use a double integral in polar coordinates to verify this integral. First,

_

∞

−∞

_

∞

−∞

e

−(x

2

+y

2

)/2

dxdy

_

∞

−∞

e

−y

2

/2

__

∞

−∞

e

−x

2

/2

dx

_

dy

__

∞

−∞

e

−x

2

/2

dx

_ __

∞

−∞

e

−y

2

/2

dy

_

__

∞

−∞

e

−x

2

/2

dx

_

2

since the same function is being integrated twice in the middle equation, just with different

variables. But using polar coordinates, we see that

_

∞

−∞

_

∞

−∞

e

−(x

2

+y

2

)/2

dxdy

_

2π

0

_

∞

0

e

−r

2

/2

r dr dθ

_

2π

0

_

−e

−r

2

/2

¸

¸

¸

¸

r∞

r0

_

dθ

_

2π

0

(0−(−e

0

)) dθ

_

2π

0

1dθ 2π ,

and so

__

∞

−∞

e

−x

2

/2

dx

_

2

2π , and hence

_

∞

−∞

e

−x

2

/2

dx

_

2π .

3.7 Application: Probability and Expected Value 131

In addition to individual random variables, we can consider jointly distributed random

variables. For this, we will let X, Y and Z be three real-valued continuous random variables

deﬁned on the same sample space Ωin R (the discussion for two random variables is similar).

Then the joint distribution function F of X, Y and Z is given by

F(x, y, z) P(X ≤ x, Y ≤ y, Z ≤ z) , for −∞< x, y, z <∞. (3.46)

If there is a nonnegative, continuous real-valued function f on R

3

such that

F(x, y, z)

_

z

−∞

_

y

−∞

_

x

−∞

f (u, v, w) dudvdw , for −∞< x, y, z <∞ (3.47)

and

_

∞

−∞

_

∞

−∞

_

∞

−∞

f (x, y, z) dxdydz 1 , (3.48)

then we call f the joint probability density function (or joint p.d.f. for short) for X, Y and Z.

In general, for a

1

< b

1

, a

2

< b

2

, a

3

< b

3

, we have

P(a

1

< X ≤ b

1

, a

2

<Y ≤ b

2

, a

3

< Z ≤ b

3

)

_

b

3

a

3

_

b

2

a

2

_

b

1

a

1

f (x, y, z) dxdydz , (3.49)

with the ≤ and < symbols interchangeable in any combination. A triple integral, then, can

be thought of as representing a probability (for a function f which is a p.d.f.).

Example 3.17. Let a, b, and c be real numbers selected randomly from the interval (0, 1).

What is the probability that the equation ax

2

+bx+c 0 has at least one real solution x?

a

c

0

c

1

4a

1

1

1

4

R

1

R

2

Figure 3.7.1 Region

R R

1

∪R

2

Solution: We know by the quadratic formula that there is at least

one real solution if b

2

−4ac ≥0. So we need to calculate P(b

2

−4ac ≥

0). We will use three jointly distributed random variables to do this.

First, since 0 <a, b, c <1, we have

b

2

−4ac ≥ 0 ⇔ 0 < 4ac ≤ b

2

< 1 ⇔ 0 < 2

_

a

_

c ≤ b < 1 ,

where the last relation holds for all 0 <a, c <1 such that

0 < 4ac < 1 ⇔ 0 < c <

1

4a

.

Considering a, b and c as real variables, the region R in the ac-plane where the above

relation holds is given by R {(a, c) : 0 < a < 1, 0 < c < 1, 0 < c <

1

4a

}, which we can see is a

union of two regions R

1

and R

2

, as in Figure 3.7.1 above.

Now let X, Y and Z be continuous random variables, each representing a randomly se-

lected real number from the interval (0, 1) (think of X, Y and Z representing a, b and c,

132 CHAPTER 3. MULTIPLE INTEGRALS

respectively). Then, similar to how we showed that f (x) 1 is the p.d.f. of the uniform dis-

tribution on (0, 1), it can be shown that f (x, y, z) 1 for x, y, z in (0, 1)

(0 elsewhere) is the joint p.d.f. of X, Y and Z. Now,

P(b

2

−4ac ≥0) P((a, c) ∈ R, 2

_

a

_

c ≤ b <1) ,

so this probability is the triple integral of f (a, b, c) 1 as b varies from 2

_

a

_

c to 1 and as

(a, c) varies over the region R. Since R can be divided into two regions R

1

and R

2

, then the

required triple integral can be split into a sum of two triple integrals, using vertical slices in

R:

P(b

2

−4ac ≥0)

_

1/4

0

_

1

0

. ¸¸ .

R

1

_

1

2

_

a

_

c

1dbdc da +

_

1

1/4

_

1/4a

0

. ¸¸ .

R

2

_

1

2

_

a

_

c

1dbdc da

_

1/4

0

_

1

0

(1−2

_

a

_

c) dc da +

_

1

1/4

_

1/4a

0

(1−2

_

a

_

c) dc da

_

1/4

0

_

c −

4

3

_

ac

3/2

¸

¸

¸

c1

c0

_

da +

_

1

1/4

_

c −

4

3

_

ac

3/2

¸

¸

¸

c1/4a

c0

_

da

_

1/4

0

_

1−

4

3

_

a

_

da +

_

1

1/4

1

12a

da

a−

8

9

a

3/2

¸

¸

¸

¸

1/4

0

+

1

12

lna

¸

¸

¸

¸

1

1/4

_

1

4

−

1

9

_

+

_

0−

1

12

ln

1

4

_

5

36

+

1

12

ln4

P(b

2

−4ac ≥0)

5+3ln4

36

≈ 0.2544

In other words, the equation ax

2

+bx+c 0 has about a 25% chance of being solved!

Expected Value

The expected value EX of a random variable X can be thought of as the “average” value

of X as it varies over its sample space. If X is a discrete random variable, then

EX

x

xP(X x) , (3.50)

with the sum being taken over all elements x of the sample space. For example, if X repre-

sents the number rolled on a six-sided die, then

EX

6

x1

xP(X x)

6

x1

x

1

6

3.5 (3.51)

is the expected value of X, which is the average of the integers 1−6.

3.7 Application: Probability and Expected Value 133

If X is a real-valued continuous random variable with p.d.f. f , then

EX

_

∞

−∞

x f (x) dx . (3.52)

For example, if X has the uniform distribution on the interval (0, 1), then its p.d.f. is

f (x)

_

1, for 0 < x <1

0, elsewhere,

(3.53)

and so

EX

_

∞

−∞

x f (x) dx

_

1

0

xdx

1

2

. (3.54)

For a pair of jointly distributed, real-valued continuous random variables X and Y with

joint p.d.f. f (x, y), the expected values of X and Y are given by

EX

_

∞

−∞

_

∞

−∞

x f (x, y) dxdy and EY

_

∞

−∞

_

∞

−∞

y f (x, y) dxdy , (3.55)

respectively.

Example 3.18. If you were to pick n >2 random real numbers from the interval (0, 1), what

are the expected values for the smallest and largest of those numbers?

Solution: Let U

1

, . . . ,U

n

be n continuous random variables, each representing a randomly

selected real number from (0, 1), i.e. each has the uniform distribution on (0, 1). Deﬁne

random variables X and Y by

X min(U

1

, . . . ,U

n

) and Y max(U

1

, . . . ,U

n

) .

Then it can be shown

4

that the joint p.d.f. of X and Y is

f (x, y)

_

n(n−1)(y−x)

n−2

, for 0 ≤ x ≤ y ≤1

0, elsewhere.

(3.56)

Thus, the expected value of X is

EX

_

1

0

_

1

x

n(n−1)x(y−x)

n−2

dydx

_

1

0

_

nx(y−x)

n−1

¸

¸

¸

y1

yx

_

dx

_

1

0

nx(1−x)

n−1

dx , so integration by parts yields

−x(1−x)

n

−

1

n+1

(1−x)

n+1

¸

¸

¸

1

0

EX

1

n+1

,

4

See Ch. 6 in HOEL, PORT and STONE.

134 CHAPTER 3. MULTIPLE INTEGRALS

and similarly (see Exercise 3) it can be shown that

EY

_

1

0

_

y

0

n(n−1)y(y−x)

n−2

dxdy

n

n+1

.

So, for example, if you were to repeatedly take samples of n 3 random real numbers from

(0, 1), and each time store the minimum and maximum values in the sample, then the aver-

age of the minimums would approach

1

4

and the average of the maximums would approach

3

4

as the number of samples grows. It would be relatively simple (see Exercise 4) to write a

computer program to test this.

Exercises

B

1. Evaluate the integral

_

∞

−∞

e

−x

2

dx using anything you have learned so far.

2. For σ>0 and µ>0, evaluate

_

∞

−∞

1

σ

_

2π

e

−(x−µ)

2

/2σ

2

dx.

3. Show that EY

n

n+1

in Example 3.18

C

4. Write a computer program (in the language of your choice) that veriﬁes the results in

Example 3.18 for the case n 3 by taking large numbers of samples.

5. Repeat Exercise 4 for the case when n 4.

6. For continuous random variables X, Y with joint p.d.f. f (x, y), deﬁne the second moments

E(X

2

) and E(Y

2

) by

E(X

2

)

_

∞

−∞

_

∞

−∞

x

2

f (x, y) dxdy and E(Y

2

)

_

∞

−∞

_

∞

−∞

y

2

f (x, y) dxdy ,

and the variances Var(X) and Var(Y) by

Var(X) E(X

2

) −(EX)

2

and Var(Y) E(Y

2

) −(EY)

2

.

Find Var(X) and Var(Y) for X and Y as in Example 3.18.

7. Continuing Exercise 6, the correlation ρ between X and Y is deﬁned as

ρ

E(XY) −(EX)(EY)

_

Var(X) Var(Y)

,

where E(XY)

_

∞

−∞

_

∞

−∞

xy f (x, y) dxdy. Find ρ for X and Y as in Example 3.18.

(Note: The quantity E(XY) −(EX)(EY) is called the covariance of X and Y.)

8. In Example 3.17 would the answer change if the interval (0, 100) is used instead of (0, 1)?

Explain.

4 Line and Surface Integrals

4.1 Line Integrals

In single-variable calculus you learned how to integrate a real-valued function f (x) over an

interval [a, b] in R

1

. This integral (usually called a Riemann integral) can be thought of as

an integral over a path in R

1

, since an interval (or collection of intervals) is really the only

kind of “path” in R

1

. You may also recall that if f (x) represented the force applied along the

x-axis to an object at position x in [a, b], then the work W done in moving that object from

position x a to x b was deﬁned as the integral:

W

_

b

a

f (x) dx

In this section, we will see how to deﬁne the integral of a function (either real-valued or

vector-valued) of two variables over a general path (i.e. a curve) in R

2

. This deﬁnition will

be motivated by the physical notion of work. We will begin with real-valued functions of two

variables.

In physics, the intuitive idea of work is that

Work Force × Distance .

Suppose that we want to ﬁnd the total amount W of work done in moving an object along a

curve C in R

2

with a smooth parametrization x x(t), y y(t), a ≤ t ≤ b, with a force f (x, y)

which varies with the position (x, y) of the object and is applied in the direction of motion

along C (see Figure 4.1.1 below).

x

y

0

C

t a

t b

∆s

i

≈

_

∆x

i

2

+∆y

i

2

t t

i

t t

i+1

∆y

i

∆x

i

Figure 4.1.1 Curve C : x x(t), y y(t) for t in [a, b]

We will assume for now that the function f (x, y) is continuous and real-valued, so we only

consider the magnitude of the force. Partition the interval [a, b] as follows:

a t

0

< t

1

< t

2

<· · · < t

n−1

< t

n

b , for some integer n ≥2

135

136 CHAPTER 4. LINE AND SURFACE INTEGRALS

As we can see from Figure 4.1.1, over a typical subinterval [t

i

, t

i+1

] the distance ∆s

i

traveled

along the curve is approximately

_

∆x

i

2

+∆y

i

2

, by the Pythagorean Theorem. Thus, if the

subinterval is small enough then the work done in moving the object along that piece of the

curve is approximately

Force × Distance ≈ f (x

i∗

, y

i∗

)

_

∆x

i

2

+∆y

i

2

, (4.1)

where (x

i∗

, y

i∗

) (x(t

i

∗), y(t

i

∗)) for some t

i

∗ in [t

i

, t

i+1

], and so

W ≈

n−1

i0

f (x

i∗

, y

i∗

)

_

∆x

i

2

+∆y

i

2

(4.2)

is approximately the total amount of work done over the entire curve. But since

_

∆x

i

2

+∆y

i

2

_

_

∆x

i

∆t

i

_

2

+

_

∆y

i

∆t

i

_

2

∆t

i

,

where ∆t

i

t

i+1

−t

i

, then

W ≈

n−1

i0

f (x

i∗

, y

i∗

)

_

_

∆x

i

∆t

i

_

2

+

_

∆y

i

∆t

i

_

2

∆t

i

. (4.3)

Taking the limit of that sum as the length of the largest subinterval goes to 0, the sum over

all subintervals becomes the integral from t a to t b,

∆x

i

∆t

i

and

∆y

i

∆t

i

become x

′

(t) and y

′

(t),

respectively, and f (x

i∗

, y

i∗

) becomes f (x(t), y(t)), so that

W

_

b

a

f (x(t), y(t))

_

x

′

(t)

2

+ y

′

(t)

2

dt . (4.4)

The integral on the right side of the above equation gives us our idea of how to deﬁne,

for any real-valued function f (x, y), the integral of f (x, y) along the curve C, called a line

integral:

Deﬁnition 4.1. For a real-valued function f (x, y) and a curve C in R

2

, parametrized by

x x(t), y y(t), a ≤ t ≤ b, the line integral of f (x, y) along C with respect to arc length

s is

_

C

f (x, y) ds

_

b

a

f (x(t), y(t))

_

x

′

(t)

2

+ y

′

(t)

2

dt . (4.5)

The symbol ds is the differential of the arc length function

s s(t)

_

t

a

_

x

′

(u)

2

+ y

′

(u)

2

du , (4.6)

4.1 Line Integrals 137

which you may recognize from Section 1.9 as the length of the curve C over the interval [a, t],

for all t in [a, b]. That is,

ds s

′

(t) dt

_

x

′

(t)

2

+ y

′

(t)

2

dt , (4.7)

by the Fundamental Theorem of Calculus.

For a general real-valued function f (x, y), what does the line integral

_

C

f (x, y) ds rep-

resent? The preceding discussion of ds gives us a clue. You can think of differentials as

inﬁnitesimal lengths. So if you think of f (x, y) as the height of a picket fence along C, then

f (x, y) ds can be thought of as approximately the area of a section of that fence over some

inﬁnitesimally small section of the curve, and thus the line integral

_

C

f (x, y) ds is the total

area of that picket fence (see Figure 4.1.2).

x

y

0

C

ds

f (x, y)

Figure 4.1.2 Area of shaded rectangle height ×width ≈ f (x, y) ds

Example 4.1. Use a line integral to show that the lateral surface area A of a right circular

cylinder of radius r and height h is 2πrh.

y

z

x

0

r

h f (x, y)

C : x

2

+ y

2

r

2

Figure 4.1.3

Solution: We will use the right circular cylinder with base circle C

given by x

2

+ y

2

r

2

and with height h in the positive z direction

(see Figure 4.1.3). Parametrize C as follows:

x x(t) r cos t , y y(t) r sint , 0 ≤ t ≤2π

Let f (x, y) h for all (x, y). Then

A

_

C

f (x, y) ds

_

b

a

f (x(t), y(t))

_

x

′

(t)

2

+ y

′

(t)

2

dt

_

2π

0

h

_

(−r sint)

2

+(r cos t)

2

dt

h

_

2π

0

r

_

sin

2

t +cos

2

t dt

rh

_

2π

0

1dt 2πrh

138 CHAPTER 4. LINE AND SURFACE INTEGRALS

Note in Example 4.1 that if we had traversed the circle C twice, i.e. let t vary from 0 to

4π, then we would have gotten an area of 4πrh, i.e. twice the desired area, even though the

curve itself is still the same (namely, a circle of radius r). Also, notice that we traversed the

circle in the counter-clockwise direction. If we had gone in the clockwise direction, using the

parametrization

x x(t) r cos(2π−t) , y y(t) r sin(2π−t) , 0 ≤ t ≤2π , (4.8)

then it is easy to verify (see Exercise 12) that the value of the line integral is unchanged.

In general, it can be shown (see Exercise 15) that reversing the direction in which a curve

C is traversed leaves

_

C

f (x, y) ds unchanged, for any f (x, y). If a curve C has a parametriza-

tion x x(t), y y(t), a ≤ t ≤ b, then denote by −C the same curve as C but traversed in the

opposite direction. Then −C is parametrized by

x x(a+b−t) , y y(a+b−t) , a ≤ t ≤ b , (4.9)

and we have

_

C

f (x, y) ds

_

−C

f (x, y) ds . (4.10)

Notice that our deﬁnition of the line integral was with respect to the arc length parameter

s. We can also deﬁne

_

C

f (x, y) dx

_

b

a

f (x(t), y(t)) x

′

(t) dt (4.11)

as the line integral of f (x, y) along C with respect to x, and

_

C

f (x, y) dy

_

b

a

f (x(t), y(t)) y

′

(t) dt (4.12)

as the line integral of f (x, y) along C with respect to y.

In the derivation of the formula for a line integral, we used the idea of work as force

multiplied by distance. However, we know that force is actually a vector. So it would be

helpful to develop a vector form for a line integral. For this, suppose that we have a function

f(x, y) deﬁned on R

2

by

f(x, y) P(x, y) i + Q(x, y) j

for some continuous real-valued functions P(x, y) and Q(x, y) on R

2

. Such a function f is

called a vector ﬁeld on R

2

. It is deﬁned at points in R

2

, and its values are vectors in R

2

. For

a curve C with a smooth parametrization x x(t), y y(t), a ≤ t ≤ b, let

r(t) x(t) i + y(t) j

4.1 Line Integrals 139

be the position vector for a point (x(t), y(t)) on C. Then r

′

(t) x

′

(t) i + y

′

(t) j and so

_

C

P(x, y) dx +

_

C

Q(x, y) dy

_

b

a

P(x(t), y(t)) x

′

(t) dt +

_

b

a

Q(x(t), y(t)) y

′

(t) dt

_

b

a

(P(x(t), y(t)) x

′

(t) +Q(x(t), y(t)) y

′

(t)) dt

_

b

a

f(x(t), y(t)) ··· r

′

(t) dt

by deﬁnition of f(x, y). Notice that the function f(x(t), y(t)) ··· r

′

(t) is a real-valued function on

[a, b], so the last integral on the right looks somewhat similar to our earlier deﬁnition of a

line integral. This leads us to the following deﬁnition:

Deﬁnition 4.2. For a vector ﬁeld f(x, y) P(x, y) i +Q(x, y) j and a curve C with a smooth

parametrization x x(t), y y(t), a ≤ t ≤ b, the line integral of f along C is

_

C

f ··· dr

_

C

P(x, y) dx +

_

C

Q(x, y) dy (4.13)

_

b

a

f(x(t), y(t)) ··· r

′

(t) dt , (4.14)

where r(t) x(t) i + y(t) j is the position vector for points on C.

We use the notation dr r

′

(t) dt dxi +dyj to denote the differential of the vector-valued

function r. The line integral in Deﬁnition 4.2 is often called a line integral of a vector ﬁeld

to distinguish it from the line integral in Deﬁnition 4.1 which is called a line integral of a

scalar ﬁeld. For convenience we will often write

_

C

P(x, y) dx +

_

C

Q(x, y) dy

_

C

P(x, y) dx+Q(x, y) dy ,

where it is understood that the line integral along C is being applied to both P and Q. The

quantity P(x, y) dx+Q(x, y) dy is known as a differential form. For a real-valued function

F(x, y), the differential of F is dF

∂F

∂x

dx+

∂F

∂y

dy. A differential form P(x, y) dx+Q(x, y) dy

is called exact if it equals dF for some function F(x, y).

Recall that if the points on a curve C have position vector r(t) x(t) i+y(t) j, then r

′

(t) is a

tangent vector to C at the point (x(t), y(t)) in the direction of increasing t (which we call the

direction of C). Since C is a smooth curve, then r

′

(t) /0 on [a, b] and hence

T(t)

r

′

(t)

_

_

r

′

(t)

_

_

is the unit tangent vector to C at (x(t), y(t)). Putting Deﬁnitions 4.1 and 4.2 together we get

the following theorem:

140 CHAPTER 4. LINE AND SURFACE INTEGRALS

Theorem 4.1. For a vector ﬁeld f(x, y) P(x, y) i +Q(x, y) j and a curve C with a smooth

parametrization x x(t), y y(t), a ≤ t ≤ b and position vector r(t) x(t) i + y(t) j,

_

C

f ··· dr

_

C

f ··· Tds , (4.15)

where T(t)

r

′

(t)

|r

′

(t)|

is the unit tangent vector to C at (x(t), y(t)).

If the vector ﬁeld f(x, y) represents the force moving an object along a curve C, then the work

W done by this force is

W

_

C

f ··· Tds

_

C

f ··· dr . (4.16)

Example 4.2. Evaluate

_

C

(x

2

+ y

2

) dx+2xydy, where:

(a) C : x t , y 2t , 0 ≤ t ≤1

(b) C : x t , y 2t

2

, 0 ≤ t ≤1

x

y

0

(1, 2)

2

1

Figure 4.1.4

Solution: Figure 4.1.4 shows both curves.

(a) Since x

′

(t) 1 and y

′

(t) 2, then

_

C

(x

2

+ y

2

) dx+2xydy

_

1

0

_

(x(t)

2

+ y(t)

2

)x

′

(t) +2x(t)y(t) y

′

(t)

_

dt

_

1

0

_

(t

2

+4t

2

)(1) +2t(2t)(2)

_

dt

_

1

0

13t

2

dt

13t

3

3

¸

¸

¸

¸

1

0

13

3

(b) Since x

′

(t) 1 and y

′

(t) 4t, then

_

C

(x

2

+ y

2

) dx+2xydy

_

1

0

_

(x(t)

2

+ y(t)

2

)x

′

(t) +2x(t)y(t) y

′

(t)

_

dt

_

1

0

_

(t

2

+4t

4

)(1) +2t(2t

2

)(4t)

_

dt

_

1

0

(t

2

+20t

4

) dt

t

3

3

+4t

5

¸

¸

¸

¸

1

0

1

3

+4

13

3

4.1 Line Integrals 141

So in both cases, if the vector ﬁeld f(x, y) (x

2

+y

2

) i+2xyj represents the force moving an

object from (0, 0) to (1, 2) along the given curve C, then the work done is

13

3

. This may lead

you to think that work (and more generally, the line integral of a vector ﬁeld) is independent

of the path taken. However, as we will see in the next section, this is not always the case.

Although we deﬁned line integrals over a single smooth curve, if C is a piecewise smooth

curve, that is

C C

1

∪C

2

∪. . . ∪C

n

is the union of smooth curves C

1

, . . . , C

n

, then we can deﬁne

_

C

f ··· dr

_

C

1

f ··· dr

1

+

_

C

2

f ··· dr

2

+. . . +

_

C

n

f ··· dr

n

where each r

i

is the position vector of the curve C

i

.

Example 4.3. Evaluate

_

C

(x

2

+ y

2

) dx+2xydy, where C is the polygonal path from (0, 0) to

(0, 2) to (1, 2).

x

y

0

(1, 2) 2

1

C

1

C

2

Figure 4.1.5

Solution: Write C C

1

∪C

2

, where C

1

is the curve given by x 0, y t,

0 ≤ t ≤ 2 and C

2

is the curve given by x t, y 2, 0 ≤ t ≤ 1 (see Figure

4.1.5). Then

_

C

(x

2

+ y

2

) dx+2xydy

_

C

1

(x

2

+ y

2

) dx+2xydy

+

_

C

2

(x

2

+ y

2

) dx+2xydy

_

2

0

_

(0

2

+t

2

)(0) +2(0)t(1)

_

dt +

_

1

0

_

(t

2

+4)(1) +2t(2)(0)

_

dt

_

2

0

0dt +

_

1

0

(t

2

+4) dt

t

3

3

+4t

¸

¸

¸

¸

1

0

1

3

+4

13

3

Line integral notation varies quite a bit. For example, in physics it is common to see the

notation

_

b

a

f ··· dl, where it is understood that the limits of integration a and b are for the

underlying parameter t of the curve, and the letter l signiﬁes length. Also, the formulation

_

C

f ··· Tds from Theorem 4.1 is often preferred in physics since it emphasizes the idea of

integrating the tangential component f··· T of f in the direction of T (i.e. in the direction of C),

which is a useful physical interpretation of line integrals.

142 CHAPTER 4. LINE AND SURFACE INTEGRALS

Exercises

A

For Exercises 1-4, calculate

_

C

f (x, y) ds for the given function f (x, y) and curve C.

1. f (x, y) xy; C : x cos t, y sint, 0 ≤ t ≤π/2

2. f (x, y)

x

x

2

+1

; C : x t, y 0, 0 ≤ t ≤1

3. f (x, y) 2x+ y; C: polygonal path from (0, 0) to (3, 0) to (3, 2)

4. f (x, y) x + y

2

; C: path from (2, 0) counterclockwise along the circle x

2

+ y

2

4 to the

point (−2, 0) and then back to (2, 0) along the x-axis

5. Use a line integral to ﬁnd the lateral surface area of the part of the cylinder

x

2

+ y

2

4 below the plane x+2y+z 6 and above the xy-plane.

For Exercises 6-11, calculate

_

C

f ··· dr for the given vector ﬁeld f(x, y) and curve C.

6. f(x, y) i −j; C : x 3t, y 2t, 0 ≤ t ≤1

7. f(x, y) yi −xj; C : x cos t, y sint, 0 ≤ t ≤2π

8. f(x, y) xi + yj; C : x cos t, y sint, 0 ≤ t ≤2π

9. f(x, y) (x

2

− y) i +(x− y

2

) j; C : x cos t, y sint, 0 ≤ t ≤2π

10. f(x, y) xy

2

i +xy

3

j; C : the polygonal path from (0, 0) to (1, 0) to (0, 1) to (0, 0)

11. f(x, y) (x

2

+ y

2

) i; C : x 2+cos t, y sint, 0 ≤ t ≤2π

B

12. Verify that the value of the line integral in Example 4.1 is unchanged when using the

parametrization of the circle C given in formulas (4.8).

13. Show that if f ⊥r

′

(t) at each point r(t) along a smooth curve C, then

_

C

f ··· dr 0.

14. Show that if f points in the same direction as r

′

(t) at each point r(t) along a smooth

curve C, then

_

C

f ··· dr

_

C

|f|ds.

C

15. Prove that

_

C

f (x, y) ds

_

−C

f (x, y) ds. (Hint: Use formulas (4.9).)

16. Let C be a smooth curve with arc length L, and suppose that f(x, y) P(x, y) i +Q(x, y) j

is a vector ﬁeld such that |f(x, y)| ≤ M for all (x, y) on C. Show that

¸

¸

_

C

f ··· dr

¸

¸

≤ ML. (Hint: Recall that

¸

¸

_

b

a

g(x) dx

¸

¸

≤

_

b

a

[ g(x)[ dx for Riemann integrals.)

17. Prove that the Riemann integral

_

b

a

f (x) dx is a special case of a line integral.

4.2 Properties of Line Integrals 143

4.2 Properties of Line Integrals

We know from the previous section that for line integrals of real-valued functions (scalar

ﬁelds), reversing the direction in which the integral is taken along a curve does not change

the value of the line integral:

_

C

f (x, y) ds

_

−C

f (x, y) ds (4.17)

For line integrals of vector ﬁelds, however, the value does change. To see this, let f(x, y)

P(x, y) i +Q(x, y) j be a vector ﬁeld, with P and Q continuously differentiable functions. Let

C be a smooth curve parametrized by x x(t), y y(t), a ≤ t ≤ b, with position vector r(t)

x(t) i + y(t) j (we will usually abbreviate this by saying that C : r(t) x(t) i + y(t) j is a smooth

curve). We know that the curve −C traversed in the opposite direction is parametrized by

x x(a+b−t), y y(a+b−t), a ≤ t ≤ b. Then

_

−C

P(x, y) dx

_

b

a

P(x(a+b−t), y(a+b−t))

d

dt

(x(a+b−t)) dt

_

b

a

P(x(a+b−t), y(a+b−t)) (−x

′

(a+b−t)) dt (by the Chain Rule)

_

a

b

P(x(u), y(u)) (−x

′

(u)) (−du) (by letting u a+b−t)

_

a

b

P(x(u), y(u)) x

′

(u) du

−

_

b

a

P(x(u), y(u)) x

′

(u) du , since

_

a

b

−

_

b

a

, so

_

−C

P(x, y) dx −

_

C

P(x, y) dx

since we are just using a different letter (u) for the line integral along C. A similar argument

shows that

_

−C

Q(x, y) dy −

_

C

Q(x, y) dy ,

and hence

_

−C

f ··· dr

_

−C

P(x, y) dx+

_

−C

Q(x, y) dy

−

_

C

P(x, y) dx+−

_

C

Q(x, y) dy

−

__

C

P(x, y) dx+

_

C

Q(x, y) dy

_

_

−C

f ··· dr −

_

C

f ··· dr . (4.18)

144 CHAPTER 4. LINE AND SURFACE INTEGRALS

The above formula can be interpreted in terms of the work done by a force f(x, y) (treated

as a vector) moving an object along a curve C: the total work performed moving the object

along C from its initial point to its terminal point, and then back to the initial point moving

backwards along the same path, is zero. This is because when force is considered as a vector,

direction is accounted for.

The preceding discussion shows the importance of always taking the direction of the curve

into account when using line integrals of vector ﬁelds. For this reason, the curves in line

integrals are sometimes referred to as directed curves or oriented curves.

Recall that our deﬁnition of a line integral required that we have a parametrization x

x(t), y y(t), a ≤ t ≤ b for the curve C. But as we know, any curve has inﬁnitely many

parametrizations. So could we get a different value for a line integral using some other

parametrization of C, say, x ˜ x(u), y ˜ y(u), c ≤ u ≤ d ? If so, this would mean that our

deﬁnition is not well-deﬁned. Luckily, it turns out that the value of a line integral of a

vector ﬁeld is unchanged as long as the direction of the curve C is preserved by whatever

parametrization is chosen:

Theorem 4.2. Let f(x, y) P(x, y) i +Q(x, y) j be a vector ﬁeld, and let C be a smooth curve

parametrized by x x(t), y y(t), a ≤ t ≤ b. Suppose that t α(u) for c ≤ u ≤ d, such that

a α(c), b α(d), and α

′

(u) >0 on the open interval (c, d) (i.e. α(u) is strictly increasing on

[c, d]). Then

_

C

f ··· dr has the same value for the parametrizations x x(t), y y(t), a ≤ t ≤ b

and x ˜ x(u) x(α(u)), y ˜ y(u) y(α(u)), c ≤ u ≤ d.

Proof: Since α(u) is strictly increasing and maps [c, d] onto [a, b], then we know that t

α(u) has an inverse function u α

−1

(t) deﬁned on [a, b] such that c α

−1

(a), d α

−1

(b),

and

du

dt

1

α

′

(u)

. Also, dt α

′

(u) du, and by the Chain Rule

˜ x

′

(u)

d ˜ x

du

d

du

(x(α(u)))

dx

dt

dt

du

x

′

(t) α

′

(u) ⇒ x

′

(t)

˜ x

′

(u)

α

′

(u)

so making the susbstitution t α(u) gives

_

b

a

P(x(t), y(t)) x

′

(t) dt

_

α

−1

(b)

α

−1

(a)

P(x(α(u)), y(α(u)))

˜ x

′

(u)

α

′

(u)

(α

′

(u) du)

_

d

c

P( ˜ x(u), ˜ y(u)) ˜ x

′

(u) du ,

which shows that

_

C

P(x, y) dx has the same value for both parametrizations. A similar

argument shows that

_

C

Q(x, y) dy has the same value for both parametrizations, and hence

_

C

f ··· dr has the same value. QED

Notice that the condition α

′

(u) > 0 in Theorem 4.2 means that the two parametrizations

move along C in the same direction. That was not the case with the “reverse” parametriza-

tion for −C: for u a+b−t we have t α(u) a+b−u ⇒α

′

(u) −1 <0.

4.2 Properties of Line Integrals 145

Example 4.4. Evaluate the line integral

_

C

(x

2

+ y

2

) dx +2xydy from Example 4.2, Section

4.1, along the curve C : x t, y 2t

2

, 0 ≤ t ≤1, where t sinu for 0 ≤ u ≤π/2.

Solution: First, we notice that 0 sin0, 1 sin(π/2), and

dt

du

cosu > 0 on (0, π/2). So by

Theorem 4.2 we know that if C is parametrized by

x sinu , y 2sin

2

u , 0 ≤ u ≤π/2

then

_

C

(x

2

+y

2

) dx+2xydy should have the same value as we found in Example 4.2, namely

13

3

. And we can indeed verify this:

_

C

(x

2

+ y

2

) dx+2xydy

_

π/2

0

_

(sin

2

u+(2sin

2

u)

2

) cosu+2(sinu)(2sin

2

u)4sinu cosu

_

du

_

π/2

0

_

sin

2

u+20sin

4

u

_

cosudu

sin

3

u

3

+4sin

5

u

¸

¸

¸

¸

π/2

0

1

3

+4

13

3

In other words, the line integral is unchanged whether t or u is the parameter for C.

By a closed curve, we mean a curve C whose initial point and terminal point are the

same, i.e. for C : x x(t), y y(t), a ≤ t ≤ b, we have (x(a), y(a)) (x(b), y(b)).

>

¬

C

t a t b

(a) Closed

>

¬

C

t a

t b

(b) Not closed

Figure 4.2.1 Closed vs nonclosed curves

A simple closed curve is a closed curve which does not intersect itself. Note that any

closed curve can be regarded as a union of simple closed curves (think of the loops in a ﬁgure

eight). We use the special notation

_

C

f (x, y) ds and

_

C

f ··· dr

to denote line integrals of scalar and vector ﬁelds, respectively, along closed curves. In some

older texts you may see the notation

_

or

_

to indicate a line integral traversing a closed

curve in a counterclockwise or clockwise direction, respectively.

146 CHAPTER 4. LINE AND SURFACE INTEGRALS

So far, the examples we have seen of line integrals (e.g. Example 4.2) have had the same

value for different curves joining the initial point to the terminal point. That is, the line

integral has been independent of the path joining the two points. As we mentioned before,

this is not always the case. The following theorem gives a necessary and sufﬁcient condition

for this path independence:

Theorem 4.3. In a region R, the line integral

_

C

f ··· dr is independent of the path between

any two points in R if and only if

_

C

f ··· dr 0 for every closed curve C which is contained in

R.

Proof: Suppose that

_

C

f ··· dr 0 for every closed curve C which is contained in R. Let P

1

and P

2

be two distinct points in R. Let C

1

be a curve in R going from P

1

to P

2

, and let C

2

be another curve in R going from P

1

to P

2

, as in Figure 4.2.2.

>

>

C

1

C

2

P

1

P

2

Figure 4.2.2

Then C C

1

∪−C

2

is a closed curve in R (from P

1

to

P

1

), and so

_

C

f ··· dr 0. Thus,

0

_

C

f ··· dr

_

C

1

f ··· dr +

_

−C

2

f ··· dr

_

C

1

f ··· dr −

_

C

2

f ··· dr , and so

_

C

1

f ··· dr

_

C

2

f ··· dr. This proves path independence.

Conversely, suppose that the line integral

_

C

f···dr is independent of the path between any

two points in R. Let C be a closed curve contained in R. Let P

1

and P

2

be two distinct points

on C. Let C

1

be a part of the curve C that goes from P

1

to P

2

, and let C

2

be the remaining

part of C that goes from P

1

to P

2

, again as in Figure 4.2.2. Then by path independence we

have

_

C

1

f ··· dr

_

C

2

f ··· dr

_

C

1

f ··· dr −

_

C

2

f ··· dr 0

_

C

1

f ··· dr +

_

−C

2

f ··· dr 0 , so

_

C

f ··· dr 0

since C C

1

∪−C

2

. QED

4.2 Properties of Line Integrals 147

Clearly, the above theorem does not give a practical way to determine path independence,

since it is impossible to check the line integrals around all possible closed curves in a region.

What it mostly does is give an idea of the way in which line integrals behave, and how seem-

ingly unrelated line integrals can be related (in this case, a speciﬁc line integral between

two points and all line integrals around closed curves).

For a more practical method for determining path independence, we ﬁrst need a version

of the Chain Rule for multivariable functions:

Theorem 4.4. (Chain Rule) If z f (x, y) is a continuously differentiable function of x and

y, and both x x(t) and y y(t) are differentiable functions of t, then z is a differentiable

function of t, and

dz

dt

∂z

∂x

dx

dt

+

∂z

∂y

dy

dt

(4.19)

at all points where the derivatives on the right are deﬁned.

The proof is virtually identical to the proof of Theorem 2.2 from Section 2.4 (which uses the

Mean Value Theorem), so we omit it.

1

We will now use this Chain Rule to prove the following

sufﬁcient condition for path independence of line integrals:

Theorem 4.5. Let f(x, y) P(x, y) i +Q(x, y) j be a vector ﬁeld in some region R, with P and

Q continuously differentiable functions on R. Let C be a smooth curve in R parametrized

by x x(t), y y(t), a ≤ t ≤ b. Suppose that there is a real-valued function F(x, y) such that

∇F f on R. Then

_

C

f ··· dr F(B) − F(A) , (4.20)

where A (x(a), y(a)) and B (x(b), y(b)) are the endpoints of C. Thus, the line integral is

independent of the path between its endpoints, since it depends only on the values of F at

those endpoints.

Proof: By deﬁnition of

_

C

f ··· dr, we have

_

C

f ··· dr

_

b

a

_

P(x(t), y(t)) x

′

(t) +Q(x(t), y(t)) y

′

(t)

_

dt

_

b

a

_

∂F

∂x

dx

dt

+

∂F

∂y

dy

dt

_

dt (since ∇F f ⇒

∂F

∂x

P and

∂F

∂y

Q)

_

b

a

F

′

(x(t), y(t)) dt (by the Chain Rule in Theorem 4.4)

F(x(t), y(t))

¸

¸

¸

b

a

F(B) − F(A)

by the Fundamental Theorem of Calculus. QED

1

See TAYLOR and MANN, § 6.5.

148 CHAPTER 4. LINE AND SURFACE INTEGRALS

Theorem 4.5 can be thought of as the line integral version of the Fundamental Theorem

of Calculus. A real-valued function F(x, y) such that ∇F(x, y) f(x, y) is called a potential

for f. A conservative vector ﬁeld is one which has a potential.

Example 4.5. Recall from Examples 4.2 and 4.3 in Section 4.1 that the line integral

_

C

(x

2

+

y

2

) dx +2xydy was found to have the value

13

3

for three different curves C going from the

point (0, 0) to the point (1, 2). Use Theorem 4.5 to show that this line integral is indeed path

independent.

Solution: We need to ﬁnd a real-valued function F(x, y) such that

∂F

∂x

x

2

+ y

2

and

∂F

∂y

2xy .

Suppose that

∂F

∂x

x

2

+ y

2

, Then we must have F(x, y)

1

3

x

3

+xy

2

+ g(y) for some function

g(y). So

∂F

∂y

2xy+ g

′

(y) satisﬁes the condition

∂F

∂y

2xy if g

′

(y) 0, i.e. g(y) K, where K

is a constant. Since any choice for K will do (why?), we pick K 0. Thus, a potential F(x, y)

for f(x, y) (x

2

+ y

2

) i +2xyj exists, namely

F(x, y)

1

3

x

3

+xy

2

.

Hence the line integral

_

C

(x

2

+ y

2

) dx+2xydy is path independent.

Note that we can also verify that the value of the line integral of f along any curve C going

from (0, 0) to (1, 2) will always be

13

3

, since by Theorem 4.5

_

C

f ··· dr F(1, 2) − F(0, 0)

1

3

(1)

3

+(1)(2)

2

−(0+0)

1

3

+4

13

3

.

A consequence of Theorem 4.5 in the special case where C is a closed curve, so that the

endpoints A and B are the same point, is the following important corollary:

Corollary 4.6. If a vector ﬁeld f has a potential in a region R, then

_

C

f···dr 0 for any closed

curve C in R (i.e.

_

C

∇F ··· dr 0 for any real-valued function F(x, y)).

Example 4.6. Evaluate

_

C

xdx+ ydy for C : x 2cos t, y 3sint, 0 ≤ t ≤2π.

Solution: The vector ﬁeld f(x, y) xi + yj has a potential F(x, y):

∂F

∂x

x ⇒ F(x, y)

1

2

x

2

+ g(y) , so

∂F

∂y

y ⇒ g

′

(y) y ⇒ g(y)

1

2

y

2

+K

4.2 Properties of Line Integrals 149

for any constant K, so F(x, y)

1

2

x

2

+

1

2

y

2

is a potential for f(x, y). Thus,

_

C

xdx+ ydy

_

C

f ··· dr 0

by Corollary 4.6, since the curve C is closed (it is the ellipse

x

2

4

+

y

2

9

1).

Exercises

A

1. Evaluate

_

C

(x

2

+ y

2

) dx+2xydy for C : x cos t, y sint, 0 ≤ t ≤2π.

2. Evaluate

_

C

(x

2

+ y

2

) dx+2xydy for C : x cos t, y sint, 0 ≤ t ≤π.

3. Is there a potential F(x, y) for f(x, y) yi −xj? If so, ﬁnd one.

4. Is there a potential F(x, y) for f(x, y) xi − yj? If so, ﬁnd one.

5. Is there a potential F(x, y) for f(x, y) xy

2

i +x

3

yj? If so, ﬁnd one.

B

6. Let f(x, y) and g(x, y) be vector ﬁelds, let a and b be constants, and let C be a curve in R

2

.

Show that

_

C

(af ±bg) ··· dr a

_

C

f ··· dr ± b

_

C

g··· dr .

7. Let C be a curve whose arc length is L. Show that

_

C

1ds L.

8. Let f (x, y) and g(x, y) be continuously differentiable real-valued functions in a region R.

Show that

_

C

f ∇g··· dr −

_

C

g∇f ··· dr

for any closed curve C in R. (Hint: Use Exercise 21 in Section 2.4.)

9. Let f(x, y)

−y

x

2

+y

2

i +

x

x

2

+y

2

j for all (x, y) /(0, 0), and C : x cos t, y sint, 0 ≤ t ≤2π.

(a) Show that f ∇F, for F(x, y) tan

−1

(y/x).

(b) Show that

_

C

f ··· dr 2π. Does this contradict Corollary 4.6? Explain.

C

10. Let g(x) and h(y) be differentiable functions, and let f(x, y) h(y) i + g(x) j. Can f have a

potential F(x, y)? If so, ﬁnd it. You may assume that F would be smooth. (Hint: Consider

the mixed partial derivatives of F.)

150 CHAPTER 4. LINE AND SURFACE INTEGRALS

4.3 Green’s Theorem

We will now see a way of evaluating the line integral of a smooth vector ﬁeld around a

simple closed curve. A vector ﬁeld f(x, y) P(x, y) i +Q(x, y) j is smooth if its component

functions P(x, y) and Q(x, y) are smooth. We will use Green’s Theorem (sometimes called

Green’s Theorem in the plane) to relate the line integral around a closed curve with a double

integral over the region inside the curve:

Theorem 4.7. (Green’s Theorem) Let R be a region in R

2

whose boundary is a simple

closed curve C which is piecewise smooth. Let f(x, y) P(x, y) i +Q(x, y) j be a smooth vector

ﬁeld deﬁned on both R and C. Then

_

C

f ··· dr

R

_

∂Q

∂x

−

∂P

∂y

_

dA , (4.21)

where C is traversed so that R is always on the left side of C.

Proof: We will prove the theorem in the case for a simple region R, that is, where the

boundary curve C can be written as C C

1

∪C

2

in two distinct ways:

C

1

the curve y y

1

(x) from the point X

1

to the point X

2

(4.22)

C

2

the curve y y

2

(x) from the point X

2

to the point X

1

, (4.23)

where X

1

and X

2

are the points on C farthest to the left and right, respectively; and

C

1

the curve x x

1

(y) from the point Y

2

to the point Y

1

(4.24)

C

2

the curve x x

2

(y) from the point Y

1

to the point Y

2

, (4.25)

where Y

1

and Y

2

are the lowest and highest points, respectively, on C. See Figure 4.3.1.

a b

x

y

¬

>

y y

2

(x)

y y

1

(x)

x x

2

(y)

x x

1

(y)

Y

2

Y

1

X

2

X

1 R

C

d

c

Figure 4.3.1

Integrate P(x, y) around C using the representation C C

1

∪C

2

given by (4.23) and (4.24).

4.3 Green’s Theorem 151

Since y y

1

(x) along C

1

(as x goes from a to b) and y y

2

(x) along C

2

(as x goes from b to

a), as we see from Figure 4.3.1, then we have

_

C

P(x, y) dx

_

C

1

P(x, y) dx +

_

C

2

P(x, y) dx

_

b

a

P(x, y

1

(x)) dx +

_

a

b

P(x, y

2

(x)) dx

_

b

a

P(x, y

1

(x)) dx −

_

b

a

P(x, y

2

(x)) dx

−

_

b

a

(P(x, y

2

(x)) − P(x, y

1

(x))) dx

−

_

b

a

_

P(x, y)

¸

¸

¸

yy

2

(x)

yy

1

(x)

_

dx

−

_

b

a

_

y

2

(x)

y

1

(x)

∂P(x, y)

∂y

dydx (by the Fundamental Theorem of Calculus)

−

R

∂P

∂y

dA .

Likewise, integrate Q(x, y) around C using the representation C C

1

∪C

2

given by (4.25)

and (4.26). Since x x

1

(y) along C

1

(as y goes from d to c) and x x

2

(y) along C

2

(as y goes

from c to d), as we see from Figure 4.3.1, then we have

_

C

Q(x, y) dy

_

C

1

Q(x, y) dy +

_

C

2

Q(x, y) dy

_

c

d

Q(x

1

(y), y) dy +

_

d

c

Q(x

2

(y), y) dy

−

_

d

c

Q(x

1

(y), y) dy +

_

d

c

Q(x

2

(y), y) dy

_

d

c

(Q(x

2

(y), y) − Q(x

1

(y), y)) dy

_

d

c

_

Q(x, y)

¸

¸

¸

xx

2

(y)

xx

1

(y)

_

dy

_

d

c

_

x

2

(y)

x

1

(y)

∂Q(x, y)

∂x

dxdy (by the Fundamental Theorem of Calculus)

R

∂Q

∂x

dA , and so

152 CHAPTER 4. LINE AND SURFACE INTEGRALS

_

C

f ··· dr

_

C

P(x, y) dx+

_

C

Q(x, y) dy

−

R

∂P

∂y

dA+

R

∂Q

∂x

dA

R

_

∂Q

∂x

−

∂P

∂y

_

dA .

QED

Though we proved Green’s Theorem only for a simple region R, the theorem can also be

proved for more general regions (say, a union of simple regions).

2

Example 4.7. Evaluate

_

C

(x

2

+y

2

) dx+2xydy, where C is the boundary (traversed counter-

clockwise) of the region R { (x, y) : 0 ≤ x ≤1, 2x

2

≤ y ≤2x}.

x

y

0

(1, 2)

2

1

C

Figure 4.3.2

Solution: R is the shaded region in Figure 4.3.2. By Green’s Theorem, for

P(x, y) x

2

+ y

2

and Q(x, y) 2xy, we have

_

C

(x

2

+ y

2

) dx+2xydy

R

_

∂Q

∂x

−

∂P

∂y

_

dA

R

(2y−2y) dA

R

0dA 0 .

We actually already knew that the answer was zero. Recall from Example 4.5 in Section

4.2 that the vector ﬁeld f(x, y) (x

2

+ y

2

) i +2xyj has a potential function F(x, y)

1

3

x

3

+xy

2

,

and so

_

C

f ··· dr 0 by Corollary 4.6.

Example 4.8. Let f(x, y) P(x, y) i +Q(x, y) j, where

P(x, y)

−y

x

2

+ y

2

and Q(x, y)

x

x

2

+ y

2

,

and let R { (x, y) : 0 < x

2

+ y

2

≤1}. For the boundary curve C : x

2

+ y

2

1, traversed counter-

clockwise, it was shown in Exercise 9(b) in Section 4.2 that

_

C

f ··· dr 2π. But

∂Q

∂x

y

2

−x

2

(x

2

+ y

2

)

2

∂P

∂y

⇒

R

_

∂Q

∂x

−

∂P

∂y

_

dA

R

0dA 0 .

This would seem to contradict Green’s Theorem. However, note that R is not the entire

region enclosed by C, since the point (0, 0) is not contained in R. That is, R has a “hole” at

the origin, so Green’s Theorem does not apply.

2

See TAYLOR and MANN, § 15.31 for a discussion of some of the difﬁculties involved when the boundary curve

is “complicated”.

4.3 Green’s Theorem 153

x

y

0

C

1

C

2

1

1

1/2

1/2

R

>

¬

Figure 4.3.3 The annulus R

If we modify the region R to be the annulus R

{ (x, y) : 1/4 ≤ x

2

+ y

2

≤ 1} (see Figure 4.3.3), and take

the “boundary” C of R to be C C

1

∪C

2

, where C

1

is

the unit circle x

2

+ y

2

1 traversed counterclockwise

and C

2

is the circle x

2

+ y

2

1/4 traversed clockwise,

then it can be shown (see Exercise 8) that

_

C

f ··· dr 0 .

We would still have

R

_

∂Q

∂x

−

∂P

∂y

_

dA 0, so for this R

we would have

_

C

f ··· dr

R

_

∂Q

∂x

−

∂P

∂y

_

dA ,

which shows that Green’s Theorem holds for the annular region R.

It turns out that Green’s Theorem can be extended to multiply connected regions, that is,

regions like the annulus in Example 4.8, which have one or more regions cut out from the

interior, as opposed to discrete points being cut out. For such regions, the “outer” boundary

and the “inner” boundaries are traversed so that R is always on the left side.

C

1

C

2

R

1

R

2

>

¬

¬

>

(a) Region R with one hole

C

1

C

2

C

3

R

1

R

2

> >

¬ ¬

¬

>

(b) Region R with two holes

Figure 4.3.4 Multiply connected regions

The intuitive idea for why Green’s Theorem holds for multiply connected regions is shown

in Figure 4.3.4 above. The idea is to cut “slits” between the boundaries of a multiply con-

nected region R so that R is divided into subregions which do not have any “holes”. For

example, in Figure 4.3.4(a) the region R is the union of the regions R

1

and R

2

, which are

divided by the slits indicated by the dashed lines. Those slits are part of the boundary of

both R

1

and R

2

, and we traverse then in the manner indicated by the arrows. Notice that

along each slit the boundary of R

1

is traversed in the opposite direction as that of R

2

, which

154 CHAPTER 4. LINE AND SURFACE INTEGRALS

means that the line integrals of f along those slits cancel each other out. Since R

1

and R

2

do

not have holes in them, then Green’s Theorem holds in each subregion, so that

_

bdy

of R

1

f ··· dr

R

1

_

∂Q

∂x

−

∂P

∂y

_

dA and

_

bdy

of R

2

f ··· dr

R

2

_

∂Q

∂x

−

∂P

∂y

_

dA .

But since the line integrals along the slits cancel out, we have

_

C

1

∪C

2

f ··· dr

_

bdy

of R

1

f ··· dr +

_

bdy

of R

2

f ··· dr ,

and so

_

C

1

∪C

2

f ··· dr

R

1

_

∂Q

∂x

−

∂P

∂y

_

dA +

R

2

_

∂Q

∂x

−

∂P

∂y

_

dA

R

_

∂Q

∂x

−

∂P

∂y

_

dA ,

which shows that Green’s Theorem holds in the region R. A similar argument shows that

the theorem holds in the region with two holes shown in Figure 4.3.4(b).

We know from Corollary 4.6 that when a smooth vector ﬁeld f(x, y) P(x, y) i +Q(x, y) j on

a region R (whose boundary is a piecewise smooth, simple closed curve C) has a potential in

R, then

_

C

f··· dr 0. And if the potential F(x, y) is smooth in R, then

∂F

∂x

P and

∂F

∂y

Q, and

so we know that

∂

2

F

∂y∂x

∂

2

F

∂x∂y

⇒

∂P

∂y

∂Q

∂x

in R.

Conversely, if

∂P

∂y

∂Q

∂x

in R then

_

C

f ··· dr

R

_

∂Q

∂x

−

∂P

∂y

_

dA

R

0dA 0 .

For a simply connected region R (i.e. a region with no holes), the following can be shown:

The following statements are equivalent for a simply connected region R in R

2

:

(a) f(x, y) P(x, y) i +Q(x, y) j has a smooth potential F(x, y) in R

(b)

_

C

f ··· dr is independent of the path for any curve C in R

(c)

_

C

f ··· dr 0 for every simple closed curve C in R

(d)

∂P

∂y

∂Q

∂x

in R (in this case, the differential form P dx+Qdy is exact)

4.3 Green’s Theorem 155

Exercises

A

For Exercises 1-4, use Green’s Theorem to evaluate the given line integral around the curve

C, traversed counterclockwise.

1.

_

C

(x

2

− y

2

) dx+2xydy; C is the boundary of R { (x, y) : 0 ≤ x ≤1, 2x

2

≤ y ≤2x}

2.

_

C

x

2

ydx+2xydy; C is the boundary of R { (x, y) : 0 ≤ x ≤1, x

2

≤ y ≤ x}

3.

_

C

2ydx−3xdy; C is the circle x

2

+ y

2

1

4.

_

C

(e

x

2

+ y

2

) dx +(e

y

2

+x

2

) dy; C is the boundary of the triangle with vertices (0, 0), (4, 0)

and (0, 4)

5. Is there a potential F(x, y) for f(x, y) (y

2

+3x

2

) i +2xyj? If so, ﬁnd one.

6. Is there a potential F(x, y) for f(x, y) (x

3

cos(xy) +2xsin(xy)) i +x

2

ycos(xy) j? If so, ﬁnd

one.

7. Is there a potential F(x, y) for f(x, y) (8xy+3) i +4(x

2

+ y) j? If so, ﬁnd one.

8. Show that for any constants a, b and any closed simple curve C,

_

C

adx+bdy 0.

B

9. For the vector ﬁeld f as in Example 4.8, show directly that

_

C

f ··· dr 0, where C is the

boundary of the annulus R { (x, y) : 1/4 ≤ x

2

+ y

2

≤ 1} traversed so that R is always on

the left.

10. Evaluate

_

C

e

x

sin ydx+(y

3

+e

x

cos y) dy, where C is the boundary of the rectangle with

vertices (1, −1), (1, 1), (−1, 1) and (−1, −1), traversed counterclockwise.

C

11. For a region R bounded by a simple closed curve C, show that the area A of R is

A −

_

C

ydx

_

C

xdy

1

2

_

C

xdy− ydx ,

where C is traversed so that R is always on the left. (Hint: Use Green’s Theorem and the

fact that A

R

1dA.)

156 CHAPTER 4. LINE AND SURFACE INTEGRALS

4.4 Surface Integrals and the Divergence Theorem

In Section 4.1 we learned how to integrate along a curve. We will now learn how to perform

integration over a surface in R

3

, such as a sphere or a paraboloid. Recall from Section 1.8

how we identiﬁed points (x, y, z) on a curve C in R

3

, parametrized by x x(t), y y(t), z z(t),

a ≤ t ≤ b, with the terminal points of the position vector

r(t) x(t)i + y(t)j +z(t)k for t in [a, b].

The idea behind a parametrization of a curve is that it “transforms” a subset of R

1

(nor-

mally an interval [a, b]) into a curve in R

2

or R

3

(see Figure 4.4.1).

a t b

R

1

y

z

x

0

(x(a), y(a), z(a))

(x(t), y(t), z(t))

(x(b), y(b), z(b))

r(t)

C

x x(t)

y y(t)

z z(t)

Figure 4.4.1 Parametrization of a curve C in R

3

Similar to how we used a parametrization of a curve to deﬁne the line integral along the

curve, we will use a parametrization of a surface to deﬁne a surface integral. We will use

two variables, u and v, to parametrize a surface Σ in R

3

: x x(u, v), y y(u, v), z z(u, v),

for (u, v) in some region R in R

2

(see Figure 4.4.2).

u

v

R

R

2

(u, v)

y

z

x

0

Σ

r(u, v)

x x(u, v)

y y(u, v)

z z(u, v)

Figure 4.4.2 Parametrization of a surface Σ in R

3

In this case, the position vector of a point on the surface Σ is given by the vector-valued

function

r(u, v) x(u, v)i + y(u, v)j + z(u, v)k for (u, v) in R.

4.4 Surface Integrals and the Divergence Theorem 157

Since r(u, v) is a function of two variables, deﬁne the partial derivatives

∂r

∂u

and

∂r

∂v

for

(u, v) in R by

∂r

∂u

(u, v)

∂x

∂u

(u, v)i +

∂y

∂u

(u, v)j +

∂z

∂u

(u, v)k , and

∂r

∂v

(u, v)

∂x

∂v

(u, v)i +

∂y

∂v

(u, v)j +

∂z

∂v

(u, v)k .

The parametrization of Σ can be thought of as “transforming” a region in R

2

(in the uv-

plane) into a 2-dimensional surface in R

3

. This parametrization of the surface is sometimes

called a patch, based on the idea of “patching” the region R onto Σ in the grid-like manner

shown in Figure 4.4.2.

In fact, those gridlines in R lead us to how we will deﬁne a surface integral over Σ. Along

the vertical gridlines in R, the variable u is constant. So those lines get mapped to curves on

Σ, and the variable u is constant along the position vector r(u, v). Thus, the tangent vector

to those curves at a point (u, v) is

∂r

∂v

. Similarly, the horizontal gridlines in R get mapped to

curves on Σ whose tangent vectors are

∂r

∂u

.

Now take a point (u, v) in R as, say, the lower left corner of one of the rectangular grid

sections in R, as shown in Figure 4.4.2. Suppose that this rectangle has a small width

and height of ∆u and ∆v, respectively. The corner points of that rectangle are (u, v), (u+

∆u, v), (u+∆u, v+∆v) and (u, v+∆v). So the area of that rectangle is A ∆u∆v. Then that

rectangle gets mapped by the parametrization onto some section of the surface Σ which,

for ∆u and ∆v small enough, will have a surface area (call it dσ) that is very close to the

area of the parallelogram which has adjacent sides r(u+∆u, v)−r(u, v) (corresponding to the

line segment from (u, v) to (u +∆u, v) in R) and r(u, v +∆v) −r(u, v) (corresponding to the

line segment from (u, v) to (u, v+∆v) in R). But by combining our usual notion of a partial

derivative (see Deﬁnition 2.3 in Section 2.2) with that of the derivative of a vector-valued

function (see Deﬁnition 1.12 in Section 1.8) applied to a function of two variables, we have

∂r

∂u

≈

r(u+∆u, v) −r(u, v)

∆u

, and

∂r

∂v

≈

r(u, v+∆v) −r(u, v)

∆v

,

and so the surface area element dσ is approximately

_

_

(r(u+∆u, v) −r(u, v)) ×××(r(u, v+∆v) −r(u, v))

_

_

≈

_

_

_

_

(∆u

∂r

∂u

) ×××(∆v

∂r

∂v

)

_

_

_

_

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

∆u∆v

by Theorem 1.13 in Section 1.4. Thus, the total surface area S of Σ is approximately the sum

of all the quantities

_

_

∂r

∂u

×××

∂r

∂v

_

_

∆u∆v, summed over the rectangles in R. Taking the limit of

that sum as the diagonal of the largest rectangle goes to 0 gives

S

R

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

dudv . (4.26)

158 CHAPTER 4. LINE AND SURFACE INTEGRALS

We will write the double integral on the right using the special notation

Σ

dσ

R

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

dudv . (4.27)

This is a special case of a surface integral over the surface Σ, where the surface area element

dσ can be thought of as 1dσ. Replacing 1 by a general real-valued function f (x, y, z) deﬁned

in R

3

, we have the following:

Deﬁnition 4.3. Let Σ be a surface in R

3

parametrized by x x(u, v), y y(u, v),

z z(u, v), for (u, v) in some region R in R

2

. Let r(u, v) x(u, v)i + y(u, v)j +z(u, v)k be the

position vector for any point on Σ, and let f (x, y, z) be a real-valued function deﬁned on some

subset of R

3

that contains Σ. The surface integral of f (x, y, z) over Σ is

Σ

f (x, y, z) dσ

R

f (x(u, v), y(u, v), z(u, v))

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

dudv . (4.28)

In particular, the surface area S of Σ is

S

Σ

1dσ . (4.29)

Example 4.9. A torus T is a surface obtained by revolving a circle of radius a in the yz-plane

around the z-axis, where the circle’s center is at a distance b from the z-axis (0 < a < b), as

in Figure 4.4.3. Find the surface area of T.

y

z

0

a

(y−b)

2

+z

2

a

2

u

b

(a) Circle in the yz-plane

x

y

z

v

a

(x, y,z)

(b) Torus T

Figure 4.4.3

Solution: For any point on the circle, the line segment from the center of the circle to that

point makes an angle u with the y-axis in the positive y direction (see Figure 4.4.3(a)). And

as the circle revolves around the z-axis, the line segment from the origin to the center of that

4.4 Surface Integrals and the Divergence Theorem 159

circle sweeps out an angle v with the positive x-axis (see Figure 4.4.3(b)). Thus, the torus

can be parametrized as:

x (b+acosu) cosv , y (b+acosu) sinv , z asinu , 0 ≤ u ≤2π , 0 ≤v ≤2π

So for the position vector

r(u, v) x(u, v)i + y(u, v)j + z(u, v)k

(b+acosu) cosvi + (b+acosu) sinvj + asinuk

we see that

∂r

∂u

−asinu cosvi − asinu sinvj + acosuk

∂r

∂v

−(b+acosu) sinvi + (b+acosu) cosvj + 0k ,

and so computing the cross product gives

∂r

∂u

×××

∂r

∂v

−a(b+acosu) cosv cosui − a(b+acosu) sinv cosuj − a(b+acos u) sinuk ,

which has magnitude

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

a(b+acos u) .

Thus, the surface area of T is

S

Σ

1dσ

_

2π

0

_

2π

0

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

dudv

_

2π

0

_

2π

0

a(b+acos u) dudv

_

2π

0

_

abu+a

2

sinu

¸

¸

¸

u2π

u0

_

dv

_

2π

0

2πabdv

4π

2

ab

Since

∂r

∂u

and

∂r

∂v

are tangent to the surface Σ (i.e. lie in the tangent plane to Σ at each point

on Σ), then their cross product

∂r

∂u

×××

∂r

∂v

is perpendicular to the tangent plane to the surface

at each point of Σ. Thus,

160 CHAPTER 4. LINE AND SURFACE INTEGRALS

Σ

f (x, y, z) dσ

R

f (x(u, v), y(u, v), z(u, v)) |n|dσ ,

where n

∂r

∂u

×××

∂r

∂v

. We say that n is a normal vector to Σ.

y

z

x

0

Figure 4.4.4

Recall that normal vectors to a plane can point in two opposite

directions. By an outward unit normal vector to a surface Σ,

we will mean the unit vector that is normal to Σ and points away

from the “top” (or “outer” part) of the surface. This is a hazy

deﬁnition, but the picture in Figure 4.4.4 gives a better idea of

what outward normal vectors look like, in the case of a sphere.

With this idea in mind, we make the following deﬁnition of a

surface integral of a 3-dimensional vector ﬁeld over a surface:

Deﬁnition 4.4. Let Σ be a surface in R

3

and let f(x, y, z) f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k

be a vector ﬁeld deﬁned on some subset of R

3

that contains Σ. The surface integral of f

over Σ is

Σ

f ··· dσ

Σ

f ··· ndσ , (4.30)

where, at any point on Σ, n is the outward unit normal vector to Σ.

Note in the above deﬁnition that the dot product inside the integral on the right is a

real-valued function, and hence we can use Deﬁnition 4.3 to evaluate the integral.

Example 4.10. Evaluate the surface integral

Σ

f··· dσ, where f(x, y, z) yzi+xzj+xyk and Σ

is the part of the plane x+y+z 1 with x ≥0, y ≥0, and z ≥0, with the outward unit normal

n pointing in the positive z direction (see Figure 4.4.5).

y

z

x

0

1

1

1

Σ

x+ y+z 1

n

Figure 4.4.5

Solution: Since the vector v (1, 1, 1) is normal to the plane x+ y+z 1

(why?), then dividing v by its length yields the outward unit normal

vector n

_

1

_

3

,

1

_

3

,

1

_

3

_

. We now need to parametrize Σ. As we can see

from Figure 4.4.5, projecting Σ onto the xy-plane yields a triangular

region R { (x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 1−x}. Thus, using (u, v) instead of

(x, y), we see that

x u, y v, z 1−(u+v), for 0 ≤ u ≤1, 0 ≤v ≤1−u

4.4 Surface Integrals and the Divergence Theorem 161

is a parametrization of Σ over R (since z 1−(x+ y) on Σ). So on Σ,

f ··· n (yz, xz, xy) ···

_

1

_

3

,

1

_

3

,

1

_

3

_

1

_

3

(yz +xz +xy)

1

_

3

((x+ y)z +xy)

1

_

3

((u+v)(1−(u+v)) +uv)

1

_

3

((u+v) −(u+v)

2

+uv)

for (u, v) in R, and for r(u, v) x(u, v)i + y(u, v)j +z(u, v)k ui +vj +(1−(u+v))k we have

∂r

∂u

×××

∂r

∂v

(1, 0, −1) ×××(0, 1, −1) (1, 1, 1) ⇒

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

_

3 .

Thus, integrating over R using vertical slices (e.g. as indicated by the dashed line in Figure

4.4.5) gives

Σ

f ··· dσ

Σ

f ··· ndσ

R

(f(x(u, v), y(u, v), z(u, v)) ··· n)

_

_

_

_

∂r

∂u

×××

∂r

∂v

_

_

_

_

dvdu

_

1

0

_

1−u

0

1

_

3

((u+v) −(u+v)

2

+uv)

_

3dvdu

_

1

0

_

(u+v)

2

2

−

(u+v)

3

3

+

uv

2

2

¸

¸

¸

¸

v1−u

v0

_

du

_

1

0

_

1

6

+

u

2

−

3u

2

2

+

5u

3

6

_

du

u

6

+

u

2

4

−

u

3

2

+

5u

4

24

¸

¸

¸

¸

1

0

1

8

.

Computing surface integrals can often be tedious, especially when the formula for the

outward unit normal vector at each point of Σ changes. The following theorem provides an

easier way in the case when Σ is a closed surface, that is, when Σ encloses a bounded

solid in R

3

. For example, spheres, cubes, and ellipsoids are closed surfaces, but planes and

paraboloids are not.

162 CHAPTER 4. LINE AND SURFACE INTEGRALS

Theorem 4.8. (Divergence Theorem) Let Σ be a closed surface in R

3

which bounds a

solid S, and let f(x, y, z) f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k be a vector ﬁeld deﬁned on some

subset of R

3

that contains Σ. Then

Σ

f ··· dσ

S

div f dV , (4.31)

where

div f

∂f

1

∂x

+

∂f

2

∂y

+

∂f

3

∂z

(4.32)

is called the divergence of f.

The proof of the Divergence Theorem is very similar to the proof of Green’s Theorem, i.e. it

is ﬁrst proved for the simple case when the solid S is bounded above by one surface, bounded

below by another surface, and bounded laterally by one or more surfaces. The proof can then

be extended to more general solids.

3

Example 4.11. Evaluate

Σ

f ··· dσ, where f(x, y, z) xi + yj + zk and Σ is the unit sphere

x

2

+ y

2

+z

2

1.

Solution: We see that div f 1+1+1 3, so

Σ

f ··· dσ

S

div f dV

S

3 dV

3

S

1 dV 3vol(S) 3·

4π(1)

3

3

4π .

In physical applications, the surface integral

Σ

f ··· dσ is often referred to as the ﬂux of f

through the surface Σ. For example, if f represents the velocity ﬁeld of a ﬂuid, then the ﬂux

is the net quantity of ﬂuid to ﬂow through the surface Σ per unit time. A positive ﬂux means

there is a net ﬂow out of the surface (i.e. in the direction of the outward unit normal vector

n), while a negative ﬂux indicates a net ﬂow inward (in the direction of −n).

The term divergence comes from interpreting div f as a measure of how much a vector

ﬁeld “diverges” from a point. This is best seen by using another deﬁnition of div f which is

equivalent

4

to the deﬁnition given by formula (4.32). Namely, for a point (x, y, z) in R

3

,

div f(x, y, z) lim

V→0

1

V

Σ

f ··· dσ , (4.33)

3

See TAYLOR and MANN, § 15.6 for the details.

4

See SCHEY, p. 36-39, for an intuitive discussion of this.

4.4 Surface Integrals and the Divergence Theorem 163

where V is the volume enclosed by a closed surface Σ around the point (x, y, z). In the

limit, V →0 means that we take smaller and smaller closed surfaces around (x, y, z), which

means that the volumes they enclose are going to zero. It can be shown that this limit is

independent of the shapes of those surfaces. Notice that the limit being taken is of the

ratio of the ﬂux through a surface to the volume enclosed by that surface, which gives a

rough measure of the ﬂow “leaving” a point, as we mentioned. Vector ﬁelds which have zero

divergence are often called solenoidal ﬁelds.

The following theorem is a simple consequence of formula (4.33).

Theorem 4.9. If the ﬂux of a vector ﬁeld f is zero through every closed surface containing a

given point, then div f 0 at that point.

Proof: By formula (4.33), at the given point (x, y, z) we have

div f(x, y, z) lim

V→0

1

V

Σ

f ··· dσ for closed surfaces Σ containing (x, y, z), so

lim

V→0

1

V

(0) by our assumption that the ﬂux through each Σ is zero, so

lim

V→0

0

0 . QED

Lastly, we note that sometimes the notation

Σ

f (x, y, z) dσ and

Σ

f ··· dσ

is used to denote surface integrals of scalar and vector ﬁelds, respectively, over closed sur-

faces. Especially in physics texts, it is common to see simply

_

Σ

instead of

Σ

.

Exercises

A

For Exercises 1-4, use the Divergence Theorem to evaluate the surface integral

Σ

f ··· dσ of

the given vector ﬁeld f(x, y, z) over the surface Σ.

1. f(x, y, z) xi +2yj +3zk, Σ: x

2

+ y

2

+z

2

9

2. f(x, y, z) xi + yj +zk, Σ: boundary of the solid cube S { (x, y, z) : 0 ≤ x, y, z ≤1}

3. f(x, y, z) x

3

i + y

3

j +z

3

k, Σ: x

2

+ y

2

+z

2

1

4. f(x, y, z) 2i +3j +5k, Σ: x

2

+ y

2

+z

2

1

164 CHAPTER 4. LINE AND SURFACE INTEGRALS

B

5. Show that the ﬂux of any constant vector ﬁeld through any closed surface is zero.

6. Evaluate the surface integral from Exercise 2 without using the Divergence Theorem, i.e.

using only Deﬁnition 4.3, as in Example 4.10. Note that there will be a different outward

unit normal vector to each of the six faces of the cube.

7. Evaluate the surface integral

Σ

f ··· dσ, where f(x, y, z) x

2

i +xyj +zk and Σ is the part of

the plane 6x +3y +2z 6 with x ≥ 0, y ≥ 0, and z ≥ 0, with the outward unit normal n

pointing in the positive z direction.

8. Use a surface integral to show that the surface area of a sphere of radius r is 4πr

2

. (Hint:

Use spherical coordinates to parametrize the sphere.)

9. Use a surface integral to show that the surface area of a right circular cone of radius R

and height h is πR

_

h

2

+R

2

. (Hint: Use the parametrization x r cosθ, y r sinθ, z

h

R

r,

for 0 ≤ r ≤R and 0 ≤θ ≤2π.)

10. The ellipsoid

x

2

a

2

+

y

2

b

2

+

z

2

c

2

1 can be parametrized using ellipsoidal coordinates

x asinφ cosθ , y bsinφ sinθ , z ccosφ , for 0 ≤θ ≤2π and 0 ≤φ≤π.

Show that the surface area S of the ellipsoid is

S

_

π

0

_

2π

0

sinφ

_

a

2

b

2

cos

2

φ+c

2

(a

2

sin

2

θ +b

2

cos

2

θ) sin

2

φ dθ dφ .

(Note: The above double integral can not be evaluated by elementary means. For speciﬁc

values of a, b and c it can be evaluated using numerical methods. An alternative is to

express the surface area in terms of elliptic integrals.

5

)

C

11. Use Deﬁnition 4.3 to prove that the surface area S over a region R in R

2

of a surface

z f (x, y) is given by the formula

S

R

_

1+

_

∂f

∂x

_

2

+

_

∂f

∂y

_

2

dA .

(Hint: Think of the parametrization of the surface.)

5

BOWMAN, F., Introduction to Elliptic Functions, with Applications, New York: Dover, 1961, § III.7.

4.5 Stokes’ Theorem 165

4.5 Stokes’ Theorem

So far the only types of line integrals which we have discussed are those along curves in R

2

.

But the deﬁnitions and properties which were covered in Sections 4.1 and 4.2 can easily be

extended to include functions of three variables, so that we can now discuss line integrals

along curves in R

3

.

Deﬁnition 4.5. For a real-valued function f (x, y, z) and a curve C in R

3

, parametrized by

x x(t), y y(t), z z(t), a ≤ t ≤ b, the line integral of f (x, y, z) along C with respect to

arc length s is

_

C

f (x, y, z) ds

_

b

a

f (x(t), y(t), z(t))

_

x

′

(t)

2

+ y

′

(t)

2

+z

′

(t)

2

dt . (4.34)

The line integral of f (x, y, z) along C with respect to x is

_

C

f (x, y, z) dx

_

b

a

f (x(t), y(t), z(t)) x

′

(t) dt . (4.35)

The line integral of f (x, y, z) along C with respect to y is

_

C

f (x, y, z) dy

_

b

a

f (x(t), y(t), z(t)) y

′

(t) dt . (4.36)

The line integral of f (x, y, z) along C with respect to z is

_

C

f (x, y, z) dz

_

b

a

f (x(t), y(t), z(t)) z

′

(t) dt . (4.37)

Similar to the two-variable case, if f (x, y, z) ≥0 then the line integral

_

C

f (x, y, z) ds can be

thought of as the total area of the “picket fence” of height f (x, y, z) at each point along the

curve C in R

3

.

Vector ﬁelds in R

3

are deﬁned in a similar fashion to those in R

2

, which allows us to deﬁne

the line integral of a vector ﬁeld along a curve in R

3

.

Deﬁnition 4.6. For a vector ﬁeld f(x, y, z) P(x, y, z) i+Q(x, y, z) j+R(x, y, z) k and a curve C

in R

3

with a smooth parametrization x x(t), y y(t), z z(t), a ≤ t ≤ b, the line integral

of f along C is

_

C

f ··· dr

_

C

P(x, y, z) dx +

_

C

Q(x, y, z) dy +

_

C

R(x, y, z) dz (4.38)

_

b

a

f(x(t), y(t), z(t)) ··· r

′

(t) dt , (4.39)

where r(t) x(t) i + y(t) j +z(t) k is the position vector for points on C.

166 CHAPTER 4. LINE AND SURFACE INTEGRALS

Similar to the two-variable case, if f(x, y, z) represents the force applied to an object at a

point (x, y, z) then the line integral

_

C

f··· dr represents the work done by that force in moving

the object along the curve C in R

3

.

Some of the most important results we will need for line integrals in R

3

are stated below

without proof (the proofs are similar to their two-variable equivalents).

Theorem 4.10. For a vector ﬁeld f(x, y, z) P(x, y, z) i +Q(x, y, z) j +R(x, y, z) k and a curve

C with a smooth parametrization x x(t), y y(t), z z(t), a ≤ t ≤ b and position vector

r(t) x(t) i + y(t) j +z(t) k,

_

C

f ··· dr

_

C

f ··· Tds , (4.40)

where T(t)

r

′

(t)

|r

′

(t)|

is the unit tangent vector to C at (x(t), y(t), z(t)).

Theorem 4.11. (Chain Rule) If w f (x, y, z) is a continuously differentiable function of

x, y, and z, and x x(t), y y(t) and z z(t) are differentiable functions of t, then w is a

differentiable function of t, and

dw

dt

∂w

∂x

dx

dt

+

∂w

∂y

dy

dt

+

∂w

∂z

dz

dt

. (4.41)

Also, if x x(t

1

, t

2

), y y(t

1

, t

2

) and z z(t

1

, t

2

) are continuously differentiable function of

(t

1

, t

2

), then

6

∂w

∂t

1

∂w

∂x

∂x

∂t

1

+

∂w

∂y

∂y

∂t

1

+

∂w

∂z

∂z

∂t

1

(4.42)

and

∂w

∂t

2

∂w

∂x

∂x

∂t

2

+

∂w

∂y

∂y

∂t

2

+

∂w

∂z

∂z

∂t

2

. (4.43)

Theorem 4.12. Let f(x, y, z) P(x, y, z) i +Q(x, y, z) j +R(x, y, z) k be a vector ﬁeld in some

solid S, with P, Q and R continuously differentiable functions on S. Let C be a smooth

curve in S parametrized by x x(t), y y(t), z z(t), a ≤ t ≤ b. Suppose that there is a

real-valued function F(x, y, z) such that ∇F f on S. Then

_

C

f ··· dr F(B) − F(A) , (4.44)

where A (x(a), y(a), z(a)) and B(x(b), y(b), z(b)) are the endpoints of C.

Corollary 4.13. If a vector ﬁeld f has a potential in a solid S, then

_

C

f··· dr 0 for any closed

curve C in S (i.e.

_

C

∇F ··· dr 0 for any real-valued function F(x, y, z)).

6

See TAYLOR and MANN, § 6.5 for a proof.

4.5 Stokes’ Theorem 167

Example 4.12. Let f (x, y, z) z and let C be the curve in R

3

parametrized by

x t sint , y t cos t , z t , 0 ≤ t ≤8π .

Evaluate

_

C

f (x, y, z) ds. (Note: C is called a conical helix. See Figure 4.5.1).

Solution: Since x

′

(t) sint +t cos t, y

′

(t) cos t −t sint, and z

′

(t) 1, we have

x

′

(t)

2

+ y

′

(t)

2

+z

′

(t)

2

(sin

2

t +2t sint cos t +t

2

cos

2

t) +(cos

2

t −2t sint cos t +t

2

sin

2

t) +1

t

2

(sin

2

t +cos

2

t) +sin

2

t +cos

2

t +1

t

2

+2 ,

so since f (x(t), y(t), z(t)) z(t) t along the curve C, then

_

C

f (x, y, z) ds

_

8π

0

f (x(t), y(t), z(t))

_

x

′

(t)

2

+ y

′

(t)

2

+z

′

(t)

2

dt

_

8π

0

t

_

t

2

+2 dt

_

1

3

(t

2

+2)

3/2

_ ¸

¸

¸

¸

8π

0

1

3

_

(64π

2

+2)

3/2

−2

_

2

_

.

-25

-20

-15

-10

-5

0

5

10

15

20

25

-25

-20

-15

-10

-5

0

5

10

15

20

25

30

0

5

10

15

20

25

30

z

t 0

t 8π

x

y

z

Figure 4.5.1 Conical helix C

Example 4.13. Let f(x, y, z) xi + yj +2zk be a vector ﬁeld in R

3

. Using the same curve C

from Example 4.12, evaluate

_

C

f ··· dr.

Solution: It is easy to see that F(x, y, z)

x

2

2

+

y

2

2

+z

2

is a potential for f(x, y, z) (i.e. ∇F f).

168 CHAPTER 4. LINE AND SURFACE INTEGRALS

So by Theorem 4.12 we know that

_

C

f ··· dr F(B) − F(A) , where A (x(0), y(0), z(0)) and B(x(8π), y(8π), z(8π)), so

F(8πsin8π, 8πcos8π, 8π) − F(0sin0, 0cos0, 0)

F(0, 8π, 8π) − F(0, 0, 0)

0+

(8π)

2

2

+(8π)

2

−(0+0+0) 96π

2

.

We will now discuss a generalization of Green’s Theorem in R

2

to orientable surfaces in

R

3

, called Stokes’ Theorem. A surface Σ in R

3

is orientable if there is a continuous vector

ﬁeld Nin R

3

such that Nis nonzero and normal to Σ (i.e. perpendicular to the tangent plane)

at each point of Σ. We say that such an N is a normal vector ﬁeld.

y

z

x

0

N

−N

Figure 4.5.2

For example, the unit sphere x

2

+y

2

+z

2

1 is orientable, since the

continuous vector ﬁeld N(x, y, z) xi+yj+zk is nonzero and normal

to the sphere at each point. In fact, −N(x, y, z) is another normal

vector ﬁeld (see Figure 4.5.2). We see in this case that N(x, y, z) is

what we have called an outward normal vector, and −N(x, y, z) is an

inward normal vector. These “outward” and “inward” normal vec-

tor ﬁelds on the sphere correspond to an “outer” and “inner” side,

respectively, of the sphere. That is, we say that the sphere is a two-

sided surface. Roughly, “two-sided” means “orientable”. Other ex-

amples of two-sided, and hence orientable, surfaces are cylinders,

paraboloids, ellipsoids, and planes.

You may be wondering what kind of surface would not have two sides. An example is the

Möbius strip, which is constructed by taking a thin rectangle and connecting its ends at

the opposite corners, resulting in a “twisted” strip (see Figure 4.5.3).

A

B A

B

−→

(a) Connect A to A and B to B along the ends

A

→

A

→

(b) Not orientable

Figure 4.5.3 Möbius strip

If you imagine walking along a line down the center of the Möbius strip, as in Figure

4.5.3(b), then you arrive back at the same place from which you started but upside down!

That is, your orientation changed even though your motion was continuous along that center

4.5 Stokes’ Theorem 169

line. Informally, thinking of your vertical direction as a normal vector ﬁeld along the strip,

there is a discontinuity at your starting point (and, in fact, at every point) since your vertical

direction takes two different values there. The Möbius strip has only one side, and hence is

nonorientable.

7

For an orientable surface Σ which has a boundary curve C, pick a unit normal vector n

such that if you walked along C with your head pointing in the direction of n, then the

surface would be on your left. We say in this situation that n is a positive unit normal vector

and that C is traversed n-positively. We can now state Stokes’ Theorem:

Theorem 4.14. (Stokes’ Theorem) Let Σ be an orientable surface in R

3

whose boundary

is a simple closed curve C, and let f(x, y, z) P(x, y, z)i +Q(x, y, z)j +R(x, y, z)k be a smooth

vector ﬁeld deﬁned on some subset of R

3

that contains Σ. Then

_

C

f ··· dr

Σ

(curl f) ··· ndσ , (4.45)

where

curl f

_

∂R

∂y

−

∂Q

∂z

_

i +

_

∂P

∂z

−

∂R

∂x

_

j +

_

∂Q

∂x

−

∂P

∂y

_

k , (4.46)

n is a positive unit normal vector over Σ, and C is traversed n-positively.

Proof: As the general case is beyond the scope of this text, we will prove the theorem only

for the special case where Σ is the graph of z z(x, y) for some smooth real-valued function

z(x, y), with (x, y) varying over a region D in R

2

.

y

z

x

0

n

(x, y)

D

C

D

C

Σ: z z(x, y)

Figure 4.5.4

Projecting Σ onto the xy-plane, we see that the closed

curve C (the boundary curve of Σ) projects onto a closed

curve C

D

which is the boundary curve of D (see Fig-

ure 4.5.4). Assuming that C has a smooth parametriza-

tion, its projection C

D

in the xy-plane also has a smooth

parametrization, say

C

D

: x x(t) , y y(t) , a ≤ t ≤ b ,

and so C can be parametrized (in R

3

) as

C : x x(t) , y y(t) , z z(x(t), y(t)) , a ≤ t ≤ b ,

since the curve C is part of the surface z z(x, y). Now, by the Chain Rule (Theorem 4.4 in

Section 4.2), for z z(x(t), y(t)) as a function of t, we know that

z

′

(t)

∂z

∂x

x

′

(t) +

∂z

∂y

y

′

(t) ,

7

For further discussion of orientability, see O’NEILL, § IV.7.

170 CHAPTER 4. LINE AND SURFACE INTEGRALS

and so

_

C

f ··· dr

_

C

P(x, y, z) dx+Q(x, y, z) dy+R(x, y, z) dz

_

b

a

_

P x

′

(t) +Q y

′

(t) +R

_

∂z

∂x

x

′

(t) +

∂z

∂y

y

′

(t)

__

dt

_

b

a

__

P +R

∂z

∂x

_

x

′

(t) +

_

Q+R

∂z

∂y

_

y

′

(t)

_

dt

_

C

D

˜

P(x, y) dx+

˜

Q(x, y) dy ,

where

˜

P(x, y) P(x, y, z(x, y)) + R(x, y, z(x, y))

∂z

∂x

(x, y) , and

˜

Q(x, y) Q(x, y, z(x, y)) + R(x, y, z(x, y))

∂z

∂y

(x, y)

for (x, y) in D. Thus, by Green’s Theorem applied to the region D, we have

_

C

f ··· dr

D

_

∂

˜

Q

∂x

−

∂

˜

P

∂y

_

dA . (4.47)

Thus,

∂

˜

Q

∂x

∂

∂x

_

Q(x, y, z(x, y)) +R(x, y, z(x, y))

∂z

∂y

(x, y)

_

, so by the Product Rule we get

∂

∂x

(Q(x, y, z(x, y))) +

_

∂

∂x

R(x, y, z(x, y))

_

∂z

∂y

(x, y) +R(x, y, z(x, y))

∂

∂x

_

∂z

∂y

(x, y)

_

.

Now, by formula (4.42) in Theorem 4.11, we have

∂

∂x

(Q(x, y, z(x, y)))

∂Q

∂x

∂x

∂x

+

∂Q

∂y

∂y

∂x

+

∂Q

∂z

∂z

∂x

∂Q

∂x

· 1 +

∂Q

∂y

· 0 +

∂Q

∂z

∂z

∂x

∂Q

∂x

+

∂Q

∂z

∂z

∂x

.

Similarly,

∂

∂x

(R(x, y, z(x, y)))

∂R

∂x

+

∂R

∂z

∂z

∂x

.

4.5 Stokes’ Theorem 171

Thus,

∂

˜

Q

∂x

∂Q

∂x

+

∂Q

∂z

∂z

∂x

+

_

∂R

∂x

+

∂R

∂z

∂z

∂x

_

∂z

∂y

+ R(x, y, z(x, y))

∂

2

z

∂x∂y

∂Q

∂x

+

∂Q

∂z

∂z

∂x

+

∂R

∂x

∂z

∂y

+

∂R

∂z

∂z

∂x

∂z

∂y

+ R

∂

2

z

∂x∂y

.

In a similar fashion, we can calculate

∂

˜

P

∂y

∂P

∂y

+

∂P

∂z

∂z

∂y

+

∂R

∂y

∂z

∂x

+

∂R

∂z

∂z

∂y

∂z

∂x

+ R

∂

2

z

∂y∂x

.

So subtracting gives

∂

˜

Q

∂x

−

∂

˜

P

∂y

_

∂Q

∂z

−

∂R

∂y

_

∂z

∂x

+

_

∂R

∂x

−

∂P

∂z

_

∂z

∂y

+

_

∂Q

∂x

−

∂P

∂y

_

(4.48)

since

∂

2

z

∂x∂y

∂

2

z

∂y∂x

by the smoothness of z z(x, y). Hence, by equation (4.47),

_

C

f ··· dr

D

_

−

_

∂R

∂y

−

∂Q

∂z

_

∂z

∂x

−

_

∂P

∂z

−

∂R

∂x

_

∂z

∂y

+

_

∂Q

∂x

−

∂P

∂y

__

dA (4.49)

after factoring out a −1 from the terms in the ﬁrst two products in equation (4.48).

Now, recall from Section 2.3 (see p.76) that the vector N−

∂z

∂x

i−

∂z

∂y

j +k is normal to the

tangent plane to the surface z z(x, y) at each point of Σ. Thus,

n

N

_

_

N

_

_

−

∂z

∂x

i −

∂z

∂y

j +k

_

1+

_

∂z

∂x

_

2

+

_

∂z

∂y

_

2

is in fact a positive unit normal vector to Σ (see Figure 4.5.4). Hence, using the parametriza-

tion r(x, y) xi + yj + z(x, y) k, for (x, y) in D, of the surface Σ, we have

∂r

∂x

i +

∂z

∂x

k and

∂r

∂y

j +

∂z

∂y

k, and so

_

_

∂r

∂x

×××

∂r

∂y

_

_

_

1+

_

∂z

∂x

_

2

+

_

∂z

∂y

_

2

. So we see that using formula (4.46) for

curl f, we have

Σ

(curl f) ··· ndσ

D

(curl f) ··· n

_

_

_

_

∂r

∂x

×××

∂r

∂y

_

_

_

_

dA

D

__

∂R

∂y

−

∂Q

∂z

_

i +

_

∂P

∂z

−

∂R

∂x

_

j +

_

∂Q

∂x

−

∂P

∂y

_

k

_

···

_

−

∂z

∂x

i −

∂z

∂y

j +k

_

dA

D

_

−

_

∂R

∂y

−

∂Q

∂z

_

∂z

∂x

−

_

∂P

∂z

−

∂R

∂x

_

∂z

∂y

+

_

∂Q

∂x

−

∂P

∂y

__

dA ,

which, upon comparing to equation (4.49), proves the Theorem. QED

172 CHAPTER 4. LINE AND SURFACE INTEGRALS

Note: The condition in Stokes’ Theorem that the surface Σ have a (continuously vary-

ing) positive unit normal vector n and a boundary curve C traversed n-positively can be

expressed more precisely as follows: if r(t) is the position vector for C and T(t) r

′

(t)/|r

′

(t)|

is the unit tangent vector to C, then the vectors T, n, T×××n form a right-handed system.

Also, it should be noted that Stokes’ Theorem holds even when the boundary curve C is

piecewise smooth.

Example 4.14. Verify Stokes’ Theorem for f(x, y, z) zi +xj + yk when Σ is the paraboloid

z x

2

+ y

2

such that z ≤1 (see Figure 4.5.5).

y

z

x

0

n

C

Σ

1

Figure 4.5.5 z x

2

+ y

2

Solution: The positive unit normal vector to the surface

z z(x, y) x

2

+ y

2

is

n

−

∂z

∂x

i −

∂z

∂y

j +k

_

1+

_

∂z

∂x

_

2

+

_

∂z

∂y

_

2

−2xi −2yj +k

_

1+4x

2

+4y

2

,

and curl f (1−0) i +(1−0) j +(1−0) ki +j +k, so

(curl f) ··· n (−2x−2y+1)/

_

1+4x

2

+4y

2

.

Since Σ can be parametrized as r(x, y) xi + yj +(x

2

+ y

2

) k for

(x, y) in the region D { (x, y) : x

2

+ y

2

≤1}, then

Σ

(curl f) ··· ndσ

D

(curl f) ··· n

_

_

_

_

∂r

∂x

×××

∂r

∂y

_

_

_

_

dA

D

−2x−2y+1

_

1+4x

2

+4y

2

_

1+4x

2

+4y

2

dA

D

(−2x−2y+1) dA , so switching to polar coordinates gives

_

2π

0

_

1

0

(−2r cosθ −2r sinθ +1)r dr dθ

_

2π

0

_

1

0

(−2r

2

cosθ −2r

2

sinθ +r) dr dθ

_

2π

0

_

−

2r

3

3

cosθ −

2r

3

3

sinθ +

r

2

2

¸

¸

¸

r1

r0

_

dθ

_

2π

0

_

−

2

3

cosθ −

2

3

sinθ +

1

2

_

dθ

−

2

3

sinθ +

2

3

cosθ +

1

2

θ

¸

¸

¸

2π

0

π .

4.5 Stokes’ Theorem 173

The boundary curve C is the unit circle x

2

+ y

2

1 laying in the plane z 1 (see Figure

4.5.5), which can be parametrized as x cos t, y sint, z 1 for 0 ≤ t ≤2π. So

_

C

f ··· dr

_

2π

0

((1)(−sint) +(cos t)(cos t) +(sint)(0)) dt

_

2π

0

_

−sint +

1+cos2t

2

_

dt

_

here we used cos

2

t

1+cos2t

2

_

cos t +

t

2

+

sin2t

4

¸

¸

¸

2π

0

π .

So we see that

_

C

f ··· dr

Σ

(curl f) ··· ndσ, as predicted by Stokes’ Theorem.

The line integral in the preceding example was far simpler to calculate than the surface

integral, but this will not always be the case.

Example 4.15. Let Σ be the elliptic paraboloid z

x

2

4

+

y

2

9

for z ≤1, and let C be its boundary

curve. Calculate

_

C

f ··· dr for f(x, y, z) (9xz +2y)i +(2x + y

2

)j +(−2y

2

+2z)k, where C is

traversed counterclockwise.

Solution: The surface is similar to the one in Example 4.14, except nowthe boundary curve C

is the ellipse

x

2

4

+

y

2

9

1 laying in the plane z 1. In this case, using Stokes’ Theorem is easier

than computing the line integral directly. As in Example 4.14, at each point (x, y, z(x, y)) on

the surface z z(x, y)

x

2

4

+

y

2

9

the vector

n

−

∂z

∂x

i −

∂z

∂y

j +k

_

1+

_

∂z

∂x

_

2

+

_

∂z

∂y

_

2

−

x

2

i −

2y

9

j +k

_

1+

x

2

4

+

4y

2

9

,

is a positive unit normal vector to Σ. And calculating the curl of f gives

curl f (−4y−0)i + (9x−0)j + (2−2)k −4yi + 9xj + 0k ,

so

(curl f) ··· n

(−4y)(−

x

2

) +(9x)(−

2y

9

) +(0)(1)

_

1+

x

2

4

+

4y

2

9

2xy−2xy+0

_

1+

x

2

4

+

4y

2

9

0 ,

and so by Stokes’ Theorem

_

C

f ··· dr

Σ

(curl f) ··· ndσ

Σ

0dσ 0 .

174 CHAPTER 4. LINE AND SURFACE INTEGRALS

In physical applications, for a simple closed curve C the line integral

_

C

f···dr is often called

the circulation of f around C. For example, if E represents the electrostatic ﬁeld due to a

point charge, then it turns out

8

that curl E0, which means that the circulation

_

C

E··· dr 0

by Stokes’ Theorem. Vector ﬁelds which have zero curl are often called irrotational ﬁelds.

In fact, the term curl was created by the 19

th

century Scottish physicist James Clerk

Maxwell in his study of electromagnetism, where it is used extensively. In physics, the

curl is interpreted as a measure of circulation density. This is best seen by using another

deﬁnition of curl f which is equivalent

9

to the deﬁnition given by formula (4.46). Namely, for

a point (x, y, z) in R

3

,

n··· (curl f)(x, y, z) lim

S→0

1

S

_

C

f ··· dr , (4.50)

where S is the surface area of a surface Σ containing the point (x, y, z) and with a simple

closed boundary curve C and positive unit normal vector n at (x, y, z). In the limit, think of

the curve C shrinking to the point (x, y, z), which causes Σ, the surface it bounds, to have

smaller and smaller surface area. That ratio of circulation to surface area in the limit is

what makes the curl a rough measure of circulation density (i.e. circulation per unit area).

x

y

0

f

Figure 4.5.6 Curl and rotation

An idea of how the curl of a vector ﬁeld is

related to rotation is shown in Figure 4.5.6.

Suppose we have a vector ﬁeld f(x, y, z) which

is always parallel to the xy-plane at each

point (x, y, z) and that the vectors growlarger

the further the point (x, y, z) is from the y-

axis. For example, f(x, y, z) (1+x

2

) j. Think

of the vector ﬁeld as representing the ﬂow

of water, and imagine dropping two wheels

with paddles into that water ﬂow, as in Fig-

ure 4.5.6. Since the ﬂow is stronger (i.e. the

magnitude of f is larger) as you move away

from the y-axis, then such a wheel would ro-

tate counterclockwise if it were dropped to

the right of the y-axis, and it would rotate

clockwise if it were dropped to the left of the y-axis. In both cases the curl would be nonzero

(curl f(x, y, z) 2xk in our example) and would obey the right-hand rule, that is, curl f(x, y, z)

points in the direction of your thumb as you cup your right hand in the direction of the rota-

tion of the wheel. So the curl points outward (in the positive z-direction) if x >0 and points

inward (in the negative z-direction) if x < 0. Notice that if all the vectors had the same di-

rection and the same magnitude, then the wheels would not rotate and hence there would

be no curl (which is why such ﬁelds are called irrotational, meaning no rotation).

8

See Ch. 2 in REITZ, MILFORD and CHRISTY.

9

See SCHEY, p. 78-81, for the derivation.

4.5 Stokes’ Theorem 175

Finally, by Stokes’ Theorem, we know that if C is a simple closed curve in some solid

region S in R

3

and if f(x, y, z) is a smooth vector ﬁeld such that curl f 0 in S, then

_

C

f ··· dr

Σ

(curl f) ··· ndσ

Σ

0··· ndσ

Σ

0dσ 0 ,

where Σ is any orientable surface inside S whose boundary is C (such a surface is some-

times called a capping surface for C). So similar to the two-variable case, we have a three-

dimensional version of a result from Section 4.3, for solid regions in R

3

which are simply

connected (i.e. regions having no holes):

The following statements are equivalent for a simply connected solid region S in R

3

:

(a) f(x, y, z) P(x, y, z) i +Q(x, y, z) j +R(x, y, z) k has a smooth potential F(x, y, z) in S

(b)

_

C

f ··· dr is independent of the path for any curve C in S

(c)

_

C

f ··· dr 0 for every simple closed curve C in S

(d)

∂R

∂y

∂Q

∂z

,

∂P

∂z

∂R

∂x

, and

∂Q

∂x

∂P

∂y

in S (i.e. curl f 0 in S)

Part (d) is also a way of saying that the differential form P dx+Qdy+Rdz is exact.

Example 4.16. Determine if the vector ﬁeld f(x, y, z) xyzi+xzj+xyk has a potential in R

3

.

Solution: Since R

3

is simply connected, we just need to check whether curl f 0 throughout

R

3

, that is,

∂R

∂y

∂Q

∂z

,

∂P

∂z

∂R

∂x

, and

∂Q

∂x

∂P

∂y

throughout R

3

, where P(x, y, z) xyz, Q(x, y, z) xz, and R(x, y, z) xy. But we see that

∂P

∂z

xy ,

∂R

∂x

y ⇒

∂P

∂z

/

∂R

∂x

for some (x, y, z) in R

3

.

Thus, f(x, y, z) does not have a potential in R

3

.

Exercises

A

For Exercises 1-3, calculate

_

C

f (x, y, z) ds for the given function f (x, y, z) and curve C.

1. f (x, y, z) z; C : x cos t, y sint, z t, 0 ≤ t ≤2π

176 CHAPTER 4. LINE AND SURFACE INTEGRALS

2. f (x, y, z)

x

y

+ y+2yz; C : x t

2

, y t, z 1, 1 ≤ t ≤2

3. f (x, y, z) z

2

; C : x t sint, y t cos t, z

2

_

2

3

t

3/2

, 0 ≤ t ≤1

For Exercises 4-9, calculate

_

C

f ··· dr for the given vector ﬁeld f(x, y, z) and curve C.

4. f(x, y, z) i −j +k; C : x 3t, y 2t, z t, 0 ≤ t ≤1

5. f(x, y, z) yi −xj +zk; C : x cos t, y sint, z t, 0 ≤ t ≤2π

6. f(x, y, z) xi + yj +zk; C : x cos t, y sint, z 2, 0 ≤ t ≤2π

7. f(x, y, z) (y−2z) i +xyj +(2xz + y) k; C : x t, y 2t, z t

2

−1, 0 ≤ t ≤1

8. f(x, y, z) yzi +xzj +xyk; C : the polygonal path from (0, 0, 0) to (1, 0, 0) to (1, 2, 0)

9. f(x, y, z) xyi +(z −x) j +2yzk; C : the polygonal path from (0, 0, 0) to (1, 0, 0) to (1, 2, 0)

to (1, 2, −2)

For Exercises 10-13, state whether or not the vector ﬁeld f(x, y, z) has a potential in R

3

(you

do not need to ﬁnd the potential itself).

10. f(x, y, z) yi −xj +zk 11. f(x, y, z) ai +bj +ck (a, b, c constant)

12. f(x, y, z) (x+ y) i +xj +z

2

k 13. f(x, y, z) xyi −(x− yz

2

) j + y

2

zk

B

For Exercises 14-15, verify Stokes’ Theorem for the given vector ﬁeld f(x, y, z) and surface Σ.

14. f(x, y, z) 2yi −xj +zk; Σ: x

2

+ y

2

+z

2

1, z ≥0

15. f(x, y, z) xyi +xzj + yzk; Σ: z x

2

+ y

2

, z ≤1

16. Construct a Möbius strip from a piece of paper, then draw a line down its center (like

the dotted line in Figure 4.5.3(b)). Cut the Möbius strip along that center line completely

around the strip. How many surfaces does this result in? How would you describe them?

Are they orientable?

17. Use Gnuplot (see Appendix C) to plot the Möbius strip parametrized as:

r(u, v) cosu(1+vcos

u

2

) i +sinu(1+vcos

u

2

) j +vsin

u

2

k , 0 ≤ u ≤2π , −

1

2

≤v ≤

1

2

C

18. Let Σ be a closed surface and f(x, y, z) a smooth vector ﬁeld. Show that

Σ

(curl f) ··· ndσ0. (Hint: Split Σ in half.)

19. Show that Green’s Theorem is a special case of Stokes’ Theorem.

4.6 Gradient, Divergence, Curl and Laplacian 177

4.6 Gradient, Divergence, Curl and Laplacian

In this ﬁnal section we will establish some relationships between the gradient, divergence

and curl, and we will also introduce a new quantity called the Laplacian. We will then show

how to write these quantities in cylindrical and spherical coordinates.

For a real-valued function f (x, y, z) on R

3

, the gradient ∇f (x, y, z) is a vector-valued func-

tion on R

3

, that is, its value at a point (x, y, z) is the vector

∇f (x, y, z)

_

∂f

∂x

,

∂f

∂y

,

∂f

∂z

_

∂f

∂x

i +

∂f

∂y

j +

∂f

∂z

k

in R

3

, where each of the partial derivatives is evaluated at the point (x, y, z). So in this way,

you can think of the symbol ∇ as being “applied” to a real-valued function f to produce a

vector ∇f .

It turns out that the divergence and curl can also be expressed in terms of the symbol ∇.

This is done by thinking of ∇ as a vector in R

3

, namely

∇

∂

∂x

i +

∂

∂y

j +

∂

∂z

k . (4.51)

Here, the symbols

∂

∂x

,

∂

∂y

and

∂

∂z

are to be thought of as “partial derivative operators” that

will get “applied” to a real-valued function, say f (x, y, z), to produce the partial derivatives

∂f

∂x

,

∂f

∂y

and

∂f

∂z

. For instance,

∂

∂x

“applied” to f (x, y, z) produces

∂f

∂x

.

Is ∇ really a vector? Strictly speaking, no, since

∂

∂x

,

∂

∂y

and

∂

∂z

are not actual numbers. But

it helps to think of ∇ as a vector, especially with the divergence and curl, as we will soon see.

The process of “applying”

∂

∂x

,

∂

∂y

,

∂

∂z

to a real-valued function f (x, y, z) is normally thought of

as multiplying the quantities:

_

∂

∂x

_

( f )

∂f

∂x

,

_

∂

∂y

_

( f )

∂f

∂y

,

_

∂

∂z

_

( f )

∂f

∂z

For this reason, ∇ is often referred to as the “del operator”, since it “operates” on functions.

For example, it is often convenient to write the divergence div f as ∇··· f, since for a vector

ﬁeld f(x, y, z) f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k, the dot product of f with ∇ (thought of as a

vector) makes sense:

∇··· f

_

∂

∂x

i +

∂

∂y

j +

∂

∂z

k

_

··· ( f

1

(x, y, z)i + f

2

(x, y, z)j + f

3

(x, y, z)k)

_

∂

∂x

_

( f

1

) +

_

∂

∂y

_

( f

2

) +

_

∂

∂z

_

( f

3

)

∂f

1

∂x

+

∂f

2

∂y

+

∂f

3

∂z

div f

178 CHAPTER 4. LINE AND SURFACE INTEGRALS

We can also write curl f in terms of ∇, namely as ∇×××f, since for a vector ﬁeld f(x, y, z)

P(x, y, z)i +Q(x, y, z)j +R(x, y, z)k, we have:

∇×××f

¸

¸

¸

¸

¸

¸

¸

¸

¸

i j k

∂

∂x

∂

∂y

∂

∂z

P(x, y, z) Q(x, y, z) R(x, y, z)

¸

¸

¸

¸

¸

¸

¸

¸

¸

_

∂R

∂y

−

∂Q

∂z

_

i −

_

∂R

∂x

−

∂P

∂z

_

j +

_

∂Q

∂x

−

∂P

∂y

_

k

_

∂R

∂y

−

∂Q

∂z

_

i +

_

∂P

∂z

−

∂R

∂x

_

j +

_

∂Q

∂x

−

∂P

∂y

_

k

curl f

For a real-valued function f (x, y, z), the gradient ∇f (x, y, z)

∂f

∂x

i +

∂f

∂y

j +

∂f

∂z

k is a vector

ﬁeld, so we can take its divergence:

div ∇f ∇··· ∇f

_

∂

∂x

i +

∂

∂y

j +

∂

∂z

k

_

···

_

∂f

∂x

i +

∂f

∂y

j +

∂f

∂z

k

_

∂

∂x

_

∂f

∂x

_

+

∂

∂y

_

∂f

∂y

_

+

∂

∂z

_

∂f

∂z

_

∂

2

f

∂x

2

+

∂

2

f

∂y

2

+

∂

2

f

∂z

2

Note that this is a real-valued function, to which we will give a special name:

Deﬁnition 4.7. For a real-valued function f (x, y, z), the Laplacian of f , denoted by ∆f , is

given by

∆f (x, y, z) ∇··· ∇f

∂

2

f

∂x

2

+

∂

2

f

∂y

2

+

∂

2

f

∂z

2

. (4.52)

Often the notation ∇

2

f is used for the Laplacian instead of ∆f , using the convention ∇

2

∇··· ∇.

Example 4.17. Let r(x, y, z) xi+yj+zk be the position vector ﬁeld on R

3

. Then |r(x, y, z)|

2

r··· r x

2

+ y

2

+z

2

is a real-valued function. Find

(a) the gradient of |r|

2

(b) the divergence of r

(c) the curl of r

(d) the Laplacian of |r|

2

4.6 Gradient, Divergence, Curl and Laplacian 179

Solution: (a) ∇|r|

2

2xi +2yj +2zk2r

(b) ∇··· r

∂

∂x

(x) +

∂

∂y

(y) +

∂

∂z

(z) 1+1+1 3

(c)

∇×××r

¸

¸

¸

¸

¸

¸

¸

¸

¸

i j k

∂

∂x

∂

∂y

∂

∂z

x y z

¸

¸

¸

¸

¸

¸

¸

¸

¸

(0−0) i − (0−0) j + (0−0) k 0

(d) ∆|r|

2

∂

2

∂x

2

(x

2

+ y

2

+z

2

) +

∂

2

∂y

2

(x

2

+ y

2

+z

2

) +

∂

2

∂z

2

(x

2

+ y

2

+z

2

) 2+2+2 6

Note that we could have calculated ∆|r|

2

another way, using the ∇ notation along with parts

(a) and (b):

∆|r|

2

∇··· ∇|r|

2

∇··· 2r 2∇··· r 2(3) 6

Notice that in Example 4.17 if we take the curl of the gradient of |r|

2

we get

∇×××(∇|r|

2

) ∇×××2r 2∇×××r 20 0 .

The following theorem shows that this will be the case in general:

Theorem 4.15. For any smooth real-valued function f (x, y, z), ∇×××(∇f ) 0.

Proof: We see by the smoothness of f that

∇×××(∇f )

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

i j k

∂

∂x

∂

∂y

∂

∂z

∂f

∂x

∂f

∂y

∂f

∂z

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

¸

_

∂

2

f

∂y∂z

−

∂

2

f

∂z∂y

_

i −

_

∂

2

f

∂x∂z

−

∂

2

f

∂z∂x

_

j +

_

∂

2

f

∂x∂y

−

∂

2

f

∂y∂x

_

k 0 ,

since the mixed partial derivatives in each component are equal. QED

Corollary 4.16. If a vector ﬁeld f(x, y, z) has a potential, then curl f 0.

Another way of stating Theorem 4.15 is that gradients are irrotational. Also, notice that

in Example 4.17 if we take the divergence of the curl of r we trivially get

∇··· (∇×××r) ∇··· 0 0 .

The following theorem shows that this will be the case in general:

Theorem 4.17. For any smooth vector ﬁeld f(x, y, z), ∇··· (∇×××f) 0.

The proof is straightforward and left as an exercise for the reader.

180 CHAPTER 4. LINE AND SURFACE INTEGRALS

Corollary 4.18. The ﬂux of the curl of a smooth vector ﬁeld f(x, y, z) through any closed

surface is zero.

Proof: Let Σ be a closed surface which bounds a solid S. The ﬂux of ∇×××f through Σ is

Σ

(∇×××f) ··· dσ

S

∇··· (∇×××f) dV (by the Divergence Theorem)

S

0 dV (by Theorem 4.17)

0 . QED

There is another method for proving Theorem 4.15 which can be useful, and is often used

in physics. Namely, if the surface integral

Σ

f (x, y, z) dσ0 for all surfaces Σ in some solid

region (usually all of R

3

), then we must have f (x, y, z) 0 throughout that region. The proof

is not trivial, and physicists do not usually bother to prove it. But the result is true, and can

also be applied to double and triple integrals.

For instance, to prove Theorem 4.15, assume that f (x, y, z) is a smooth real-valued func-

tion on R

3

. Let C be a simple closed curve in R

3

and let Σ be any capping surface for C (i.e.

Σ is orientable and its boundary is C). Since ∇f is a vector ﬁeld, then

Σ

(∇×××(∇f )) ··· ndσ

_

C

∇f ··· dr by Stokes’ Theorem, so

0 by Corollary 4.13.

Since the choice of Σ was arbitrary, then we must have (∇×××(∇f ))···n0 throughout R

3

, where

n is any unit vector. Using i, j and k in place of n, we see that we must have ∇×××(∇f ) 0 in

R

3

, which completes the proof.

Example 4.18. A system of electric charges has a charge density ρ(x, y, z) and produces an

electrostatic ﬁeld E(x, y, z) at points (x, y, z) in space. Gauss’ Law states that

Σ

E··· dσ 4π

S

ρ dV

for any closed surface Σ which encloses the charges, with S being the solid region enclosed

by Σ. Show that ∇··· E4πρ. This is one of Maxwell’s Equations.

10

10

In Gaussian (or CGS) units.

4.6 Gradient, Divergence, Curl and Laplacian 181

Solution: By the Divergence Theorem, we have

S

∇··· E dV

Σ

E··· dσ

4π

S

ρ dV by Gauss’ Law, so combining the integrals gives

S

(∇··· E−4πρ) dV 0 , so

∇··· E−4πρ 0 since Σ and hence S was arbitrary, so

∇··· E 4πρ .

Often (especially in physics) it is convenient to use other coordinate systems when dealing

with quantities such as the gradient, divergence, curl and Laplacian. We will present the

formulas for these in cylindrical and spherical coordinates.

Recall from Section 1.7 that a point (x, y, z) can be represented in cylindrical coordinates

(r, θ, z), where x r cosθ, y r sinθ, z z. At each point (r, θ, z), let e

r

, e

θ

, e

z

be unit vectors

in the direction of increasing r, θ, z, respectively (see Figure 4.6.1). Then e

r

, e

θ

, e

z

form an

orthonormal set of vectors. Note, by the right-hand rule, that e

z

×××e

r

e

θ

.

x

y

z

0

(x, y, z)

(x, y, 0)

θ x

y

z

r

e

r

e

θ

e

z

Figure 4.6.1

Orthonormal vectors e

r

, e

θ

, e

z

in cylindrical coordinates

x

y

z

0

(x, y, z)

(x, y, 0)

θ x

y

z

ρ

φ

e

ρ

e

θ

e

φ

Figure 4.6.2

Orthonormal vectors e

ρ

, e

θ

, e

φ

in spherical coordinates

Similarly, a point (x, y, z) can be represented in spherical coordinates (ρ, θ, φ), where x

ρsinφcosθ, y ρsinφsinθ, z ρcosφ. At each point (ρ, θ, φ), let e

ρ

, e

θ

, e

φ

be unit vectors

in the direction of increasing ρ, θ, φ, respectively (see Figure 4.6.2). Then the vectors e

ρ

, e

θ

,

e

φ

are orthonormal. By the right-hand rule, we see that e

θ

×××e

ρ

e

φ

.

We can now summarize the expressions for the gradient, divergence, curl and Laplacian

in Cartesian, cylindrical and spherical coordinates in the following tables:

182 CHAPTER 4. LINE AND SURFACE INTEGRALS

Cartesian (x, y, z): Scalar function F; Vector ﬁeld f f

1

i + f

2

j + f

3

k

gradient : ∇F

∂F

∂x

i +

∂F

∂y

j +

∂F

∂z

k

divergence : ∇··· f

∂f

1

∂x

+

∂f

2

∂y

+

∂f

3

∂z

curl : ∇×××f

_

∂f

3

∂y

−

∂f

2

∂z

_

i +

_

∂f

1

∂z

−

∂f

3

∂x

_

j +

_

∂f

2

∂x

−

∂f

1

∂y

_

k

Laplacian : ∆F

∂

2

F

∂x

2

+

∂

2

F

∂y

2

+

∂

2

F

∂z

2

Cylindrical (r, θ, z): Scalar function F; Vector ﬁeld f f

r

e

r

+ f

θ

e

θ

+ f

z

e

z

gradient : ∇F

∂F

∂r

e

r

+

1

r

∂F

∂θ

e

θ

+

∂F

∂z

e

z

divergence : ∇··· f

1

r

∂

∂r

(r f

r

) +

1

r

∂f

θ

∂θ

+

∂f

z

∂z

curl : ∇×××f

_

1

r

∂f

z

∂θ

−

∂f

θ

∂z

_

e

r

+

_

∂f

r

∂z

−

∂f

z

∂r

_

e

θ

+

1

r

_

∂

∂r

(r f

θ

) −

∂f

r

∂θ

_

e

z

Laplacian : ∆F

1

r

∂

∂r

_

r

∂F

∂r

_

+

1

r

2

∂

2

F

∂θ

2

+

∂

2

F

∂z

2

Spherical (ρ, θ, φ): Scalar function F; Vector ﬁeld f f

ρ

e

ρ

+ f

θ

e

θ

+ f

φ

e

φ

gradient : ∇F

∂F

∂ρ

e

ρ

+

1

ρsinφ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

divergence : ∇··· f

1

ρ

2

∂

∂ρ

(ρ

2

f

ρ

) +

1

ρsinφ

∂f

θ

∂θ

+

1

ρsinφ

∂

∂φ

(sinφ f

θ

)

curl : ∇×××f

1

ρsinφ

_

∂

∂φ

(sinφ f

θ

) −

∂f

φ

∂θ

_

e

ρ

+

1

ρ

_

∂

∂ρ

(ρf

φ

) −

∂f

ρ

∂φ

_

e

θ

+

_

1

ρsinφ

∂f

ρ

∂θ

−

1

ρ

∂

∂ρ

(ρf

θ

)

_

e

φ

Laplacian : ∆F

1

ρ

2

∂

∂ρ

_

ρ

2

∂F

∂ρ

_

+

1

ρ

2

sin

2

φ

∂

2

F

∂θ

2

+

1

ρ

2

sinφ

∂

∂φ

_

sinφ

∂F

∂φ

_

The derivation of the above formulas for cylindrical and spherical coordinates is straight-

forward but extremely tedious. The basic idea is to take the Cartesian equivalent of the

quantity in question and to substitute into that formula using the appropriate coordinate

transformation. As an example, we will derive the formula for the gradient in spherical

coordinates.

4.6 Gradient, Divergence, Curl and Laplacian 183

Goal: Show that the gradient of a real-valued function F(ρ, θ, φ) in spherical coordinates is:

∇F

∂F

∂ρ

e

ρ

+

1

ρsinφ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

Idea: In the Cartesian gradient formula ∇F(x, y, z)

∂F

∂x

i+

∂F

∂y

j +

∂F

∂z

k, put the Cartesian ba-

sis vectors i, j, k in terms of the spherical coordinate basis vectors e

ρ

, e

θ

, e

φ

and functions of

ρ, θ and φ. Then put the partial derivatives

∂F

∂x

,

∂F

∂y

,

∂F

∂z

in terms of

∂F

∂ρ

,

∂F

∂θ

,

∂F

∂φ

and functions

of ρ, θ and φ.

Step 1: Get formulas for e

ρ

, e

θ

, e

φ

in terms of i, j, k.

We can see from Figure 4.6.2 that the unit vector e

ρ

in the ρ direction at a general point

(ρ, θ, φ) is e

ρ

r

|r|

, where r xi + yj + zk is the position vector of the point in Cartesian

coordinates. Thus,

e

ρ

r

|r|

xi + yj +zk

_

x

2

+ y

2

+z

2

,

so using x ρsinφcosθ, y ρsinφsinθ, z ρcosφ, and ρ

_

x

2

+ y

2

+z

2

, we get:

e

ρ

sinφ cosθi + sinφ sinθj + cosφk

Now, since the angle θ is measured in the xy-plane, then the unit vector e

θ

in the θ

direction must be parallel to the xy-plane. That is, e

θ

is of the form ai +bj +0k. To ﬁgure

out what a and b are, note that since e

θ

⊥e

ρ

, then in particular e

θ

⊥e

ρ

when e

ρ

is in the

xy-plane. That occurs when the angle φ is π/2. Putting φπ/2 into the formula for e

ρ

gives

e

ρ

cosθi+sinθj+0k, and we see that a vector perpendicular to that is −sinθi+cosθj+0k.

Since this vector is also a unit vector and points in the (positive) θ direction, it must be e

θ

:

e

θ

−sinθi + cosθj + 0k

Lastly, since e

φ

e

θ

×××e

ρ

, we get:

e

φ

cosφ cosθi + cosφ sinθj − sinφk

Step 2: Use the three formulas from Step 1 to solve for i, j, k in terms of e

ρ

, e

θ

, e

φ

.

This comes down to solving a system of three equations in three unknowns. There are

many ways of doing this, but we will do it by combining the formulas for e

ρ

and e

φ

to

eliminate k, which will give us an equation involving just i and j. This, with the formula for

e

θ

, will then leave us with a system of two equations in two unknowns (i and j), which we

will use to solve ﬁrst for j then for i. Lastly, we will solve for k.

First, note that

sinφe

ρ

+ cosφe

φ

cosθi + sinθj

184 CHAPTER 4. LINE AND SURFACE INTEGRALS

so that

sinθ(sinφe

ρ

+ cosφe

φ

) + cosθe

θ

(sin

2

θ +cos

2

θ)j j ,

and so:

j sinφ sinθe

ρ

+ cosθe

θ

+ cosφ sinθe

φ

Likewise, we see that

cosθ(sinφe

ρ

+ cosφe

φ

) − sinθe

θ

(cos

2

θ +sin

2

θ)i i ,

and so:

i sinφ cosθe

ρ

− sinθe

θ

+ cosφ cosθe

φ

Lastly, we see that:

k cosφe

ρ

− sinφe

φ

Step 3: Get formulas for

∂F

∂ρ

,

∂F

∂θ

,

∂F

∂φ

in terms of

∂F

∂x

,

∂F

∂y

,

∂F

∂z

.

By the Chain Rule, we have

∂F

∂ρ

∂F

∂x

∂x

∂ρ

+

∂F

∂y

∂y

∂ρ

+

∂F

∂z

∂z

∂ρ

,

∂F

∂θ

∂F

∂x

∂x

∂θ

+

∂F

∂y

∂y

∂θ

+

∂F

∂z

∂z

∂θ

,

∂F

∂φ

∂F

∂x

∂x

∂φ

+

∂F

∂y

∂y

∂φ

+

∂F

∂z

∂z

∂φ

,

which yields:

∂F

∂ρ

sinφ cosθ

∂F

∂x

+ sinφ sinθ

∂F

∂y

+ cosφ

∂F

∂z

∂F

∂θ

−ρsinφ sinθ

∂F

∂x

+ ρsinφ cosθ

∂F

∂y

∂F

∂φ

ρcosφ cosθ

∂F

∂x

+ ρcosφ sinθ

∂F

∂y

− ρsinφ

∂F

∂z

Step 4: Use the three formulas from Step 3 to solve for

∂F

∂x

,

∂F

∂y

,

∂F

∂z

in terms of

∂F

∂ρ

,

∂F

∂θ

,

∂F

∂φ

.

Again, this involves solving a system of three equations in three unknowns. Using a

similar process of elimination as in Step 2, we get:

∂F

∂x

1

ρsinφ

_

ρsin

2

φ cosθ

∂F

∂ρ

− sinθ

∂F

∂θ

+ sinφ cosφ cosθ

∂F

∂φ

_

∂F

∂y

1

ρsinφ

_

ρsin

2

φ sinθ

∂F

∂ρ

+ cosθ

∂F

∂θ

+ sinφ cosφ sinθ

∂F

∂φ

_

∂F

∂z

1

ρ

_

ρcosφ

∂F

∂ρ

− sinφ

∂F

∂φ

_

4.6 Gradient, Divergence, Curl and Laplacian 185

Step 5: Substitute the formulas for i, j, k from Step 2 and the formulas for

∂F

∂x

,

∂F

∂y

,

∂F

∂z

from

Step 4 into the Cartesian gradient formula ∇F(x, y, z)

∂F

∂x

i +

∂F

∂y

j +

∂F

∂z

k.

Doing this last step is perhaps the most tedious, since it involves simplifying 3×3+3×3+

2×2 22 terms! Namely,

∇F

1

ρsinφ

_

ρsin

2

φ cosθ

∂F

∂ρ

−sinθ

∂F

∂θ

+sinφ cosφ cosθ

∂F

∂φ

_

(sinφ cosθe

ρ

−sinθe

θ

+cosφ cosθe

φ

)

+

1

ρsinφ

_

ρsin

2

φ sinθ

∂F

∂ρ

+cosθ

∂F

∂θ

+sinφ cosφ sinθ

∂F

∂φ

_

(sinφ sinθe

ρ

+cosθe

θ

+cosφ sinθe

φ

)

+

1

ρ

_

ρcosφ

∂F

∂ρ

−sinφ

∂F

∂φ

_

(cosφe

ρ

−sinφe

φ

) ,

which we see has 8 terms involving e

ρ

, 6 terms involving e

θ

, and 8 terms involving e

φ

. But

the algebra is straightforward and yields the desired result:

∇F

∂F

∂ρ

e

ρ

+

1

ρsinφ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

Example 4.19. In Example 4.17 we showed that ∇|r|

2

2r and ∆|r|

2

6, where r(x, y, z)

xi + yj +zk in Cartesian coordinates. Verify that we get the same answers if we switch to

spherical coordinates.

Solution: Since |r|

2

x

2

+ y

2

+z

2

ρ

2

in spherical coordinates, let F(ρ, θ, φ) ρ

2

(so that

F(ρ, θ, φ) |r|

2

). The gradient of F in spherical coordinates is

∇F

∂F

∂ρ

e

ρ

+

1

ρsinφ

∂F

∂θ

e

θ

+

1

ρ

∂F

∂φ

e

φ

2ρe

ρ

+

1

ρsinφ

(0) e

θ

+

1

ρ

(0) e

φ

2ρe

ρ

2ρ

r

|r|

, as we showed earlier, so

2ρ

r

ρ

2r , as expected. And the Laplacian is

∆F

1

ρ

2

∂

∂ρ

_

ρ

2

∂F

∂ρ

_

+

1

ρ

2

sin

2

φ

∂

2

F

∂θ

2

+

1

ρ

2

sinφ

∂

∂φ

_

sinφ

∂F

∂φ

_

1

ρ

2

∂

∂ρ

(ρ

2

2ρ) +

1

ρ

2

sinφ

(0) +

1

ρ

2

sinφ

∂

∂φ

_

sinφ(0)

_

1

ρ

2

∂

∂ρ

(2ρ

3

) + 0 + 0

1

ρ

2

(6ρ

2

) 6 , as expected.

186 CHAPTER 4. LINE AND SURFACE INTEGRALS

Exercises

A

For Exercises 1-6, ﬁnd the Laplacian of the function f (x, y, z) in Cartesian coordinates.

1. f (x, y, z) x+ y+z 2. f (x, y, z) x

5

3. f (x, y, z) (x

2

+ y

2

+z

2

)

3/2

4. f (x, y, z) e

x+y+z

5. f (x, y, z) x

3

+ y

3

+z

3

6. f (x, y, z) e

−x

2

−y

2

−z

2

7. Find the Laplacian of the function in Exercise 3 in spherical coordinates.

8. Find the Laplacian of the function in Exercise 6 in spherical coordinates.

9. Let f (x, y, z)

z

x

2

+ y

2

in Cartesian coordinates. Find ∇f in cylindrical coordinates.

10. For f(r, θ, z) r e

r

+z sinθe

θ

+rze

z

in cylindrical coordinates, ﬁnd div f and curl f.

11. For f(ρ, θ, φ) e

ρ

+ρ cosθe

θ

+ρe

φ

in spherical coordinates, ﬁnd div f and curl f.

B

For Exercises 12-23, prove the given formula (r |r| is the length of the position vector ﬁeld

r(x, y, z) xi + yj +zk).

12. ∇(1/r) −r/r

3

13. ∆(1/r) 0 14. ∇··· (r/r

3

) 0 15. ∇(lnr) r/r

2

16. div(F+G) div F + div G 17. curl (F+G) curl F + curl G

18. div( f F) f div F + F··· ∇f 19. div(F×××G) G··· curl F − F··· curl G

20. div(∇f ×××∇g) 0 21. curl ( f F) f curl F + (∇f ) ×××F

22. curl (curl F) ∇(div F) − ∆F 23. ∆( f g) f ∆g + g∆f + 2(∇f ··· ∇g)

C

24. Prove Theorem 4.17.

25. Derive the gradient formula in cylindrical coordinates: ∇F

∂F

∂r

e

r

+

1

r

∂F

∂θ

e

θ

+

∂F

∂z

e

z

26. Use f u∇v in the Divergence Theorem to prove:

(a) Green’s ﬁrst identity:

S

(u∆v + (∇u) ··· (∇v)) dV

Σ

(u∇v) ··· dσ

(b) Green’s second identity:

S

(u∆v − v∆u) dV

Σ

(u∇v − v∇u) ··· dσ

27. Suppose that ∆u 0 (i.e. u is harmonic) over R

3

. Deﬁne the normal derivative

∂u

∂n

of u

over a closed surface Σ with outward unit normal vector n by

∂u

∂n

D

n

u n··· ∇u. Show

that

Σ

∂u

∂n

dσ0. (Hint: Use Green’s second identity.)

Bibliography

Abbott, E.A., Flatland, 7th edition. New York: Dover Publications, Inc., 1952

Classic tale about a creature living in a 2-dimensional world who encounters a higher-

dimensional creature, with lots of humor thrown in.

Anton, H. and C. Rorres, Elementary Linear Algebra: Applications Version, 8th edition. New

York: John Wiley & Sons, 2000

Standard treatment of elementary linear algebra.

Bazaraa, M.S., H.D. Sherali and C.M. Shetty, Nonlinear Programming: Theory and Algo-

rithms, 2nd edition. New York: John Wiley & Sons, 1993

Thorough treatment of nonlinear optimization.

Farin, G., Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide,

2nd edition. San Diego, CA: Academic Press, 1990

An intermediate-level book on curve and surface design.

Hecht, E., Optics, 2nd edition. Reading, MA: Addison-Wesley Publishing Co., 1987

An intermediate-level book on optics, covering a wide range of topics.

Hoel, P.G., S.C. Port and C.J. Stone, Introduction to Probability Theory, Boston, MA:

Houghton Mifﬂin Co., 1971

An excellent introduction to elementary, calculus-based probability theory. Lots of good exer-

cises.

Jackson, J.D., Classical Electrodynamics, 2nd edition. New York: John Wiley & Sons, 1975

An advanced book on electromagnetism, famous for being intimidating. Most of the mathemat-

ics will be understandable after reading the present book.

Marion, J.B., Classical Dynamics of Particles and Systems, 2nd edition. New York: Academic

Press, 1970

Standard intermediate-level treatment of classical mechanics. Very thorough.

O’Neill, B., Elementary Differential Geometry, New York: Academic Press, 1966

Intermediate-level book on differential geometry, with a modern approach based on differential

forms.

187

188 Bibliography

Pogorelov, A.V., Analytical Geometry, Moscow: Mir Publishers, 1980

An intermediate/advanced book on analytic geometry.

Press, W.H., S.A. Teukolsky, W.T. Vetterling and B.P. Flannery, Numerical Recipes in FOR-

TRAN: The Art of Scientific Computing, 2nd edition. Cambridge, UK: Cambridge Uni-

versity Press, 1992

An excellent source of information on numerical methods for solving a wide variety of problems.

Though all the examples are in the FORTRAN programming language, the code is clear enough

to implement in the language of your choice.

Protter, M.H. and C.B. Morrey, Analytic Geometry, 2nd edition. Reading, MA: Addison-

Wesley Publishing Co., 1975

Thorough treatment of elementary analytic geometry, with a rigor not found in most recent

books.

Ralston, A. and P. Rabinowitz, A First Course in Numerical Analysis, 2nd edition. New York:

McGraw-Hill, 1978

Standard treatment of elementary numerical analysis.

Reitz, J.R., F.J. Milford and R.W. Christy, Foundations of Electromagnetic Theory, 3rd edi-

tion. Reading, MA: Addison-Wesley Publishing Co., 1979

Intermediate text on electromagnetism.

Schey, H.M., Div, Grad, Curl, and All That: An Informal Text on Vector Calculus, New York:

W.W. Norton & Co., 1973

Very intuitive approach to the subject, from a physicist’s viewpoint. Highly recommended.

Taylor, A.E. and W.R. Mann, Advanced Calculus, 2nd edition. New York: John Wiley & Sons,

1972

Excellent treatment of n-dimensional calculus. A good book to study after the present book.

Many intriguing exercises.

Uspensky, J.V., Theory of Equations, New York: McGraw-Hill, 1948

A classic on the subject, discussing many interesting topics.

Weinberger, H.F., A First Course in Partial Differential Equations, New York: John Wiley &

Sons, 1965

A good introduction to the vast subject of partial differential equations.

Welchons, A.M. and W.R. Krickenberger, Solid Geometry, Boston, MA: Ginn & Co., 1936

A very thorough treatment of 3-dimensional geometry from an elementary perspective, in-

cludes many topics which (sadly) do not seem to be taught anymore.

Appendix A

Answers and Hints to Selected Exercises

Chapter 1

Section 1.1 (p. 8)

1. (a)

_

5 (b)

_

5 (c)

_

17 (d) 1

(e) 2

_

17 2. Yes 3. No

Section 1.2 (p. 14)

1. (a) (−4, 4, −3) (b) (2, 6, −1)

(c)

_

−1

_

30

,

5

_

30

,

−2

_

30

_

(d)

_

41

2

(e)

_

41

2

(f) (14, −6, 8) (g) (−7, 3, −4)

(h) (−1, −6, 1) (i) (−2, −4, 2) (j) No.

3. No. |v|+|w| is larger.

Section 1.3 (p. 18)

1. 10 3. 73.4

◦

5. 90

◦

7. 0

◦

9. Yes, since v··· w0.

11. [v··· w[ 0 <

_

21

_

5 |v||w|

13. |v+w|

_

26 <

_

21+

_

5 |v|+|w|

15. Hint: use Deﬁnition 1.6.

24. Hint: See Theorem 1.10(c).

Section 1.4 (p. 29)

1. (−5, −23, −24) 3. (8, 4, −5) 5. 0

7. 16.72 9. 4

_

5 11. 9 13. 0

and (8, −10, 2) 15. 14

Section 1.5 (p. 39)

1. (a) (2, 3, −2) +t(5, 4, −3) (b) x 2+5t,

y 3+4t, z −2−3t (c)

x−2

5

y−3

4

z+2

−3

3. (a) (2, 1, 3) +t(1, 0, 1) (b) x 2+t,

y 1, z 3+t (c) x−2 z −3, y 1

5. x 1+2t, y −2+7t, z −3+8t

7. 7.65 9. (1, 2, 3)

11. 4x−4y+3z −10 0

13. x−2y−z +2 0

15. 11x−24y+21z −26 0 17. 9/

_

35

19. x 5t, y 2+3t, z −7t 21. (10, −2, 1)

Section 1.6 (p. 46)

1. radius: 1, center: (2, 3, 5) 3. radius: 5,

center: (−1, −1, −1) 5. No intersection.

7. circle x

2

+ y

2

4 in the planes z ±

_

5

9. lines

x

a

y

b

, z 0 and

x

a

−

y

b

, z 0

13.

_

2a

2−c

,

2b

2−c

, 0

_

Section 1.7 (p. 50)

1. (a) (4,

π

3

, −1) (b) (

_

17,

π

3

, 1.816)

3. (a) (2

_

7,

11π

6

, 0) (b) (2

_

7,

11π

6

,

π

2

)

5. (a) r

2

+z

2

25 (b) ρ 5

7. (a) r

2

+9z

2

36 (b) ρ

2

(1+8cos

2

φ) 36

10. (a, θ, acotφ) 12. Hint: Use the distance

formula for Cartesian coordinates.

Section 1.8 (p. 57)

1. f

′

(t) (1, 2t, 3t

2

); x 1+t, y z 1

3. f

′

(t) (−2sin2t, 2cos2t, 1); x 1,

y 2t, z t 5. v(t) (1, 1−cos t, sint),

a(t) (0, sint, cos t)

9. (a) Line parallel to c (b) Half-line paral-

lel to c (c) Hint: Think of the

189

190 Appendix A: Answers and Hints to Selected Exercises

functions as position vectors.

15. Hint: Theorem 1.16

Section 1.9 (p. 63)

1.

3π

_

5

2

3. 2(5

3/2

−8) 5. Replace

t by

__

27s+16

2

_

2/3

−4

__

9 6. Hint: Use

Theorem 1.20(e), Example 1.37, and

Theorem 1.16 7. Hint: Use Exercise 6.

9. Hint: Use f

′

(t) |f(t)|T, differentiate

that to get f

′′

(t), put those expressions into

f

′

(t) ××× f

′′

(t), then write T

′

(t) in terms of

N(t). 11. T(t)

1

_

2

(−sint, cos t, 1), N(t)

(−cos t, −sint, 0), B(t)

1

_

2

(sint, −cos t, 1),

κ(t) 1/2

Chapter 2

Section 2.1 (p. 70)

1. domain: R

2

, range: [−1, ∞) 3. domain:

{(x, y) : x

2

+ y

2

≥4}, range: [0, ∞)

5. domain: R

3

, range: [−1, 1] 7. 1

9. does not exist 11. 2 13. 2 15. 0

17. does not exist

Section 2.2 (p. 74)

1.

∂f

∂x

2x,

∂f

∂y

2y 3.

∂f

∂x

x(x

2

+ y+4)

−1/2

,

∂f

∂y

1

2

(x

2

+ y +4)

−1/2

5.

∂f

∂x

ye

xy

+ y,

∂f

∂y

xe

xy

+x 7.

∂f

∂x

4x

3

,

∂f

∂y

0

9.

∂f

∂x

x(x

2

+ y

2

)

−1/2

,

∂f

∂y

y(x

2

+ y

2

)

−1/2

11.

∂f

∂x

2x

3

(x

2

+ y+4)

−2/3

,

∂f

∂y

1

3

(x

2

+ y+4)

−2/3

13.

∂f

∂x

−2xe

−(x

2

+y

2

)

,

∂f

∂y

−2ye

−(x

2

+y

2

)

15.

∂f

∂x

ycos(xy),

∂f

∂y

xcos(xy) 17.

∂

2

f

∂x

2

2,

∂

2

f

∂y

2

2,

∂

2

f

∂x∂y

0 19.

∂

2

f

∂x

2

(y+4)(x

2

+ y+4)

−3/2

,

∂

2

f

∂y

2

−

1

4

(x

2

+ y+4)

−3/2

,

∂

2

f

∂x∂y

−

1

2

x(x

2

+ y+4)

−3/2

21.

∂

2

f

∂x

2

y

2

e

xy

,

∂

2

f

∂y

2

x

2

e

xy

,

∂

2

f

∂x∂y

(1+xy)e

xy

+1 23.

∂

2

f

∂x

2

12x

2

,

∂

2

f

∂y

2

0,

∂

2

f

∂x∂y

0 25.

∂

2

f

∂x

2

−x

−2

,

∂

2

f

∂y

2

−y

−2

,

∂

2

f

∂x∂y

0

Section 2.3 (p. 77)

1. 2x+3y−z −3 0 3. −2x+ y−z −2 0

5. x +2y z 7.

1

2

(x −1) +

4

9

(y −2) +

_

11

12

(z −

2

_

11

3

) 0 9. 3x+4y−5z 0

Section 2.4 (p. 82)

1. (2x, 2y) 3. (

x

_

x

2

+y

2

+4

,

y

_

x

2

+y

2

+4

)

5. (1/x, 1/ y) 7. (yzcos(xyz), xzcos(xyz), xycos(xyz))

9. (2x, 2y, 2z) 11. 2

_

2 13.

1

_

3

15.

_

3 cos(1) 17. increase: (45, 20),

decrease: (−45, −20)

Section 2.5 (p. 88)

1. local min. (1, 0); saddle pt. (−1, 0)

3. local min. (1, 1); local max. (−1, −1); sad-

dle pts. (1, −1), (−1, 1) 5. local min. (1, −1);

saddle pt. (0, 0) 7. local min. (0, 0)

9. local min. (−1, 1/2) 11. width = height =

depth=10 13. x y 4, z 2

Section 2.6 (p. 95)

2. (x

0

, y

0

) (0, 0) : → (0.2858, −0.3998);

(x

0

, y

0

) (1, 1) : →(1.03256, −1.94037)

Section 2.7 (p. 100)

1. min.

_

−4

_

5

,

−2

_

5

_

; max.

_

4

_

5

,

2

_

5

_

3. min.

_

20

_

13

,

30

_

13

_

; max.

_

−

20

_

13

, −

30

_

13

_

4. min.

_

−9

_

5

, 0,

2

_

5

_

; max.

_

9

8

,

_

59

4

,

−1

4

_

5.

8abc

3

_

3

191

Chapter 3

Section 3.1 (p. 104)

1. 1 3.

7

12

5.

7

6

7. 5 9.

1

2

11. 15

Section 3.2 (p. 109)

1. 1 3. 8ln2−3 5.

π

4

6.

1

4

7. 2 9.

1

6

10.

6

5

Section 3.3 (p. 112)

1.

9

2

3. (2cos(π

2

) +π

4

−2)/4 5.

1

6

7. 6

10.

1

3

Section 3.4 (p. 116)

1. The values should converge to ≈ 1.318.

(Hint: In Java the exponential function e

x

can be obtained with Math.exp(x). Other

languages have similar functions, otherwise

use e 2.7182818284590455 in your pro-

gram.)

2. ≈1.146 3. ≈0.705 4. ≈0.168

Section 3.5 (p. 123)

1. 8π 3.

4π

3

(8−3

3/2

) 7. 1−

sin2

2

9. 2πab

Section 3.6 (p. 127)

1. (1, 8/3) 3. (0,

4a

3π

) 5. (0, 3π/16)

7. (0, 0, 5a/12) 9. (7/12, 7/12, 7/12)

Section 3.7 (p. 134)

1.

_

π 2. 1 6. Both are

n

(n+1)

2

(n+2)

7.

1

n

Chapter 4

Section 4.1 (p. 142)

1. 1/2 3. 23 5. 24π 7. −2π 9. 2π

11. 0

Section 4.2 (p. 149)

1. 0 3. No 4. Yes. F(x, y)

x

2

2

−

y

2

2

5. No 9. (b) No. Hint: Think of how F is

deﬁned. 10. Yes. F(x, y) axy+bx+cy+d

Section 4.3 (p. 155)

1. 16/15 3. −5π 5. Yes. F(x, y) xy

2

+x

3

7. Yes. F(x, y) 4x

2

y+2y

2

+3x

Section 4.4 (p. 163)

1. 216π 2. 3 3. 12π/5 7. 15/4

Section 4.5 (p. 175)

1. 2

_

2π

2

2. (17

_

17−5

_

5)/3 3. 2/5

4. 2 5. 2π(π−1) 7. 67/15 9. 6

11. Yes 13. No 19. Hint: Think of how

a vector ﬁeld f(x, y) P(x, y) i +Q(x, y) j in R

2

can be extended in a natural way to be a vec-

tor ﬁeld in R

3

.

Section 4.6 (p. 186)

1. 0 3. 12

_

x

2

+ y

2

+z

2

5. 6(x+ y+z)

7. 12ρ 8. (4ρ

2

−6)e

−ρ

2

9. −

2z

r

3

e

r

+

1

r

2

e

z

11. div f

2

ρ

−

sinθ

sinφ

+cotφ;

curl f cotφ cosθe

ρ

+2e

θ

−2cosθe

φ

25. Hint: Start by showing that e

r

cosθi +

sinθj, e

θ

−sinθi +cosθj, e

z

k.

Appendix B

We will prove the right-hand rule for the cross product of two vectors in R

3

.

For any vectors v and w in R

3

, deﬁne a new vector, n(v, w), as follows:

1. If v and w are nonzero and not parallel, and θ is the angle between them, then n(v, w) is

the vector in R

3

such that:

(a) the magnitude of n(v, w) is |v||w| sinθ,

(b) n(v, w) is perpendicular to the plane containing v and w, and

(c) v, w, n(v, w) form a right-handed system.

2. If v and w are nonzero and parallel, then n(v, w) 0.

3. If either v or w is 0, then n(v, w) 0.

The goal is to show that n(v, w) v×××w for all v, w in R

3

, which would prove the right-hand

rule for the cross product (by part 1(c) of our deﬁnition). To do this, we will perform the

following steps:

Step 1: Show that n(v, w) v×××w if v and w are any two of the basis vectors i, j, k.

This was already shown in Example 1.11 in Section 1.4.

Step 2: Show that n(av, bw) ab(v×××w) for any scalars a, b if v and w are any two of the

basis vectors i, j, k.

If either a 0 or b 0 then n(av, bw) 0 ab(v×××w), so the result holds. So assume that

a /0 and b /0. Let v and wbe any two of the basis vectors i, j, k. For example, we will show

that the result holds for v i and wk (the other possibilities follow in a similar fashion).

For av ai and bw bk, the angle θ between av and bw is 90

◦

. Hence the magnitude

of n(av, bw), by deﬁnition, is |ai||bk| sin90

◦

[ab[. Also, by deﬁnition, n(av, bw) is per-

pendicular to the plane containing ai and bk, namely, the xz-plane. Thus, n(av, bw) must

be a scalar multiple of j. Since its magnitude is [ab[, then n(av, bw) must be either [ab[j or

−[ab[j.

There are four possibilities for the combinations of signs for a and b. We will consider the

case when a >0 and b >0 (the other three possibilities are handled similarly).

192

193

In this case, n(av, bw) must be either abj or −abj. Now, since i, j, k form a right-handed

system, then i, k, j form a left-handed system, and so i, k, −j form a right-handed system.

Thus, ai, bk, −abj form a right-handed system (since a >0, b >0, and ab >0). So since, by

deﬁnition, ai, bk, n(ai, bk) form a right-handed system, and since n(ai, bk) has to be either

abj or −abj, this means that we must have n(ai, bk) −abj.

But we know that ai ×××bk ab(i ×××k) ab(−j) −abj. Therefore, n(ai, bk) ab(i ×××k),

which is what we needed to show.

∴ n(av, bw) ab(v×××w)

Step 3: Show that n(u, v+w) n(u, v) +n(u, w) for any vectors u, v, w.

If u 0 then the result holds trivially since n(u, v+w), n(u, v) and n(u, w) are all the zero

vector. If v 0, then the result follows easily since n(u, v+w) n(u, 0 +w) n(u, w)

0+n(u, w) n(u, 0) n(u, w) n(u, v) +n(u, w). A similar argument shows that the result

holds if w0.

So now assume that u, v and w are all nonzero vectors. We will describe a geometric

construction of n(u, v), which is shown in the ﬁgure below. Let P be a plane perpendicular

to u. Multiply the vector v by the positive scalar |u|, then project the vector |u|v straight

down onto the plane P. You can think of this projection vector (denoted by pro j

P

|u|v) as

the shadow of the vector |u|v on the plane P, with the light source directly overhead the

terminal point of |u|v. If θ is the angle between u and v, then we see that pro j

P

|u|v has

magnitude |u||v|sinθ, which is the magnitude of n(u, v). So rotating pro j

P

|u|v by 90

◦

in a counter-clockwise direction in the plane P gives a vector whose magnitude is the same

as that of n(u, v) and which is perpendicular to pro j

P

|u|v (and hence perpendicular to v).

Since this vector is in P then it is also perpendicular to u. And we can see that u, v and

this vector form a right-handed system. Hence this vector must be n(u, v). Note that this

holds even if u ∥ v, since in that case θ 0

◦

and so sinθ 0 which means that n(u, v) has

magnitude 0, which is what we would expect.

u

v

pro j

P

|u|v

|u|v

n(u, v)

θ

θ

P

Now apply this same geometric construction to get n(u, w) and n(u, v+w). Since |u|(v+

w) is the sum of the vectors |u|v and |u|w, then the projection vector pro j

P

|u|(v+w) is

the sum of the projection vectors pro j

P

|u|v and pro j

P

|u|w (to see this, using the shadow

194 Appendix B: Proof of the Right-Hand Rule for the Cross Product

analogy again and the parallelogram rule for vector addition, think of how projecting a

parallelogram onto a plane gives you a parallelogram in that plane). So then rotating all

three projection vectors by 90

◦

in a counter-clockwise direction in the plane P preserves that

sum (see the ﬁgure below), which means that n(u, v+w) n(u, v) +n(u, w).

u

v

w

v+w

|u|(v+w)

pro j

P

|u|v

pro j

P

|u|w

pro j

P

|u|(v+w)

|u|v

|u|w

n(u, v) n(u, w)

n(u, v+w)

θ

θ

P

Step 4: Show that n(w, v) −n(v, w) for any vectors v, w.

If v and ware nonzero and parallel, or if either is 0, then n(w, v) 0 −n(v, w), so the result

holds. So assume that v and w are nonzero and not parallel. Then n(w, v) has magnitude

|w||v| sinθ, which is the same as the magnitude of n(v, w), and hence is the same as the

magnitude of −n(v, w). By deﬁnition, n(v, w) is perpendicular to the plane containing w and

v, and hence so is −n(v, w). Also, v, w, n(v, w) form a right-handed system, and so w, v,

n(v, w) form a left-handed system, and hence w, v, −n(v, w) form a right-handed system.

Thus, we have shown that −n(v, w) is a vector with the same magnitude as n(w, v) and is

perpendicular to the plane containing w and v, and that w, v, −n(v, w) form a right-handed

system. So by deﬁnition this means that −n(v, w) must be n(w, v).

Step 5: Show that n(v, w) v×××w for all vectors v, w.

Write v v

1

i +v

2

j +v

3

k and ww

1

i +w

2

j +w

3

k. Then by Steps 3 and 4, we have

195

n(v, w) n(v

1

i +v

2

j +v

3

k, w

1

i +w

2

j +w

3

k)

n(v

1

i +v

2

j +v

3

k, w

1

i) + n(v

1

i +v

2

j +v

3

k, w

2

j +w

3

k)

n(v

1

i +v

2

j +v

3

k, w

1

i) + n(v

1

i +v

2

j +v

3

k, w

2

j) + n(v

1

i +v

2

j +v

3

k, w

3

k)

−n(w

1

i, v

1

i +v

2

j +v

3

k) + −n(w

2

j, v

1

i +v

2

j +v

3

k) + −n(w

3

k, v

1

i +v

2

j +v

3

k).

We can use Steps 1 and 2 to evaluate the three terms on the right side of the last equation

above:

−n(w

1

i, v

1

i +v

2

j +v

3

k) −n(w

1

i, v

1

i) + −n(w

1

i, v

2

j) + −n(w

1

i, v

3

k)

−v

1

w

1

n(i, i) + −v

2

w

1

n(i, j) + −v

3

w

1

n(i, k)

−v

1

w

1

(i ×××i) + −v

2

w

1

(i ×××j) + −v

3

w

1

(i ×××k)

−v

1

w

1

0 + −v

2

w

1

k + −v

3

w

1

(−j)

−n(w

1

i, v

1

i +v

2

j +v

3

k) −v

2

w

1

k + v

3

w

1

j

Similarly, we can calculate

−n(w

2

j, v

1

i +v

2

j +v

3

k) v

1

w

2

k − v

3

w

2

i

and

−n(w

3

j, v

1

i +v

2

j +v

3

k) −v

1

w

3

j + v

2

w

3

i .

Thus, putting it all together, we have

n(v, w) −v

2

w

1

k + v

3

w

1

j + v

1

w

2

k − v

3

w

2

i − v

1

w

3

j + v

2

w

3

i

(v

2

w

3

−v

3

w

2

)i + (v

3

w

1

−v

1

w

3

)j + (v

1

w

2

−v

2

w

1

)k

v×××w by deﬁnition of the cross product.

∴ n(v, w) v×××w for all vectors v, w.

So since v, w, n(v, w) form a right-handed system, then v, w, v×××w form a right-handed

system, which completes the proof.

Appendix C

3D Graphing with Gnuplot

Gnuplot is a free, open-source software package for producing a variety of graphs. Versions

are available for many operating systems. Below is a very brief tutorial on how to use

Gnuplot to graph functions of several variables.

INSTALLATION

1. Go to http://www.gnuplot.info/download.html and follow the links to download the lat-

est version for your operating system. For Windows, you should get the Zip ﬁle with a

name such as gp420win32.zip, which is version 4.2.0. All the examples we will discuss

require at least version 4.2.0.

2. Install the downloaded ﬁle. For example, in Windows you would unzip the Zip ﬁle you

downloaded in Step 1 into some folder (use the “Use folder names” option if extracting

with WinZip).

RUNNING GNUPLOT

1. In Windows, run wgnuplot.exe from the folder (or bin folder) where you installed Gnu-

plot. In Linux, just type gnuplot in a terminal window.

2. You should now get a Gnuplot terminal with a gnuplot> command prompt. In Windows

this will appear in a new window, while in Linux it will appear in the terminal window

where the gnuplot command was run. For Windows, if the font is unreadable you can

change it by right-clicking on the text part of the Gnuplot window and selecting the

“Choose Font..” option. For example, the font “Courier”, style “Regular”, size “12” is

usually a good choice (that choice can be saved for future sessions by right-clicking in the

Gnuplot window again and selecting the option to update wgnuplot.ini).

3. At the gnuplot> command prompt you can now run graphing commands, which we will

now describe.

GRAPHING FUNCTIONS

The usual way to create 3D graphs in Gnuplot is with the splot command:

splot <range> <comma-separated list of functions>

196

197

For a function z f (x, y), <range> is the range of x and y values (and optionally the range

of z values) over which to plot. To specify an x range and a y range, use an expression of the

form [a : b][c : d], for some numbers a < b and c < d. This will cause the graph to be plotted

for a ≤ x ≤ b and c ≤ y ≤ d.

Function deﬁnitions use the x and y variables in combination with mathematical operators,

listed below:

Symbol Operation Example Result

+ Addition 2+3 5

− Subtraction 3−2 1

* Multiplication 2*3 6

/ Division 4/2 2

** Power 2**3 2

3

8

exp(x) e

x

exp(2) e

2

log(x) lnx log(2) ln2

sin(x) sinx sin(pi/2) 1

cos(x) cos x cos(pi) −1

tan(x) tanx tan(pi/4) 1

Example C.1. To graph the function z 2x

2

+ y

2

from x −1 to x 1 and from y −2 to

y 2, type this at the gnuplot> prompt:

splot [-1:1][-2:2] 2

*

x

**

2 + y

**

2

The result is shown below:

-1

-0.5

0

0.5

1

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0

1

2

3

4

5

6

7

2∗x∗∗2+ y∗∗2

198 Appendix C: 3D Graphing with Gnuplot

Note that we had to type 2*x**2 to multiply 2 times x

2

. For clarity, parentheses can be used

to make sure the operations are being performed in the correct order:

splot [-1:1][-2:2] 2

*

(x

**

2) + y

**

2

In the above example, to also plot the function z e

x+y

on the same graph, put a comma

after the ﬁrst function then append the new function:

splot [-1:1][-2:2] 2

*

(x

**

2) + y

**

2, exp(x+y)

By default, the x-axis and y-axis are not shown in the graph. To display the axes, use this

command before the splot command:

set zeroaxis

Also, by default the x- and y-axes are switched from their usual position. To show the axes

with the orientation which we have used throughout the text, use this command:

set view 60,120,1,1

Also, to label the axes, use these commands:

set xlabel "x"

set ylabel "y"

set zlabel "z"

To show the level curves of the surface z f (x, y) on both the surface and projected onto the

xy-plane, use this command:

set contour both

The default mesh size for the grid on the surface is 10 units. To get more of a colored/shaded

surface, increase the mesh size (to, say, 25) like this:

set isosamples 25

Putting all this together, we get the following graph with these commands:

set zeroaxis

set view 60,120,1,1

set xlabel "x"

set ylabel "y"

set zlabel "z"

set contour both

set isosamples 25

splot [-1:1][-2:2] 2

*

(x

**

2) + y

**

2, exp(x+y)

199

-1

-0.5

0

0.5

1

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

0

5

10

15

20

25

z

2∗x∗∗2+ y∗∗2

6

5

4

3

2

1

exp(x+ y)

20

15

10

5

x

y

z

The numbers listed below the functions in the key in the upper right corner of the graph

are the “levels” of the level curves of the corresponding surface. That is, they are the num-

bers c such that f (x, y) c. Because of the large number of level curves, the key was put

outside the graph with the set key outside command. If you do not want the function key

displayed, it can be turned off with this command: unset key

PARAMETRIC FUNCTIONS

Gnuplot has the ability to graph surfaces given in various parametric forms. For example,

for a surface parametrized in cylindrical coordinates

x r cosθ , y r sinθ , z z

you would do the following:

set mapping cylindrical

set parametric

splot [a:b][c:d] v

*

cos(u),v

*

sin(u),f(u,v)

where the variable u represents θ, with a ≤ u ≤ b, the variable v represents r, with c ≤v ≤ d,

and z f (u, v) is some function of u and v.

Example C.2. The graph of the helicoid z θ in Example 1.34 from Section 1.7 (p. 49) was

created using the following commands:

200 Appendix C: 3D Graphing with Gnuplot

set mapping cylindrical

set parametric

set view 60,120,1,1

set xyplane 0

set xlabel "x"

set ylabel "y"

set zlabel "z"

unset key

set isosamples 15

splot [0:4

*

pi][0:2] v

*

cos(u),v

*

sin(u),u

The command set xyplane 0 moves the z-axis so that z 0 aligns with the xy-plane (which

is not the default in Gnuplot). Looking at the graph, you will see that r varies from 0 to 2,

and θ varies from 0 to 4π.

PRINTING AND SAVING

In Windows, to print a graph from Gnuplot right-click on the titlebar of the graph’s window,

select “Options” and then the “Print..” option. If that does not work on your version of

Gnuplot, then go to the File menu on the main Gnuplot menubar, select “Output Device ...”,

and enter pdf in the Terminal type? textﬁeld, hit OK. That will allow you to print the graph

as a PDF ﬁle.

To save a graph, say, as a PNG ﬁle, go to the File menu on the main Gnuplot menubar,

select “Output Device ...”, and enter png in the Terminal type? textﬁeld, hit OK. Then, in the

File menu again, select the “Output ...” option and enter a ﬁlename (say, graph.png) in the

Output ﬁlename? textﬁeld, hit OK. Now run your splot command again and you should see

a ﬁle called graph.png in the current directory (usually the directory where wgnuplot.exe is

located, though you can change that setting using the “Change Directory ...” option in the

File menu).

In Linux, to save the graph as a ﬁle called graph.png, you would issue the following com-

mands:

set terminal png

set output ’graph.png’

and then run your splot command. There are many terminal types (which determine the

output format). Run the command set terminal to see all the possible types. In Linux,

the postscript terminal type is popular, since the print quality is high and there are many

PostScript viewers available.

To quit Gnuplot, type quit at the gnuplot> command prompt.

GNU Free Documentation License

Version 1.2, November 2002

Copyright ©2000,2001,2002 Free Software Foundation, Inc.

51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA

Everyone is permitted to copy and distribute verbatim copies of this license document, but

changing it is not allowed.

Preamble

The purpose of this License is to make a manual, textbook, or other functional and useful

document "free" in the sense of freedom: to assure everyone the effective freedom to copy

and redistribute it, with or without modifying it, either commercially or noncommercially.

Secondarily, this License preserves for the author and publisher a way to get credit for their

work, while not being considered responsible for modiﬁcations made by others.

This License is a kind of "copyleft", which means that derivative works of the document

must themselves be free in the same sense. It complements the GNUGeneral Public License,

which is a copyleft license designed for free software.

We have designed this License in order to use it for manuals for free software, because free

software needs free documentation: a free program should come with manuals providing the

same freedoms that the software does. But this License is not limited to software manuals; it

can be used for any textual work, regardless of subject matter or whether it is published as a

printed book. We recommend this License principally for works whose purpose is instruction

or reference.

1. APPLICABILITY AND DEFINITIONS

This License applies to any manual or other work, in any medium, that contains a notice

placed by the copyright holder saying it can be distributed under the terms of this License.

Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that

work under the conditions stated herein. The "Document", below, refers to any such manual

or work. Any member of the public is a licensee, and is addressed as "you". You accept

the license if you copy, modify or distribute the work in a way requiring permission under

copyright law.

A "Modiﬁed Version" of the Document means any work containing the Document or a

portion of it, either copied verbatim, or with modiﬁcations and/or translated into another

language.

201

202 GNU Free Documentation License

A "Secondary Section" is a named appendix or a front-matter section of the Document

that deals exclusively with the relationship of the publishers or authors of the Document

to the Document’s overall subject (or to related matters) and contains nothing that could

fall directly within that overall subject. (Thus, if the Document is in part a textbook of

mathematics, a Secondary Section may not explain any mathematics.) The relationship

could be a matter of historical connection with the subject or with related matters, or of

legal, commercial, philosophical, ethical or political position regarding them.

The "Invariant Sections" are certain Secondary Sections whose titles are designated,

as being those of Invariant Sections, in the notice that says that the Document is released

under this License. If a section does not ﬁt the above deﬁnition of Secondary then it is not

allowed to be designated as Invariant. The Document may contain zero Invariant Sections.

If the Document does not identify any Invariant Sections then there are none.

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts

or Back-Cover Texts, in the notice that says that the Document is released under this Li-

cense. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most

25 words.

A "Transparent" copy of the Document means a machine-readable copy, represented in

a format whose speciﬁcation is available to the general public, that is suitable for revising

the document straightforwardly with generic text editors or (for images composed of pixels)

generic paint programs or (for drawings) some widely available drawing editor, and that

is suitable for input to text formatters or for automatic translation to a variety of formats

suitable for input to text formatters. A copy made in an otherwise Transparent ﬁle format

whose markup, or absence of markup, has been arranged to thwart or discourage subsequent

modiﬁcation by readers is not Transparent. An image format is not Transparent if used for

any substantial amount of text. A copy that is not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain ASCII without markup,

Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD,

and standard-conforming simple HTML, PostScript or PDF designed for human modiﬁca-

tion. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats

include proprietary formats that can be read and edited only by proprietary word proces-

sors, SGML or XML for which the DTD and/or processing tools are not generally available,

and the machine-generated HTML, PostScript or PDF produced by some word processors for

output purposes only.

The "Title Page" means, for a printed book, the title page itself, plus such following pages

as are needed to hold, legibly, the material this License requires to appear in the title page.

For works in formats which do not have any title page as such, "Title Page" means the text

near the most prominent appearance of the work’s title, preceding the beginning of the body

of the text.

A section "Entitled XYZ" means a named subunit of the Document whose title either

is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in an-

other language. (Here XYZ stands for a speciﬁc section name mentioned below, such as

"Acknowledgments", "Dedications", "Endorsements", or "History".) To "Preserve the

203

Title" of such a section when you modify the Document means that it remains a section

"Entitled XYZ" according to this deﬁnition.

The Document may include Warranty Disclaimers next to the notice which states that

this License applies to the Document. These Warranty Disclaimers are considered to be

included by reference in this License, but only as regards disclaiming warranties: any other

implication that these Warranty Disclaimers may have is void and has no effect on the

meaning of this License.

2. VERBATIM COPYING

You may copy and distribute the Document in any medium, either commercially or non-

commercially, provided that this License, the copyright notices, and the license notice say-

ing this License applies to the Document are reproduced in all copies, and that you add no

other conditions whatsoever to those of this License. You may not use technical measures to

obstruct or control the reading or further copying of the copies you make or distribute. How-

ever, you may accept compensation in exchange for copies. If you distribute a large enough

number of copies you must also follow the conditions in section 3.

You may also lend copies, under the same conditions stated above, and you may publicly

display copies.

3. COPYING IN QUANTITY

If you publish printed copies (or copies in media that commonly have printed covers) of

the Document, numbering more than 100, and the Document’s license notice requires Cover

Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover

Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both

covers must also clearly and legibly identify you as the publisher of these copies. The front

cover must present the full title with all words of the title equally prominent and visible.

You may add other material on the covers in addition. Copying with changes limited to the

covers, as long as they preserve the title of the Document and satisfy these conditions, can

be treated as verbatim copying in other respects.

If the required texts for either cover are too voluminous to ﬁt legibly, you should put the

ﬁrst ones listed (as many as ﬁt reasonably) on the actual cover, and continue the rest onto

adjacent pages.

If you publish or distribute Opaque copies of the Document numbering more than 100,

you must either include a machine-readable Transparent copy along with each Opaque copy,

or state in or with each Opaque copy a computer-network location from which the general

network-using public has access to download using public-standard network protocols a com-

plete Transparent copy of the Document, free of added material. If you use the latter option,

you must take reasonably prudent steps, when you begin distribution of Opaque copies in

quantity, to ensure that this Transparent copy will remain thus accessible at the stated lo-

cation until at least one year after the last time you distribute an Opaque copy (directly or

through your agents or retailers) of that edition to the public.

204 GNU Free Documentation License

It is requested, but not required, that you contact the authors of the Document well before

redistributing any large number of copies, to give them a chance to provide you with an

updated version of the Document.

4. MODIFICATIONS

You may copy and distribute a Modiﬁed Version of the Document under the conditions of

sections 2 and 3 above, provided that you release the Modiﬁed Version under precisely this

License, with the Modiﬁed Version ﬁlling the role of the Document, thus licensing distribu-

tion and modiﬁcation of the Modiﬁed Version to whoever possesses a copy of it. In addition,

you must do these things in the Modiﬁed Version:

A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document,

and from those of previous versions (which should, if there were any, be listed in the

History section of the Document). You may use the same title as a previous version if the

original publisher of that version gives permission.

B. List on the Title Page, as authors, one or more persons or entities responsible for au-

thorship of the modiﬁcations in the Modiﬁed Version, together with at least ﬁve of the

principal authors of the Document (all of its principal authors, if it has fewer than ﬁve),

unless they release you from this requirement.

C. State on the Title page the name of the publisher of the Modiﬁed Version, as the pub-

lisher.

D. Preserve all the copyright notices of the Document.

E. Add an appropriate copyright notice for your modiﬁcations adjacent to the other copy-

right notices.

F. Include, immediately after the copyright notices, a license notice giving the public per-

mission to use the Modiﬁed Version under the terms of this License, in the form shown

in the Addendum below.

G. Preserve in that license notice the full lists of Invariant Sections and required Cover

Texts given in the Document’s license notice.

H. Include an unaltered copy of this License.

I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating

at least the title, year, new authors, and publisher of the Modiﬁed Version as given on the

Title Page. If there is no section Entitled "History" in the Document, create one stating

the title, year, authors, and publisher of the Document as given on its Title Page, then

add an item describing the Modiﬁed Version as stated in the previous sentence.

205

J. Preserve the network location, if any, given in the Document for public access to a Trans-

parent copy of the Document, and likewise the network locations given in the Document

for previous versions it was based on. These may be placed in the "History" section. You

may omit a network location for a work that was published at least four years before the

Document itself, or if the original publisher of the version it refers to gives permission.

K. For any section Entitled "Acknowledgments" or "Dedications", Preserve the Title of the

section, and preserve in the section all the substance and tone of each of the contributor

acknowledgments and/or dedications given therein.

L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their

titles. Section numbers or the equivalent are not considered part of the section titles.

M. Delete any section Entitled "Endorsements". Such a section may not be included in the

Modiﬁed Version.

N. Do not retitle any existing section to be Entitled "Endorsements" or to conﬂict in title

with any Invariant Section.

O. Preserve any Warranty Disclaimers.

If the Modiﬁed Version includes new front-matter sections or appendices that qualify as

Secondary Sections and contain no material copied from the Document, you may at your

option designate some or all of these sections as invariant. To do this, add their titles to

the list of Invariant Sections in the Modiﬁed Version’s license notice. These titles must be

distinct from any other section titles.

You may add a section Entitled "Endorsements", provided it contains nothing but endorse-

ments of your Modiﬁed Version by various parties–for example, statements of peer review

or that the text has been approved by an organization as the authoritative deﬁnition of a

standard.

You may add a passage of up to ﬁve words as a Front-Cover Text, and a passage of up to

25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modiﬁed Version.

Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or

through arrangements made by) any one entity. If the Document already includes a cover

text for the same cover, previously added by you or by arrangement made by the same entity

you are acting on behalf of, you may not add another; but you may replace the old one, on

explicit permission from the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License give permission to

use their names for publicity for or to assert or imply endorsement of any Modiﬁed Version.

5. COMBINING DOCUMENTS

You may combine the Document with other documents released under this License, under

the terms deﬁned in section 4 above for modiﬁed versions, provided that you include in the

206 GNU Free Documentation License

combination all of the Invariant Sections of all of the original documents, unmodiﬁed, and

list them all as Invariant Sections of your combined work in its license notice, and that you

preserve all their Warranty Disclaimers.

The combined work need only contain one copy of this License, and multiple identical In-

variant Sections may be replaced with a single copy. If there are multiple Invariant Sections

with the same name but different contents, make the title of each such section unique by

adding at the end of it, in parentheses, the name of the original author or publisher of that

section if known, or else a unique number. Make the same adjustment to the section titles

in the list of Invariant Sections in the license notice of the combined work.

In the combination, you must combine any sections Entitled "History" in the various orig-

inal documents, forming one section Entitled "History"; likewise combine any sections En-

titled "Acknowledgments", and any sections Entitled "Dedications". You must delete all

sections Entitled "Endorsements".

6. COLLECTIONS OF DOCUMENTS

You may make a collection consisting of the Document and other documents released un-

der this License, and replace the individual copies of this License in the various documents

with a single copy that is included in the collection, provided that you follow the rules of this

License for verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute it individually

under this License, provided you insert a copy of this License into the extracted document,

and follow this License in all other respects regarding verbatim copying of that document.

7. AGGREGATION WITH INDEPENDENT WORKS

A compilation of the Document or its derivatives with other separate and independent

documents or works, in or on a volume of a storage or distribution medium, is called an "ag-

gregate" if the copyright resulting from the compilation is not used to limit the legal rights

of the compilation’s users beyond what the individual works permit. When the Document

is included in an aggregate, this License does not apply to the other works in the aggregate

which are not themselves derivative works of the Document.

If the Cover Text requirement of section 3 is applicable to these copies of the Document,

then if the Document is less than one half of the entire aggregate, the Document’s Cover

Texts may be placed on covers that bracket the Document within the aggregate, or the elec-

tronic equivalent of covers if the Document is in electronic form. Otherwise they must ap-

pear on printed covers that bracket the whole aggregate.

8. TRANSLATION

Translation is considered a kind of modiﬁcation, so you may distribute translations of the

Document under the terms of section 4. Replacing Invariant Sections with translations re-

quires special permission from their copyright holders, but you may include translations of

207

some or all Invariant Sections in addition to the original versions of these Invariant Sections.

You may include a translation of this License, and all the license notices in the Document,

and any Warranty Disclaimers, provided that you also include the original English version

of this License and the original versions of those notices and disclaimers. In case of a dis-

agreement between the translation and the original version of this License or a notice or

disclaimer, the original version will prevail.

If a section in the Document is Entitled "Acknowledgments", "Dedications", or "History",

the requirement (section 4) to Preserve its Title (section 1) will typically require changing

the actual title.

9. TERMINATION

You may not copy, modify, sublicense, or distribute the Document except as expressly pro-

vided for under this License. Any other attempt to copy, modify, sublicense or distribute the

Document is void, and will automatically terminate your rights under this License. How-

ever, parties who have received copies, or rights, from you under this License will not have

their licenses terminated so long as such parties remain in full compliance.

10. FUTURE REVISIONS OF THIS LICENSE

The Free Software Foundation may publish new, revised versions of the GNU Free Doc-

umentation License from time to time. Such new versions will be similar in spirit to the

present version, but may differ in detail to address new problems or concerns. See http://

www.gnu.org/copyleft/.

Each version of the License is given a distinguishing version number. If the Document

speciﬁes that a particular numbered version of this License "or any later version" applies to

it, you have the option of following the terms and conditions either of that speciﬁed version or

of any later version that has been published (not as a draft) by the Free Software Foundation.

If the Document does not specify a version number of this License, you may choose any

version ever published (not as a draft) by the Free Software Foundation.

ADDENDUM: How to use this License for your documents

To use this License in a document you have written, include a copy of the License in the

document and put the following copyright and license notices just after the title page:

Copyright ©YEAR YOUR NAME. Permission is granted to copy, distribute and/or

modify this document under the terms of the GNU Free Documentation License, Ver-

sion 1.2 or any later version published by the Free Software Foundation; with no

Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the

license is included in the section entitled "GNU Free Documentation License".

208 GNU Free Documentation License

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the

"with...Texts." line with this:

with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts

being LIST, and with the Back-Cover Texts being LIST.

If you have Invariant Sections without Cover Texts, or some other combination of the

three, merge those two alternatives to suit the situation.

If your document contains nontrivial examples of program code, we recommend releas-

ing these examples in parallel under your choice of free software license, such as the GNU

General Public License, to permit their use in free software.

History

This section contains the revision history of the book. For persons making modiﬁcations to

the book, please record the pertinent information here, following the format in the ﬁrst item

below.

1. VERSION: 1.0

Date: 2008-01-04

Author(s): Michael Corral

Title: Vector Calculus

Modiﬁcation(s): Initial version

209

Index

Symbols

D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

M

x

, M

y

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

M

xy

, M

xz

, M

yz

. . . . . . . . . . . . . . . . . . . . . . . . 126

∆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

R

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

R

3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

¯ x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

¯ y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

¯ z. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

δ(x, y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

∂(x, y, z)

∂(u, v, w)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

∂f

∂x

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

S

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

R

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

_

C

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136, 139

C

1

, C

∞

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

∇ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80, 177

∇

2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Σ

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

_

C

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

∂. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

D

v

f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

e

r

, e

θ

, e

z

, e

ρ

, e

φ

. . . . . . . . . . . . . . . . . . . . . . . 181

dr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

i, j, k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

A

acceleration. . . . . . . . . . . . . . . . . . . . . . . . . . 2, 55

angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

annulus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

arc length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

area element . . . . . . . . . . . . . . . . . . . . . . . . . . 105

average value . . . . . . . . . . . . . . . . . . . . . . . . . 113

B

Bézier curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Beta function. . . . . . . . . . . . . . . . . . . . . . . . . . 123

C

capping surface . . . . . . . . . . . . . . . . . . . . . . . 175

Cauchy-Schwarz Inequality . . . . . . . . . . . . 17

center of mass. . . . . . . . . . . . . . . . . . . . . . . . . 124

centroid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . 60, 147

change of variable. . . . . . . . . . . . . . . . 117, 119

circulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

closed curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

closed surface . . . . . . . . . . . . . . . . . . . . . . . . . 161

collinear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

conical helix. . . . . . . . . . . . . . . . . . . . . . . . . . . 167

conservative ﬁeld . . . . . . . . . . . . . . . . . . . . . 148

constrained critical point . . . . . . . . . . . . . . . 96

continuity. . . . . . . . . . . . . . . . . . . . . . . . . . . 52, 69

continuously differentiable . . . . . . . . . 59, 80

coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Cartesian . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

curvilinear . . . . . . . . . . . . . . . . . . . . . . . . . 47

cylindrical . . . . . . . . . . . . . . . . . . . . 47, 182

ellipsoidal. . . . . . . . . . . . . . . . . . . . . . . . . 164

left-handed. . . . . . . . . . . . . . . . . . . . . . . . . . 2

polar . . . . . . . . . . . . . . . . . . . . . . . . . . 47, 121

rectangular. . . . . . . . . . . . . . . . . . . . . . . . . . 1

right-handed . . . . . . . . . . . . . . . . . . . . . . . . 2

210

Index 211

spherical . . . . . . . . . . . . . . . . . . . . . . 47, 182

coplanar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

covariance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

critical point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

curl. . . . . . . . . . . . . . . . . . . . . . . . . . 169, 178, 182

curvature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

cylinder. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

D

density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

derivative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

directional . . . . . . . . . . . . . . . . . . . . . . . . . 78

mixed partial . . . . . . . . . . . . . . . . . . . . . . 73

partial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

vector-valued function . . . . . . . . . . . . . 52

determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

differential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

differential form . . . . . . . . . . . . . . . . . . . . . . 139

directed curve . . . . . . . . . . . . . . . . . . . . . . . . . 144

direction angles . . . . . . . . . . . . . . . . . . . . . . . . 19

direction cosines. . . . . . . . . . . . . . . . . . . . . . . . 19

directional derivative. . . . . . . . . . . . . . . . . . . 78

distance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

between points . . . . . . . . . . . . . . . . . . . 6, 7

from point to line . . . . . . . . . . . . . . . . . . 33

point to plane . . . . . . . . . . . . . . 37, 41, 42

distribution function . . . . . . . . . . . . . . . . . . 129

joint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

divergence . . . . . . . . . . . . . . . . . . 162, 177, 182

Divergence Theorem. . . . . . . . . . . . . . . . . . 162

dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

double integral . . . . . . . . . . . . . . . . . . . 102, 105

polar coordinates . . . . . . . . . . . . . . . . . 121

doubly ruled surface. . . . . . . . . . . . . . . . . . . . 45

E

ellipsoid . . . . . . . . . . . . . . . . . . . . . . 43, 123, 164

elliptic cone. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

elliptic paraboloid . . . . . . . . . . . . . . . . . . . . . . 44

Euclidean space . . . . . . . . . . . . . . . . . . . . . . . . . 1

exact differential form . . . . . . 139, 154, 175

expected value . . . . . . . . . . . . . . . . . . . . . . . . 132

extreme point . . . . . . . . . . . . . . . . . . . . . . . . . . 83

F

ﬂux. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

continuous . . . . . . . . . . . . . . . . . . . . . . . . . 69

scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

vector-valued . . . . . . . . . . . . . . . . . . . . . . 51

G

Gaussian blur . . . . . . . . . . . . . . . . . . . . . . . . . . 70

global maximum . . . . . . . . . . . . . . . . . . . . . . . 83

global minimum. . . . . . . . . . . . . . . . . . . . . . . . 83

gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . 80, 182

Green’s identities . . . . . . . . . . . . . . . . . . . . . 186

Green’s Theorem. . . . . . . . . . . . . . . . . . . . . . 150

H

harmonic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

helicoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

helix . . . . . . . . . . . . . . . . . . . . . . . . . . . 51, 59, 167

hyperbolic paraboloid . . . . . . . . . . . . . . . . . . 44

hyperboloid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

one sheet . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

two sheets. . . . . . . . . . . . . . . . . . . . . . . . . . 43

hypersurface . . . . . . . . . . . . . . . . . . . . . . . . . . 110

hypervolume . . . . . . . . . . . . . . . . . . . . . . . . . . 110

I

improper integral . . . . . . . . . . . . . . . . . . . . . 108

integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

double . . . . . . . . . . . . . . . . . . . . . . . 102, 105

improper. . . . . . . . . . . . . . . . . . . . . . . . . . 108

iterated . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

multiple . . . . . . . . . . . . . . . . . . . . . . . . . . 101

surface. . . . . . . . . . . . . . . . . . . . . . . 156, 158

triple. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

irrotational . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

iterated integral . . . . . . . . . . . . . . . . . . . . . . 102

212 Index

J

Jacobi identity. . . . . . . . . . . . . . . . . . . . . . . . . . 30

Jacobian. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

joint distribution. . . . . . . . . . . . . . . . . . . . . . 131

L

Lagrange multiplier . . . . . . . . . . . . . . . . . . . . 96

lamina . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Laplacian . . . . . . . . . . . . . . . . . . . . . . . . 178, 182

level curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

vector-valued function . . . . . . . . . . . . . 52

line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

intersection of planes . . . . . . . . . . . . . . 38

parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

parametric representation. . . . . . . . . 31

perpendicular . . . . . . . . . . . . . . . . . . . . . . 34

skew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

symmetric representation . . . . . . . . . 32

through two points. . . . . . . . . . . . . . . . . 33

vector representation . . . . . . . . . . . . . . 31

line integral . . . . . . . . . . . . . . . . . . . . . . 136, 139

local maximum. . . . . . . . . . . . . . . . . . . . . . . . . 83

local minimum . . . . . . . . . . . . . . . . . . . . . . . . . 83

M

mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

mixed partial derivative. . . . . . . . . . . . . . . . 73

Möbius strip . . . . . . . . . . . . . . . . . . . . . . . . . . 168

moment . . . . . . . . . . . . . . . . . . . . . . . . . . 124, 126

momentum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Monte Carlo method . . . . . . . . . . . . . . . . . . 113

moving frame ﬁelds . . . . . . . . . . . . . . . . . . . . 62

multiple integral . . . . . . . . . . . . . . . . . . . . . . 101

multiply connected. . . . . . . . . . . . . . . . . . . . 153

N

n-positive direction . . . . . . . . . . . . . . . . . . . 169

Newton’s algorithm . . . . . . . . . . . . . . . . . . . . 89

normal derivative . . . . . . . . . . . . . . . . . . . . . 186

normal to a curve. . . . . . . . . . . . . . . . . . . . . . . 81

normal vector ﬁeld . . . . . . . . . . . . . . . . . . . . 168

O

orientable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

orthonormal vectors . . . . . . . . . . . . . . . . . . . . 64

outward normal . . . . . . . . . . . . . . . . . . . . . . . 160

P

paraboloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

elliptic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

hyperbolic . . . . . . . . . . . . . . . . . . . . . . 44, 84

of revolution . . . . . . . . . . . . . . . . . . . . . . . 44

parallelepiped . . . . . . . . . . . . . . . . . . . . . . . . . . 24

volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

parameter . . . . . . . . . . . . . . . . . . . . . . . . . . 31, 60

parametrization. . . . . . . . . . . . . . . . . . . . . . . . 60

partial derivative. . . . . . . . . . . . . . . . . . . . . . . 71

partial differential equation. . . . . . . . . . . . 74

path independence . . . . . . . . . . 146, 154, 175

piecewise smooth curve . . . . . . . . . . . . . . . 141

plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

coordinate . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Euclidean . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

in space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

line of intersection . . . . . . . . . . . . . . . . . 38

normal form . . . . . . . . . . . . . . . . . . . . . . . 35

normal vector . . . . . . . . . . . . . . . . . . . . . . 35

point-normal form . . . . . . . . . . . . . . . . . 35

tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

through three points . . . . . . . . . . . . . . . 36

position vector . . . . . . . . . . . . . . . . . 54, 55, 139

potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

probability density function. . . . . . . . . . . 129

projection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Q

quadric surface . . . . . . . . . . . . . . . . . . . . . . . . . 43

R

random variable . . . . . . . . . . . . . . . . . . . . . . 128

Riemann integral . . . . . . . . . . . . . . . . . . . . . 135

right-hand rule. . . . . . . . . . . . . . . . . . . . 21, 192

ruled surface . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Index 213

S

saddle point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

sample space . . . . . . . . . . . . . . . . . . . . . . . . . . 128

scalar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

combination. . . . . . . . . . . . . . . . . . . . . . . . 12

scalar function . . . . . . . . . . . . . . . . . . . . . . . . . 53

scalar triple product . . . . . . . . . . . . . . . . . . . . 25

Second Derivative Test . . . . . . . . . . . . . . . . . 84

second moment. . . . . . . . . . . . . . . . . . . . . . . . 134

second-degree equation. . . . . . . . . . . . . . . . . 43

simple closed curve . . . . . . . . . . . . . . . . . . . 145

simply connected. . . . . . . . . . . . . . . . . 154, 175

smooth function . . . . . . . . . . . . . . . . . . . . 59, 84

solenoidal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

spherical spiral . . . . . . . . . . . . . . . . . . . . . . . . . 54

standard normal distribution . . . . . . . . . 130

steepest descent . . . . . . . . . . . . . . . . . . . . . . . . 95

stereographic projection. . . . . . . . . . . . . . . . 46

Stokes’ Theorem . . . . . . . . . . . . . . . . . 168, 169

surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

doubly ruled . . . . . . . . . . . . . . . . . . . . . . . 45

orientable. . . . . . . . . . . . . . . . . . . . . . . . . 168

ruled . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

two-sided . . . . . . . . . . . . . . . . . . . . . . . . . 168

surface integral . . . . . . . . . . . . . . . . . . 156, 158

T

tangent plane . . . . . . . . . . . . . . . . . . . . . . . . . . 75

torus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

trace. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

triangle inequality . . . . . . . . . . . . . . . . . . . . . 18

triple integral . . . . . . . . . . . . . . . . . . . . . . . . . 110

cylindrical coordinates. . . . . . . . . . . . 122

spherical coordinates . . . . . . . . . . . . . 122

U

uniform density . . . . . . . . . . . . . . . . . . . . . . . 124

uniform distribution . . . . . . . . . . . . . . . . . . 129

uniformly distributed . . . . . . . . . . . . . . . . . 128

unit disk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

V

variance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

angle between. . . . . . . . . . . . . . . . . . . . . . 15

basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

components . . . . . . . . . . . . . . . . . . . . . . . . 13

direction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

magnitude . . . . . . . . . . . . . . . . . . . . . . . . 3, 7

normal . . . . . . . . . . . . . . . . . . . . . . . . 35, 160

normalized. . . . . . . . . . . . . . . . . . . . . . . . . 12

parallel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

perpendicular . . . . . . . . . . . . . . . . . . 16, 17

positive unit normal . . . . . . . . . . . . . . 169

principal normal N . . . . . . . . . . . . . . . . 64

scalar multiplication . . . . . . . . . . . . . . . . 9

subtraction. . . . . . . . . . . . . . . . . . . . . . . . . 10

tangent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

translation. . . . . . . . . . . . . . . . . . . . . . . . 5, 9

unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

unit binormal B. . . . . . . . . . . . . . . . . . . . 64

unit tangent T . . . . . . . . . . . . . . . . . . . . . 64

zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3, 4

vector ﬁeld . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

normal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

smooth. . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

vector triple product. . . . . . . . . . . . . . . . . . . . 25

velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2, 55

volume element . . . . . . . . . . . . . . . . . . . . . . . 110

W

wave equation. . . . . . . . . . . . . . . . . . . . . . . . . . 74

work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135, 166

Z

zenith angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Vector Calculus

Michael Corral

Schoolcraft College

About the author: Michael Corral is an Adjunct Faculty member of the Department of Mathematics at Schoolcraft College. He received a B.A. in Mathematics from the University of California at Berkeley, and received an M.A. in Mathematics and an M.S. in Industrial & Operations Engineering from the University of Michigan.

A This text was typeset in L TEX 2ε with the KOMA-Script bundle, using the GNU Emacs text editor on a Fedora Linux system. The graphics were created using MetaPost, PGF, and Gnuplot.

Copyright © 2008 Michael Corral. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

I would rate it as a 5. For more details.mecmath.4). Appendix C contains a brief tutorial on Gnuplot for graphing functions of two variables.a. hopefully with enough comments so that the reader can ﬁgure out what is being done even without knowing Java. This book is released under the GNU Free Documentation License (GFDL). the Monte Carlo method for approximating multiple integrals. Appendix B contains a proof of the right-hand rule for the cross product.net). There are exercises at the end of each section. So that there is no ambiguity on this matter. The A exercises are mostly of a routine computational nature. B and C would be “Easy”. Java was chosen due to its ubiquity. anyone can make as many copies of this book as desired and distribute it as desired. and easy availability for multiple platforms. B and C.edu for iii .g.since that is how mathematics works . The prerequisites are the standard courses in single-variable calculus (a. It is suitable for a one-semester course. Those exercises do not mandate the use of Java. Feel free to contact me at mcorral@schoolcraft. If I were to rate the level of rigor in the book on a scale of 1 to 10.k. The code samples in the text are in the Java programming language. the B exercises are slightly more involved. in Section 3.too much rigor and emphasis on proofs can impede the ﬂow of learning for the vast majority of the audience at this level. A crude way of describing A. relatively clear syntax. see the included copy of the GFDL.Preface This book covers calculus in two and three variables. which seems to have virtually disappeared from calculus texts over the last few decades. Answers and hints to most odd-numbered and some even-numbered exercises are provided in Appendix A. and the C exercises usually require some effort or insight to solve. normally known as “Vector Calculus”. which in my experience are more than enough for a semester course in this subject. “Moderate” and “Challenging”. or simply “Calculus III”. many of the B exercises are easy and not all the C exercises are difﬁcult. The PDF version will always be freely available to the public at no cost (go to http://www. There are 420 exercises throughout the text. “Multivariable Calculus”. There are a few exercises that require the student to write his or her own computer program to solve some numerical approximation problems (e. which allows others to not only copy and distribute the book but also to modify it. respectively. However. with 1 being completely informal and 10 being completely rigorous. I have tried to be somewhat rigorous about proving results. divided into three categories: A. so students are free to implement the solutions using the language of their choice. While it would have been simple to use a scripting language like Python. But while it is important for students to see full-blown proofs . and perhaps even easier with a functional programming language (such as Haskell or Scheme). without needing my permission. Calculus I and II).

January 2008 M ICHAEL C ORRAL . etc). Finally.iv Preface any questions on this or any other matter involving the book (e. comments. suggestions.g. corrections. I welcome your input. I would like to thank my students in Math 240 for being the guinea pigs for the initial draft of this book. and for ﬁnding the numerous errors and typos it contained.

. . . . . . . . . . .1 Line Integrals . . . . . . . . . . . . . . . . . . . Partial Derivatives . . . . . . . . . . . . . . . . . . . . . 101 101 105 110 113 117 124 128 4 Line and Surface Integrals 135 4. . . . . . Tangent Plane to a Surface . . . . . . . . . . . .Contents Preface 1 Vectors in Euclidean Space 1. . . . . . . . . . Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Surfaces . . . . . . . . . . . . .5 3. . . . . . . . . . . . . . . . . . . . . . . . . .1 1. . . . . . . . . . . . . . . . . . Numerical Approximation of Multiple Integrals Change of Variables in Multiple Integrals . . . . . . . . . . . . . . 143 4. . Lines and Planes .6 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 1 1 9 15 20 31 40 47 51 59 2 Functions of Several Variables 2.2 1. . . . . . . . . . . . Maxima and Minima . . . . . . . . . . . . . . . . . . . . Vector-Valued Functions Arc Length . . . Unconstrained Optimization: Numerical Methods Constrained Optimization: Lagrange Multipliers . . . . . . . . . . . 150 v . . . .6 1. . . . . . . . . . . . . . Vector Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Triple Integrals . . Double Integrals Over a General Region . . .3 Green’s Theorem . . . 65 65 71 75 78 83 89 96 3 Multiple Integrals 3. . . . . . . . . . . . .2 Properties of Line Integrals . . .1 3. . . . . . . . . . . . . . . . . . . . . . . . .4 3. . . . . . . . . . . . . . . . . . . . . . 135 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 2.4 1. . . . . . . . . . . . . . . . . . Curvilinear Coordinates .8 1. . . . . . . . . . . . . . . . . . . . .6 3. . . . . . . Application: Center of Mass . . . . . . . . . . .2 2. . . . . . . . . . . . . .7 Functions of Two or Three Variables . . . . . . . . . Directional Derivatives and the Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 1. . . . . . . . . . .3 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 2. . . . . . . . . . . . . . .7 Double Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Application: Probability and Expected Value . . . . . . .2 3. . . . . . Dot Product . . . . . . . . . .9 Introduction .3 3. . . . . . .3 2. .5 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . .vi Contents 4. . . . . . .4 Surface Integrals and the Divergence Theorem . . . . . .5 Stokes’ Theorem . . . . . . 177 Bibliography Appendix A: Appendix B: Appendix C: Answers and Hints to Selected Exercises Proof of the Right-Hand Rule for the Cross Product 3D Graphing with Gnuplot 187 189 192 196 201 209 210 GNU Free Documentation License History Index . . . . . . . . 165 4. . . . . . Divergence. . . . . . . . . . . . . . . . . 156 4. . . . . . . .6 Gradient. Curl and Laplacian . . . . . . . . . . . . . . . . .

c) b 0 a x x y xz-plane 0 x y-plane yz-plane y z Figure 1. b). z) = ( x. y and z).1 Figure 1.1.1. In vector (or multivariable) calculus.2). we will deal with functions of two or three variables (usually x. Since Euclidean space is 3-dimensional. y = f ( x). We use the word “Euclidean” to denote a system in which all the usual rules of Euclidean geometry hold.2 1 .1.1 Vectors in Euclidean Space 1. and three mutually perpendicular coordinate planes: the x y-plane. respectively). yz-plane and xz-plane (see Figure 1. y)). only by giving the illusion of three dimensions. z = f ( x. The graph of f consists of the points ( x. the graph of the function f consists of the points ( x. We denote the Euclidean plane by R2 . consists of all ordered pairs of real numbers (a.1 Introduction In single-variable calculus. The Euclidean plane has two perpendicular coordinate axes: the x-axis and the y-axis. the functions that one encounters are functions of a variable (usually x or t) that varies over some subset of the real number line (which we denote by R). y or x. b. z. lies in Euclidean space. say. say. in the Cartesian or rectangular coordinate system. the “2” represents the number of dimensions of the plane. which. f ( x. These points lie in the Euclidean plane. For such a function. we denote it by R3 . y) = ( x. f ( x)). The graph of a function of two variables. y). c).1. y. z c P(a. Euclidean space has three mutually perpendicular coordinate axes ( x. which in the Cartesian coordinate system consists of all ordered triples of real numbers (a.1. y. y. b. The 3-dimensional coordinate system of Euclidean space can be represented on a ﬂat surface. such as this page or a blackboard. in the manner shown in Figure 1.

1 So far. R4 ). This is where the idea of a vector comes in.e.1. Notice that switching the x.2 CHAPTER 1. 1 One thing you will learn is why a 4-dimensional creature would be able to reach inside an egg and remove the yolk without cracking the shell! . as in Figure 1. For an entertaining discussion of this subject.3 Right-handed coordinate system An equivalent way of deﬁning a right-handed system is if you can point your thumb upwards in the positive z-axis direction while using the remaining four ﬁngers to rotate the x-axis towards the y-axis. or its acceleration? Or the gravitational force acting on the object? These phenomena all seem to involve motion and direction in some way. which we can not see in our 3-dimensional space. But what about something such as the velocity of the object. and the thumb in the positive direction of the z-axis. the graphs exist in 4-dimensional space (i. Throughout the book we will use a right-handed system.1. the middle ﬁnger in the positive direction of the y-axis. Doing the same thing with the left hand is what deﬁnes a lefthanded coordinate system. and that rotating either type of system does not change its “handedness”.1. So we can only think of 4-dimensional space abstractly. let alone simulate in 2-dimensional space. VECTORS IN EUCLIDEAN SPACE The coordinate system shown in Figure 1. see the book by A BBOTT. we have discussed the position of an object in 2-dimensional or 3-dimensional space. z x 0 y Figure 1. because it is possible.and y-axes in a right-handed system results in a left-handed system. to point the index ﬁnger in the positive direction of the x-axis.1 is known as a right-handed coordinate system. For functions of three variables. using the right hand.3.

indicated by a plus or minus symbol (representing motion in the positive direction or the negative direction.g. f ′ ( t) = ±a for some number a ≥ 0.1. and the ± represents the direction of the velocity (though the + is usually omitted for the positive direction). respectively). y − −→Q P z Q S − −→S R R Q −→ − PQ v −→ − RS 0 R S −→ − PQ P 0 R −→ − RS y x 0 x P v P Q x S (b) Two dimensions (a) One dimension (c) Three dimensions Figure 1. −→ − denoted by PQ . For motion along a straight line.1 Introduction 3 You have already dealt with velocity and acceleration in single-variable calculus. Note that our deﬁnition could apply to systems with any number of dimensions (see Figure 1. for motion along a straight line. and it is denoted by 0. So you can think of that number. We will often denote a vector by a single bold-faced letter (e. preceded by a direction. This is the motivation for how we will deﬁne a vector.e.or 3-dimensional space. then d y/ dt = f ′ ( t) is the velocity of the object at time t. the velocities are also contained in that 1-dimensional space. which is positive if the object is moving in an agreed-upon “positive” direction.4 Vectors in different dimensions . For general motion along a curve in 2. The vector is denoted by PQ . since they are just numbers. which was called the velocity of the object. and its direction is the same as that of the directed line segment. A geometric object which has those features is an arrow. Its magnitude is the length of the line segment. velocity will need to be represented by a multidimensional object which should have both a magnitude and a direction. which in elementary geometry is called a “directed line segment”. in a 1-dimensional space. The zero vector is just a point. The derivative f ′ ( t) is just a number. Deﬁnition 1.e. A (nonzero) vector is a directed line segment drawn from a point P (called its initial point) to a point Q (called its terminal point).1. and negative if it moves in the opposite of that direction. To indicate the direction of a vector. For example. Then a is the magnitude of the velocity (normally called the speed of the object). i. we draw an arrow from its initial point to its terminal point. indicated by a nonnegative number.1. i. v) and use the terms “magnitude” and “length” interchangeably. as having two components: a magnitude.1. if y = f ( t) gives the displacement of an object after time t.4 (a)-(c)). however. with P and Q being distinct −→ − points.

and we will leave it at that. and is suggested by the vector w in Figure 1. v and w all have the same magnitude 5 (by the Pythagorean Theorem).2. since 1 they lie on lines having the same slope 2 .2 Now that we know what a vector is.5. What about the direction of the zero vector? A single point really has no well-deﬁned direction. Deﬁnition 1.5 the vectors u. can take any direction).1. Is there a single vector which we can choose to represent all those equal vectors? The answer is yes. 2 In the subject of linear algebra there is a more abstract way of deﬁning a vector where the concept of “direction” is not really used. . i.1. VECTORS IN EUCLIDEAN SPACE A few things need to be noted about the zero vector. Any vector with zero magnitude is equal to the zero vector. This leads us to the following deﬁnition. vectors with the same magnitude and direction but with different initial points would be equal. Notice that we were careful to only deﬁne the direction of a nonzero vector. we need a way of determining when two vectors are equal. See A NTON and R ORRES. So u = w. however. We also see that v is parallel to u but points in the opposite direction. does not require it to have a direction. 0 = 0. Some contend that the zero vector has arbitrary direction (i.4 CHAPTER 1. and they point in the same direction. while others say that it has no direction. even though they have different initial points. And we see that u and w are parallel.5 So we can see that there are an inﬁnite number of vectors for a given magnitude and direction. This agrees with the deﬁnition of the zero vector as just a point. those vectors all being equal and differing only by their initial and terminal points.e. Not everyone agrees on the direction of the zero vector. Two nonzero vectors are equal if they have the same magnitude and the same direction. By this deﬁnition. which is well-deﬁned since the initial and terminal points are distinct. the direction can not be determined). What is the magnitude of the zero vector? We deﬁne it to be zero. So u = v. For example. Our deﬁnition of the zero vector. which has zero length.e. in Figure 1.1. Our motivation for what a vector is included the notions of magnitude and direction.e. y 4 3 2 1 0 u v w x 1 2 3 4 Figure 1. some say that it has indeterminate direction (i.

1. Also. 5) y 0 x y 0 x (a) The point (3. we mean vectors in Cartesian coordinates starting at the origin. Similar to seeing if two points are the same. you translate each vector to start at the origin by subtracting the coordinates of the original initial point from the original terminal point. 0. 5 Thinking of vectors as starting from the origin provides a way of dealing with vectors in a standard way. 4. 5) z v = (3. 0) and the terminal point is (3. you are now seeing if the terminal points of vectors starting at the origin are the same. When doing this. we will mean the one whose initial point is at the origin of the coordinate system. it is understood that the initial point of v is at the origin (0. we will write the zero vector 0 in R2 and R3 as (0. Then compare the coordinates of the terminal points of these “new” vectors: if those coordinates are the same. 0) and (0. 4.4. which we will do in the next section).6 Correspondence between points and vectors Unless otherwise stated. then the original vectors are equal. when adding vectors. 5). 4. The point-vector correspondence provides an easy way to check if two vectors are equal. without having to determine their magnitude and direction. it is convenient to write v = (3. The resulting point will be the terminal point of the “new” vector whose initial point is the origin. ﬁnd the (unique!) vector it equals whose initial point is the origin. 4. But there will be times when it is convenient to consider a different initial point for a vector (for example.1. Example 1. 0. Another advantage of using the origin as the initial point is that it provides an easy correspondence between a vector and its terminal point. c) in R3 . when we refer to vectors as v = (a. b. Let v be the vector in R3 whose initial point is at the origin and whose terminal point is (3. 5) and the vector v are different objects.4. when speaking of “the vector” with a given magnitude and direction.1 Introduction Unless otherwise indicated. 4. b) in R2 or v = (a. 4.1. z P(3. 5).5) (b) The vector (3. 0).5) Figure 1. For each vector. 5). Though the point (3. since every coordinate system has an origin. To get the “new” vectors starting at the origin. . respectively. Do this for each original vector then compare.

5) Translate PQ to v −→ − v=w (1. 5. 4. 1. −2) = (2 − 1. 2). −3.2) . Consider the vectors PQ and RS in R3 .6 CHAPTER 1. R = −→ −→ − − (1. y2 ) in R2 . 1. 0) − (1. the distance d between P and Q is: d= ( x2 − x1 )2 + ( y2 − y1 )2 (1. 1. 7) − (2. 7) P (2. 1 − (−3). 4. 5 − 1. −→ − −→ − So PQ = v = (1. Q = ( x2 .1) By this formula. 5. 5). 0) and terminal point S − R = (2.2. VECTORS IN EUCLIDEAN SPACE −→ − −→ − Example 1.1. 7). 7 − 5) = (1. −2) −S −→ R x Figure 1. 2) and RS = w = (1. y1 ) and terminal point −→ − Q = ( x2 . 2). 4. −→ −→ − − ∴ PQ = RS z − −→ PQ Q (3. y1 ). 2) −→ − Translate RS to w 0 S (2. y2 ). 1. RS is equal to the vector w with initial point (0.Q = (3. −3. 0) y R (1. 0 − (−2)) = (1. the magnitude of PQ is: −→ − PQ = ( x2 − x1 )2 + ( y2 − y1 )2 (1. 4. Does PQ = RS ? −→ − Solution: The vector PQ is equal to the vector v with initial point (0. −→ − Similarly. 5) = (3 − 2. 0). 0) and terminal point Q − P = (3. −3. 1. 0. 5. we have the following result: −→ − For a vector PQ in R2 with initial point P = ( x1 .7 Recall the distance formula for points in the Euclidean plane: For points P = ( x1 . 2). where P = (2. −2) and S = (2. 1. 0. 4.

b. Applying the Pythagorean Theorem to the right triangle △PSR gives |PR |2 = a2 + b2 . R = (a. b. b. z1 ) and Q = ( x2 .1.2) with P = (0. Case 3: exactly one of a. c are 0. The distance d between points P = ( x1 . c is 0. A second application of the Pythagorean Theorem. so v = 0 = a2 + b 2 + c 2 .1. which is a vector of length | c | along the z-axis. the magnitude of v is: v = a2 + b 2 + c 2 02 + 02 + 02 = (1. For a vector v = (a. and S = (a. Then v = 0. z Q(a.3) To calculate the magnitude of vectors in R3 .8 . we assume that a = b = 0 and c = 0 (the other two possibilities are handled in a similar manner). gives v = |PQ | = |PR |2 + |QR |2 = This proves the theorem. c) in R3 . QED Figure 1. c). b. so by the Pythagorean Theorem we have v = b2 + c2 = 02 + b 2 + c 2 = a 2 + b 2 + c 2 . b) in R2 is a special case of formula (1.5) Proof: There are four cases to consider: Case 1: a = b = c = 0. y1 . Consider the points P = (0. 0. c are 0. Q = (a. 0. 0). as shown in Figure 1. we need a distance formula for points in Euclidean space (we will postpone the proof until the next section): Theorem 1. b. Then v = (0. c). Then v = (0. So v = | c| = c2 = 02 + 02 + c2 = a2 + b2 + c2 .1. we can assume that a. b. Case 4: none of a. c) v c y a S x P 0 b R a2 + b 2 + c 2 .1 Introduction 7 Finding the magnitude of a vector v = (a.8. this time to the right triangle △PQR . 0). b. the magnitude of v is: v = a2 + b 2 (1. y2 . 0) and Q = (a. c).4) The proof will use the following result: Theorem 1. Without loss of generality. b. 0). Without loss of generality. 0.1. Without loss of generality. z2 ) in R3 is: d= ( x2 − x1 )2 + ( y2 − y1 )2 + ( z2 − z1 )2 (1. b. b = 0 and c = 0 (the other two possibilities are handled in a similar manner).2. we assume that a = 0. b) : For a vector v = (a. b) in R2 . Case 2: exactly two of a. c are all positive (the other seven possibilities are handled in a similar manner). which is a vector in the yz-plane.

1). does PQ = RS ? −→ −→ − − 3.2. y1 . 0) (c) v = (3. −4) −→ −→ − − 2.1. 1) (e) v = (6. 3 b. y2 . R = (2. y1 . y2 . Solution: By formula (1. −1) (b) v = (2. (c) The distance between the points P = (2. −1. v = 82 + 32 = 73. C 6. the distance d = (4 − 2)2 + (2 − (−1))2 + (−3 − 4)2 = 4 + 9 + 49 = 62. Calculate the magnitudes of the following vectors: (a) v = (2. 0). and z2 > z1 > 0. 2). z1 ) R(x2 . Solution: By formula (1.2).9 y T(x2 . 3) in R2 . Though we will see a simple proof of Theorem 1. 4).2. Q = (1. −2. −1. y1 . y2 > y1 > 0. 3. 4. Let v = (a. 5). v = 52 + 82 + (−2)2 = 25 + 64 + 4 = 93. y2 . does PQ = RS ? B 4. 2. Q = (2. −1. (d) The magnitude of the vector v = (5. 2). 8. 0) be vectors in R3 . z2 ) P(x1 . For the points P = (0. 0. 1). −2) (d) v = (0. −1. (b) The magnitude of the vector v = (8.) z Q(x2 . 0.4). 0. Calculate the following: −→ − (a) The magnitude of the vector PQ in R2 with P = (−1. 0) and w = (a. z2 ) satisfy the following conditions: x2 > x1 > 0. c) and w = (3a. −2) in R3 . −→ − Solution: By formula (1. 0) Figure 1. 0) x U(x2 . 0. 0.8 CHAPTER 1. 3. Let v = (1. S = (3. z1 ) 0 S(x1 . Prove the special case of Theorem 1. PQ = (5 − (−1))2 + (5 − 2)2 = 36 + 9 = 45 = 3 5. Show that w = 3 v . 0. 2) and Q = (5. Exercises A 1. −3) in R2 . 5.1 where the points P = ( x1 . Solution: By formula (1. z1 ) and Q = ( x2 . 2. Show that w = | a | v . y1 .1 in the next section.3). (Hint: Think of Case 4 in the proof of Theorem 1.3. y2 . For the points P = (1.1. 1). 3 c) be vectors in R3 . VECTORS IN EUCLIDEAN SPACE Example 1. it is possible to prove it using methods similar to those in the proof of Theorem 1. S = (2. 2). 0) .9.5). R = (1. 4) and Q = (4. and consider Figure 1. b.

g.2. Before doing that.2 Vector Algebra 9 1. Deﬁnition 1.1 −v −2v Recall that translating a nonzero vector means that the initial point of the vector is changed but the magnitude and direction are preserved. Deﬁnition 1. is obtained by translating w so that its initial point is at the terminal point of v.1). and as ﬂipping the vector in the opposite direction if the scalar is a negative number (see Figure 1. physicist and astronomer William Rowan Hamilton. We are now ready to deﬁne the sum of two vectors. points in the same direction as v if k > 0. we will introduce the notion of a scalar. See M ARION for details. points in the opposite direction as v if k < 0. . and its terminal point is the new terminal point of w. is the vector whose magnitude is | k | v .g. electric charge.2 Vector Algebra Now that we know what vectors are. You can think of scalar multiplication of a vector as stretching or shrinking the vector. The word vector comes from Latin.1. The sum of vectors v and w. we can start to perform some of the usual algebraic operations on them (e. scalars will always be real numbers.5v Figure 1. A scalar is a quantity that can be represented by a single number. to convey the sense of something that could be represented by a point on a scale or graduated ruler. is that under certain types of coordinate transformations (e. and speed (not velocity). rotations). and is the zero vector 0 if k = 0.4. subtraction). where it means “carrier”. the initial point of v + w is the initial point of v. v 2v 3v 0. used in physics. For a scalar k and a nonzero vector v. addition. the scalar multiple of v by k. we deﬁne k0 = 0 for any scalar k. Two vectors v and w are parallel (denoted by v ∥ w) if one is a scalar multiple of the other. 3 The term scalar was invented by 19th century Irish mathematician. Deﬁnition 1.4 We can now deﬁne scalar multiplication of a vector. a quantity that is not affected is a scalar.2. while a quantity that is affected (in a certain way) is a vector.3. For the zero vector 0. For our purposes. denoted by v + w. 4 An alternate deﬁnition of scalars and vectors.5. denoted by kv.3 Examples of scalar quantities are mass.

VECTORS IN EUCLIDEAN SPACE Intuitively. and so we see that v + 0 = v = 0 + v for any vector v.2. Since we will deal mostly with Cartesian coordinates in this book.10 CHAPTER 1. v w w+v v+w v (a) Add vectors w w v−w −w v v−w w v v+w v−w (b) Subtract vectors (c) Combined add/subtract Figure 1. In general. we have not even mentioned coordinates in this section so far.3 Subtracting vectors v and w Figure 1. it is easy to see that v + (−v) = 0. that is.3.4 “Geometric” vector algebra Notice that we have temporarily abandoned the practice of starting vectors at the origin. v+w w v (a) Vectors v and w w v (b) Translate w to the end of v w v (c) The sum v + w Figure 1. (a) shows that v + w = w + v for any vectors v. . it uses laws from elementary geometry to prove statements about vectors.4 shows the use of “geometric proofs” of various laws of vector algebra. since the scalar multiple −v = −1 v is a well-deﬁned vector.2. In particular.2.2).2 Adding vectors v and w Notice that our deﬁnition is valid for the zero vector (which is just a point. as we would expect. Also. In fact. For example. we can deﬁne vector subtraction as follows: v − w = v + (−w). v −w (b) Translate −w to the end of v w v (a) Vectors v and w v v−w −w (c) The difference v − w Figure 1. w. the following two theorems are useful for performing vector algebra on vectors in R2 and R3 starting at the origin.2. See Figure 1.2.2. and hence can be translated). And (c) shows how you can think of v − w as the vector that is tacked on to the end of w to add up to v. 0 + 0 = 0. adding w to v means tacking on w to the end of v (see Figure 1.

1.2 Vector Algebra

Theorem 1.3. Let v = (v1 , v2 ), w = (w1 , w2 ) be vectors in R2 , and let k be a scalar. Then (a) kv = ( kv1 , kv2 ) (b) v + w = (v1 + w1 , v2 + w2 )

11

Proof: (a) Without loss of generality, we assume that v1 , v2 > 0 (the other possibilities are handled in a similar manner). If k = 0 then kv = 0v = 0 = (0, 0) = (0v1 , 0v2 ) = ( kv1 , kv2 ), which v2 is what we needed to show. If k = 0, then ( kv1 , kv2 ) lies on a line with slope kv2 = v1 , which kv1 is the same as the slope of the line on which v (and hence kv) lies, and ( kv1 , kv2 ) points in the same direction on that line as kv. Also, by formula (1.3) the magnitude of ( kv1 , kv2 ) is

2 2 2 2 2 2 ( kv1 )2 + ( kv2 )2 = k2 v1 + k2 v2 = k2 (v1 + v2 ) = | k | v1 + v2 = | k | v . So kv and ( kv1 , kv2 ) have the same magnitude and direction. This proves (a).

(b) Without loss of generality, we assume that v1 , v2 , w1 , w2 > 0 (the other possibilities are handled in a similar manner). From Figure 1.2.5, we see that when translating w to start at the end of v, the new terminal point of w is (v1 + w1 , v2 + w2 ), so by the deﬁnition of v + w this must be the terminal point of v + w. This proves (b). QED

v2 + w 2 w2

y

v w w v+w v

w1 v1 w1 v1 + w 1 w2

v2 0

x

Figure 1.2.5

Theorem 1.4. Let v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 ) be vectors in R3 , let k be a scalar. Then (a) kv = ( kv1 , kv2 , kv3 ) (b) v + w = (v1 + w1 , v2 + w2 , v3 + w3 )

The following theorem summarizes the basic laws of vector algebra. Theorem 1.5. For any vectors u, v, w, and scalars k, l , we have (a) v + w = w + v Commutative Law Associative Law Additive Identity Additive Inverse Associative Law Distributive Law Distributive Law

(d) v + (−v) = 0

(c) v + 0 = v = 0 + v (e) k( l v) = ( kl )v

(b) u + (v + w) = (u + v) + w

(f) k(v + w) = kv + kw (g) ( k + l )v = kv + l v

Proof: (a) We already presented a geometric proof of this in Figure 1.2.4(a). (b) To illustrate the difference between analytic proofs and geometric proofs in vector algebra, we will present both types here. For the analytic proof, we will use vectors in R3 (the proof for R2 is similar).

12

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

**Let u = ( u 1 , u 2 , u 3 ), v = (v1 , v2 , v3 ), w = (w1 , w2 , w3 ) be vectors in R3 . Then u + (v + w) = ( u 1 , u 2 , u 3 ) + ((v1 , v2 , v3 ) + (w1 , w2 , w3 ))
**

= ( u 1 , u 2 , u 3 ) + (v1 + w1 , v2 + w2 , v3 + w3 )

by Theorem 1.4(b)

= ( u 1 + v1 , u 2 + v2 , u 3 + v3 ) + (w1 , w2 , w3 ) = (u + v) + w

= (( u 1 + v1 ) + w1 , ( u 2 + v2 ) + w2 , ( u 3 + v3 ) + w3 ) by properties of real numbers

= ( u 1 + (v1 + w1 ), u 2 + (v2 + w2 ), u 3 + (v3 + w3 )) by Theorem 1.4(b)

by Theorem 1.4(b)

**This completes the analytic proof of (b). Figure 1.2.6 provides the geometric proof.
**

u + (v + w) = (u + v) + w v+w

w

u

u+v

v

Figure 1.2.6 Associative Law for vector addition

(c) We already discussed this on p.10. (d) We already discussed this on p.10. (e) We will prove this for a vector v = (v1 , v2 , v3 ) in R3 (the proof for R2 is similar):

**k( l v) = k( lv1 , lv2 , lv3 )
**

= ( kl )(v1 , v2 , v3 )

by Theorem 1.4(a) by Theorem 1.4(a) by Theorem 1.4(a)

= ( klv1 , klv2 , klv3 ) = ( kl )v

(f) and (g): Left as exercises for the reader.

QED

= 1. Dividing a nonzero vector v by v is often called normalizing v. There are speciﬁc unit vectors which we will often use, called the basis vectors: i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1) in R3 ; i = (1, 0) and j = (0, 1) in R2 . These are useful for several reasons: they are mutually perpendicular, since they lie on distinct coordinate axes; they are all unit vectors: i = j = k = 1; every vector can be written as a unique scalar combination of the basis vectors: v = (a, b) = a i + b j in R2 , v = (a, b, c) = a i + b j + c k in R3 . See Figure 1.2.7.

A unit vector is a vector with magnitude 1. Notice that for any nonzero vector v, the 1 vector v is a unit vector which points in the same direction as v, since v > 0 and v = v v

v v

1.2 Vector Algebra

z z

13

2

y y

v = (a, b, c)

ck y y ai x

(c) R3

2 1 j

0

v = (a, b)

bj

1 k i 0 j 1

x

1

2

0 bj

x

x 0 ai

(b) v = a i + b j

i 1

(a) R2

2

2

(d) v = a i + b j + c k

Figure 1.2.7

Basis vectors in different dimensions

When a vector v = (a, b, c) is written as v = a i + b j + c k, we say that v is in component form, and that a, b, and c are the i, j, and k components, respectively, of v. We have: v = v1 i + v2 j + v3 k, k a scalar =⇒ kv = kv1 i + kv2 j + kv3 k v = v1 i + v2 j + v3 k =⇒ v = Example 1.4. Let v = (2, 1, −1) and w = (3, −4, 2) in R3 . (a) Find v − w. Solution: v − w = (2 − 3, 1 − (−4), −1 − 2) = (−1, 5, −3) (b) Find 3v + 2w. Solution: 3v + 2w = (6, 3, −3) + (6, −8, 4) = (12, −5, 1) (c) Write v and w in component form. Solution: v = 2 i + j − k, w = 3 i − 4 j + 2 k (d) Find the vector u such that u + v = w. Solution: By Theorem 1.5, u = w − v = −(v − w) = −(−1, 5, −3) = (1, −5, 3), by part(a). (e) Find the vector u such that u + v + w = 0. Solution: By Theorem 1.5, u = −w − v = −(3, −4, 2) − (2, 1, −1) = (−5, 3, −1). (f) Find the vector u such that 2u + i − 2 j = k. Solution: 2u = −i + 2 j + k =⇒ u = − 1 i + j + 1 k 2 2 (g) Find the unit vector Solution:

v v 2 2 2 v1 + v2 + v3

v = v1 i + v2 j + v3 k, w = w1 i + w2 j + w3 k =⇒ v + w = (v1 + w1 )i + (v2 + w2 )j + (v3 + w3 )k

=

v v . 1 22 +12 +(−1)2

(2, 1, −1) =

2 , 1 , −1 6 6 6

14

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

We can now easily prove Theorem 1.1 from the previous section. The distance d between two points P = ( x1 , y1 , z1 ) and Q = ( x2 , y2 , z2 ) in R3 is the same as the length of the vector w − v, where the vectors v and w are deﬁned as v = ( x1 , y1 , z1 ) and w = ( x2 , y2 , z2 ) (see Figure 1.2.8). So since w − v = ( x2 − x1 , y2 − y1 , z2 − z1 ), then d = w − v = ( x2 − x1 )2 + ( y2 − y1 )2 + ( z2 − z1 )2 by Theorem 1.2.

z P(x1 , y1 , z1 )

w−v v w

0 x

Q(x2 , y2 , z2 ) y

Figure 1.2.8

Proof of Theorem 1.2: d = w − v

Exercises A

1. Let v = (−1, 5, −2) and w = (3, 1, 1). (a) Find v − w. (e) Find

1 2 (v + w)

(b) Find v + w. (f) Find −2 v + 4 w.

(c) Find

v v

.

(d) Find

1 2 (v − w)

.

.

(g) Find v − 2 w.

(h) Find the vector u such that u + v + w = i.

(i) Find the vector u such that u + v + w = 2 j + k.

(j) Is there a scalar m such that m(v + 2 w) = k? If so, ﬁnd it.

2. For the vectors v and w from Exercise 1, is v − w = v − w ? If not, which quantity is larger? 3. For the vectors v and w from Exercise 1, is v + w = v + w ? If not, which quantity is larger?

B

4. Prove Theorem 1.5(f) for R3 . 5. Prove Theorem 1.5(g) for R3 .

C

6. We know that every vector in R3 can be written as a scalar combination of the vectors i, j, and k. Can every vector in R3 be written as a scalar combination of just i and j, i.e. for any vector v in R3 , are there scalars m, n such that v = m i + n j? Justify your answer.

See Figure 1. Let v = (v1 . called the dot product. v2 .3 Dot Product You may have noticed that while we did deﬁne multiplication of a vector by a scalar in the previous section on vector algebra.1. i.6) Notice that the dot product of two vectors is a scalar. Why? Because for vectors u. the dot product is still v · w = v1 w1 + v2 w2 + v3 w3 . which we will now develop as a consequence of the analytic deﬁnition.7) (1. For vectors v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k in component form. The dot product of v and w. . does not hold for the dot product of vectors. w. denoted by v · w. θ θ θ 360◦ − θ 360◦ − θ (a) 0◦ < θ < 180◦ 360◦ − θ (b) θ = 180◦ (c) θ = 0◦ Figure 1. we did not deﬁne multiplication of a vector by a vector. Also notice that we deﬁned the dot product in an analytic way. so that 0◦ ≤ θ ≤ 180◦ . We do not deﬁne the angle between the zero vector and any other vector. w2 . the dot product is: v · w = v1 w1 + v2 w2 (1. So the associative law that holds for multiplication of numbers and for addition of vectors (see Theorem 1. w3 ) be vectors in R3 .e.3 Dot Product 15 1.3. We will now see one type of multiplication of vectors. The angle between two nonzero vectors with the same initial point is the smallest angle between them. Deﬁnition 1.5(b).3. v2 ) and w = (w1 . not a vector. Any two nonzero vectors with the same initial point have two angles between them: θ and 360◦ − θ . and so (u · v) · w is not deﬁned since the left side of that dot product (the part in parentheses) is a scalar and not a vector. for vectors v = (v1 . v3 ) and w = (w1 . We will always choose the smallest nonnegative angle θ between them. the dot product u · v is a scalar.1. is given by: v · w = v1 w1 + v2 w2 + v3 w3 Similarly. There is a geometric way of deﬁning the dot product.6. by referencing vector coordinates. w2 ) in R2 . Deﬁnition 1.(e)).7.1 Angle between vectors We can now take a more geometric view of the dot product by establishing a relationship between the dot product of two vectors and the angle between them. v.

Then cos θ = v· w v w (1. v = cos θ = v· w = v w 1 6 26 = 6. By the Law of Cosines (see Figure 1. .16 CHAPTER 1. v3 ) and w = (w1 . We will write v ⊥ w to indicate that v and w are perpendicular. we have the following important corollary to Theorem 1.6. z v θ v−w w y 0 x Figure 1.5. 1). v3 − w3 ). and let θ be the angle between them. we have v−w 2 = v 2 + w 2 −2 v w cos θ (1. w be nonzero vectors.9) (note that equation (1. Since cos 90◦ = 0.3. Two nonzero vectors v and w are perpendicular if and only if v · w = 0. VECTORS IN EUCLIDEAN SPACE Theorem 1. Let v.2).3. v2 .6: Corollary 1. and w = 26. w3 ).9) gives + w 2 −2 v w cos θ = (v1 − w1 )2 + (v2 − w2 )2 + (v3 − w3 )2 2 2 2 2 2 2 = (v1 + v2 + v3 ) + (w1 + w2 + w3 ) − 2(v1 w1 + v2 w2 + v3 w3 ) 2 2 2 2 2 2 = (v1 − 2v1 w1 + w1 ) + (v2 − 2v2 w2 + w2 ) + (v3 − 2v3 w3 + w3 ) −2 v w cos θ = −2(v · w) .7. QED v w = v 2 + w 2 − 2(v · w) .08 =⇒ θ = 85. expanding v − w v 2 2 in equation (1. −4. Let v = (v1 .41◦ Two nonzero vectors are perpendicular if the angle between them is 90◦ . so since v = 0 and w = 0 then v· w cos θ = .8) Proof: We will prove the theorem for vectors in R3 (the proof for R2 is similar). Find the angle θ between the vectors v = (2. 1. w2 . since v > 0 and w > 0. v2 − w2 . −1) and w = (3. Solution: Since v · w = (2)(3) + (1)(−4) + (−1)(1) = 1. so Example 1.2 Since v − w = (v1 − w1 . then 1 2 39 ≈ 0.9) holds even for the “degenerate” cases θ = 0◦ and 180◦ ).

then v · w = 0 by part (c). . w. The following theorem summarizes the basic properties of the dot product.1.3. Theorem 1. For any vectors u. Then by Theorem 1. and scalar k. obtuse. 5. w 0 ≤ θ < 90 ◦ ◦ w 90◦ < θ ≤ 180◦ w θ = 90◦ v v (b) v · w < 0 (c) v · w = 0 v (a) v · w > 0 Figure 1. So assume that v and w are nonzero vectors.9. See Figure 1. or a right angle. v ⊥ w since v · w = (−1)(3) + (5)(1) + (−2)(1) = 0. If θ is the angle between nonzero vectors v and w. we have (a) v · w = w · v Commutative Law Associative Law (c) v · 0 = 0 = 0 · v (b) ( kv) · w = v · ( kw) = k(v · w) (d) u · (v + w) = u · v + u · w w Distributive Law Distributive Law Cauchy-Schwarz Inequality5 (f) | v · w | ≤ v (e) (u + v) · w = u · w + v · w Proof: The proofs of parts (a)-(e) are straightforward applications of the deﬁnition of the dot product. so w . and are left to the reader as exercises. 1. negative. and so the inequality holds trivially.6.8. respectively. then > 0 for 0◦ ≤ θ < 90◦ v · w is 0 for θ = 90◦ < 0 for 90◦ < θ ≤ 180◦ 17 By Corollary 1. v · w = cos θ v w w .3 Dot Product Since cos θ > 0 for 0◦ ≤ θ < 90◦ and cos θ < 0 for 90◦ < θ ≤ 180◦ . the dot product can be thought of as a way of telling if the angle between two vectors is acute. (f) If either v = 0 or w = 0. −2) and w = (3. we also have: Corollary 1. 1) perpendicular? Solution: Yes. so since | cos θ | ≤ 1.3.3.3 Sign of the dot product & angle between vectors Example 1. We will prove part (f).6. v.8. depending on whether the dot product is positive. Are the vectors v = (−1. QED |v · w| ≤ v | v · w | = | cos θ | v 5 Also known as the Cauchy-Schwarz-Buniakovski Inequality. or zero.

as given in the following theorem: Theorem 1. For any vectors v. For Exercises 3-8. then u ⊥ ( kv + l w) for all scalars k. Another way of saying this is with the familiar statement “the shortest distance between two points is a straight line. . then their span is a line. we see that if u · v = 0 and u · w = 0. 1. then u · ( kv + l w) = k(u · v)+ l (u · w) = k(0) + l (0) = 0 for all scalars k. So what we showed above is that a vector which is perpendicular to two other vectors is also perpendicular to their span. Let v = (5. The dot product can be used to derive properties of the magnitudes of vectors. −4.” v+w w v Figure 1. QED The Triangle Inequality gets its name from the fact that in any triangle. the most important of which is the Triangle Inequality. Calculate v · w.4).9(f) we have 2 v + w ≤ v + w after taking square roots of both sides. Thus.4 w + w = ( v + w )2 and so Exercises A 1. we have v+w 2 = (v + w) · (v + w) = v · v + v · w + w · v + w · w = v ≤ v ≤ v 2 2 2 + 2(v · w) + w + 2 |v · w| + w +2 v 2 2 . (b) By part (a) and Theorem 1. if they are not parallel. Let v = −3 i − 2 j − k and w = 6 i + 4 j + 2 k. VECTORS IN EUCLIDEAN SPACE Using Theorem 1. then v = w + (v − w) ≤ w + v − w by the Triangle Inequality. l . the collection of all scalar combinations kv + l w is called the span of v and w.9.10.3. (c) Since v = w + (v − w). l . we have the following fact: If u ⊥ v and u ⊥ w. For vectors v and w.3.18 CHAPTER 1. 3). w. so since a ≤ | a | for any real number a. so by Theorem 1. −2) and w = (4. Calculate v · w. so subtracting w from both sides gives v − w ≤ v − w .9. 2. which proves (b). If nonzero vectors v and w are parallel. no one side is longer than the sum of the lengths of the other two sides (see Figure 1. then their span is a plane. we have (a) v 2 = v · v (b) v + w ≤ v + w Triangle Inequality (c) v − w ≥ v − w Proof: (a) Left as an exercise for the reader. we have . ﬁnd the angle θ between the vectors v and w.

−2. −2). respectively. w from Exercise 6. 1.) v L u w Figure 1. Let v = (8. For v. −2) 8. v = (5. 10. 1. For v. 24. and γ be the angles between a nonzero vector v in R3 and the vectors i.9(a). cos γ are called the direction cosines. 23. −4. w = −3 i + 6 j + 3 k 4. the projection of v onto w (sometimes written as pro j w v) is the vector u along the same line L as w whose terminal point is obtained by dropping a perpendicular line from the terminal point of v to L (see Figure 1. −1).9(c). w = (1. Prove Theorem 1. v = i. v = − i + 2 j + k. Let α. For nonzero vectors v and w. Prove Theorem 1. Let v = (6. γ are often called the direction angles of v. Is v ⊥ w? Justify your answer. w = (8. j.5). cos β. and cos α. C 22. then v = w. w from Exercise 5. 4. w = (2. Prove or give a counterexample: If v · w = 0 for all v. Is v ⊥ w? Justify your answer. 4).) . 4) and w = (0. 3) and w = (−2. w from Exercise 5. and k. 2.3 Dot Product 3. Prove that v − w ≤ v − w for all v. 0) 7. 15. w (Hint: Consider the angle between v and w.3. verify the Triangle Inequality v + w ≤ v + w . 17. Show that cos2 α + cos2 β + cos2 γ = 1. 6. 11. v = (4.3. Prove or give a counterexample: If u · v = u · w. β. 14. Prove Theorem 1.5 26. −10). 4) 6. 3) 5. 0. verify the Cauchy-Schwarz Inequality | v · w | ≤ v 13. Prove Theorem 1. 20. w = 3 i + 2 j + 4k 19 9. β. For v. v = (7. Prove or give a counterexample: If u · v = u · w for all u. 21. −1). For v. then w = 0.9(b). 4. B Note: Consider only vectors in R3 for Exercises 15-25. 16. v = (2. 2. then v = w. verify the Triangle Inequality v + w ≤ v + w . Prove Theorem 1. w from Exercise 6. (Note: α. w . Show that |v · w| u = . w . 4). w. verify the Cauchy-Schwarz Inequality | v · w | ≤ v 12.9(e). Prove Theorem 1. w = (4.9(d). 25.10(a). 19. 18. 1. 2.1.

4 Cross Product In Section 1.10) Example 1. is the vector in R3 given by: v × w = (v2 w3 − v3 w2 . Proof: We will show that (v × w) · v = 0: (v × w) · v = (v2 w3 − v3 w2 . The proof that v × w ⊥ w is similar.8. v3 w1 − v1 w3 . v3 w1 − v1 w3 . It turns out that this will always be the case. however. Solution: Since i = (1. Theorem 1.20 CHAPTER 1.7. Deﬁnition 1. v1 w2 − v2 w1 ) · (v1 . If the cross product v × w of two nonzero vectors v and w is also a nonzero vector. .12. (1)(1) − (0)(0)) =k = (0. 0. then it is perpendicular to both v and w.7. 1) x 1 k = i× j y i 0 j 1 1 Figure 1. 0). The resulting product. then i × j = ((0)(0) − (0)(1). This product.3 we deﬁned the dot product. VECTORS IN EUCLIDEAN SPACE 1. but we will see the geometric basis for it shortly. In the above example. Let v = (v1 . If the cross product v × w of two nonzero vectors v and w is also a nonzero vector. called the cross product. was a scalar. then it is perpendicular to the span of v and w. after rearranging the terms. denoted by v × w. w2 . not a vector. w3 ) be vectors in R3 . the cross product of the given vectors was perpendicular to both those vectors. In this section we will deﬁne a product of two vectors that does result in another vector.9.1 Similarly it can be shown that j × k = i and k × i = j. As a consequence of the above theorem and Theorem 1. 0. v2 . v3 ) and w = (w1 . 1. v3 ) = v2 w3 v1 − v3 w2 v1 + v3 w1 v2 − v1 w3 v2 + v1 w2 v3 − v2 w1 v3 = 0 . = v1 v2 w3 − v1 v2 w3 + w1 v2 v3 − w1 v2 v3 + v1 w2 v3 − v1 w2 v3 ∴ v × w ⊥ v by Corollary 1. is only deﬁned for vectors in R3 . (0)(0) − (1)(0). 0) and j = (0. v1 w2 − v2 w1 ) z (1.11. Find i × j. we have the following: QED Corollary 1. The deﬁnition may appear strange and lacking motivation.4. The cross product of v and w. which gave a way of multiplying two vectors. v2 .

4 Cross Product 21 The span of any two nonzero. w: v× w 2 = (v2 w3 − v3 w2 )2 + (v3 w1 − v1 w3 )2 + (v1 w2 − v2 w1 )2 2 2 2 2 2 2 2 2 2 2 2 2 = v2 w3 − 2v2 w2 v3 w3 + v3 w2 + v3 w1 − 2v1 w1 v3 w3 + v1 w3 + v1 w2 − 2v1 w1 v2 w2 + v2 w1 2 2 2 2 2 2 2 2 2 = v1 (w2 + w3 ) + v2 (w1 + w3 ) + v3 (w1 + w2 ) − 2(v1 w1 v2 w2 + v1 w1 v3 w3 + v2 w2 v3 w3 ) 2 2 2 2 2 2 and now adding and subtracting v1 w1 . that is. where θ is the angle between v and w. v × w form a right-handed system. Recall from Section 1. the vectors v. w. so we have: . one the opposite of the other. z v θ v× w y w x 0 P −v × w Figure 1. so sin2 θ . and v3 w3 on the right side gives 2 2 2 2 2 2 2 2 2 2 2 2 = v1 (w1 + w2 + w3 ) + v2 (w1 + w2 + w3 ) + v3 (w1 + w2 + w3 ) 2 2 2 2 2 2 = (v1 + v2 + v3 )(w1 + w2 + w3 ) 2 2 2 2 2 2 − (v1 w1 + v2 w2 + v3 w3 + 2(v1 w1 v2 w2 + v1 w1 v3 w3 + v2 w2 v3 w3 )) − ((v1 w1 )2 + (v2 w2 )2 + (v3 w3 )2 + 2(v1 w1 )(v2 w2 ) + 2(v1 w1 )(v3 w3 ) + 2(v2 w2 )(v3 w3 )) so using (a + b + c)2 = a2 + b2 + c2 + 2ab + 2ac + 2 bc for the subtracted term gives 2 2 2 2 2 2 = (v1 + v2 + v3 )(w1 + w2 + w3 ) − (v1 w1 + v2 w2 + v3 w3 )2 = v = v 2 w w w w 2 − (v · w)2 2 2 2 2 2 2 1− (v · w)2 v 2 w 2 . there are two possible directions for v × w.2 Direction of v × w We will now derive a formula for the magnitude of v × w. As shown in Figure 1. nonparallel vectors v. It turns out (see Appendix B) that the direction of v × w is given by the right-hand rule. and since 0◦ ≤ θ ≤ 180◦ . then sin θ ≥ 0.4.1.6 v× w 2 = v = v (1 − cos2 θ ) .2.4. v2 w2 . since v > 0 and w > 0. for nonzero vectors v. so the above corollary shows that v × w is perpendicular to that plane. w in R3 is a plane P . so by Theorem 1.1 that this means that you can point your thumb upwards in the direction of v × w while rotating v towards w with the remaining four ﬁngers.

when the magnitude of the cross product can be calculated directly. as shown in Figure 1.3 Q v R Think of the triangle as existing in R3 . Let △PQR and PQRS be a triangle and parallelogram. The area A PQR of △PQR is 1 2 bh. Example 1. P h θ S P S h w θ Q b R Figure 1.4. Area of triangles and parallelograms (a) The area A of a triangle with adjacent sides v. Let θ be the angle between v and w. like for any other vector. in R3 . as in the following example. where b is the base of the triangle and h is the height. w (as vectors in R3 ) is: A= 1 v× w 2 (b) The area A of a parallelogram with adjacent sides v.13. w (as vectors in R3 ) is: A = v× w . The formula is more useful for its applications in geometry.3. then A PQRS = v w sin θ By the discussion in Example 1.11) It may seem strange to bother with the above formula.8. VECTORS IN EUCLIDEAN SPACE If θ is the angle between nonzero vectors v and w in R3 . then v× w = v w sin θ (1. respectively.4. we have proved the following theorem: Theorem 1.8. and identify the sides QR and QP with vectors v and w. respectively.22 CHAPTER 1. So we see that b= v and h = w sin θ A PQR = 1 v w sin θ 2 1 v×w = 2 So since the area A PQRS of the parallelogram PQRS is twice the area of the triangle △PQR .

Then R v = (1. −→ − −→ − z Solution: Let v = PQ and w = PR .5 = ((−1)(0) − (0)(2). 8) 1 1 v× w = (1. 12.9. 0). (−3)(2) − (−1)(1)) 0 A=5 . and the cross product is only deﬁned Q w 3 3 2 for vectors in R .13 is valid. 3).46 v A= w y 0 x P(2. 8). 0) = (0.4. 3. −7). 2. and Theorem 1. 15) 2 2 1 = ((3)(15) − (25)(8). 12. (0)(1) − (−3)(0). as in Figure 1. 2). (25)(−7) − (1)(15). 2). where P = (1. 8. 2.8 were for the adjacent sides QP and QR only. Then Q(3.4. and R = (−5. 25) × (−7. Calculate the area of the parallelogram PQRS .4 Example 1. −5) x 1 2 3 4 5 Figure 1. 7. 12. 7. 1). 18)−(2. 4.5.4. 18). 0. −7) Figure 1. and S = (4. 25) and w = (−5. We would get a different formula for the area if we had picked PQ and PR as the adjacent sides. But 4 2 these are vectors in R . R can be thought of as the subset of R3 such that the z-coordinate is always 0. 4).13 makes it simpler to calculate the area of a triangle in 3-dimensional space than by using traditional geometric methods. 4. so the area A of the triangle △PQR is R(−5. So we can write 2 S v = (−3. −1) and w = (5. 2) = (1. Example 1. (1)(8) − (3)(−7)) 2 1 (−155. 8)−(2. 18) v = (3. as in Figure 1. −190. Then the area A of PQRS is v 1 P A = v × w = (−3.1. 0) × (1. 7. However.13 that the formulas hold for any adjacent sides are not justiﬁed. 3. − − → −→ − y Solution: Let v = SP and w = SR . Q = (3. 4.4. 2) = (−3. 4) − (4. then the more general statements in Theorem 1. 1) − (4. 29) = 2 1 1 (−155)2 + (−190)2 + 292 = 60966 = 2 2 A ≈ 123.4 Cross Product 23 It may seem at ﬁrst glance that since the formulas derived in Example 1. Q = (2. 4. so the choice of adjacent sides indeed does not matter. 0) and w = (1. −7) = (−7. −1. 15). where P = (2. Calculate the area of the triangle △PQR . R = (5. −1. −7) = (1. but it can be shown (see Exercise 26) that the different formulas would yield the same value.4. Theorem 1.10. 8.

(a) By the deﬁnition of the cross product and scalar multiplication. we have i× j = k j× k = i k× i = j j × i = −k i× i = j× j = k× k = 0 k × j = −i i × k = −j Recall from geometry that a parallelepiped is a 3-dimensional solid with 6 faces. all of which are parallelograms. . But the angle between v and w is 0◦ or 180◦ if and only if v ∥ w. we have: v × w = (v2 w3 − v3 w2 .11).24 CHAPTER 1. and θ is the angle between them. If both v and w are nonzero.11.e.6 6 An equivalent deﬁnition of a parallelepiped is: the collection of all scalar combinations k v + k v + k v of 1 1 2 2 3 3 some vectors v1 . w3 v1 − w1 v3 . v3 in R3 . where 0 ≤ k 1 . We will prove parts (a) and (g) and leave the rest to the reader as exercises. and scalar k.14. (g) If either v or w is 0 then v × w = 0 by part (e). k 3 ≤ 1. v × w = 0 if and only if v w sin θ = 0. so v and w are scalar multiples.4.7. which is true if and only if sin θ = 0 (since v > 0 and w > 0). and either v = 0 = 0w or w = 0 = 0v.4. then by formula (1. v2 w1 − v1 w2 ) w x 0 w× v Figure 1. Theorem 1. v2 . i. w1 v2 − w2 v1 ) = −(v3 w2 − v2 w3 . k 2 .6 Note that this says that v × w and w × v have the same magnitude but opposite direction (see Figure 1. v. VECTORS IN EUCLIDEAN SPACE The following theorem summarizes the basic properties of the cross product. So since 0◦ ≤ θ ≤ 180◦ . v3 w1 − v1 w3 . v1 w2 − v2 w1 ) z v v× w y = −w × v = −(w2 v3 − w3 v2 . Adding to Example 1. w in R3 . they are parallel. v1 w3 − v3 w1 . For any vectors u. then sin θ = 0 if and only if θ = 0◦ or 180◦ .6). QED Example 1. we have (a) v × w = −w × v Anticommutative Law (b) u × (v + w) = u × v + u × w Distributive Law Associative Law (c) (u + v) × w = u × w + v × w (e) v × 0 = 0 = 0 × v (f) v × v = 0 Distributive Law (d) ( kv) × w = v × ( kw) = k(v × w) (g) v × w = 0 if and only if v ∥ w Proof: The proofs of properties (b)-(f) are straightforward.

13(b). then the height h is u cos θ . v.4 Cross Product 25 Example 1. v. as in Figure 1. The proof of the following theorem is left as an exercise for the reader: (1.4. Another type of triple product is the vector triple product u × (v × w). Since the volume is the same no matter which base and height we use. v. w in R3 represent any three adjacent sides of a parallelepiped. Hence. w in R3 represent adjacent sides of a parallelepiped P .12) . If vectors u. and not − u cos θ . Repeating this with the base determined by w and u. where θ is the angle between u and v × w. then picking the wrong order for the three adjacent sides in the scalar triple product in formula (1. By Theorem 1. Show that the volume of P is the scalar triple product u · (v × w). By Theorem 1. then repeating the same steps using the base determined by u and v (since w is on the same side of that base’s plane as u × v).12. the area A of the base parallelogram is v × w .) Since v × w = −w × v for any vectors v.v × w allelepiped P is the area A of the base parallelogram times the height h. w forming a right-handed system. with u.15. we have the following result: For any vectors u. because the vector u is on the same side of the base parallelogram’s plane as the vector v × w (so that cos θ > 0).12 the height h of the parallelepiped is u cos θ . then the volume of the parallelepiped is | u · (v × w) |. u v×w vol(P ) = A h u u · (v × w) = v× w u v× w = u · (v × w) cos θ = u h θ w v Figure 1.4. w in R3 . w in R3 .7 Parallelepiped P In Example 1. u · (v × w) = w · (u × v) = v · (w × u) (Note that the equalities hold trivially if any of the vectors are 0.12) will give you the negative of the volume of the parallelepiped. the volume is w · (u × v).7. So taking the absolute value of the scalar triple product for any order of the three adjacent sides will always give the volume: Theorem 1. And we can see that since v × w is perpendicular to the base parallelogram determined by v and w.1. v. Volume of a parallelepiped: Let the vectors u.6 we know that u · (v × w) . Solution: Recall that the volume vol(P ) of a par.

v×w z u y 0 u × (v × w) x Figure 1. −4. Example 1. u × (v × w) is perpendicular to both u and v × w = (0. 0) − 6 (1. 14. 18.4.13) An examination of the formula in Theorem 1. 0. By the right side of formula (1. Solution: Since u · v = 6 and u · w = 7. 4) (see Figure 1. since that plane is itself perpendicular to v × w.26 CHAPTER 1. 0) − (6. u × (v × w) is perpendicular to both u and v × w. This makes sense since. .13).e. the cross product is written as: v × w = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k. In particular. w = (1. 2. then u × (v × w) = (u · w)v − (u · v)w = (8. 4). because it can be represented as a determinant. 7 See A NTON and R ORRES for a fuller development. we will just cover what is essential for our purposes. For any vectors u. VECTORS IN EUCLIDEAN SPACE Theorem 1.16 gives some idea of the geometry of the vector triple product. 0). 2. 0) = (14. being perpendicular to v × w means that u × (v × w) lies in the plane containing v and w.13. v. v = (2.8). Also. by Theorem 1. 3.11. which could be any vector? The following example may help to see how this works. 2.16.4. 0). We will not go too deeply into the theory of determinants7 . u × (v × w) = (u · w)v − (u · v)w (1.8 v w For vectors v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k in component form. w in R3 . and hence lies in the plane containing v and w (i. It is often easier to use the component form for the cross product. v and w are coplanar). and that u × (v × w) also lies in that plane. we see that u × (v × w) is a scalar combination of v and w. 0) Note that v and w lie in the x y-plane. 0) = 7 (2. But then how is u × (v × w) also perpendicular to u. 3. u × (v × w). Find u × (v × w) for u = (1.

1. written as a1 b1 c1 a1 b1 c1 a2 b2 c2 a2 b2 c2 a3 b3 c3 or a1 b1 c1 a2 b2 c2 a3 b3 .4 Cross Product A 2 × 2 matrix is an array of two rows and two columns of scalars. d is the scalar deﬁned by the following formula: a c b = ad − bc d It may help to remember this formula as being the product of the scalars on the downward diagonal minus the product of the scalars on the upward diagonal. c3 and its determinant is given by the formula: a3 b2 b3 = a1 c2 c3 b3 b1 − a2 c3 c1 b3 b1 + a3 c3 c1 b2 c2 (1. written as 27 a c b d or a c b d where a. d are scalars. b. Example 1. The determinant of such a matrix. c.15.14) One way to remember the above formula is the following: multiply each scalar in the ﬁrst row by the determinant of the 2 × 2 matrix that remains after removing the row and column that contain that scalar. then sum those products up. 1 0 2 4 −1 3 1 0 2 −1 3 0 2 =1 −0 4 3 1 2 +2 4 −1 1 0 = 1(−2 − 0) − 0(8 − 3) + 2(0 + 1) = 0 . written as a c b d or det a c b .14. Example 1. 1 2 = (1)(4) − (2)(3) = 4 − 6 = −2 3 4 A 3 × 3 matrix is an array of three rows and three columns of scalars. putting alternating plus and minus signs in front of each (starting with a plus).

4. w = (1. This gives us a determinant that is now a vector. 1. Solution: By Theorem 1. the volume vol(P ) of the parallelepiped P is the absolute value of the scalar triple product of the three adjacent sides (in any order). Let v = 4 i − j + 3 k and w = i + 2 k. w2 .17. For any vectors u = ( u 1 .9 P z v y 3 2 =2 1 −2 vol(P ) = |−28 | = 28. u 3 ).12. 3). by Example 1. 1. then the deﬁnition still makes sense.9). u 2 . Then v× w = i j k 4 −1 3 1 0 2 = −1 3 i− 0 2 4 3 j+ 1 2 4 −1 k = −2 i − 5 j + k 1 0 The scalar triple product can also be written as a determinant. w = (w1 . −2) (see Figure 1.17.28 CHAPTER 1. the following theorem provides an alternate deﬁnition of the determinant of a 3 × 3 matrix as the volume of a parallelepiped whose adjacent sides are the rows of the matrix and form a right-handed system (a left-handed system would give the negative volume). Theorem 1. if we put three vectors in the ﬁrst row of a 3 × 3 matrix.15.17. In fact.4. u · (v × w) = 2 1 3 −1 3 2 1 1 −2 −1 2 −1 1 −2 −1 3 +3 1 1 u 0 x w Figure 1. By Theorem 1. w3 ) in R3 : u1 u · (v × w) = v1 w1 u2 v2 w2 u3 v3 w3 (1. 2). VECTORS IN EUCLIDEAN SPACE We deﬁned the determinant as a scalar.15) Example 1. since we would be performing scalar multiplication on those three vectors (they would be multiplied by the 2×2 scalar determinants as before). = 2(−8) − 1(0) + 3(−4) = −28. v = (v1 . However. 3. v2 . and lets us write the cross product of v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k as a determinant: i v × w = v1 w1 j v2 w2 k v2 v3 = w2 w3 v1 v3 i− w1 w3 v1 v3 j+ w1 w3 v2 k w2 = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k Example 1. v3 ). so . v = (−1. derived from algebraic operations on scalar entries in a matrix. Find the volume of the parallelepiped with adjacent sides u = (2.16.

1. 0. 4. 13. −1) For Exercises 9-10. v = (2.1. 4) 4. v = − i + 2 j + k. 0. 4). −10). Q = (4. R = (2. P = (4. −2) 15. 2. −4. Q = (2. 2). R = (2. 1. 3). 2. v = (3. 6). 1). P = (−2. v = (7. w = (5. 1. 2) 14. R = (−1. 5). 0. 1.18. v. 3). v = (−1. P = (2. v = (3. u = (1. 4. w. calculate v × w. −2). 2. 4). 1. ﬁnd the volume of the parallelepiped with adjacent sides u. −2). Q = (1. 1. w = 3 i + 2 j + 4k For Exercises 7-8. S = (3. calculate u · (v × w) and u × (v × w). v = (1. 9. Prove: (u × v) · (w × z) = Solution: Let x = u × v. P = (5. z = (2. 1. Calculate (u × v) · (w × z) for u = (1. −2).4 Cross Product Interchanging the dot and cross products can be useful in proving vector identities: u· w u· z for all vectors u. 1. 1) 10. 1. w = (1. 3. . 3). 0. 2. v. S = (3. 5). w. calculate the area of the triangle △PQR . v = i. v = (7. 2. 6.16) = (u · w)(v · z) − (u · z)(v · w) (by commutativity of the dot product). 1. w = −3 i + 6 j + 3 k 2. Then (u × v) · (w × z) = x · (w × z) = (z · v)(w · u) − (z · u)(w · v) = = w · ((z · v)u − (z · u)v) (by Theorem 1. w = (1. −10) 6. −2. 3) 3. u = (1. 3. 0) 8. R = (6. 3). v = (2. 1. = w · (z × (u × v)) = w · (z × x) (by formula (1. u = (1. 11. 4). 2). 7. v = (5. −10). w = (2. w = (2. 0. w = (2. 4). 2). 1) For Exercises 13-14. 1. 0. 0. 2). 3). 2). calculate the area of the parallelogram PQRS . −2) 12. w = (2. u = (1. Q = (1. 2. w = (4. −4.12)) u· w u· z v· w v· z Exercises A For Exercises 1-6. 0) For Exercises 11-12. v· w v· z 29 Example 1. 1). 5. z in R3 . 0. 0) 5. w = (7. 2). 2).

14(c). 20. w lie in the same plane in R3 if and only if u · (v × w) = 0.14(f). 22. show that (u × v) × (w × z) = (z · (u × v))w − (w · (u × v))z and that (u × v) × (w × z) = (u · (w × z))v − (v · (w × z))u Why do both equations make sense geometrically? . where −u = RP and −v = RQ . 2 1 show that 1 (−u) × (−v) = 2 v × w . then v = 0. where u = PR . w in R3 : (a) v× w 2 + | v · w |2 = v 2 w 2 (b) If v · w = 0 and v × w = 0. VECTORS IN EUCLIDEAN SPACE B 16. 17. Consider the vector equation a × x = b in R3 . then v = 0 or w = 0. If v and w are unit vectors in R3 . 30. w. 18. Prove Theorem 1. Prove Theorem 1. and v = QR . Prove Theorem 1. show that 2 u × (−w) = 1 v × w . 2 27. where a = 0.16. Prove the Jacobi identity: u × (v × w) + v × (w × u) + w × (u × v) = 0 29.14(b). C 26. Prove the following for all vectors v. To do this. (Hint: Expand both sides of the equation.14(e). z in R3 . 23. for any scalar k (b) x = a 2 28.) 25. For all vectors u. 19. Show that u. Prove Theorem 1. Prove that in Example 1. Prove Theorem 1. w = QP as before.14(d). Prove Theorem 1. Show that if v × w = 0 for all w in R3 .30 CHAPTER 1. under what condition(s) would v × w also be a unit vector in R3 ? Justify your answer. Show that: (a) a · b = 0 b× a + ka is a solution to the equation. v. 21. 24.8 the formula for the area of the triangle △PQR yields the 1 same value no matter which two adjacent sides are chosen. −w = PQ .17. Prove Theorem 1. Similarly. v.

y0 . z P(x0 . the line L through P parallel to v consists of all points ( x.1 that every point on the line L can be obtained by adding the vector tv to the vector r for some scalar t. z) given by x = x0 + at. Since v = (a.5. in the language of vectors. Line through a point. b.5.16) where r = ( x0 . z = z0 + ct. c) be a nonzero vector. we can start to deal with some familiar geometric objects. z0 + ct). for − ∞ < t < ∞ (1. Note that we used the correspondence between a vector and its terminal point.1). We will ﬁrst consider lines. We can summarize the vector representation of L as follows: For a point P = ( x0 . The reason for doing this is simple: using vectors makes it easier to study objects in 3-dimensional Euclidean space. .5. z0 ) is the vector pointing to P . the line L through P parallel to v is given by r + tv. y0 .5 Lines and Planes Now that we know how to perform some operations on vectors. b. b. We then get the parametric representation of L with the parameter t: For a point P = ( x0 . y0 . parallel to a vector Let P = ( x0 . the vector r + tv will point to every point on L.5 Lines and Planes 31 1. let v = (a.17) Note that in both representations we get the point P on L by letting t = 0. y0 . z0 ) be a point in R3 . as t varies over all real numbers. z0 ) and nonzero vector v in R3 . z0 ) be the vector pointing from the origin to P . y = y0 + bt. and let L be the line through P which is parallel to v (see Figure 1. c). then we see from Figure 1. Since multiplying the vector v by a scalar t lengthens or shrinks v while preserving its direction if t > 0. z0 ) tv L r r + tv t<0 0 x Figure 1. then the terminal point of the vector r + tv is ( x0 + at. That is. y0 . like lines and planes.1. for − ∞ < t < ∞ (1. and reversing its direction if t < 0.1 r + tv t>0 v y Let r = ( x0 . y. y0 + bt. z0 ) and nonzero vector v = (a. c) in R3 . y0 .

11) and (10. called the symmetric representation of L: For a point P = ( x0 . (c) symmetric. a = 0 in the above scenario? We can not divide by zero. 6). z) such that x−2 y−3 z−5 = = 4 −1 6 (d) Letting t = 1 and t = 2 in part(b) yields the points (6. 1. −1.16). Then the symmetric representation of L would be: z L (1. 5) and parallel to the vector v = (4. (b) parametric.2 derived for the cases when b = 0 or c = 0.2). z) such that x = 2 + 4 t. y. not just L itself. .5. and so x = x0 + 0 t = x0 . say.5. On the other hand. which is x parallel to the yz-plane (see Figure 1.32 CHAPTER 1. b and c all nonzero. 17) on L. the vector representation gives us the vectors whose terminal points make up the line L. Solution: (a) Let r = (2. 3. VECTORS IN EUCLIDEAN SPACE In formula (1. 5) + t(4. 2. 6). 3. so we can write the following system of equalities. for − ∞ < t < ∞ (b) L consists of the points ( x. Lastly: (d) ﬁnd two points on L distinct from P . b. the parametric representation always gives just the points on L and nothing else. We can also solve for t in terms of y and in terms of z if neither b nor c. So you have to remember to identify the vectors r + tv with their terminal points. Similar equations can be Figure 1. 5). then we can solve for the parameter t: t = ( x − x0 )/a.19) Note that this says that the line L lies in the plane x = x0 . in the following forms: (a) vector.19. Then by formula (1. These three values all equal the same value t. 3. is zero: t = ( y − y0 )/ b and t = ( z − z0 )/ c. Write the line L through the point P = (2. y. −1. Example 1. z = 5 + 6 t. You may have noticed that the vector representation of L in formula (1. the line L through P parallel to v consists of all points ( x. y. Technically.17). though. y0 . respectively. for − ∞ < t < ∞ (c) L consists of the points ( x. z) given by the equations x − x0 y − y0 z − z0 = = a b c What if. but we do know that x = x0 + at.18) y 0 x0 x = x0 x = x0 . That is an advantage of using vector notation. y − y0 z − z0 = b c (1.16) is more compact than the parametric and symmetric formulas. L is given by: r + tv = (2. y = 3 − t. c) in R3 with a. if a = 0. z0 ) and vector v = (a.

Pick a point Q on L. If θ is the angle between w and v. 1. parametric. The distance d from P to L is the length of the line segment from P to L which is perpendicular to L (see Figure 1. y2 . −6) in parametric form. z2 ).23) . r2 − r1 is the vector from P1 to P2 . y1 . Let r1 = ( x1 . y1 . Symmetric: y = y1 + ( y2 − y1 ) t. z2 ) be the vectors pointing to P1 and P2 .4 d= v× w v (1.5 Lines and Planes 33 Line through two points Let P1 = ( x1 . z2 ) P1 (x1 . y2 . Solution: By formula (1. z1 ). then: P w θ d L Q v Figure 1. and symmetric forms for the line L: z P2 (x2 . z1 ) and r2 = ( x2 . and let r1 = ( x1 . y2 . y2 .5. we will get the entire line L as t varies over all real numbers. The following is a summary of the vector. z2 ) be distinct points in R3 . and let w be the vector from Q to P . y = 1 + 3 t. y1 . r2 = ( x2 . y1 . z1 ) and P2 = ( x2 . z) such that x = −3 + 7 t. y.20) x = x1 + ( x2 − x1 ) t.21) y − y1 z − z1 x − x1 = = x2 − x1 y2 − y1 z2 − z1 (1. then d = w sin θ . Write the line L through the points P1 = (−3.20. and let P be a point not on L. z = z1 + ( z2 − z1 ) t. P2 = ( x2 .22) Example 1. and z1 = z2 ) (1. −4) and P2 = (4. L consists of the points ( x. and let L be the line through P1 and P2 . respectively.5. y2 . So since v × w = v w sin θ and v = 0. Then the line L through P1 and P2 has the following representations: Vector: r1 + t(r2 − r1 ) .4).5. 4.5. z1 ).3 Let P1 = ( x1 . y1 = y2 . for − ∞ < t < ∞ (if x1 = x2 . z = −4 − 2 t. y1 . Then as we can see from Figure 1.21).1. z1 ) r2 − r1 L r1 0 x r2 r1 + t(r2 − r1 ) y Figure 1. z2 ) be distinct points in R3 . So if we multiply the vector r2 − r1 by a scalar t and add it to the vector r1 . for − ∞ < t < ∞ Parametric: (1. for − ∞ < t < ∞ Distance between a point and a line Let L be a line in R3 in vector form as r + tv (for −∞ < t < ∞).3.

it is often easier to use Figure 1.) Letting s = 0 in the equations for the ﬁrst line. y = 8 − 3 t. 2 + 2 s.20.34 CHAPTER 1. In this case. 3. z = −3 + 2 t The lines intersect when (−1 + 3 s. 1 − s) = (−3 + t. 1) − (−3. −4) and v = (7. In 3-dimensional space. t: −1 + 3 s = −3 + t : ⇒ t = 2 + 3 s 1 − s = −3 + 2 t : 1 − 0 = −3 + 2(2) ⇒ 1 = 1 2 + 2 s = 8 − 3 t : ⇒ 2 + 2 s = 8 − 3(2 + 3 s) = 2 − 9 s ⇒ 2 s = −9 s ⇒ s = 0 ⇒ t = 2 + 3(0) = 2 (Note that we had to check this. skew lines are on parallel 0 planes (see Figure 1.22. L 1 and L 2 are perpendicular (denoted as L 1 ⊥ L 2 ) if v1 and v2 are perpendicular. .98 z In 2-dimensional space. Also. 1. 8 − 3 t. since the values of the parameters may not be the same at the point of intersection. 3. gives the point of intersection (−1. −2). represented in vector form as r1 + sv1 and r2 + tv2 . Find the point of intersection (if any) of the following lines: x+1 y−2 z−1 = = 3 2 −1 and x+3 = y−8 z+3 = −3 2 Solution: First we write the lines in parametric form. VECTORS IN EUCLIDEAN SPACE Solution: From Example 1. Setting the two ( x. 1. 1. 1. y. that is. −2) 152 + (−43)2 + (−12)2 72 + 32 + (−2)2 2218 62 = 5. y = 2 + 2 s. It is clear that two lines L 1 and L 2 . two lines are either identical. parallel.5. 1) to the line L in Example 1. even though they are not parallel. or letting t = 2 in the equations for the second line. −4) = (4. Example 1. 2. 0. respectively. with parameters s and t: x = −1 + 3 s. 5). or they L1 intersect. then for w = QP = (1. they do not intersect but they are not parallel.5. so 4 0 = d= 15 i − 43 j − 12 k v× w = = v (7. 1.21. 1).5). for −→ − r = (−3. −4) is on L. z) triples equal will result in a system of 3 equations in 2 unknowns ( s and t). there is an additional possibility: two lines can be skew. z = 1−s and x = −3 + t. Find the distance d from the point P = (1. Since the point Q = (−3. we see that we can represent L in vector form as: r + tv. you should use different parameter variables (usually s and t) for the lines.20. x 3 To determine whether two lines in R intersect. 3 −2 i− 0 5 7 −2 j+ 4 5 7 3 k = 15 i − 43 j − 12 k . −3 + 2 t) for some s.5 the parametric representation of the lines. L2 y However. are parallel (denoted as L 1 ∥ L 2 ) if v1 and v2 are parallel. we have: v× w = i j k 7 3 −2 4 0 5 = Example 1.

b. z − z0 ). and let n = (a. c) be a nonzero vector which is perpendicular to the plane P . c) be a nonzero vector which is perpendicular to P .1. z0 ).23 is 2 x + 4 y + 8 z − 22 = 0. y. and suppose it contains a point P0 = ( x0 . (1. y.5 Lines and Planes We will now consider planes in 3-dimensional Euclidean space. the plane P consists of all points ( x. or equivalently: a( x − x0 ) + b( y − y0 ) + c( z − z0 ) = 0 The above equation is called the point-normal form of the plane P . y0 .6 The plane P Conversely. Then the vector r = ( x − x0 . if ( x. 4. y. So if r = 0. perpendicular to a vector Let P be a plane in R3 . And if r = 0 then we still have n · r = 0. y − y0 . z0 ) be a point in P .5. y0 .25) Example 1. 8). z) lies in P . This proves the following theorem: Theorem 1. y0 . Then P consists of the points ( x.26) .5. 1. b. z − z0 ) lies in the plane P (see Figure 1. Solution: By formula (1. we get an equation of the plane in normal form: ax + b y + cz + d = 0 For example. 35 Plane through a point. Now let ( x. n r (x.24) where r = ( x − x0 . y.18. the normal form of the plane in Example 1. then r ⊥ n and hence n · r = 0. z) is any point in R3 such that r = ( x − x0 . 3) and perpendicular to the vector n = (2. Let n = (a.25) and combine the constant terms. Let P be a plane in R3 . Find the equation of the plane P containing the point (−3. z) (x0 . (1. z0 ) Figure 1. y.23. z) be any point in the plane P . let ( x0 . y − y0 . y. y − y0 .25). Such a vector is called a normal vector (or just a normal) to the plane. z − z0 ) = 0 and n · r = 0. z) satisfying the vector equation: n· r = 0 (1. z) such that: 2( x + 3) + 4( y − 1) + 8( z − 3) = 0 If we multiply out the terms in formula (1. then r ⊥ n and so ( x.6).

to ﬁnd the equation of the plane that contains those two lines. as in Example 1. 2) and S = (3. 3). one point from one line and two points from the other). Then for the vectors QR = (−1.24.7 Noncollinear points Q. So two skew lines do not determine a plane. z) such that: 5( x − 2) − 3( y − 1) + ( z − 3) = 0 or in normal form. −2). three collinear points (i. the plane P consists of all points ( x. 1).5. −→ − −→ − So QR and QS (and hence Q . three noncollinear points do determine a plane. −→ − Solution: Let Q = (2.7). 1. all on the same line) do not determine a plane. then use the technique above. an inﬁnite number of planes would contain the line on which those three points lie. . −1. 1. VECTORS IN EUCLIDEAN SPACE Plane containing three noncollinear points In 2-dimensional and 3-dimensional space. to write the equation. then QR and QS are nonzero vectors which are not parallel (by −→ −→ − − −→ − −→ − noncollinearity). −1) × (1. However. −2. y. n = QR × QS −→ − QR −→ −→ − − R S Q −→ − QS Figure 1. −1) −→ − and QS = (1. Find the equation of the plane P containing the points (2. Two points do not determine a plane in R3 . two points determine a line. R = (1. 3). But two (nonidentical) lines which either intersect or are parallel do determine a plane. −2) = (5. For if Q . (1.25) with the point Q (we could also use R or S ). parallel planes.36 CHAPTER 1.24. 1) So using formula (1. R and S are −→ − −→ − noncollinear points in R3 . 2. −3.5. S Example 1.e. 1). simply pick from the two lines a total of three noncollinear points (i. 1. 2. R.e. and so their cross product QR × QS is perpendicular to both QR and QS . R and S ) lie in the plane through the point Q with normal −→ −→ − − vector n = QR × QS (see Figure 1. −2. 5 x − 3 y + z − 10 = 0 We mentioned earlier that skew lines in R3 lie on separate. We will leave examples of this as exercises for the reader. 2) and (3. In both cases. the plane P has a normal vector −→ −→ − − n = QR × QS = (−1. −1. 1. In fact.

5. Let Q = ( x0 . so n r D = | cos θ | r = = n r n | ax0 + b y0 + cz0 − (ax + b y + cz) | n· r r = n· r = = a2 + b 2 + c 2 a2 + b 2 + c 2 | ax0 + b y0 + cz0 − (− d ) | a2 + b 2 + c 2 | a( x0 − x) + b( y0 − y) + c( z0 − z) | = | ax0 + b y0 + cz0 + d | a2 + b 2 + c 2 If n points away from the side of P where the point Q is located.24. Then the distance D from Q to P is: D= | ax0 + b y0 + cz0 + d | a2 + b 2 + c 2 (1. Solution: Recall that the plane is given by 5 x − 3 y + z − 10 = 0. b.19. The following theorem gives a formula for that distance. so cos θ > 0.87 . any plane divides R3 into two disjoint parts.5 Lines and Planes 37 Distance between a point and a plane The distance between a point in R3 and a plane is the length of the line segment from that point to the plane which is perpendicular to the plane. y0 − y. 4. z0 − z).3.1. Then r = 0 since Q does not lie in P . Now. z0 ) be a point in R3 . y0 .8). z) be any point in the plane P (so that ax + b y + cz + d = 0) and let −→ − r = RQ = ( x0 − x.5. Then 0◦ < θ < 90◦ .8 By Theorem 1.27) Proof: Let R = ( x.6 in Section 1. we know that cos θ = n·r . and let P be a plane with normal form ax + b y + cz + d = 0 that does not contain Q . From the normal form equation for P . the distance D is cos θ r = | cos θ | r (see Figure 1. Theorem 1. Place n so that its initial point is at R . Assume that n points toward the side of P where the point Q is located. c) is a normal vector for P . QED Example 1. and thus repeating the same argument as above still gives the same result. The distance D is then | cos θ | r . then 90◦ < θ < 180◦ and so cos θ < 0. Thus. and let θ be the angle between r and n. So | 5(2) − 3(4) + 1(−5) − 10 | |−17 | D= 52 + (−3)2 + 12 = 35 = 17 35 ≈ 2. Find the distance D from (2.25. n r D θ Q D P R Figure 1. y. we know that n = (a. −5) to the plane from Example 1.

38 CHAPTER 1. z) on both planes will satisfy the following system of two equations in three unknowns: 5 x − 3 y + z − 10 = 0 2x + 4 y − z + 3 = 0 Set x = 0 (why is that a good choice?). Likewise. intersect in a line L.9).5.28) where r is any vector pointing to a point belonging to both planes. Since n1 × n2 ⊥ n1 . 26). 7. y. Suppose that two planes P1 and P2 with normal vectors n1 and n2 . Solution: The plane 5 x − 3 y + z − 10 = 0 has normal vector n1 = (5. If two planes do intersect. substituting that into the ﬁrst equation gives y = 7. 1) and the plane 2 x + 4 y − z + 3 = 0 has normal vector n2 = (2.26. for − ∞ < t < ∞ . i. 7. Since n1 and n2 are not scalar multiples. and the planes are perpendicular if their normal vectors are perpendicular. Then the above equations are reduced to: −3 y + z − 10 = 0 4y − z + 3 = 0 The second equation gives z = 4 y + 3. 7. then n1 × n2 is parallel to the plane P1 .5. VECTORS IN EUCLIDEAN SPACE Line of intersection of two planes Note that two planes are parallel if they have normal vectors that are parallel. z = 31 + 26 t. respectively. which leaves you to solve two equations in just two unknowns. n1 × n2 is parallel to the intersection of P1 and P2 . Thus. Since n1 × n2 = (−1. for − ∞ < t < ∞ or in parametric form: x = − t. n1 × n2 ⊥ n2 means that Figure 1. y. for − ∞ < t < ∞ (1. Thus. Then z = 31. ﬁnd a common solution ( x. 4. y = 7 + 7 t. 31) is on L. 7.e. A point ( x. This can often be made easier by setting one of the coordinate variables to zero. n1 × n2 is parallel to L. z) to the two normal form equations of the planes. 26). −1). we can write L in the following vector form: L : r + t(n1 × n2 ) . −3. and so the point (0. then the two planes are not parallel and hence will intersect. then L is given by: r + t(n1 × n2 ) = (0. Example 1. 31) + t(−1. they do so in a line (see L Figure 1. To ﬁnd a point in both planes.9 n1 × n2 is also parallel to P2 . Find the line of intersection L of the planes 5 x − 3 y + z − 10 = 0 and 2 x + 4 y − z + 3 = 0.

write the line L through the points P1 and P2 in parametric form. x + 3 y + 2 z − 6 = 0. 2).1. write the normal form of the plane containing the given points. 8. 17. z = 7 + t 8. 3). 1. P = (0. −2. 2 x − y + z + 2 = 0 20. 9. 7. 2. 0. (Hint: Put the equations of the line into the equation of the plane. P : 3 x − y − 5 z + 8 = 0 18. 4) For Exercises 13-14. −2). n = (4. −3). Q = (6. ﬁnd the point of intersection (if any) of the given lines. and x = 1 + 6 t. (1. L : x = 3 + 2 t. 1. ﬁnd the distance d from the point P to the line L. −3). (−3. y = 2 + t. P2 = (3. Write the normal form of the plane containing the lines from Exercise 9. 1. −10) For Exercises 5-6. z = −7 − 5 s 10. y = 4 + 3 t. (0. −2). 0). write the line L through the point P and parallel to the vector v in the following forms: (a) vector. 13. 3 x + y − 5 z = 0. 0). v = (5. z = 3 − 2 t x−6 x − 11 y − 14 z + 9 = y + 3 = z and = = 4 3 −6 2 For Exercises 11-12. 1. 0.5 Lines and Planes 39 Exercises A For Exercises 1-4. n = (2. (b) parametric. 0. x + 2 y + z + 4 = 0 B 21. P1 = (1. (4. P2 = (−2. 1. Q = (5. z = 5 + 4 t For Exercises 9-10. (1. ﬁnd the line of intersection (if any) of the given planes. 0. 5. −1. 1) 15. 2. 6. 3) For Exercises 7-8. v = (2. Write the normal form of the plane containing the lines from Exercise 10. 3). 0. For Exercises 17-18. 5. −1). 2).) . Q = (0. Q = (4. −1). −2. P = (2. 5). −4. 11. −3) 3. 2. ﬁnd the distance D from the point Q to the plane P . (6. 6) 14. L : x = −2 − 2 t. and (c) symmetric. 1) 2. 0). Find the point(s) of intersection (if any) of the line x−6 = y + 3 = z with the plane 4 x + 3 y + 2 z − 6 = 0. 1. 0). P : −5 x + 2 y − 7 z + 1 = 0 For Exercises 19-20. P = (1. write the normal form of the plane P containing the point Q and perpendicular to the vector n. −4. 3) 12. P = (0. y = −4 − 3 s. 16. 19. 1. v = (1. 3. x = 7 + 3 s. 3). P = (3. −1. 4. 1) 4. v = (7. y = 4 t. P1 = (4. 1. 5) 6. P = (2.

VECTORS IN EUCLIDEAN SPACE 1. In this section we will look at some surfaces that are more complex. the most important of which are the sphere and the cylinder. z) : ( x − x0 )2 + ( y − y0 )2 + ( z − z0 )2 = r 2 } Using vector notation.6.1(a) that the intersection of the sphere with the x y-plane is a circle of radius r (i. y. y0 .1 illustrates the vectorial approach to spheres. A plane is an example of a surface. z0 ) x0 y x (a) radius r .9. Surfaces are 2-dimensional.30) x =r x x y 0 x − x0 = r x − x0 ( x0 . for some real-valued function F .e. A sphere S is the set of all points ( x. y. y. z0 ) (called the center of the sphere): S = { ( x. 0) 0 x (b) radius r . 8 See O’N EILL for a deeper and more rigorous discussion of surfaces. 0. z) in R3 which are a ﬁxed distance r (called the radius) from a ﬁxed point P0 = ( x0 .6. In general. a plane given by ax + b y + cz + d = 0 is the solution set of F ( x. Similarly for the intersections with the xz-plane and the yz-plane. y0 . z z (1. since it is “ﬂat”.40 CHAPTER 1. z) and x0 = ( x0 . center ( x0 . . y0 .1 Spheres in R3 Note in Figure 1. y. given by x2 + y2 = r 2 as a subset of R2 ). For example.29) S = { x : x − x0 = r } where x = ( x. z0 ) are vectors. center (0. z) = 0 for the function F ( x. z) = ax + b y + cz + d . y. z) = 0 in R3 . a plane intersects a sphere either at a single point or in a circle. Figure 1. a great circle.6. z0 ) Figure 1. Deﬁnition 1. which we will deﬁne informally8 as the solution set of the equation F ( x.6 Surfaces In the previous section we discussed planes in Euclidean space. y0 . The plane is the simplest surface. y. this can be written in the equivalent form: (1.

28. x 2 + y2 + z 2 − 4 x + 2 y − 8 z + 5 = 0 Solution: Put the equations of the line into the equation of the sphere. y = 1 + 2 t.6. Example 1.27.1. which was ( x − 2)2 + ( y + 1)2 + ( z − 4)2 = 16. c and d . 4). Putting those two values into the 6 equations of the line gives the following two points of intersection: 2+ . and solve for t: (3 + t − 2)2 + (1 + 2 t + 1)2 + (3 − t − 4)2 = 16 ( t + 1)2 + (2 t + 2)2 + (− t − 1)2 = 16 6 t2 + 12 t − 10 = 0 4 The quadratic formula gives the solutions t = −1 ± 4 6 8 6 4 6 . 0. we get an equation of the form: x2 + y2 + z2 + ax + b y + cz + d = 0 (1.28 and the line x = 3 + t. Is 2 x2 + 2 y2 + 2 z2 − 8 x + 4 y − 16 z + 10 = 0 the equation of a sphere? Solution: Dividing both sides of the equation by 2 gives ( x2 − 4 x + 4) + ( y2 + 2 y + 1) + ( z2 − 8 z + 16) + 5 − 4 − 1 − 16 = 0 ( x − 2)2 + ( y + 1)2 + ( z − 4)2 = 16 which is a sphere of radius 4 centered at (2. −1 − 8 6 .31) for some constants a. Figure 1. Find the points(s) of intersection (if any) of the sphere from Example 1.4+ 4 6 . Conversely. 12). which can be determined by completing the square for the x.2 If the equation in formula (1.29) is multiplied out. b. Find the intersection of the sphere x2 + y2 + z2 = 169 with the plane z = 12.6. y and z variables.4− and 2− 4 6 . −1 + . Putting z = 12 into the equation of the sphere gives z z = 12 y 0 x2 + y2 + 122 = 169 x2 + y2 = 169 − 144 = 25 = 52 x which is a circle of radius 5 centered at (0. Example 1. so it does intersect the plane z = 12.6 Surfaces 41 Example 1.29.2). −1. z = 3 − t. Solution: The sphere is centered at the origin and has radius 13 = 169. parallel to the x y-plane (see Figure 1. an equation of this form may describe a sphere.

42

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

If two spheres intersect, they do so either at a single point or in a circle. Example 1.30. Find the intersection (if any) of the spheres x2 + y2 + z2 = 25 and x2 + y2 + ( z − 2)2 = 16. Solution: For any point ( x, y, z) on both spheres, we see that

16 − ( z − 2)2 = 25 − z2

x2 + y2 + ( z − 2)2 = 16

x2 + y2 + z2 = 25

⇒ ⇒ ⇒ ⇒

x2 + y2 = 25 − z2 , and

4z − 4 = 9

2 2

x2 + y2 = 16 − ( z − 2)2 , so x + y = 25 − (13/4)2 = 231/16

231 4

⇒

z = 13/4

∴ The intersection is the circle x2 + y2 = 231 of radius 16

≈ 3.8 centered at (0, 0, 13 ). 4

The cylinders that we will consider are right circular cylinders. These are cylinders obtained by moving a line L along a circle C in R3 in a way so that L is always perpendicular to the plane containing C . We will only consider the cases where the plane containing C is parallel to one of the three coordinate planes (see Figure 1.6.3).

z r z r y 0 x

(a) x2 + y2 = r 2 , any z

z

y

r 0 x

y

0 x

(b) x2 + z2 = r 2 , any y

(c) y2 + z2 = r 2 , any x

Figure 1.6.3

Cylinders in R3

For example, the equation of a cylinder whose base circle C lies in the x y-plane and is centered at (a, b, 0) and has radius r is ( x − a)2 + ( y − b)2 = r 2 , (1.32)

where the value of the z coordinate is unrestricted. Similar equations can be written when the base circle lies in one of the other coordinate planes. A plane intersects a right circular cylinder in a circle, ellipse, or one or two lines, depending on whether that plane is parallel, oblique9 , or perpendicular, respectively, to the plane containing C . The intersection of a surface with a plane is called the trace of the surface.

9 i.e. at an angle strictly between 0◦ and 90◦ .

1.6 Surfaces

43

The equations of spheres and cylinders are examples of second-degree equations in R3 , i.e. equations of the form

Ax2 + B y2 + Cz2 + Dx y + Exz + F yz + Gx + H y + I z + J = 0

(1.33)

for some constants A , B, . . . , J . If the above equation is not that of a sphere, cylinder, plane, line or point, then the resulting surface is called a quadric surface. One type of quadric surface is the ellipsoid, given by an equation of the form:

c

z

x 2 y2 z 2 + + =1 a2 b 2 c 2

(1.34)

y

In the case where a = b = c, this is just a sphere. In general, an ellipsoid is egg-shaped (think of an ellipse rotated around its major axis). Its traces in the coordinate planes are ellipses.

a

x

0

b

Figure 1.6.4

Ellipsoid

Two other types of quadric surfaces are the hyperboloid of one sheet, given by an equation of the form: x 2 y2 z 2 + − =1 (1.35) a2 b 2 c 2 and the hyperboloid of two sheets, whose equation has the form:

x 2 y2 z 2 − − =1 a2 b 2 c 2

z z

(1.36)

y 0 0

y

x Figure 1.6.5 Hyperboloid of one sheet

x Figure 1.6.6 Hyperboloid of two sheets

44

CHAPTER 1. VECTORS IN EUCLIDEAN SPACE

For the hyperboloid of one sheet, the trace in any plane parallel to the x y-plane is an ellipse. The traces in the planes parallel to the xz- or yz-planes are hyperbolas (see Figure 1.6.5), except for the special cases x = ±a and y = ± b; in those planes the traces are pairs of intersecting lines (see Exercise 8). For the hyperboloid of two sheets, the trace in any plane parallel to the x y- or xz-plane is a hyperbola (see Figure 1.6.6). There is no trace in the yz-plane. In any plane parallel to the yz-plane for which | x | > | a |, the trace is an ellipse. The elliptic paraboloid is another type of quadric surface, whose equation has the form:

z

x 2 y2 z + = a2 b 2 c

(1.37)

The traces in planes parallel to the x y-plane are ellipses, though in the x y-plane itself the trace is a single point. The traces in planes parallel to the xz- or yz-planes are parabolas. Figure y 1.6.7 shows the case where c > 0. When c < 0 the surface is 0 turned downward. In the case where a = b, the surface is called x a paraboloid of revolution, which is often used as a reﬂecting sur- Figure 1.6.7 Paraboloid face, e.g. in vehicle headlights.10 A more complicated quadric surface is the hyperbolic paraboloid, given by:

x 2 y2 z − = a2 b 2 c

(1.38)

100 50

z

0 -10 -50 -5 -100 -10 -5 0 5 5 10 10 0

x

y

Figure 1.6.8 Hyperbolic paraboloid

10 For a discussion of this see pp. 157-158 in H ECHT.

1.6 Surfaces

45

The hyperbolic paraboloid can be tricky to draw; using graphing software on a computer can make it easier. For example, Figure 1.6.8 was created using the free Gnuplot package (see Appendix C). It shows the graph of the hyperbolic paraboloid z = y2 − x2 , which is the special case where a = b = 1 and c = −1 in equation (1.38). The mesh lines on the surface are the traces in planes parallel to the coordinate planes. So we see that the traces in planes parallel to the xz-plane are parabolas pointing upward, while the traces in planes parallel to the yz-plane are parabolas pointing downward. Also, notice that the traces in planes parallel to the x y-plane are hyperbolas, though in the x y-plane itself the trace is a pair of intersecting lines through the origin. This is true in general when c < 0 in equation (1.38). When c > 0, the surface would be similar to that in Figure 1.6.8, only rotated 90◦ around the z-axis and the nature of the traces in planes parallel to the xz- or yz-planes would be reversed. The last type of quadric surface that we will consider is the elliptic cone, which has an equation of the form:

z

x 2 y2 z 2 + − =0 a2 b 2 c 2

(1.39)

y The traces in planes parallel to the x y-plane are ellipses, ex0 cept in the x y-plane itself where the trace is a single point. The traces in planes parallel to the xz- or yz-planes are hyperbolas, except in the xz- and yz-planes themselves where the traces are pairs of intersecting lines. x Notice that every point on the elliptic cone is on a line which lies entirely on the surface; in Figure 1.6.9 these lines all go Figure 1.6.9 Elliptic cone through the origin. This makes the elliptic cone an example of a ruled surface. The cylinder is also a ruled surface. What may not be as obvious is that both the hyperboloid of one sheet and the hyperbolic paraboloid are ruled surfaces. In fact, on both surfaces there are two lines through each point on the surface (see Exercises 11-12). Such surfaces are called doubly ruled surfaces, and the pairs of lines are called a regulus. It is clear that for each of the six types of quadric surfaces that we discussed, the surface can be translated away from the origin (e.g. by replacing x2 by ( x − x0 )2 in its equation). It can be proved11 that every quadric surface can be translated and/or rotated so that its equation matches one of the six types that we described. For example, z = 2 x y is a case of equation (1.33) with “mixed” variables, e.g. with D = 0 so that we get an x y term. This equation does not match any of the types we considered. However, by rotating the x- and y-axes by 45◦ in the x y-plane by means of the coordinate transformation x = ( x′ − y′ )/ 2, y = ( x′ + y′ )/ 2, z = z′ , then z = 2 x y becomes the hyperbolic paraboloid z′ = ( x′ )2 − ( y′ )2 in the ( x′ , y′ , z′ ) coordinate system. That is, z = 2 x y is a hyperbolic paraboloid as in equation (1.38), but rotated 45◦ in the x y-plane.

11 See Ch. 7 in P OGORELOV.

3) and (0. c) be an arbitrary point on S ∗ . 8.) 12 See W ELCHONS and K RICKENBERGER.46 CHAPTER 1. Find the intersection of the spheres x2 + y2 + z2 = 9 and ( x − 4)2 + ( y + 2)2 + ( z − 4)2 = 9. which essentially identiﬁes all of R2 with a “punctured” sphere. −4. 2) and (a. 0) x Figure 1. Recall that two planes intersect in a line. −1. 0. Let S be the sphere with radius 1 centered at (0. ﬁnd its radius and center.10. C 10. and the − b2 = y2 in the x y-plane. y. z = 3 + t .6. Then the line passing through (0. 0). (Note: Every point in the x y-plane can be matched with a point on S ∗ . 0.6. in this manner. y = −2 − 3 t . VECTORS IN EUCLIDEAN SPACE Exercises A For Exercises 1-4. c) 1 S y (x. 2). 0. Find the trace of the hyperbolic paraboloid x2 a2 x2 a2 y2 z2 c2 + b2 − z c = 1 in the plane x = a. y. z (0. 2 x2 + 2 y2 + 2 z2 + 4 x + 4 y + 4 z − 44 = 0 2.e. i. and let S ∗ be S without the “north pole” point (0. points that do not lie in the same plane) determine a sphere. for a proof. b and c. (Hint: Write equation (1. x2 + y2 + z2 + 2 x − 2 y − 8 z + 19 = 0 4. 0) in terms of a.12 Find the equation of the sphere that passes through the points (0. Find the trace of the hyperboloid of one sheet trace in the plane y = b. x2 + y2 − z2 + 12 x + 2 y − 4 z + 32 = 0 5. b. 1. (Hint: Exercise 11) 13. b.e. (Hint: Equation (1. Find this point ( x. It can be shown that any four noncoplanar points (i. Show that the hyperboloid of one sheet is a doubly ruled surface. Find the point(s) of intersection of the sphere ( x − 3)2 + ( y + 1)2 + ( z − 3)2 = 9 and the line x = −1 + 2 t . 7. as in Figure 1. 0. Show that the hyperbolic paraboloid is a doubly ruled surface. 160. If so.) a2 c 12. B 6. 3). (1. b. determine if the given equation describes a sphere. factor each side. 2). 0. Let (a. each point on the surface is on two lines lying entirely on the surface. 0). 9. 0. 2) (a. y. 1). c) intersects the x y-plane at some point ( x. This method is called stereographic projection. p. (0. Find the intersection of the sphere x2 + y2 + z2 = 9 and the cylinder x2 + y2 = 4.31)) 11. and vice versa. x2 + y2 + z2 − 4 x − 6 y − 10 z + 37 = 0 3.10 0 .35) as 2 y2 x2 − z2 = 1 − b2 .

let ( r.7. these paths can 0 x be curved. θ .1 as with Cartesian coordinates.2 Cylindrical coordinates z ρ Spherical coordinates (ρ . z) and the spherical coordinates (ρ .7 Curvilinear Coordinates 47 1.3). θ . y. z): z P(x. Also. 0. φ is called the zenith angle. 0) be the projection of P upon the x y-plane. 0) where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0 Figure 1.1. y. y) as a point in R2 . and φ is undeﬁned when ( x. and let P0 = ( x. Cylindrical coordinates are often used when there is symmetry around the z-axis. y. z) be a point in Cartesian coordinates in R3 . φ) of P ( x. z) are determined by following straight paths starting from the origin: ﬁrst along the z x-axis. ρ ≥ 0 and 0 ≤ φ ≤ π. Let P = ( x.7. spherical coordinates are useful when there is symmetry about the origin.7. Note that r ≥ 0. θ . 0) where 0 ≤ θ ≤ π if y ≥ 0 and π < θ < 2π if y < 0 Figure 1. y. y. θ . Figure 1.7. physicists usually switch the deﬁnitions of θ and φ to make (ρ . and let φ be the angle between that line segment and the positive z-axis (see Figure 1. 0).7. y. For this reason. z) = (0. y. Let ρ be the length of the line segment from the origin to P . y. φ) a right-handed system. Then the cylindrical coordinates ( r.7 Curvilinear Coordinates z (x. z) z y x = ρ sin φ cos θ y = ρ sin φ sin θ z = ρ cos φ ρ= θ = tan−1 −1 x 2 + y2 + z 2 y x φ x z x2 + y2 + z2 0 θ y φ = cos x P0 (x. Instead of refy x erencing a point in terms of sides of a rectangular parallelepiped. 0). y. z) The Cartesian coordinates of a point ( x. φ): P(x. we will think of the point as lying on a cylinder or sphere. z) are deﬁned as follows:13 Cylindrical coordinates ( r. In curvilinear coordinate systems. The two types of curvilinear coordinates which we will consider are cylindrical and spherical coordinates. Treating ( x. then parallel to the y-axis. y. 13 This “standard” deﬁnition of spherical coordinates used by mathematicians results in a left-handed system. . y) = (0. 0 ≤ θ < 2π. θ is undeﬁned when ( x. θ . θ ) be its polar coordinates (see Figure 1. z) z x = r cos θ y = r sin θ z=z θ = tan−1 r= x 2 + y2 y x x x z=z 0 θ y y r P0 (x.1. then parallel to the z-axis.2). as y in Figure 1.3 Spherical coordinates Both θ and φ are measured in radians.7.

54 . θ = tan−1 π ∴ (r.4 Cylindrical coordinate surfaces For spherical coordinates (ρ . and the surface z = z0 is a plane parallel to the x y-plane. φ) = 3.7. 54 . φ). and constants ρ 0 . 1. θ .5(a) show how these coordinate systems got their names. −2.31.5 Spherical coordinate surfaces Figures 1. we see from Figure 1. θ0 and z0 .23 radians. 1) from Cartesian coordinates to (a) cylindrical and (b) spherical coordinates.7. and the surface φ = φ0 is a circular cone whose vertex is at the origin. ≈ 1. since y = −2 < 0. θ. VECTORS IN EUCLIDEAN SPACE Example 1. and constants r 0 .7. z r0 y 0 y 0 z z0 y z 0 θ0 x (a) r = r 0 x (b) θ = θ0 x (c) z = z0 Figure 1. Convert the point (−2.23 9 = 3. z z z ρ0 y 0 0 y φ0 y θ0 0 x (c) φ = φ0 x (a) ρ = ρ 0 x (b) θ = θ0 Figure 1. Solution: (a) r = (−2)2 + (−2)2 = 2 2.7.4 that the surface r = r 0 is a cylinder of radius r 0 centered along the z-axis. For cylindrical coordinates ( r. the surface θ = θ0 is a half-plane emanating from the z-axis. z). θ0 and φ0 . θ . . 1 (b) ρ = (−2)2 + (−2)2 + 12 = π ∴ (ρ .7. the surface θ = θ0 is a half-plane emanating from the z-axis. φ = cos−1 1 3 −2 −2 = tan−1 (1) = 5π 4 . z) = 2 2.5 that the surface ρ = ρ 0 is a sphere of radius ρ 0 centered at the origin. θ. we see from Figure 1.48 CHAPTER 1.7.4(a) and 1.

7 Curvilinear Coordinates 49 Sometimes the equation of a surface in Cartesian coordinates can be transformed into a simpler equation in some other coordinate system. Figure 1. So this sweeps out a (ruled!) surface shaped like a spiral staircase.7. Example 1. or ρ 2 − 2 sin φ (2 cos θ − sin θ ) ρ − 4 = 0 x2 + y2 + z2 − 4 x − 2 y + 5 = 9 . where the spiral has an inﬁnite radius. Describe the surface given by θ = z in cylindrical coordinates. Solution: This surface is called a helicoid. then the equation in cylindrical coordinates is r = 2.5 2 2 1 1.5 -2 -1. Write the equation ( x − 2)2 + ( y − 1)2 + z2 = 9 in spherical coordinates.34.5 0 y 0. Solution: Multiplying the equation out gives ρ 2 − 4ρ sin φ cos θ − 2ρ sin φ sin θ − 4 = 0 . 1.5 1 1.1. as opposed to the Cartesian equation where you could immediately identify the surface as a sphere of radius 3 centered at (2. so does the angle θ .33. As the (vertical) z coordinate increases.7. Write the equation of the cylinder x2 + y2 = 4 in cylindrical coordinates. as in the following example.6 Helicoid θ = z .5 -1 -0.6 shows a section of this surface restricted to 0 ≤ z ≤ 4π and 0 ≤ r ≤ 2.5 Figure 1. so we get after combining terms. if the sphere is not centered at the origin. 0). Using spherical coordinates to write the equation of a sphere does not necessarily make the equation simpler. Example 1. Example 1. Solution: Since r = x2 + y2 . while the radius r is unrestricted.5 -1 -0. 14 12 10 8 z 6 4 2 0-2 -1.5 0 x 0. Note that this actually makes it more difﬁcult to ﬁgure out what the surface is.32.

2 3. is d= ρ 2 + ρ 2 − 2ρ 1 ρ 2 [sin φ1 sin φ2 cos( θ2 − θ1 ) + cos φ1 cos φ2 ] . (2. θ1 . θ2 . write the given equation in (a) cylindrical and (b) spherical coordinates. 11. z1 ) and ( r 2 . θ2 . x2 + y2 + z2 = 25 6. θ2 . C 10. 0) with radius | a |. 1 2 . show that cos γ = cos φ1 cos φ2 + sin φ1 sin φ2 cos( θ2 − θ1 ). ﬁnd the (a) cylindrical and (b) spherical coordinates of the point whose Cartesian coordinates are given. Since 0 < φ < π. Show that for a = 0. the line segment from the origin to P can be extended to intersect the cylinder given by r = a (in cylindrical coordinates). 5. Describe the intersection of the surfaces whose equations in spherical coordinates are θ = π and φ = π . respectively. with a > 0 and 0 < φ < π. x2 + y2 + 9 z2 = 36 B 8. 0) 4. Let v1 be the vector from the origin to P1 . (−5. Let P = (a. Show that the distance d between the points P1 and P2 with spherical coordinates (ρ 1 . 100-102 in J ACKSON. VECTORS IN EUCLIDEAN SPACE Exercises A For Exercises 1-4. Find the cylindrical coordinates of that point of intersection. which provides a general expression for the electrostatic potential at a point due to a unit charge. For the angle γ between v1 and v2 . 2) For Exercises 5-7. respectively. z2 ).50 CHAPTER 1. φ1 ) and (ρ 2 . θ1 . φ2 ). 6) 3. x2 + y2 = 2 y 7. φ1 ) and (ρ 2 . Show that the distance d between the points P1 and P2 with cylindrical coordinates ( r 1 . φ2 ). 2. 12. This formula is used in electrodynamics to prove the addition theorem for spherical harmonics. φ) be a point in spherical coordinates. (0. −1) 2. Then P lies on the sphere ρ = a. 5. Let P1 and P2 be points whose spherical coordinates are (ρ 1 . See pp. − 7. the equation ρ = 2a sin φ cos θ in spherical coordinates describes a sphere centered at (a. and let v2 be the vector from the origin to P2 . 0. is d= r 2 + r 2 − 2 r 1 r 2 cos( θ2 − θ1 ) + ( z2 − z1 )2 . θ . 1 2 13. 2 4 9. respectively. 1. θ1 . ( 21.

By identifying vectors with their terminal points. f 3 ( t)) for some real-valued functions f 1 ( t). Deﬁne f : R → R by f( t) = (cos t. though. sin t. Since each of the three component functions are real-valued. This is the equation of a helix (see Figure 1. f 2 ( t). which in Cartesian coordinates has the terminal point (1. the terminal points of f( t) trace out a curve spiraling upward. we can begin discussing functions whose values are vectors. A vector-valued function of a real variable is a rule that associates a vector f( t) with a real number t. it will sometimes be the case that results from single-variable calculus can simply be applied to each of the component functions to yield a similar result for the vector-valued function.8 Vector-Valued Functions 51 1. a curve in space can be written as a vector-valued function. As the value of t increases. A vector-valued function of a real variable can be written in component form as f( t) = f 1 ( t)i + f 2 ( t)j + f 3 ( t)k or in the form f( t) = ( f 1 ( t).8.1. the curve lies on the surface of the right circular cylinder x2 + y2 = 1.and y-coordinates of f( t) are x = cos t and y = sin t. so x2 + y2 = cos2 t + sin2 t = 1.1 It may help to think of vector-valued functions of a real variable in R3 as a generalization of the parametric functions in R2 which you learned about in single-variable calculus. The concept of a limit. f( t) = ti + t2 j + t3 k is a vector-valued function in R3 .1). 3 f(2π) f(0) x y 0 Figure 1. Thus. We write f : D → R3 to denote that f is a mapping of D into R3 . as in the following deﬁnition. The ﬁrst form is often used when emphasizing that f( t) is a vector. Deﬁnition 1. there are times when such generalizations do not hold (see Exercise 13). For example. At t = 1 the value of the function is the vector i + j + k. where t is in some subset D of R1 (called the domain of f). f 3 ( t). and the second form is useful when considering just the terminal points of the vectors.8 Vector-Valued Functions Now that we are familiar with vectors and their operations. 1. . called the component functions of f. Much of the theory of real-valued functions of a single real variable can be applied to vector-valued functions of a real variable. can be extended naturally to vector-valued functions.10. We would write f : R → R3 . z Example 1. t).35.8. For each t. 1). deﬁned for all real numbers t. the x. However. f 2 ( t).

Let f( t) = (cos t. then t→ a t→ a t→ a t→ a lim f( t) = lim f 1 ( t). f 2 ′ (a). Equivalently. let a be a real number and let c be a vector. 0. 1) for all t. the derivative of a vector-valued function is a tangent vector to the curve in space which the function represents. Similarly. Let f( t) be a vector-valued function. f 3 ( t)) be a vector-valued function. if the component derivatives exist. is the limit The derivative of f( t) at a. Recall that the derivative of a real-valued function of a single variable is a real number. The above deﬁnition shows that continuity and the derivative of vector-valued functions can also be deﬁned in terms of its component functions. f 2 ( t). .12. If f( t) = ( f 1 ( t). VECTORS IN EUCLIDEAN SPACE Deﬁnition 1. 0. representing the slope of the tangent line to the graph of the function at a point. and f 3 ( t) are continuous at a. df (a). cos t.2). sin t. Then f( t) is continuous at a if lim f( t) = f(a). f 2 ( t). denoted by f ′ (a) or dt f ′ (a) = lim f(a + h) − f(a) h→0 h t→ a if that limit exists. lim f 2 ( t). and let a be a real number in its domain. f 2 ( t). y = s. written as lim f( t) = c. Equivalently. Then f ′ ( t) = (− sin t. if lim f( t) − c = 0. f 3 ′ (a)). f( t) is continuous at a if and only if f 1 ( t). 2π) + s(0. 1). Then we say that the limit of f( t) as t approaches a equals c. t).2 ) f(a + h) Tangent vector f ′ (a) and tangent line L = f(a) + sf ′ (a) Example 1. f ′ (a) = ( f 1 ′ (a). f 3 ( t)). and it lies on the tangent line to the curve (see Figure 1. 1.36.52 CHAPTER 1. or in parametric form: x = 1.8. Let f( t) = ( f 1 ( t).11. 2π) is L = f(2π) + s f ′ (2π) = (1. lim f 3 ( t) t→ a t→ a provided that all three limits on the right side exist.8. Deﬁnition 1. z = 2π + s for −∞ < s < ∞. We say that f( t) is differentiable at a if f ′ (a) exists. z f(a f ′ ( a) + L f( t) y f(a) h) − f( a 0 x Figure 1. The tangent line L to the curve at f(2π) = (1.

let u( t) be a differentiable scalar function.1. Note that if u( t) is a scalar function and f( t) is a vector-valued function. f 2 ( t). let k be a scalar. and leave the proof of part (g) as an exercise for the reader. Let f( t) and g( t) be differentiable vector-valued functions. ( t). f 2 ( t). g 2 ( t).8 Vector-Valued Functions 53 A scalar function is a real-valued function. ( t) · ( g 1 ( t). ( t). We will prove part (f). and let c be a constant vector. f 3 ( t)) and g( t) = ( g 1 ( t). then their product. g 3 ( t) are all differentiable real-valued functions. g 3 ( t)) dt dt dt d g2 d g3 d g1 ( t). f 3 ( t). f 2 ( t). Then (a) d (c) = 0 dt d df (b) ( kf) = k dt dt df dg d (f + g) = + (c) dt dt dt d df dg (d) (f − g) = − dt dt dt du df d ( u f) = f+u (e) dt dt dt d df dg (f) (f · g) = · g + f· dt dt dt d df dg (g) (f × g) = ×g+f× dt dt dt Proof: The proofs of parts (a)-(e) follow easily by differentiating the component functions and using the rules for derivatives from single-variable calculus. Then d d (f( t) · g( t)) = ( f 1 ( t) g 1 ( t) + f 2 ( t) g 2 ( t) + f 3 ( t) g 3 ( t)) dt dt d d d = ( f 1 ( t) g 1 ( t)) + ( f 2 ( t) g 2 ( t)) + ( f 3 ( t) g 3 ( t)) dt dt dt d g1 d f2 d g2 d f3 d g3 d f1 ( t) g 1 ( t) + f 1 ( t) ( t) + ( t) g 2 ( t) + f 2 ( t) ( t) + ( t) g 3 ( t) + f 3 ( t) ( t) = dt dt dt dt dt dt d f1 d f2 d f3 = ( t). g 2 ( t). f 3 ( t)) · dt dt dt dg df ( t) · g( t) + f( t) · ( t) for all t. deﬁned by ( u f)( t) = u( t) f( t) for all t. g 2 ( t). The basic properties of derivatives of vector-valued functions are summarized in the following theorem. g 3 ( t)). is a vector-valued function (since the product of a scalar with a vector is a vector). where the component functions f 1 ( t). (f) Write f( t) = ( f 1 ( t).20. ( t) + ( f 1 ( t). QED = dt dt . g 1 ( t). Theorem 1.

38.1 0. so f( t) 2 = (f( t) · f( t)).6 0.8 -0.1 -0.8 -0.6 -0.05 z 0 -0.15 0. so if f( t) = 0 then d f ′ ( t) · f( t) f( t) = .2 0. we know that dt dt d d But f( t) 2 = f( t) · f( t).8. sin t .05 -0. . then by the Chain Rule for real-valued d d f( t) 2 = 2 f( t) f( t) .15 -0. for a = 0.4 0. the reader will be asked to show that this curve lies on the sphere x2 + y2 + z2 = 1 and to verify directly that f ′ ( t) · f( t) = 0 for all t. This means that if a curve lies completely on a sphere (or circle) centered at the origin. In the exercises. Thus.3 shows the graph of the curve when a = 0.4 -0.6 -0. then the tangent vector f ′ ( t) is always perpendicular to the position vector f( t). so dt dt = 2f ′ ( t) · f( t) .4 0.3 Spherical spiral with a = 0. VECTORS IN EUCLIDEAN SPACE Example 1.8.2 -1 -0. the above example shows this important fact: We know that f( t) is constant if and only if If f( t) = 0. 1 + a2 t2 1 + a2 t2 Example 1. we have dt dt 2 f( t) d d f( t) = (f( t) · f( t)) = f ′ ( t) · f( t) + f( t) · f ′ ( t) by Theorem 1. The spherical spiral f( t) = cos t . 1 + a2 t2 Figure 1.2 0. dt f( t) d f( t) = 0 for all t.6 0. Also.54 CHAPTER 1.37. Solution: Since f( t) is a real-valued function of t. Hence.4 -0. then f( t) is constant if and only if f( t) ⊥ f ′ ( t) for all t.2 0.20(f). Find the derivative of f( t) .8 1 1 Figure 1. f( t) ⊥ f ′ ( t) if and dt only if f ′ ( t) · f( t) = 0. Suppose f( t) is differentiable.2-1 -0.2 0 x 0. functions.2 . −at 0.2.8 0 y 0.

higher-order derivatives of vector-valued functions are obtained by repeatedly differentiating the (ﬁrst) derivative of the function: f ′′ ( t) = d ′ f ( t) . In fact. 3 sin t. Note that since the mass m is a constant. y( t). dt f ′′′ ( t) = d ′′ f ( t) . And not only does r( t) lie on the sphere of radius 5 centered at the origin.) We can use vector-valued functions to represent physical quantities. Also. 4 sin t) be the position vector of an object at time t ≥ 0. . 3 cos t. 14 We will often use the older dot notation for derivatives when physics is involved. . v( t) = 5 for all t also. −3 sin t. etc. and suppose that an object of constant mass m is subjected to some force so that it moves in space. .37 we know that r( t) · ˙ r( t) = 0 for all t (which we can verify from part (a)). such as velocity. −4 sin t) ˙ Solution: (a) v( t) = r( t) = (−5 sin t. .39. let the real variable t represent time elapsed from some initial time ( t = 0).. dnf d d n −1 f = dt n dt dt n−1 (for n = 2. dt . note that a( t) = −r( t). 3. It turns out (see Exercise 16) that whenever an object moves in a circle with constant speed. y = y( t). For example. . z( t)) the position vector of the object. That is. Find its (a) velocity and (b) acceleration vectors. 4. y ( t).e. z ( t)) ˙ force: F( t) = p( t) = p ′ ( t) = The magnitude v( t) of the velocity vector is called the speed of the object. Example 1. y.8 Vector-Valued Functions 55 Just as in single-variable calculus. We can deﬁne various physical quantities associated with the object as follows:14 position: r( t) = ( x( t).. so by Example 1. force. ˙ (b) a( t) = v( t) = (−5 cos t. z( t). 4 cos t) Note that r( t) = 25 cos2 t + 25 sin2 t = 5 for all t. z = z( t) for some real-valued functions x( t). the force equation becomes the familiar F( t) = ma( t). but perhaps not so obvious is that it lies completely within a circle of radius 5 centered at the origin. Call r( t) = ( x( t). z) at time t a function of t. y( t). y ( t). z ( t)) dp dt (Newton’s Second Law of Motion) dr dt ′ ′ ′ = ( x ( t). towards the center of the circle). Let r( t) = (5 cos t. with its position ( x. the acceleration vector will point in the opposite direction of the position vector (i.1. y( t). z( t)) ˙ velocity: v( t) = r( t) = r ′ ( t) = ˙ acceleration: a( t) = v( t) = v ′ ( t) = momentum: p( t) = mv( t) dv dt d2r ¨ = r( t) = r ′′ ( t) = 2 dt ′′ ′′ ′′ = ( x ( t). momentum. acceleration. x = x( t).

As an example. 5. r2 are position vectors to distinct points then r1 + t(r2 − r1 ) represents a line through those two points as t varies over all real numbers.5 that if r1 . 0) 0 0.40. a function of the form f( t) = (a 1 t + b 1 . A function of the form f( t) = (a 1 t2 + b 1 t + c 1 . In general.4 Bézier curve approximation for three points . The function b2 ( t) is the Bézier curve 1 0 for the points b0 . let b0 = (0. 1]. b1 . a 2 t + b 2 . given three points (or position vectors) b0 . 1] it is the line segment between the points. 0. 2) (0. 3) 3 2. Then the explicit formula for the Bézier curve is b2 ( t) = (2 t + 2 t2 . 3). 0). 6 t − 4 t2 ). and b1 ( t) is the line segment between b1 and b2 .4. 2.56 CHAPTER 1. Note from the last formula that the curve is a parabola that goes through b0 (when t = 0) and b2 (when t = 1).5 0 0 1 2 3 4 y 5 4 3 3.8. VECTORS IN EUCLIDEAN SPACE Recall from Section 1. we see that b1 ( t) is the line segment between b0 and 0 b1 . 0 1 0 (1. and when t is restricted to the interval [0. with l(0) = r1 and l(1) = r2 . Example 1. That vector sum can be written as (1 − t)r1 + tr2 . For instance. For t in the interval [0. Bézier curves are used in Computer Aided Design (CAD) to approximate the shape of a polygonal path in space (called the Bézier polygon or control polygon).8.5 (4.5 2 1. 4 t + t2 . deﬁne b1 ( t) = (1 − t)b0 + tb1 0 b1 ( t) = (1 − t)b1 + tb2 1 b2 ( t) = (1 − t)b1 ( t) + tb1 ( t) 0 0 1 = (1 − t)2 b0 + 2 t(1 − t)b1 + t2 b2 for all real t. a 3 t2 + b 3 t + c 3 ) represents a (possibly degenerate) parabola in R3 .5 z 1 0. So the function l( t) = (1 − t)r1 + tr2 is a line through the terminal points of r1 and r2 .5 x Figure 1. b2 in R3 . and the curve is b2 ( t). a 3 t + b 3 ) represents a line in R3 . 2. as shown in Figure 1. b2 . 2). 0. a 2 t2 + b 2 t + c 2 . b1 .5 2 2. where the line 0 segments are b1 ( t) and b1 ( t).5 1 1. b1 = (1. 5. and b2 = (4.

8. ﬁnd the velocity v( t) and acceleration a( t) of an object with the given position vector r( t). Let f( t) = cos t 1 + a2 t2 1 + a2 t2 1 + a2 t2 (a) Show that f( t) = 1 for all t.5 x 4 y 5 4 Figure 1. 1) 1. 2 sin2 t.5 (2. 1) B 7.5 (0. If f ′ ( t) = 0 for all t in some interval (a. f( t) = (sin 2 t. with a = 0. 5.5 2 2. −at . 1 − cos t) 6. t2 + 1. t − sin t. 3. 2) 1 z 0. 5. 15 See pp. the polygonal path determined by n ≥ 3 noncollinear points in R3 can be used to deﬁne the Bézier curve recursively by a process called repeated linear interpolation. 27-30 in FARIN. 2 (0. sin 2 t. 0) 0 0. sin t . e2 t + 1. This curve will be a vector-valued function whose components are polynomials of degree n − 1.8 Vector-Valued Functions 57 In general. b). 0. . b).5 (4. f( t) = ( e t + 1. e t + 1) 4. 1.1.8. 2 sin t. and its formula is given by de Casteljau’s algorithm. f( t) = ( t + 1. t) 2.5 1 1. show that f( t) is a constant vector in (a. calculate f ′ ( t) and ﬁnd the tangent line at f(0). 0) 0 0 1 2 3 3 3. 1. the reader will be given the algorithm for the case of n = 4 points and asked to write the explicit formula for the Bézier curve for the four points shown in Figure 1. 2 cos t) 2 For Exercises 5-6. r( t) = (3 cos t. . f( t) = (cos 2 t. t3 + 1) 3. (b) Show directly that f ′ ( t) · f( t) = 0 for all t.5 Bézier curve approximation for four points Exercises A For Exercises 1-4.8. r( t) = ( t.5.15 In the exercises.

there is no t in the interval (0. how do you explain the difference in the two derivatives? 10. dt dt dt 11. velocity v( t). (b) What kind of curve does h( t) = e t c represent? Explain. 0. Show that d df dh dg . Show that d df d2f f× =f × 2. sin t. 1). b3 in R3 is deﬁned by the 0 following algorithm (going from the left column to the right): b1 ( t) = (1 − t)b1 + tb2 1 b1 ( t) = (1 − t)b0 + tb1 0 b2 ( t) = (1 − t)b1 ( t) + tb1 ( t) 1 1 2 b2 ( t) = (1 − t)b1 ( t) + tb1 ( t) 0 0 1 b3 ( t) = (1 − t)b2 ( t) + tb2 ( t) 0 0 1 b1 ( t) = (1 − t)b2 + tb3 2 (a) Show that b3 ( t) = (1 − t)3 b0 + 3 t(1 − t)2 b1 + 3 t2 (1 − t)b2 + t3 b3 . 16.37 to show that r( t) ⊥ v( t) and a( t) ⊥ v( t). Show that a( t) points in the opposite direction as r( t) for all t.) 17. b2 = (2. The angular momentum L( t) of the particle with respect to the origin at time t is deﬁned as L( t) = r( t) × p( t). 15. Given your answer to part (a). Prove Theorem 1. 0).58 CHAPTER 1. 12. Let r( t) be the position vector for a particle moving in R3 . 3. (Hint: Use Example 1. 2). b2 . Let a particle of (constant) mass m have position vector r( t). Show that L ′ ( t) = N( t). Show that d (r × (v × r)) = r dt 2 a + (r · v)v − ( v 2 + r · a)r. 1. 2π) such that f ′ ( t) = f(2π) − f(0) .40) for the Bézier curve for the points b0 = (0. For a constant vector c = 0.20(g). and hence a( t) ∥ r( t). b1 . 5. acceleration a( t) and momentum p( t) at time t. b1 = (0. If F( t) is the force acting on the particle at time t. (a) What kind of curve does g( t) = t3 c represent? Explain. the function f( t) = tc represents a line parallel to c. . 0). b3 = (4. 2π − 0 C 14. VECTORS IN EUCLIDEAN SPACE 9. The Mean Value Theorem does not hold for vector-valued functions: Show that for f( t) = (cos t. The Bézier curve b3 ( t) for four noncollinear points b0 . then deﬁne the torque N( t) acting on the particle with respect to the origin as N( t) = r( t) × F( t). (c) Compare f ′ (0) and g ′ (0). (f · (g × h)) = · (g × h) + f · × h + f· g × dt dt dt dt 13. 0 (b) Write the explicit formula (as in Example 1. t). Let r( t) be the position vector in R3 for a particle that moves with constant speed c > 0 in a circle of radius a > 0 in the x y-plane.

Then the arc length L of the curve from t = a to t = b is b L= a f ′ ( t) dt = b a x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt (1.41).41) A real-valued function whose ﬁrst derivative is continuous is called continuously differentiable (or a C 1 function). t) from t = 0 to t = 2π. b]. y( t) and z( t) exists and is continuous.2 and § 18. b) the ﬁrst derivative of each component function x( t). b]. the arc length of a curve in R3 . See the proof in T AYLOR and M ANN.41. Note that we did not prove that the formula in the above deﬁnition actually gives the length of a section of a curve. . normally glossed over in calculus texts. sin t. 16 In particular. z( t)) be a curve in R3 whose domain includes the interval [a. Solution: By formula (1. This is indeed how we will deﬁne the distance traveled and. we have 2π 2π 2π L= = 0 (− sin t)2 + (cos t)2 + 12 dt = 0 sin2 t + cos2 t + 1 dt = 2 dt 0 2(2π − 0) = 2 2π Similar to the case in R2 . Suppose that in the interval (a.9 Arc Length Let r( t) = ( x( t). b] into subintervals where all the component functions are continuously differentiable (except at the endpoints. and a function whose derivatives of all orders are continuous is called smooth (or a C ∞ function). A rigorous proof requires dealing with some subtleties.40) which is analogous to the case from single-variable calculus for parametric functions in R2 . Find the length L of the helix f( t) = (cos t. which are beyond the scope of this book. Since v( t) is the speed of the object at time t. Deﬁnition 1. Let f( t) = ( x( t). Duhamel’s principle is needed. in general.9 Arc Length 59 1. All the functions we will consider will be smooth. z( t)) be the position vector of an object moving in R3 . (1. it seems natural to deﬁne the distance s traveled by the object from time t = a to t = b as the deﬁnite integral b b s= a v( t) dt = a x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt .1. y( t).2. y( t). if there are values of t in the interval [a.16 Example 1. and that no section of the curve is repeated. A smooth curve f( t) is one whose derivative f ′ ( t) is never the zero vector and whose component functions are all smooth. which can be ignored). § 14. b] where the derivative of a component function is not continuous then it is often possible to partition [a. The sum of the arc lengths over the subintervals will be the arc length over [a.13.

π] . 2π] by α( s) = 2π s shows that h( s) is equivalent to f( t). Intuitively. one-to-one. deﬁne α : [0. d ] α t [a. the speeds of f( t) and g( t) are f ′ ( t) = 2 and g ′ ( t) = 2 2. d ] → R3 deﬁned by g( s) = f(α( s)) is a parametrization of C with parameter s.42. d ] then we say that g( s) is equivalent to f( t). and is strictly increasing (since α ′ ( s) = 2 > 0 for all s). b]. b] f f( t) R3 g( s) = f(α( s)) = f( t) Note that the differentiability of g( s) follows from a version of the Chain Rule for vectorvalued functions (the proof is left as an exercise): Theorem 1. We say that g( t) and f( t) are different parametrizations of the same curve. b]. g( s) = (cos 2 s. t) for t in [0. 2π]. 2π s) for s in [0. d ] → [a. viewing the functions as position vectors and their derivatives as velocity vectors. and let α : [ c. Likewise. π] → [0. sin t. sin t. respectively. Then α is smooth. d ] onto [a. g( t) traces out the same section of the curve as f( t) does over the interval [0.42) ds dt ds for any s where the composite function f(α( s)) is deﬁned. sin 2π s. and d f d f dt = (1. If α is strictly increasing on [ c. over the interval [0. Chain Rule: If f( t) is a differentiable vector-valued function of t. 2π]. 2π] by α( s) = 2 s. Example 1. VECTORS IN EUCLIDEAN SPACE Notice that the curve traced out by the function f( t) = (cos t. π]. b] be a smooth one-to-one mapping of an interval [ c. 1] → [0.14.60 CHAPTER 1. Then the function g : [ c. s [ c. Deﬁnition 1. 1] To see that g( s) is equivalent to f( t). and t = α( s) is a differentiable scalar function of s. t) from Example 1. For example. sin 2 s.21.41 is also traced out by the function g( t) = (cos 2 t. deﬁning α : [0. 2π] h( s) = (cos 2π s. The following are all equivalent parametrizations of the same curve: f( t) = (cos t. 2 t). this says that g( t) traces the curve twice as fast as f( t). π] onto [0. Let C be a smooth curve in R3 represented by a function f( t) deﬁned on an interval [a. 2 s) for s in [0. sin 2 t. then f( s) = f(α( s)) is a differentiable vector-valued function of s. This makes sense since. maps [0.

for any given smooth parametrization f( t) deﬁned on [a. its derivative is s ′ ( t) = d ds = dt dt t a f ′ ( u) du = f ′ ( t) for all t in [a. and vice versa. b] onto the interval [ s(a). L] → R3 by f( s) = f(α( s)) for all s in [0.43) a In terms of motion along a curve. There is a natural correspondence between s and t: from a starting point on the curve. L] α( s ) t [a. L].9. From single-variable calculus. differentiable mapping onto the interval [0. for each t in [a. L]. b] → [0. s is the distance traveled along the curve after time t has elapsed. Since s is the arc length of the curve over the interval [a. by the Chain Rule. by the parameter s given by t s = s( t) = f ′ ( u) du. The idea behind this is to replace the parameter t. b] → [0. b] s( t) Figure 1.9 Arc Length 61 A curve can have many parametrizations. So the new parameter will be distance instead of time. b]. L] → [a. b] that is differentiable and the inverse of s : [a. s( b)]. . so which one is the best to use? In some situations the arc length parametrization can be useful. But we see that a s( a) = a f ′ ( u) du = 0 and b s( b ) = a f ′ ( u) du = L = arc length from t = a to t = b So the function s : [a. L]. b] there is a unique s in [0. we know that this means that there exists an inverse function α : [0.1. Since f( t) is smooth. (1. t] for each t in [a. then it is a function of t. so 1 . Thus s ′ ( t) > 0 and hence s( t) is strictly increasing on the interval [a. Then f( s) is smooth. That is. so = f ′ (α( s)) ′ f (α( s)) f ′ ( s) = 1 for all s in [0. In fact. L].1 t = α(s) 1 s ′ (α( s)) = 1 f ′ (α( s)) So deﬁne the arc length parametrization f : [0. b]. Recall that this means that s is a one-to-one mapping of the interval [a. b]. with different speeds. L] is a one-to-one. By the Fundamental Theorem of Calculus. So the arc length parametrization traverses the curve at a “normal” rate. And we know that the derivative of α is α ′ ( s) = s [0. b]. b]. L] such that s = s( t) and t = α( s). f( s) has unit speed: f ′ ( s) = f ′ (α( s)) α ′ ( s) by the Chain Rule. then f ′ ( t) > 0 for all t in [a. the distance traveled along the curve (in one direction) is uniquely determined by the amount of time elapsed.

s 2 . sin t.62 CHAPTER 1. in the ﬁeld of mathematics known as differential geometry.43).43. you would then substitute the expression for t in terms of s (which we called α( s)) into the formula for f( t) to get f( s). b]. VECTORS IN EUCLIDEAN SPACE In practice. is desirable.41 and formula (1. which. t). 2π]. y ′ ( t) = r ′ ( t) sin θ ( t) + r ( t)θ ′ ( t) cos θ ( t) 17 See O’N EILL for an introduction to elementary differential geometry. arc length parametrizations are more useful for theoretical purposes than for practical computations. by arc length. y( t).22. y( t) = r ( t) sin θ ( t). Note that f ′ ( s) = 1. Then the arc length L of the curve over [a.43 is the exception. are polynomial functions in R3 .17 The methods involve using an arc length parametrization. z( t)) of a point on the curve are given by x( t) = r ( t) cos θ ( t). But their arc length parametrizations are not only not polynomials. sin s 2 . which often leads to an integral that is either difﬁcult or impossible to evaluate in a simple closed form. Example 1. . we have t t 0 s= 0 f ′ ( u) du = 2 du = 2 t for all t in [0. Solution: By Example 1. which makes their computation much easier. the usual parametrizations of Bézier curves. θ = θ ( t) and z = z( t) are the cylindrical coordinates of a curve f( t). 2π]. 18 For example. for t in [0. So we can solve for t in terms of s: t = α( s) = ∴ f(s) = cos s 2 .44) Proof: The Cartesian coordinates ( x( t). not the norm. they are in fact usually impossible to calculate at all. in CAD. and these deﬁnitions can be shown to be equivalent to those using arc length. We will leave this to the exercises. which we discussed in Section 1. The simple integral in Example 1. 2 2π]. s Arc length plays an important role when discussing curvature and moving frame ﬁelds. parametrizing a curve f( t) by arc length requires you to evaluate the integral t s = a f ′ ( u) du in some closed form (as a function of t) so that you could then solve for t in terms of s.18 Curvature and moving frame ﬁelds can be deﬁned without using arc length. The arc length for curves given in other coordinate systems can also be calculated: Theorem 1. 2 for all s in [0.8. In general. This makes their computation relatively simple. z ( t) = z ( t) so differentiating the above expressions for x( t) and y( t) with respect to t gives x ′ ( t) = r ′ ( t) cos θ ( t) − r ( t)θ ′ ( t) sin θ ( t). for t in [a. If that can be done. Parametrize the helix f( t) = (cos t. Suppose that r = r ( t). b] is b L= a r ′ ( t)2 + r ( t)2 θ ′ ( t)2 + z ′ ( t)2 dt (1.

f( t) = (2 cos 3 t.9 Arc Length and so 63 x ′ ( t)2 + y ′ ( t)2 = ( r ′ ( t) cos θ ( t) − r ( t)θ ′ ( t) sin θ ( t))2 + ( r ′ ( t) sin θ ( t) + r ( t)θ ′ ( t) cos θ ( t))2 = r ′ ( t)2 (cos2 θ + sin2 θ ) + r ( t)2 θ ′ ( t)2 (cos2 θ + sin2 θ ) − 2 r ′ ( t) r ( t)θ ′ ( t) cos θ sin θ + 2 r ′ ( t) r ( t)θ ′ ( t) cos θ sin θ b = r ′ ( t)2 + r ( t)2 θ ′ ( t)2 . and so L= = a b a x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt r ′ ( t)2 + r ( t)2 θ ′ ( t)2 + z ′ ( t)2 dt QED Solution: Since r ′ ( t) = e t . 5. f( t) = (( t2 + 1) cos t. for t over the interval [0. 3 t) on [0. f( t) 3 d dt f( t) f( t) = . ( t2 + 1) sin t. 1] 3. Find the arc length L of the curve whose cylindrical coordinates are r = e t . 2 t3/2 ) on [0. 3 sin 2 t. Parametrize the curve from Exercise 3 by arc length.44. Show that f( t) × (f ′ ( t) × f( t)) . θ = t and z = e t . Parametrize the curve from Exercise 1 by arc length.1. 1. 1]. f( t) = (3 cos 2 t. B 6. θ ′ ( t) = 1 and z ′ ( t) = e t . 2 2 t) on [0. Let f( t) be a differentiable curve such that f( t) = 0 for all t. 2 sin 3 t. then 1 Example 1. calculate the arc length of f( t) over the given interval. L= = = 0 1 0 1 0 r ′ ( t)2 + r ( t)2 θ ′ ( t)2 + z ′ ( t)2 dt e2 t + e2 t (1) + e2 t dt e t 3 dt = 3( e − 1) Exercises A For Exercises 1-3. 1] 4. π/2] 2.

12.64 CHAPTER 1. 11. f ′ ( t) f ′′ ( t) × f ′ ( t) 9. t). Then T ′ ( t) = 0 so we can deﬁne the unit principal normal vector N by N( t) = Show that N( t) = T ′ ( t) . Let f( t) be a smooth curve such that f ′ ( t) = 0 for all t. Show that the arc length L of a curve whose spherical coordinates are ρ = ρ ( t). = f ′ ( t) f ′ ( t) 4 and that T ′ ( t) = f ′ ( t) κ( t) N( t). the unit binormal vector B is deﬁned by B( t) = T( t) × N( t). θ = θ ( t) and φ = φ( t) for t in an interval [a. the curvature κ is deﬁned by κ( t) = f ′ ( t) × (f ′′ ( t) × f ′ ( t)) T ′ ( t) . Continuing Exercise 8. sin t. Show that B( t) = f ′ ( t) × f ′′ ( t) . VECTORS IN EUCLIDEAN SPACE Exercises 7-9 develop the moving frame ﬁeld T. 7. assume that f ′ ( t) and f ′′ ( t) are not parallel. Continuing Exercise 7. f ′ ( t) 3 8. b] is b L= a ρ ′ ( t)2 + (ρ ( t)2 sin2 φ( t)) θ ′ ( t)2 + ρ ( t)2 φ ′ ( t)2 dt. B and κ at each point of the helix f( t) = (cos t. B at a point on a curve. Continuing Exercise 9. T( t) = ′ f ( t) Show that T ′ ( t) = f ′ ( t) × (f ′′ ( t) × f ′ ( t)) . Then we can deﬁne the unit tangent vector T by f ′ ( t) . N. Find T. N( t) and B( t) form a right-handed system of mutually perpendicular unit vectors (called orthonormal vectors) at each point on the curve f( t). Show that κ( t) = f ′ ( t) × f ′′ ( t) f ′ ( t) 3 Note: κ( t) gives a sense of how “curved” the curve f( t) is at each point. f ′ ( t) × f ′′ ( t) Note: The vectors T( t). N. 10. . T ′ ( t) f ′ ( t) × (f ′′ ( t) × f ′ ( t)) .

z) deﬁned on points ( x. but there will be times when we will use points in R3 . and there will also be times when it will be convenient to think of the points as vectors (or terminal points of vectors). y) = 1 − x 2 − y2 is the set D = {( x. A similar deﬁnition holds for functions f ( x. The range of f is all real numbers except 0. y) = 1 x− y is all of R2 except the points ( x. We will now examine real-valued functions of a point (or vector) in R2 or R3 . and the range of f is all of R. The largest possible set D in R2 on which f is deﬁned is called the domain of f . The domain of the function f ( x. y) : x = y}. and the range of f is the set of all real numbers f ( x.2 Functions of Several Variables 2. y. The domain of the function f ( x. Example 2. y) varies over the domain D . Example 2.8 we discussed vector-valued functions of a single real variable. Example 2. A real-valued function f deﬁned on a subset D of R2 is a rule that assigns to each point ( x.2.3. y) = x y is all of R2 . That is. 65 . the domain is the set D = {( x. y. since the quantity inside the square root is nonnegative if and only if 1 − ( x2 + y2 ) ≥ 0. We see that D consists of all points on and inside the unit circle in R2 (D is sometimes called the closed unit disk). y). For the most part these functions will be deﬁned on sets of points in R2 . The domain of the function f ( x. y) : x2 + y2 ≤ 1}. y) for which x = y.1 Functions of Two or Three Variables In Section 1. 1] in R. z) in R3 .1. The range of f is the interval [0. y) in D a real number f ( x. y) as ( x.

y) = (0. y. y) = c. z) = f ( x. where c varies over R. A function f ( x. We will now state explicitly what is meant by the limit of a function of two variables. The graph of the function f ( x. The function is not deﬁned at (0.8 0.5 at the point ( x. 0).2 0 -0.4 -5 -10 -5 0 0 5 10 10 -10 x y 5 Figure 2. so that the graph of f ( x.66 CHAPTER 2. Note that the level curves (shown both on the surface and projected onto the x y-plane) are groups of concentric circles. Example 2. The domain of the function f ( x.1. y. since it satisﬁes an equation of the form F ( x. and the range of f is all positive real numbers. y) approaches (0. y) is the set {( x. Level curves are often projected onto the x y-plane to give an idea of the various “elevation” levels of the surface (as is done in topography). the level curves are the solution sets of the equations f ( x.4 z 0. y) = sin x 2 + y2 x 2 + y2 is shown below. 0). as was mentioned in Section 1.5.4. y)} in R3 . F ( x. y. but the limit of the function exists (and equals 1) as ( x. z) : z = f ( x.2 -0. z) = e x+ y− z is all of R3 . y) deﬁned in R2 is often written as z = f ( x. . So we see that this graph is a surface in R3 .1. The traces of this surface in the planes z = c. 1 0. z) = 0 (namely. since both the numerator and denominator are 0 at that point. y) − z). are called the level curves of the function. for c in R.6 0. y). y) = sin x2 + y2 x2 + y2 You may be wondering what happens to the function in Example 2. FUNCTIONS OF SEVERAL VARIABLES Example 2.1 The function f (x. y. Equivalently. 0).

the statement “ x → a” means that x gets closer to the value a from two possible directions along the real number line (see Figure 2.1. y) can approach a point (a. In two dimensions. and how they can usually only be done easily for simple functions. is not some indeterminate form like 0/0) then you can just substitute ( x. y) approaches (a.e.1.y)→(a.1 Functions of Two or Three Variables 67 Deﬁnition 2. b) into the formula for f ( x.1) if given any ǫ > 0.2 “Approaching” a point in different dimensions . b) itself).2(b)). In general. y) is given by a single formula and is deﬁned at the point (a. b) (but not necessarily deﬁned at (a. b) (i. we will simply state that when the function f ( x. 2).2(a)).b) lim f ( x.2) x2 + y2 lim xy = since f ( x. y) − L | < ǫ whenever 0 < ( x − a)2 + ( y − b)2 < δ. Example 2. ( x. y) = (a. b) be a point in R2 . Then we say that the limit of f ( x.y)→(1.1. y) to ﬁnd the limit. inside a circle centered at (a. (2.e. b) along an inﬁnite number of paths (see Figure 2. b) with some sufﬁciently small radius δ). however. y) = xy x2 + y2 (1)(2) 2 = 2 + 22 5 1 is properly deﬁned at the point (1. A similar deﬁnition can be made for functions of three variables. y) be a real-valued function deﬁned on some set containing (a. b) (b) ( x. the multivariable cases are at least equally awkward to go through. within ǫ of L) if we pick ( x. ( x. The major difference between limits in one variable and limits in two or more variables has to do with how a point is approached. y) → (a. and let f ( x. b) in R2 x Figure 2. y) can get arbitrarily close to L (i. In the single-variable case. there exists a δ > 0 such that | f ( x. y) equals L as ( x.g. y) = L . written as ( x. If you recall the “epsilon-delta” proofs of limits of real-valued functions of a single variable. b) (e. you may remember how awkward they can be. y) sufﬁciently close to (a. so we will not bother with such proofs.2.6. y x 0 a (a) x → a in R x x 0 (a. Let (a. Instead. The idea behind the above deﬁnition is that the values of f ( x. b).1.

since doing so gives an indeterminate form 0/0. b) itself). 0) into the function. But if ( x. To show that the limit does not exist. Theorem 2.y)→(a. suppose that ( x. y) and if ( x. y) g( x.y)→(a.b) f ( x. y) approaches different values as ( x. CHAPTER 2. then we see that f ( x. y)] = ( x. it sufﬁces to have | f ( x. as shown in the following theorem.b) lim lim g( x. Hence the limit does not exist.b) lim f ( x. which we state without proof. Suppose that some scalar.y)→(a. y) approaches (0.1. and that k is lim [ f ( x. y) = (0. 0) along the straight line y = x through the origin.y)→(a. b) (but excluding (a. y) = xy x 2 + y2 = x0 x 2 + 02 =0 along that path (since x > 0 in the denominator). y) both exist.b) ( x. y) → (0. 0) along different paths.b) lim [ f ( x. Limits of real-valued multivariable functions obey the same algebraic rules as in the single-variable case.y)→(a.7. y) g( x.b) g( x.b) lim ( x.y)→(0. y) ( x.y)→(a. FUNCTIONS OF SEVERAL VARIABLES ( x.b) lim g( x. y) f ( x. y) (d) ( x. y) for all ( x.y)→(a. y) ± ( x. y) = L.b) lim f ( x.y)→(a. y) lim ( x.y)→(a. y) → (0. so that y = 0 along that path. y) ( x.b) ( x.y)→(a.y)→(a. .y)→(a.b) lim g( x. then lim (e) If | f ( x. y)] = ( x. y) − L | ≤ g( x. y) if ( x. we will show that the function approaches different values as ( x. Then: (a) (b) (c) ( x. for x > 0. y) = 0 g( x. y) ± g( x.y)→(a. Then xy x 2 + y2 does not exist f ( x. y) = g( x. y) = k ( x.0) lim Note that we can not simply substitute ( x.b) lim f ( x. y) − L | ≤ g( x. To see this.b) lim f ( x. y) and ( x. 0) along the positive x-axis.68 Example 2. 0) along different paths in R2 .y)→(a. y) → (0.b) lim ( x. y) “sufﬁciently close” to (a.y)→(a. y) = 0. y) for all ( x. Note that in part (e).b) lim k f ( x. y) = x2 1 xy = 2 = 2 + y2 2 2 x x +x which means that f ( x.y)→(a.b) lim f ( x.

we can modify the function from Example 2. y) is well-deﬁned for all ( x. 0) 2 x + y2 ( x. a2 + b 2 So since ( x. Example 2.0) lim f ( x.b) f ( x. x 2 + y2 x + y2 Therefore y4 = 0. we need an alternate method for evaluating this limit. y) = (0. A real-valued function f ( x. y) on all of R2 as follows: 0 if ( x. y) is a continuous function if it is continuous at every point in its domain D . notice that 4 4 4 y2 and so 0 ≤ y4 ≤ x2 + y2 for all ( x. x 2 + y2 y4 ( x2 + y2 )2 ≤ 2 = x2 + y2 → 0 as ( x. and we see that lim ( x. y) = (0. b) for (a. 0).8. y) → (0. ( x. Deﬁnition 2. 0) by Example 2. 0). 0) 4 f ( x. b) = (0.b) Then f ( x. Show that ( x.0) x2 + y2 lim Continuity can be deﬁned similarly as in the single-variable case. 0) we have y4 = 0. y) in R2 (i. Unless indicated otherwise. But x2 + y2 = ( x2 + y2 )2 . y) = 0 = f (0.8 so that it is continuous on all of R2 . Deﬁne a function f ( x. In fact.y)→(a. y) = y if ( x. Thus.y)→(a.8. We will use Theorem 2.y)→(0.2. y) with domain D in R2 is continuous at the point (a.1 Functions of Two or Three Variables Example 2. then f ( x. y) = f (a. 0) into the function gives the indeterminate form 0/0.y)→(0. y) = b4 = f (a. for y4 = all ( x.e. you can assume that all the functions we deal with are continuous. there are no indeterminate forms for any ( x.y)→(0. We say that f ( x.9. b) in D if lim f ( x. y) = (0. First.2. b). y). . y) = (0.1(e). y)). y) is continuous on all of R2 .0) 69 lim Since substituting ( x.

13.0) ( x. f ( x.0) ( x. 2 ( x.y)→(0. evaluate the given limit. y) = 2πσ2 e−( x + y )/2σ . state the domain and range of the given function. lim x 2 − y2 x 2 + y2 2 ( x. x) for all ( x.y)→(0. y) in R2 . 17. y) = x2 + y2 − 1 3. for σ > 0.−1) ( x. 14. y) = 4. ( x. Show that f ( x. f ( x. f ( x.y)→(0.y)→(0. is constant on the circle of radius r > 0 centered at the origin.0) 6. Use the substitution r = x2 + y2 to show that ( x.y)→(1.y)→(1. z) = sin( x yz) For Exercises 7-18. y. This function is called a Gaussian blur.) C 22. 15.0) lim sin x 2 + y2 x 2 + y2 =1 . FUNCTIONS OF SEVERAL VARIABLES Exercises A For Exercises 1-6.0) ( x. and is used as a ﬁlter in image processing software to produce a “blurred” effect. (Hint: Use Deﬁnition 2.70 CHAPTER 2. lim lim lim lim y sin( x y) x 2 + y2 x y x 2 − 2 x y + y2 x− y ( x2 + y2 ) cos cos 1 xy lim 1 xy lim B 1 19. f ( x.0) ( x. 18. y) = f ( y. f ( x. y) ≤ f ( y.1(b). 16. Prove Theorem 2.y)→(0. 7. Prove Theorem 2. z) = ( x − 1)( yz − 1) lim cos( x y) 8.) 23.0) lim 11.y)→(0. 9. 21.0) lim lim x −y x− y 4 2 x − 2x y + y x− y 2 12. Suppose that f ( x. 2 2 2 20.y)→(0.0) lim ex y x y2 x 2 + y4 x y2 x 2 + y2 10. . y) = 1 x 2 + y2 x2 + 1 y x 2 + y2 − 4 5.y)→(0. Show that f ( x. ( x.1(a) in the case of addition.y)→(0.0) ( x. x) for all ( x.1) ( x. y) = 2. f ( x.0) ( x.y)→(0. y. (Hint: You will need to use L’Hôpital’s Rule for single-variable limits.1. y) in R2 .y)→(0. 1.

y) for the function f ( x.3. y) with respect to y gives ∂f ∂y ( x. y) with respect to x or y is the rate of change of f ( x. b) with respect to x. Let f ( x. b) be a point in D . and then simply differentiating f ( x. is deﬁned as ∂x ∂f f (a + h. we can see that the partial derivative of a function f ( x. 1 It is not a Greek letter. y) = x2 + 3 y2 . using the usual rules from single-variable calculus. The symbol was ﬁrst used by the mathematicians A. to distinguish it from the letter d used for the “usual” derivative. the partial derivative of f ( x.2. y) with respect to x can be calculated by treating the y variable as a constant. b) = lim (2. y) = 2 x y and treating x as a constant and differentiating f ( x. y) as if it were a function of y alone. b) − f (a.10. b + h) − f (a. b) = lim h→0 f (a. b) (a. ∂x ∂y Solution: Treating y as a constant and differentiating f ( x. is deﬁned (2. Deﬁnition 2.2) h→0 ∂x h and the partial derivative of f at (a. Euler around 1740.3) (a. y) in the (positive) x or y direction. y) and ∂f ∂x ( x. What this means is that the partial derivative of a function f ( x. We will start with the notion of a partial derivative. y) = x2 y + y3 . b). Find ( x.1 Recall that the derivative of a function f ( x) can be interpreted as the rate of change of that function in the (positive) x direction. h Note: The symbol ∂ is pronounced “del”. From the deﬁnitions above. denoted by as ∂f ∂y ∂f ∂y (a. we can start to develop an idea of a derivative of a function of two or more variables. y) as if it were a function of x alone. denoted by ∂f (a. b) with respect to y. y) be a real-valued function with domain D in R2 . b).2 Partial Derivatives Now that we have an idea of what functions of several variables are. y) with respect to x gives ( x. Then the partial derivative of f at (a. and what a limit of such a function is.2 Partial Derivatives 71 2. y) with respect to y is obtained by treating the x variable as a constant and then differentiating f ( x. . respectively. and let (a. b) . Clairaut and L. ∂f ∂f Example 2. Likewise.

we can take their partial ∂x ∂y derivatives with respect to x and y. FUNCTIONS OF SEVERAL VARIABLES ∂f ∂x ∂f ∂y ∂f ∂x ∂f ∂y We will often simply write and instead of ( x. sin( x y2 ) . 2 ∂2 f ∂ y ∂x and ∂2 f ∂x ∂ y for the . Find the partial derivatives function f ( x. 2 ∂f ∂x .12. y) with respect to y gives ∂f ∂y = 2 x y cos( x y2 ) . 2 ∂2 f ∂y . Find ∂f and ∂f for the function f ( x. y) = ∂f ∂x = ( x2 + 1)( y2 cos( x y2 )) − (2 x) sin( x y2 ) ( x2 + 1)2 and treating x as a constant and differentiating f ( x. . . y) with respect to x gives Example 2. y) = e x y + x y3 . x2 + 1 are themselves functions of x and y. ∂2 f ∂x . y). ∂x ∂y x2 + 1 Solution: Treating y as a constant and differentiating f ( x. y) and ( x. Example 2. ∂f ∂y . This yields the higher-order partial derivatives: Since both ∂2 f ∂ x2 ∂ f ∂ y ∂x ∂ f ∂ x3 ∂ f ∂ y ∂ x2 ∂3 f ∂ y2 ∂ x ∂ f ∂x ∂ y ∂x 3 3 3 2 ∂f and ∂f = = ∂ ∂f ∂x ∂x ∂ ∂f ∂ y ∂x ∂ ∂ f 2 ∂2 f ∂ y2 ∂ f 2 = ∂ ∂f ∂y ∂y ∂ ∂f ∂x ∂ y ∂x ∂ y ∂ f ∂ y3 3 3 = = = ∂ x ∂ x2 ∂ ∂ f ∂ y ∂ x2 ∂2 f ∂ f 2 2 = ∂ ∂2 f ∂ y ∂ y2 ∂ ∂2 f ∂ x ∂ y2 ∂ ∂ ∂2 f ∂2 f ∂x ∂x ∂ y ∂ y ∂x ∂ y ∂ f ∂ x ∂ y2 ∂3 f ∂ x2 ∂ y ∂ f 3 = = = = ∂ ∂ ∂ y ∂ y ∂x ∂x ∂ y ∂x ∂ y ∂x ∂ y = .11.72 CHAPTER 2.

D yx ( x.2. such ∂2 f ∂2 f as ∂ y ∂ x and ∂ x ∂ y . f 12 ( x. y) D 11 ( x. This applies even to mixed partial derivatives of order 3 or higher. y) . y) . y) D 22 ( x. D yy ( x. y) . y) . Notice in the above example that are continuous at a point (a. The notation for partial derivatives varies. 214-216 in T AYLOR and M ANN for a proof. y) D 12 ( x. y) . All of the following are equivalent: ∂f ∂x ∂f ∂y ∂2 f ∂ x2 ∂2 f ∂ y2 ∂2 f ∂ y ∂x ∂2 f ∂x ∂ y : f x ( x. : f x y ( x. y) D 21 ( x. b). : f yx ( x. y) . y) .2 All the functions we will deal with will have continuous partial derivatives of all orders. D xx ( x. D x ( x. y) D 2 ( x. : f y ( x. y) . are called mixed partial derivatives. we have ∂f ∂x ∂2 f ∂ x2 73 = 2 x ye x = ∂ ∂x 2 y + y3 2 ∂f ∂y ∂2 f ∂ y2 y = x2 e x = ∂ ∂y 2 y + 3 x y2 2 (2 x ye x y + y3 ) 2 ( x 2 e x y + 3 x y2 ) 2 = 2 ye x ∂2 f ∂ y ∂x y + 4 x 2 y2 e x 2 2 = x4 e x ∂2 f ∂x ∂ y y + 6x y 2 = ∂ ∂y (2 x ye x y + y3 ) 2 = ∂ ∂x ( x 2 e x y + 3 x y2 ) 2 = 2 xe x y + 2 x3 ye x 2 y + 3 y2 = 2 xe x y + 2 x3 ye x 2 y + 3 y2 Higher-order partial derivatives that are taken with respect to different variables. . Speciﬁcally. y) . D y ( x. y) . 2 See pp. y) . : f yy ( x. so you can assume in the remainder of the text that ∂2 f ∂ y ∂x ∂2 f ∂ y ∂x ∂2 f ∂x ∂ y = ∂2 f . y) . it doesn’t matter in which order you take partial derivatives. In other words. then they are equal at that point. f 2 ( x. y) . y) . f 1 ( x. ∂x ∂ y It turns that this will usually be the case.2 Partial Derivatives Solution: Proceeding as before. y) . D 1 ( x. f 22 ( x. whenever both ∂2 f ∂ y ∂x and = ∂2 f ∂x ∂ y for all ( x. y) in the domain of f . y) : f xx ( x. y) . f 11 ( x. D x y ( x. f 21 ( x. y) . y) .

f ( x. y) = 11. 14. y) = x + 2 y 10. y) = ln( x y) 16. y) = sin( x y) For Exercises 17-26. f ( x. f ( x. f ( x. f ( x. y) = tan( x + y) ∂2 f ∂2 f . FUNCTIONS OF SEVERAL VARIABLES Exercises A For Exercises 1-16. f ( x. y) = x2 + y + 4 5. y) = e x y + x y 7. y) = x2 + y2 3. 1 in W EINBERGER. y) = ln( x y) 22. y) = ∂f ∂x and ∂f . y) = u( x + c y) + v( x − c y) is a solution of the general one-dimensional wave equation3 1 ∂2 f ∂2 f − 2 =0 . y) = e x y + x y 23. f ( x. y) = x4 9. f ( x. f ( x. y) = x + 2 y 26. y) = x2 + y + 4 21. Show that f ( x. f ( x. The wave equation is an example of a partial differential equation. y) = sin( x + y) 12. y) = x2 + y2 19. f ( x. Let u and v be twice-differentiable functions of a single variable. y) = x+1 y+1 x 2 + y2 3 x2 + y + 4 2 13. 28. f ( x. y) = x2 − y2 + 6 x y + 4 x − 8 y + 2 8. f ( x. f ( x. Show that the function f ( x. f ( x. y) = 6. y) = x4 25. ﬁnd 1. ∂ x 2 c ∂ y2 3 Conversely. f ( x. 15). f ( x. f ( x. f ( x. 18. f ( x. y) = x2 − y2 + 6 x y + 4 x − 8 y + 2 24. See Ch. ∂ x2 ∂ y2 ∂2 f ∂ y ∂x 15. ∂y 2. f ( x. y) = cos( x + y) 4. f ( x. f ( x. y) = sin( x y) x+1 y+1 B 27. y) = and (use Exercises 1-8. y) = sin( x + y) + cos( x − y) satisﬁes the wave equation ∂2 f ∂ x2 − ∂2 f ∂ y2 =0 . . it turns out that any solution must be of this form. f ( x. and let c = 0 be a constant. f ( x. y) = cos( x + y) 20. ﬁnd 17. f ( x.74 CHAPTER 2. y) = e−( x + y2 ) xy+1 x+ y 14.

f (a. y). b). y): given a point (a. y) (a. b) ∂x ∂f ∂f z slope = ∂f (a. y) in the positive x and y directions. f (a. ∂ y (a. z) represent a −→ − generic point on the surface S . b) in the domain D of f ( x. y. f (a. First. and the slope of the tangent line ∂f ∂f L x to that curve at that point is ∂ x (a. If the (acute) angle between the vector PQ and the plane T approaches zero as the point Q approaches P along the surface S .4.1 are contained in the tangent plane at that point. b)) z = f (x.3 Tangent Plane to a Surface 75 2. Deﬁnition 2. The intuitive idea is that a tangent plane “just touches” a surface at a point. b)) slope = ∂f (a. Let T be a plane which contains the point P . b. y) in the x and y directions described in Figure 2. Note that since two lines in R3 determine a plane. b)). y) be the equation of a surface S in R3 . b. b. Let z = f ( x. f ( x)) in R2 . y). y) in the plane x = a (see Figure 2. namely as the slope of the tangent line to the graph of f at the point ( x. There is a similar ∂f ∂f geometric meaning to the partial derivatives ∂ x and ∂ y of a function z = f ( x. The existence of those two dy .3. The formal deﬁnition mimics the intuitive notion of a tangent line to a curve. then the two tangent lines to the surface z = f ( x. b) is the slope of the tangent line L y to the trace of the surface z = f ( x. then we call T the tangent plane to S at P . you might expect that partial derivatives can be used to deﬁne a tangent plane to the graph of a surface z = f ( x.1 Partial derivatives as slopes Since the derivative dx of a function y = f ( x) is used to ﬁnd the tangent line to the graph of f (which is a curve in R2 ). c) be a point on S . dy Recall that the derivative dx of a function y = f ( x) has a geometric meaning. and let Q = ( x.3 Tangent Plane to a Surface In the previous section we mentioned that the partial derivatives ∂ x and ∂ y can be thought of as the rate of change of a function z = f ( x.1). y) Ly Lx b 0 (a. z z = f (x. y) in the plane y = b is a curve in R3 through the point (a. if the tangent plane exists at that point.3. b) ∂y (a.3. This indeed turns out to be the case. and let P = (a. Similarly. respectively. b) x D (a) Tangent line L x in the plane y = b y 0 a x (a. the trace of the surface described by z = f ( x. b) D y (b) Tangent line L y in the plane x = a Figure 2. b.2. we need a deﬁnition of a tangent plane.

76 CHAPTER 2. b. § 6. and suppose that the conditions for T to exist do hold. b)) is parallel to L x (since vx lies in the xz-plane and lies in a line with slope ∂f (a. b) i − ∂ y (a. B. It is possible that if we take the trace of the surface in the plane x − y = 0 (which makes a 45◦ angle with the positive x-axis).b) ∂x ∂f ∂f z ∂f vx = (1.2). f (a. b.3 is normal to the plane T . b) then the tangent plane to the surface z = f ( x. b) ( x − a) − ∂ y (a. the resulting curve in that plane may have a tangent line which is not in the plane determined by the other two tangent lines. f (a. b) = 0 (2. then all we need are vectors vx and v y that are parallel to L x and L y . 0. b) ( x − a) + ∂ y (a.3. b) ∂y = − ∂ x (a.3. Similarly. b)) L Lx T 0 x y y A ( x − a) + B( y − b) + C ( z − f (a. Let L x and L y be the tangent lines to the traces of the surface in the planes y = b and x = a. b)). y) will exist at the point (a. Since the slope of L x is ∂ x (a.6) 4 See T AYLOR and M ANN. ∂ x (a. b) ( y − b) + z − f (a. the vector = ∂f (a. ∂f ∂f (2. Thus the equation of T is − ∂ x (a. In this text. then the vector vx = (1. y) at a point (a. b) ∂x ∂f (a. respectively (as in Figure 2. b) = 0 . 1. b). f (a. Hence. respectively. it turns out4 that if ∂ x and ∂ y exist in a region around a point (a. y) at the point (a. b)) 1 vy ∂f (a.4.3).4) Figure 2. b)) = 0 (2. b. See Figure 2. we have the following result: The equation of the tangent plane to the surface z = f ( x. Luckily. FUNCTIONS OF SEVERAL VARIABLES tangent lines does not by itself guarantee the existence of the tangent plane. b) ∂x x 0 1 n = vx × v y = i j 1 0 0 1 k ∂f (a. . b)) is ∂f ∂f (a. b) ( y − b) − z + ∂x f (a. Since T contains the lines L x and L y .3. b. b)) is parallel to L y . those conditions will always hold. b). C ) is a normal vector to the plane T . Then the equation for T is z z = f (x. b) j + k ∂f ∂f Figure 2. f (a. Suppose that we want an equation of the tangent plane T to the surface z = f ( x. b)). b) and are continuous at (a. ∂ x (a.5) Multiplying both sides by −1. 0. ∂ y (a.3. or it may not have a tangent line ∂f ∂f at all at that point. the vector ∂x ∂f = (0. and then let n = vx × v y .2 Tangent plane where n = ( A. y) (a.

13. y) = x2 + y3 . y) = xe y . b. f ( x. c) is given by the equation ∂F (a. y) at the point P . P = (1. P = ( 3. 0. 1. In a similar fashion.6) is the special case of formula (2. f ( x. z) = f ( x. P = (−1. y) = x2 y. (2. x2 + y2 − z2 = 0. 5) . y. −1). it can be shown that if a surface is deﬁned implicitly by an equation of the form F ( x. P = (2. P = (0. Solution: For the function F ( x.14. f ( x. x2 + y2 = 4. Find the equation of the tangent plane to the surface z = x2 + y2 at the point (1. b. 1. −1. 2. and ∂F ∂z = 2 z. y. P = (3. ∂F ∂x = 2 x.2. P = (1. P = (3. b. 1. c) ( z − c) = 0 . 1) 5. y) = x2 + y2 . 1. c) ( x − a) + ∂F (a. x2 4 + y2 9 z + 16 = 1. x2 + y2 + z2 = 9. y) − z. or 2x + 4 y − z − 5 = 0 . f ( x. c) ( y − b) + ∂F (a. 5) is ∂f ∂x = 2 x and ∂f ∂y = 2 y. y) = x2 + y2 . 2. P = (1. 2) 3. y) = x + 2 y. 0) 9. 4. Solution: For the function f ( x. f ( x. y) = x y. Example 2.3 Tangent Plane to a Surface 77 Example 2. P = 1. ﬁnd the equation of the tangent plane to the given surface at the point P. −1) is 2x + 2 y − z − 9 = 0 . 2(2)( x − 2) + 2(2)( y − 2) + 2(−1)( z + 1) = 0 . y. 4) 2.7) where F ( x. 0. then the tangent plane to the surface at a point (a. 2 2 11 3 8. we have so the equation of the tangent plane at (2. 7. 2. we have the tangent plane at the point (1. z) = x2 + y2 + z2 − 9. 1) 6. 4. ﬁnd the equation of the tangent plane to the surface z = f ( x. ∂F ∂y = 2 y. 1. b. z) = 0. Find the equation of the tangent plane to the surface x2 + y2 + z2 = 9 at the point (2. −1) 4. so the equation of 2(1)( x − 1) + 2(2)( y − 2) − z + 5 = 0 . or Exercises A For Exercises 1-6. 5). f ( x.7) ∂x ∂y ∂z Note that formula (2. 3) 10. 2. 5) For Exercises 7-10. 2.

5. then we need only show the formula holds for unit vectors v = (v1 . b) + hv) − f (a. b). ∂x ∂f (a. Then the directional derivative of f at (a. for v = j = (0. What about other directions? It turns out that we can ﬁnd the rate of change in any direction using a more general type of derivative called a directional derivative. b). 0) and . which is true since D j f = as we noted earlier. is deﬁned as ∂f ∂f D v f (a.10) Proof: Note that if v = i = (1.78 CHAPTER 2. So ﬁx such a vector v and ﬁx a number h = 0. Let v be a unit vector in R2 . respectively. b) + v2 (a.9) D v f (a. b) . b) in the direction of v. v2 ) be a unit vector in R2 . b + hv2 ) − f (a. respectively. 1).2. (2. b) h (2. we learned that the partial derivatives ∂ x and ∂ y represent the (instantaneous) rate of change of f in the positive x and y directions. (2. denoted by D v f (a.8) Notice in the deﬁnition that we seem to be treating the point (a. That is. since we are adding the vector hv to it. 1) are the only unit vectors in R with a zero component. But this is just the usual idea of identifying vectors with their terminal points. b) be a point in D . Since there are many vectors with the same direction. Let (a. Deﬁnition 2. Similarly. 0) then the above formula reduces to D v f (a. If we were to write the vector v as v = (v1 . Then ∂f ∂x ∂f ∂y ∂f ∂f D v f (a. v2 ) with v1 = 0 and v2 = 0. ∂y 2 ∂f . and let v = (v1 . b) . b) = lim h→0 h From this we can immediately recognize that the partial derivatives ∂ x and ∂ y are special cases of the directional derivative with v = i = (1. b). then there is a simple formula for the directional derivative: Theorem 2. y) has continuous partial derivatives ∂ x and ∂ y (which will always be the case in this text). b) = which we know is true since D i f = formula reduces to D v f (a. b) = lim h→0 f ((a. b) be a point in D . and let (a. FUNCTIONS OF SEVERAL VARIABLES 2. 1) the ∂f . y) be a real-valued function with domain D in R2 such that the ∂f ∂f partial derivatives ∂ x and ∂ y exist and are continuous in D . v2 ). which the reader should be used to by now. we use a unit ∂x vector in the deﬁnition. 0) and v = j = (0. Let f ( x. then f (a + hv1 . Let f ( x. b) = ∂f (a. y) be a real-valued function with domain D in R2 . ∂x j = (0. ∂f ∂f If f ( x. as that represents a “standard” vector for a given direction. b) = v1 (a. y). ∂y So since i = (1.4 Directional Derivatives and the Gradient For a function z = f ( x. ∂f ∂f = D i f and ∂ y = D j f . b) as a vector.

b + α hv2 ) + hv1 ∂ x (a + β hv1 . b) f (a + hv1 . b) = hv1 Thus. b + hv2 ) − f (a. then the Mean Value Theorem from single-variable calculus can be applied to the function g( y) = f (a + hv1 . b) = lim h→0 f (a + hv1 . b + hv2 ) − f (a + hv1 . b) + f (a + hv1 . b) so by formula (2. b + α hv2 ) = g ′ ( b + α hv2 ) = g( b + hv2 ) − g( b) f (a + hv1 . by equation (2. b + α hv2 ) . b) = b + hv2 − b hv2 ∂f ∂y and so f (a + hv1 . b + hv2 ] (or [ b + hv2 . b) . b + hv2 ) − f (a + hv1 . y) on the interval [ b. So since the function f (a+ hv1 . we have ∂f ∂x (a + β hv1 . By a similar argument. b + hv2 ) − f (a. b) h ∂f ∂y = lim v2 h→0 (a + hv1 . b + hv2 ) − f (a + hv1 . b) = f (a + hv1 . b) = h h = v2 ∂f ∂y ∂f ∂f (a + hv1 . b) = v · ∂f ∂f (a. b) QED after reversing the order of summation. b) = hv2 (a + hv1 . b + α hv2 ) + v1 ∂f ∂x ∂f ∂y ∂f ∂x (a + β hv1 . (2. b) + v2 (a. b) ∂f ∂x = v2 ∂f ∂y ∂f ∂x (a.9) we have D v f (a. Note that D v f (a. then hv2 = 0 and thus any number c between b and b + hv2 can be written as c = b +α hv2 for some number 0 < α < 1. hv2 ∂ y (a + hv1 . b) − f (a. b) + v1 (a. b) − f (a.11) Since h = 0 and v2 = 0.4 Directional Derivatives and the Gradient Then 79 f (a + hv1 .2. y) is a realvalued function of y (since a + hv1 is a ﬁxed number). b] if one of h or v2 is negative) to ﬁnd a number 0 < α < 1 such that ∂f ∂y (a + hv1 . there exists a number 0 < β < 1 such that f (a + hv1 .11). b) = v1 (a. b + hv2 ) − f (a. The second vector has a special name: . ∂ y (a. b) . so D v f (a. b). b + α hv2 ) + v1 ∂f ∂x (a + β hv1 . b) ∂x . b) by the continuity of and ∂f ∂y .

D v f = v · ∇ f Example 2.4. 2 x y + x3 ). Solution: We see that ∇ f = ( y2 + 3 x2 y. y) = c x 0 Figure 2. ∂x ∂ y in R2 .80 CHAPTER 2. the gradient of f . Let c be a real number in the range of f and let v be a unit vector in R2 which is tangent to the level curve f ( x. 2) in the direction of v = 1 . 2) = 1 . z). y). y) = x y2 + x3 y at the point (1. . 2(1)(2) + 13 ) = ∂f ∂f 15 2 A real-valued function z = f ( x. y) = c (see Figure 2.1). Find the directional derivative of f ( x.13) in R3 .12) . y) is such a function and that ∇ f = 0.3. 2) = v · ∇ f (1. y. is the vector ∂f ∂f ∇f = (2. . the gradient is the vector ∇f = ∂f ∂f ∂f . FUNCTIONS OF SEVERAL VARIABLES Deﬁnition 2. Assume that f ( x.4. For a real-valued function f ( x. The symbol ∇ is pronounced “del”. 1 2 2 · (22 + 3(1)2 (2).1 5 Sometimes the notation grad( f ) is used instead of ∇ f . so D v f (1. ∂x ∂ y ∂z (2. y v ∇f f ( x.15.6.5 Corollary 2. y) whose partial derivatives ∂ x and ∂ y exist and are continuous is called continuously differentiable. 1 2 2 . denoted by ∇ f . For a real-valued function f ( x.

4.17. y) be a continuously differentiable real-valued function. 1 5 5 . and the value of f decreases the fastest in the direction of −∇ f (since θ = 180◦ in that case). with ∇ f = 0. a similar argument can be used to show that it also applies to functions of three or more variables.4 Directional Derivatives and the Gradient 81 The value of f ( x. y) the length ∇ f is ﬁxed. where x. f increases the fastest in the direction of −2 −1 .e. Let f ( x. −4 e4 ). y) = c. i. 2)? In which direction does it decrease the fastest? Solution: Since ∇ f = ( y2 + 3 x2 y. The largest value that D v f can take is when cos θ = 1 (θ = 0◦ ). Though we proved Theorem 2. where θ is the angle between v and ∇ f . y) = x y2 + x3 y increase the fastest from the point (1. −2 e−2 y . which means that ∇ f is normal to the level curve. so since v is a tangent vector to this curve. In general. the directional derivative in the three-dimensional case can also be deﬁned by the formula D v f = v· ∇f . 1) will the temperature decrease the fastest? Solution: Since ∇ f = (− e− x . (b) The value of f ( x. Likewise. the value of the function f increases the fastest in the direction of ∇ f (since θ = 0◦ in that case). . Thus. y) decreases the fastest in the direction of −∇ f . In other words. In which direction from the point (1. y) increases the fastest in the direction of ∇ f .16. for any unit vector v in R2 . we still have D v f = ∇ f cos θ . So since ∇ f = 0 then D v f = 0 ⇒ cos θ = 0 ⇒ θ = 90◦ . In which direction does the function f ( x.4 for functions of two variables. (c) The value of f ( x. 2 e−2 . 1. Then: (a) The gradient ∇ f is normal to any level curve f ( x. y) is constant along a level curve. z are space coordinates relative to the center of the solid. and the value of D v f then varies as θ varies. 5) = 0. then ∇ f (1. 1) = ( e−1 . z) = e− x + e−2 y + e4 z . ∇ f ⊥ v. 1 5 5 and decreases the fastest in the direction of . 1. But we know that D v f = v · ∇ f = v ∇ f cos θ . D v f = 0. The temperature T of a solid is given by the function T ( x. 2) = (10. So since v = 1 then D v f = ∇ f cos θ . y. while the smallest value occurs when cos θ = −1 (θ = 180◦ ). At a ﬁxed point ( x. then the temperature will decrease the fastest in the direction of −∇ f (1. where θ is the angle between v and ∇ f . then the rate of change of f in the direction of v is 0. 2 x y + x3 ). A unit vector in that direction is v = ∇f ∇f = 2 . Example 2.2. 5 5 2 . 4 e4 z ). y. In other words. We have thus proved the following theorem: Theorem 2. Example 2.

D −v f = −D v f 25. 18. 1) 17. y) = x2 e y . f ( x. D v ( c f ) = c D v f 26. B For Exercises 19-26. and that ∇( r 2 ) = 2 r. D v ( f g) = f D v g + g D v f x2 + y2 is the length of the position vector r = x i + y j for each 1 point ( x. 2). f ( x. y) and g( x. f ( x. f ( x. 1) 15. y.16 at the point (2. D v ( f + g) = D v f + D v g 27. Show that ∇ r = r when ( x. y) = 2 x + 5 y 8. 1. y) = 1 x 2 + y2 + 4 4. r . Show that: 19. y) = (0. y. y) = x2 + y2 − 1 3. f ( x. f ( x. compute the gradient ∇ f . P = (1. f ( x. 1 2 2 6. P = (1. 1 3 3 3 x2 + y2 + 4. f ( x. ∇( f g) = f ∇ g + g ∇ f 23. y) = x2 + y2 − 1.82 CHAPTER 2. y) = ln( x y) 7. P = (1. f ( x. y) = 2. 14. 1) . y. f ( x. y. 1. Repeat Example 2. y. FUNCTIONS OF SEVERAL VARIABLES Exercises A For Exercises 1-10. f ( x. P = (1. z) = x2 e yz . y. f ( x. and let v be a unit vector in R2 . let f ( x. z) = sin( x yz) 9. y) = 1 11. 0). ﬁnd the directional derivative of f at the point P in the direction of . 12. y) = 20. f ( x. y) be continuously differentiable real-valued functions. y) = v= 1 . P = (1. 3). y) in R2 .17 at the point (3. ﬁnd the directional derivative of f at the point P in the direction of 16. z) = x2 + y2 + z2 v= 1 . f ( x. z) = x2 e yz 10. f ( x. y) = x2 e y x 2 + y2 5. y) = 0 g2 24. ∇( f + g) = ∇ f + ∇ g 22. The function r ( x. Repeat Example 2. 1) 13. ∇( c f ) = c ∇ f 21. P = (1. 1. 1) x 2 + y2 . 1 . 1. let c be a constant. f ( x. ∇( f / g) = g ∇f − f ∇g if g( x. 1) For Exercises 15-16. z) = sin( x yz). z) = x 2 + y2 + z 2 For Exercises 11-14.

Then a necessary condition for f ( x. ∂y We thus have the following theorem: ∂f ∂f Theorem 2. Suppose that (a. 0) since any disk around (0. y) in the domain of f . y) = 0 simultaneously for ( x. If f ( x. b) where ∇ f (a. b).5 can be extended to apply to functions of three or more variables. y) for which ( x − a)2 + ( y − b)2 < r 2 . We will consider only functions of two variables. y) ≥ f (a.e. that is.2. i. y). y) ≤ f (a. 0)). f (a. But clearly f does not have a local maximum or minimum at (0. b) = 0. so (0. Deﬁnition 2. and that the ﬁrst-order partial derivatives of f exist at (a. and let (a. y). b) is the largest value of f in the x direction (around the point (a. there is some sufﬁciently small r > 0 such that f ( x. We say that f has a local maximum at (a. which has a = y = 0 ⇒ y = 0. along the path y = x in R2 . b). b) if f ( x. b) is the largest value of f ( x. functions of three or more variables require methods using linear algebra. y) ≥ f (a. y) as ( x. y) ≤ f (a.5. b) is a local maximum point for f ( x. and . to ﬁnd the critical points of f you have to solve the equations ∂ x ( x. y) in the domain of f . b) be a point in the domain of f . 0)) and different signs (so that f ( x. y). the necessary condition that ∇ f (a. Similarly. y) = x y has a critical point at (0. b). b) for all ( x. b) and ∂ y (a. then f has a global maximum at (a. b). b) = 0. y) = x2 . y) = 0 ∂f and ∂ y ( x. b) for all ( x. b)). b). Similar to the single-variable case. 0) is the only critical point. b) for all ( x. y) inside some disk of positive radius centered at (a. Likewise. A point (a. f ( x. y) goes in all directions from the point (a. Let f ( x. y) be a real-valued function. b) if f ( x.5 Maxima and Minima 83 2. in some sufﬁciently small disk centered at (a. b) = 0 is not always sufﬁcient to guarantee that a critical point is a local maximum or minimum. b) for all ( x. b) is the largest value of f near (a. Since g ′ ( x) = ∂ x ( x.18. y). b). b) for all ( x. Let f ( x. y) = x y < 0 = f (0. Note: Theorem 2. y) ≤ f (a. points where the function has a local maximum or local minimum. y) = x y > 0 = f (0. y) to have a local maximum or minimum at (a. y) inside some disk of positive radius centered at (a. y) where the values of x and y have the same sign (so that f ( x. y) be a real-valued function such that both ∂ x (a. So we know that ∂f ∂f g ′ (a) = 0. b) = 0. the single-variable function g( x) = f ( x. b) exist. b) has a local maximum at x = a. b). b) = 0 is called a critical point for the function f ( x. We know that f (a. In fact. So given ∂f a function f ( x. b) in the y direction and so ∂f (a. 0): ∂f ∂y ∂f ∂x = x = 0 ⇒ x = 0. then f has a global minimum at (a. f (a. b) is that ∇ f (a.7. we say that f has a local minimum at (a. If f ( x. 0) contains points ( x.5 Maxima and Minima The gradient can be used to ﬁnd extreme points of real-valued functions of several variables. b). Example 2. that is. then ∂ x (a. In particular. The function f ( x.

e.6. § 7.e. y) = x y. it is a local maximum in one direction and a local minimum in another direction. 0) The following theorem gives sufﬁcient conditions for a critical point to be a local maximum or minimum of a smooth function (i. b) − (a.1 f (x. FUNCTIONS OF SEVERAL VARIABLES local minimum at (0. . ∂ x2 ∂2 f (a. b) > 0. Let f ( x. The graph of f ( x. saddle point at (0. 100 50 0 -50 -100 -10 -5 -10 z -5 y 0 5 10 10 5 0 x Figure 2. a function whose partial derivatives of all orders exist and are continuous). with a critical point at (a. b) (i. then the test fails. 0). which is a hyperbolic paraboloid. b) < 0.6. which has a local maximum at (0. b) (d) if D = 0. So (0.5. 0). y) is shown in Figure 2.5. b) ∂ y ∂x ∂x ∂y Then (a) if D > 0 and (b) if D > 0 and ∂2 f (a.84 CHAPTER 2. Deﬁne 2 ∂2 f ∂2 f ∂2 f D = 2 (a. which we will not prove here. i. y) = − x2 . ∂ x2 then f has a local minimum at (a.1. y) be a smooth real-valued function. b) 2 (a. then f has neither a local minimum nor a local maximum at (a.e. 0) is an example of a saddle point. while along the path y = − x we have f ( x. b) = 0). 6 See T AYLOR and M ANN. ∇ f (a. b) (c) if D < 0.6 Theorem 2. b) then f has a local maximum at (a.

y) = (2. b ) = D + ∂ y ∂ x ( a.5 Maxima and Minima 85 If condition (c) holds. b) ∂x ∂ y ∂ y2 since and can ∂2 f ∂ y ∂x ∂2 f (a. b) is a saddle point. b) if = ∂2 f . −1) = (2)(2) − 12 = 3 > 0 ∂2 f (2.e. ∂2 f ∂ y2 =2 .6. we need the second-order partial derivatives: ∂2 f ∂ x2 =2 . y) = x2 + x y + y2 − 3 x. Find all local maxima and minima of f ( x. −1). Solution: First ﬁnd the critical points.e. (2. b) (a. ∂ y2 ∂2 f ∂2 f replace ∂ x2 (a. y) is smooth means that ∂2 f ∂2 f (a. b ) ∂x 2 > 0. where ∇ f = 0. b) by ∂ y2 (a.20. i. then (a. b) (a. y) are the common solutions of the equations 2x + y − 3 = 0 x + 2y =0 which has the unique solution ( x. Find all local maxima and minima of f ( x. where ∇ f = 0.2.19. Since ∂f ∂x Example 2. if D > 0 then ∂2 f ∂2 f ∂2 f 2 ( a. To use Theorem 2. Note that the assumption that f ( x. b) have the same sign. −1) = 2 > 0. b) ∂ x2 This means that in parts (a) and (b) of the theorem one desired. b) 2 ∂ y ∂x ∂x D= ∂2 f ∂2 f (a. So (2. and so ∂2 f (a. ∂2 f ∂ y ∂x 2 =1 and so D = and ∂2 f ∂x (2. ∂ x2 Thus. Solution: First ﬁnd the critical points. −1) − 2 ∂2 f ∂ y ∂x (2. y) = x y − x3 − y2 . ∂f ∂y = 2x + y − 3 and = x + 2y then the critical points ( x. Since ∂f ∂x Example 2. ∂x ∂ y Also. −1) is the only critical point. ∂f ∂y = y − 3 x2 and = x − 2y . i. −1) is a local minimum. −1) 2 ∂2 f ∂y (2. b ) ∂ y2 ( a.

∂2 f ∂ y ∂x = −4 . Also. 6 1 1 So the critical points are ( x. ∂ x2 6 12 ∂2 f 1 1 . where ∇ f = 0.6. Solution: First ﬁnd the critical points. (2. 12 . 12 2 = (−6 1 )(−2) − 12 = 1 > 0 6 = −1 < 0. ∂2 f ∂2 f ∂ y2 = −2 . y) = (0.21. 1 2 1 1 which has the solutions x = 0 and x = 1 . y) = 6 . 0) and ( x. 0) = (−6(0))(−2) − 12 = −1 < 0 and thus (0. 0) 2 ∂2 f ∂y (0. Thus. So x = 0 ⇒ y = 3(0) = 0 and x = 6 ⇒ y = 3 6 = 12 . Since ∂f ∂x Example 2. 0) is a saddle point. 2 ∂2 f ∂ y ∂x =1 So D = ∂2 f ∂x (0. substituting that into the ﬁrst equation yields 4(2 y − 2)3 = 0. i. substituting that into the second equation yields x −6 x2 = 0. is a local maximum. y) are the common solutions of the equations 4( x − 2)3 + 2( x − 2 y) = 0 −4( x − 2 y) = 0 The second equation yields x = 2 y.86 CHAPTER 2. 0) − 2 ∂ y ∂x (0. FUNCTIONS OF SEVERAL VARIABLES then the critical points ( x. To use Theorem 2. 12 − ∂2 f ∂ y ∂x 1 1 6 . we need the second-order partial derivatives: ∂2 f ∂ x2 = 12( x − 2)2 + 2 . 1) is the only critical point.6.e. and so x = 2(1) = 2. y) are the common solutions of the equations y − 3 x2 = 0 x − 2y = 0 The ﬁrst equation yields y = 3 x2 . which has the solution y = 1. To use Theorem 2. Thus. ∂ x2 6 12 ∂2 f 1 1 . we need the second-order partial derivatives: ∂2 f ∂ x2 = −6 x . Find all local maxima and minima of f ( x. y) = ( x − 2)4 + ( x − 2 y)2 . ∂f ∂y = 4( x − 2)3 + 2( x − 2 y) and = −4( x − 2 y) then the critical points ( x. D = and ∂2 f 1 1 . ∂ y2 6 12 1 1 6 . ∂2 f ∂ y2 =8 .

1) 2 ∂2 f ∂y (2. What can be done in this situation? Sometimes it is possible to examine the function to see directly the nature of a critical point. If we switch to using polar coordinates ( r. y) on the unit circle x2 + y2 = 1. where ∇ f = 0.2. If we look at the graph of f ( x. In our case. since f ( x. so it has a critical point at r = 1. then we see that we can write 2 2 f ( x. y). But we also see that f (2. 1) for all ( x. 1) − 2 ∂2 f ∂ y ∂x 2 (2. and hence (2. y) as a function g( r ) of the variable r alone: g( r ) = r 2 e−r . the points ( x. 0) and all points ( x. and we can check that g ′′ (1) = −4 e−1 < 0. so the Second Derivative Test from single-variable calculus says that r = 1 is a local maximum. However. Then g ′ ( r ) = 2 r (1 − r 2 ) e−r . Since = 2 x(1 − ( x2 + y2 )) e−( x + y2 ) + y2 ) = 2 y(1 − ( x2 + y2 )) e−( x 2 then the critical points are (0. we see that f ( x. y) is the sum of fourth and second powers of numbers and hence must be nonnegative. .6. y) on the unit circle x2 + y2 = 1 are local maximum points for f . so (0. y) in R2 .22. 0). for points ( x.2. 0) = 2 > 0. we need the second-order partial derivatives: ∂2 f ∂ x2 ∂2 f ∂ y2 ∂2 f ∂ y ∂x = 2[1 − ( x2 + y2 ) − 2 x2 − 2 x2 (1 − ( x2 + y2 ))] e−( x 2 + y2 ) = 2[1 − ( x2 + y2 ) − 2 y2 − 2 y2 (1 − ( x2 + y2 ))] e−( x = −4 x y[2 − ( x2 + y2 )] e−( x ∂2 f 2 2 + y2 ) + y2 ) At (0. Solution: First ﬁnd the critical points.5 Maxima and Minima So 87 D = ∂2 f ∂x (2. Find all local maxima and minima of f ( x. 1) = (2)(8) − (−4)2 = 0 and so the test fails. y). y) on the unit circle x2 + y2 = 1. it looks like we might have a local maximum for ( x. 1) is in fact a global minimum for f . y) ≥ 0 = f (2. Thus f ( x. we have D = (−4 x2 e−1 )(−4 y2 e−1 ) − (−4 x ye−1 )2 = 0 and so the test fails. where r 2 = x2 + y2 . Thus. we have D = 4 > 0 and ∂ x2 (0. y) on the unit circle x2 + y2 = 1. as shown in Figure 2. y) = ( x2 + y2 ) e−( x ∂f ∂x ∂f ∂y 2 2 + y2 ) . y).5. 1) = 0. y) ≥ 0 for all ( x. To use Theorem 2. i. θ ) instead of ( x. 0) is a local minimum.e. Example 2. But r = 1 corresponds to the unit circle x2 + y2 = 1.

z whose sum is 10 such that x2 y2 z is a maximum. y) = 2 x3 + 6 x y + 3 y2 7.88 CHAPTER 2. y) = x + 2 y 10. ﬁnd all local maxima and minima of the function f ( x. y) = 2. f ( x.2 f (x. b. (Hint: Use the volume condition to write the surface area as a function of just two variables.4 0. .25 0.5. f ( x. y) = (x2 + y2 )e−( x Exercises A For Exercises 1-10. f ( x.35 0. (Hint: Use Theorem 2. y) = x3 − 12 x + y2 + 8 y 4. Prove that if (a. y) = x3 − 3 x + y2 3.1 0. b)) is parallel to the x y-plane.2 0. y) = 4 x2 − 4 x y + 2 y2 + 10 x − 6 y B 11. f ( x. FUNCTIONS OF SEVERAL VARIABLES z 0. b) is a local maximum or local minimum point for a smooth function f ( x. Find three positive numbers x. f ( x. f ( x. For a rectangular solid of volume 1000 cubic meters. y) = x3 + 3 x2 + y3 − 3 y2 6. y) at the point (a.) 12.5. f ( x. f (a. y. f ( x. f ( x.15 0. y). then the tangent plane to the surface z = f ( x.3 0. ﬁnd the dimensions that will minimize the surface area.) C 13. f ( x. y) = x3 − 3 x + y3 − 3 y 5. 1. y) = −4 x2 + 4 x y − 2 y2 + 16 x − 12 y x 2 + y2 9.05 0 -3 -3 -2 -1 -2 0 -1 0 1 1 2 2 3 3 2 + y2 ) x y Figure 2. y) = 2 x3 − 6 x y + y2 8. y).

we tried to ﬁnd local (and perhaps even global) maximum and minimum points of real-valued functions f ( x. yn )∞ 1 converges to a critical point. 1. y) could be any points in the domain of f . and deﬁne 3 3 D ( x.6 Unconstrained Optimization: Numerical Methods 89 2. . Let f ( x. in general this will not be the case. y ) ∂y n n x n +1 = x n − D ( xn . then solving even one such equation. though one that is not usually emphasized. yn ) . you may have a hard time getting the exact solutions. There are formulas for solving polynomial equations of degree 4. While this was relatively simple for the examples we did. which meant having to solve the equation ∇ f = 0. which you probably learned in single-variable calculus. For example. if one of the equations that had to be solved was x3 + 9 x − 2 = 0 . yn+1 = yn − D ( xn . exponential. y) be a smooth real-valued function. . y ) ∂ x2 n n ∂f (x .2. If there are several n= critical points. Newton’s algorithm: Pick an initial point ( x0 . y ) ∂x ∂ y n n ∂f (x . The method we used required us to ﬁnd the critical points of f . y) = ∂2 f ∂x ( x.6 Unconstrained Optimization: Numerical Methods The types of problems that we solved in the previous section were examples of unconstrained optimization problems. 2. See U SPENSKY for more details. Cubic polynomial equations in one variable can be solved using Cardan’s formulas.7 For example. the only choice may be to ﬁnd a solution using some numerical method which gives a sequence of numbers which converge to the actual solution. y0 ). . where the points ( x. y ) ∂x n n ∂2 f (x . or complicated expressions involving trigonometric. . y). . Trial and error would not help much. but it can be proved that there is no general formula for solving equations for polynomials of degree ﬁve or higher. y) . could be impossible by elementary means. In this section we will describe another method of Newton for ﬁnding critical points of real-valued functions of two variables.14) Then the sequence of points ( xn . then you will have to try different initial points to ﬁnd them. y ) ∂y n n ∂2 f (x . y) − 2 ∂2 f ∂ y ∂x 2 ( x. 3. y) 2 ∂2 f ∂y ( x. y ) ∂x n n ∂2 f (x . or logarithmic functions. For n = 0. let alone two. 28 + 1 − 28 − 1. That is. which in general is a system of two equations in two unknowns ( x and y). In a situation especially since the only real solution8 turns out to be such as this. 7 This is also a problem for the equivalent method (the Second Derivative Test) in single-variable calculus. 8 There are also two nonreal. y ) ∂ y2 n n ∂f (x . deﬁne: ∂2 f (x . complex number solutions. If the equations involve polynomials in x and y of degree three or higher. Newton’s method for solving equations f ( x) = 0. y ) ∂x ∂ y n n ∂f (x . yn ) (2. which are not quite as simple as the familiar quadratic formula.

23.6.14) that we divide by D . which will take a given initial point as a parameter and then perform 100 iterations of Newton’s algorithm. y) = x3 − x y − x + x y3 − y4 for −20 ≤ x ≤ 20 and −20 ≤ y ≤ 20 Notice in the formulas (2. we will let a computer do the computing. . 0) as our initial point. we will write a simple program.90 CHAPTER 2.6. = 6 x y − 12 y2 . though it may be hard to tell where the critical points are. and since the computations are quite tedious. z 50000 0 -50000 -100000 -150000 -200000 -250000 -300000 -350000 -20 -20 -15 -10 -5 -15 -10 0 -5 0 5 5 10 10 15 15 20 20 x y Figure 2.1 f (x. FUNCTIONS OF SEVERAL VARIABLES Solution: First calculate the necessary partial derivatives: ∂f ∂x ∂2 f ∂ x2 Example 2. so we should pick an initial point where D is not zero. In each iteration the new point will be printed. = −1 + 3 y 2 Notice that solving ∇ f = 0 would involve solving two third-degree polynomial equations in x and y. ∂f ∂y = 3 x 2 − y − 1 + y3 .1 below). which in this case can not be done easily. y) over a large region may help (see Figure 2.1. y) = x3 − x y − x + x y3 − y4 . so that we can see if there is convergence. And we can see that D (0. y0 ) for our algorithm. For this. ∂2 f ∂ y2 = − x + 3 x y2 − 4 y3 ∂2 f ∂ y ∂x = 6x . Since it may take a large number of iterations of Newton’s algorithm to be sure that we are close enough to the actual critical point. 0) = (0)(0) − (−1)2 = −1 = 0. so take (0. Find all local maxima and minima of f ( x. We need to pick an initial point ( x0 . The full code is shown in Listing 2. Looking at the graph of z = f ( x. using the Java programming language.

2).out.(fyy(xn. } //The second partial derivative of f wrt x: 6x public static double fxx(double x.2) . } //The first partial derivative of f wrt y: -x+3xy^2-4y^3 public static double fy(double x. double y) { return 6*x*y ." + y + ")"). System.(fxx(xn.pow(fxy(x.yn)*fx(xn.out.4*Math.2). double y) { return 3*Math.yn)*fy(xn.println("Initial point: (" + x + ".yn) .fxy(xn.parseDouble(args[0]). //The current x and y values if (D == 0) { //We can not divide by 0 System.parseDouble(args[1]).3).y) . double xn = x.pow(x.java .y). double yn = y.2) .yn)*fx(xn.pow(y. //Initial x value double y = Double. n<=100. System. n++) { double D = fxx(x. } //The second partial derivative of f wrt y: 6xy-12y^2 public static double fyy(double x.2).yn))/D. //Go through 100 iterations of Newton’s algorithm for (int n=1. } } Listing 2.out." + y + ")").println("n = " + n + ": (" + x + ".yn))/D. double y) { return 6*x.y .yn) . //End the program } else { //Calculate the new values for x and y x = xn .6 Unconstrained Optimization: Numerical Methods 91 //Program to find the critical points of f(x. //Initial y value System.12*Math.yn)*fy(xn.fxy(xn. } //The mixed second partial derivative of f wrt x and y: -1+3y^2 public static double fxy(double x.2.pow(y.exit(0). double y) { return -1 + 3*Math.println("Error: D = 0 at iteration n = " + n).1 Program listing for newton.pow(y.pow(y.Math.3). y = yn .y)=x^3-xy-x+xy^3-y^4 public class newton { public static void main(String[] args) { //Get the initial point (x.y) as command-line parameters double x = Double.1 + Math.pow(y. double y) { return -x + 3*x*Math. } } } //Below are the parts specific to the function f //The first partial derivative of f wrt x: 3x^2-y-1+y^3 public static double fx(double x.y)*fyy(x.

39636433796318005) n = 10: (0.4711356343449874.4711356343449874. −0.44194107452339687) n = 4: (0.39636433796318005) . It turns out that both partial derivatives are indeed close enough to zero to be considered zero: (0.39636433796318005) n = 98: (0.39636433796318005) n = 100: (0.4711356343449874.85722573273506 × 10−17 ∂x ∂f (0.-0.39636433796318005) n = 9: (0.4711356343449874.92 CHAPTER 2.java Then run the program with the initial point (0.4711356343449705.3966334583092305) n = 6: (0.4711356343449874.-0.39636433796318005) As you can see.0..-0. −0.39636433796318005) = −8.-1.39636433796318005) = 4.776075636032301 < 0. It is easy to conﬁrm that ∇ f = 0 at this ∂f ∂f point. 0) with this command: java newton 0 0 Below is the output of the program using (0.-0.java is saved.-0.6 we know that (0.39636433796318005) n = 97: (0.-0.-0. truncated to show the ﬁrst 10 lines and the last 5 lines: java newton 0 0 Initial point: (0. run this command at a command prompt to compile the code: javac newton. −0.-0.39636433796318005) = −8.4711356343449874.5) n = 3: (0. so by Theorem 2.-0.4711356343449874.47123972682634485.-0.39636433796318005) n = 99: (0.4711356343449874.4711356343449874.47113558510349535.4711356343449874.-0.39636450001936047) n = 7: (0. −0.0) n = 1: (0.0. FUNCTIONS OF SEVERAL VARIABLES To use this program.0) n = 2: (1.oracle.4711356343449874. you should ﬁrst save the code in Listing 2. −0. namely the point (0.484506572966545.-0. either by evaluating ∂ x and ∂ y at the point ourselves or by modifying our program to also print the values of the partial derivatives at the point.3963643379632247) n = 8: (0.4711356343449874.java.39636433796318005).326672684688674 × 10−17 ∂y We also have D (0. we appear to have converged fairly quickly (after only 8 iterations) to what appears to be an actual critical point (up to Java’s level of precision). 0) as the initial point.com/technetwork/java/javase/downloads/ ∂f .0.-0.405341511995805) n = 5: (0..1 in a plain text ﬁle called newton.39636433796318005) is a saddle point. 9 Available for free at http://www.-0.0. You will need the Java Development Kit9 to compile the code.6065857885615251.4711356343449874. n = 96: (0. In the directory where newton.

6703832459238667.4794622222856417. And ∂2 f D (−0.-0.540962756992551.0.5161209914612475.3853578526055 > 0 (−0.4319791238981274) n = 8: (-0.4345777963475479) n = 15: (-0.004453014967208.0.-2.0.6703832679150286. 0.-1. −5.0.11570743992954591.3672160534444.49848120123515316) n = 14: (-0.4252025996474051) n = 16: (-0.6704392913413444.6703832459238667.6 Unconstrained Optimization: Numerical Methods 93 Since ∇ f consists of cubic polynomials.2.6985177124230715. it is easy to conﬁrm that both ∂ x and ∂ y vanish at the point (−0..08450704225352113) n = 3: (-0.24529117721011612) n = 7: (0.5.42501465652420045) n = 99: (-0.0.2918236503332734) n = 13: (-0.0) n = 1: (-0. 0.05837851765533317. n = 98: (-0.0.-0.42501465652420045).2.6.-0.0.42501465652420045) n = 20: (-0. and trying different values does indeed lead to different sequences which converge: java newton -1 -1 Initial point: (-1.42501465652420045) = −4.42501465652420045) .6703832459238667.5) n = 2: (-0.5426077421319053) n = 6: (-0.4540060574531383.4250147307973365) n = 17: (-0.49295774647887325.4176293491131443) n = 12: (-0.1855674752461383.0.6703832459238667.129841298650007.42501465652420045) = 15.0.0222994755432 < 0 ∂ x2 so we know that (−0. .6703832459238667.-0.9206128022529645) n = 11: (-0. 0.6703832459238667.2047647348546167) n = 4: (-0.0.8643989895639324) n = 5: (-0.-1.-0.6703832459238701.42501465652420045) is a local maximum. which does suggest a local maximum around that point.121516233310142) n = 10: (-1.4250146565242004) n = 19: (-0.6536079835854451) n = 9: (-0.0.0. running the computer program with the initial point (−5.-0. with D < 0 at that point.595509445899435).6733618916578702.5788664043863884.6703832459238667. Finally.6703832459238667. 0. it seems likely that there may be three critical points.42501465652421205) n = 18: (-0.6703832459238667. which means it is a critical point.-0.. which makes it a saddle point.42501465652420045) ∂f ∂f Again. An idea of what the graph of f looks like near that point is shown in Figure 2. −5) yields the critical point (−7.42501465652420045) n = 100: (-0.-1. The computer program makes experimenting with other initial points easy.-1.0.6703832459238667.

2 f (x.4711356343449874.8 1 0 -0.94 CHAPTER 2.67.6 -0. at least in practical applications. See R ALSTON and R ABINOWITZ for more detail and for discussion of other numerical methods. y) = x3 − x y − x + x y3 − y4 : (0.57) z 0. 0. global maxima and minima tend to be more interesting than local versions. −0.6703832459238667. so a large number of methods have been developed to ﬁnd the global minimum of functions of any number of variables. Newton’s algorithm can be used to ﬁnd those points.6 x y Figure 2.42. y) = x3 − x y − x + x y3 − y4 for −1 ≤ x ≤ 0 and 0 ≤ y ≤ 1 We can summarize our ﬁndings for the function f ( x. The crux of the steepest descent idea.8 -1 -1 -0. Our description of Newton’s algorithm is the special two-variable case of a more general algorithm that can be applied to functions of n ≥ 2 variables. 0.8 0 0.42501465652420045) : local maximum (−7. −5. In the case of functions which have a global maximum or minimum. FUNCTIONS OF SEVERAL VARIABLES (−0. and the proof that it converges (given a “reasonable” choice for the initial point) requires techniques beyond the scope of this text. then. In general.6 0.4 -0.2 0.595509445899435) : saddle point The derivation of Newton’s algorithm.2 -0.540962756992551.39636433796318005) : saddle point (−0. you move a certain amount in the direction of −∇ f at that point.6 -0. A maximization problem can always be turned into a minimization problem (why?). which is based on an idea that we discussed in Section 2.4 0.4 0.6.2 -0.2 0 -0. Many of these methods are based on the steepest descent technique. Recall that the negative gradient −∇ f gives the direction of the fastest rate of decrease of a function f . Wherever that takes you .4. is that starting from some initial point. 0. This ﬁeld of study is called nonlinear programming.4 0.

see B AZARAA. 1). Exercises C 1. . and of nonlinear programming in general. y) and f 2 ( x. yn ) . and you then just keep repeating that procedure until eventually (hopefully) you reach the point where f has its smallest value. Show that you get two different solutions when using (0. 2. then use the initial point (3. . Either modify that program or write one of your own in a programming language of your choice to show that Newton’s algorithm does lead to the point (2. Recall Example 2. 2). fxx. yn ) ∂y f 2 ( xn . yn ) ∂ f2 ∂x . and a multitude of variations on it that improve the rate of convergence. where f 1 ( x. y) = ( x − 2)4 + ( x − 2 y)2 .2. how do you explain it? (Hint: Something strange should happen. y0 ). 3). 1) was a global minimum for the function f ( x. For more discussion of this. yn ) = ∂ f1 ∂x yn+1 = yn + ( xn . and compare the results. 1. fy.21 from the previous section. y) = sin( x y) − x − y = 0 and f 2 ( x. yn ) D ( xn . Notice that our computer program can be modiﬁed fairly easily to use this function (just change the return values in the fx.) 2. 1) for the initial point ( x0 . Did anything strange happen when your program ran? If so. . S HERALI and S HETTY. yn ) ∂ f2 ( xn . yn ) ∂x D ( xn . Then the sequence of points ( xn . yn ) . For n = 0. yn ) ( xn . y0 ). ease of calculation. deﬁne: f 1 ( xn . 3. where ( xn . yn ) ( xn . . In fact. There is a “pure” steepest descent method. y) = 0 and f 2 ( x. fyy and fxy function deﬁnitions to use the appropriate partial derivative).6 Unconstrained Optimization: Numerical Methods 95 becomes your new point. Make sure that your program attempts to do 100 iterations of the algorithm. First use the initial point (0. etc. . yn )∞ 1 converges to a solution. yn ) ∂y f 1 ( xn . Write a computer program n= that uses this algorithm to ﬁnd approximate solutions to the system of equations f 1 ( x. yn ) x n +1 = x n − ∂ f1 ( xn . y) = e2 x − 2 x + 3 y = 0 . There is a version of Newton’s algorithm for solving a system of two equations f 1 ( x. where we showed that the point (2. yn ) − ∂ f1 ∂y ∂ f1 ( xn . y) are smooth real-valued functions: Pick an initial point ( x0 . yn ) ∂x D ( xn . Newton’s algorithm can be interpreted as a modiﬁed steepest descent method. 0) and (1. ∂ f2 ∂y f 2 ( xn . yn ) ∂ f2 ( xn . y) = 0 .

y) = x y = x(10 − x) = 10 x − x2 . called the Lagrange multiplier method10 . Points ( x. y.96 CHAPTER 2. y) = c (or g( x. this problem can be stated as: Maximize : f ( x.5 and 2. say. y) = c. z) = c) for some constant c The equation g( x. FUNCTIONS OF SEVERAL VARIABLES 2. For a rectangle whose perimeter is 20 m. and hence x = 5 must be the global maximum on the interval [0. ﬁnd the dimensions that will maximize the area. and we say that x and y are constrained by g( x. for solving constrained optimization problems: Maximize (or minimize) : f ( x. y. The perimeter P of the rectangle is then given by the formula P = 2 x + 2 y. respectively. so we now just have to maximize the function f ( x) = 10 x − x2 on the interval [0. Since we must have 2 x + 2 y = 20. for solving this problem. 10]. Notice in the above example that the ease of the solution depended on being able to solve for one variable in terms of the other in the equation 2 x + 2 y = 20. y in terms of x using that equation. . then we can solve for. y) = c is called the constraint equation. Solution: The area A of a rectangle with width x and height y is A = x y. This gives y = 10 − x. Since we are given that the perimeter P = 20.6 we were concerned with ﬁnding maxima and minima of functions without any constraints on the variables (other than being in the domain of the function). So since y = 10 − x = 5.7 Constrained Optimization: Lagrange Multipliers In Sections 2. Since f ′ ( x) = 10 − 2 x = 0 ⇒ x = 5 and f ′′ (5) = −2 < 0. which we then substitute into f to get f ( x. then the maximum area occurs for a rectangle whose width and height both are 5 m. z)) given : g( x. using single-variable calculus. But what if that were not possible (which is often the case)? In this section we will use a general method. then the Second Derivative Test tells us that x = 5 is a local maximum for f . 10] (since f = 0 at the endpoints of the interval). Similar deﬁnitions hold for functions of three variables. Example 2. The Lagrange multiplier method for solving such problems can now be stated: 10 Named after the French mathematician Joseph Louis Lagrange (1736-1813). y) (or f ( x. y) = x y given : 2 x + 2 y = 20 The reader is probably familiar with a simple method. y) = c are called constrained maximum or constrained minimum points. This is now a function of x alone. y) which are maxima or minima of f ( x. y) with the condition that they satisfy the constraint equation g( x.24. What would we do if there were constraints on the variables? The following example illustrates a simple case of this type of problem.

§ 6. For instance. y) = 0 for all ( x. y). Whether a point ( x. there are “hidden” constraints.24. y) will occur either at a point ( x. which by itself is not bounded. y) for some λ actually is a constrained maximum or minimum can sometimes be determined by the nature of the problem itself. see T AYLOR and M ANN. y) = 2 x + 2 y = 20 Then solving the equation ∇ f ( x. For a rectangle whose perimeter is 20 m. in Example 2. y ≤ 10. y) = c (plus any hidden constraints) describes a bounded set B in R2 . y) = c . and suppose that c is a scalar constant such that ∇ g( x. Then to solve the constrained optimization problem Maximize (or minimize) : f ( x. y) = λ∇ g( x. So how can you tell when a point that satisﬁes the condition in Theorem 2. respectively. which is beyond the scope of this text. It can be shown12 that if the constraint equation g( x. y) given : g( x. which is bounded. y) = λ∇ g( x. y) = λ∇ g( x. y) and g( x. due to the nature of the problem.7 Constrained Optimization: Lagrange Multipliers 97 Theorem 2.24 it was clear that there had to be a global maximum. y) that satisfy the equation g( x. namely 0 ≤ x. y) for some constant λ (the number λ is called the Lagrange multiplier). y) = λ∇ g( x. then it must be such a point. If there is a constrained maximum or minimum. ﬁnd the points ( x. y) that satisﬁes ∇ f ( x.8 for more detail. Example 2. y) be smooth functions.24 the constraint equation 2 x + 2 y = 20 describes a line in R2 . y) = c. In Example 2. with x and y representing the width and height. 12 Again. . then the constrained maximum or minimum of f ( x.7 really is a constrained maximum or minimum? The answer is that it depends on the constraint function g( x. which cause that line to be restricted to a line segment in R2 (including the endpoints of that line segment). A rigorous proof of the above theorem requires use of the Implicit Function Theorem. of the rectangle.11 Note that the theorem only gives a necessary condition for a point to be a constrained maximum or minimum. together with any implicit constraints. y) for some λ means solving the equations 11 See T AYLOR and M ANN. Solution: As we saw in Example 2. y) satisfying ∇ f ( x. Let f ( x. However. this problem can be stated as: Maximize : f ( x. y) or at a “boundary” point of the set B.25.7. use the Lagrange multiplier method to ﬁnd the dimensions that will maximize the area.2. y) that solve the equation ∇ f ( x. y) = x y given : g( x.

Doing this we get y x = λ = 2 2 ⇒ x= y . Solution: The distance d from any point ( x. y) = 2 x + 2 y = 2 x + 2 x = 4 x ⇒ x = 5 ⇒ y = 5 There must be a maximum area. Similarly. so the point (5. y) means solving the following equations: 2( y − 2) = 2λ y 2( x − 1) = 2λ x . Example 2. The general idea is to solve for λ in both equations. ∴ The maximum area occurs for a rectangle whose width and height both are 5 m. 5) = 25 > 0. y) = ( x − 1)2 + ( y − 2)2 given : g( x. y = 0. namely: x = 2λ y = 2λ . y) to the point (1. and minimizing the distance is equivalent to minimizing the square of the distance. y) = x2 + y2 = 80 Solving ∇ f ( x.26. y) = λ∇ g( x. 2). So we can solve both equations for λ as follows: y−2 x−1 = λ = x y ⇒ x y − y = x y − 2x ⇒ y = 2x . Find the points on the circle x2 + y2 = 80 which are closest to and farthest from the point (1. since the minimum area is 0 and f (5. FUNCTIONS OF SEVERAL VARIABLES ∂f ∂y ∂g ∂y =λ and =λ . 2) is d= ( x − 1)2 + ( y − 2)2 . Note that x = 0 since otherwise we would get −2 = 0 in the ﬁrst equation.98 ∂f ∂x ∂g ∂x CHAPTER 2. 5) that we found (called a constrained critical point) must be the constrained maximum. then set those expressions equal (since they both equal λ) to solve for x and y. Thus the problem can be stated as: Maximize (and minimize) : f ( x. so now substitute either of the expressions for x or y into the constraint equation to solve for x and y: 20 = g( x.

27. z) = x2 + y2 + z2 = 1 Solution: Solve the equation ∇ f ( x. 8) (1. Maximize (and minimize) : f ( x. Example 2. and since there must be points on the circle closest to and farthest from (1. −8) = 125. 0. then is the constrained maximum point and −1 . when the constant c in the constraint equation g( x. −8) is the farthest from (1. y. Since f 1 . 1 2 2 bounded) in R3 . 2) and (−4. z) = x2 + y2 + λ z2 = 1 yields the constrained critical points f −1 . So the two constrained critical points are (4. We needed λ only to ﬁnd the constrained critical points. y) = c is changed by 1. y) = x2 + y2 = 80 yields 5 x2 = 80. and since the constraint equation x2 + y2 + z2 = 1 describes a sphere (which is 1 . 0. −8). then we were guaranteed that the constrained critical points we found were indeed the constrained maximum and minimum. − 1 2 2 . 0.7. Substituting these expressions into the constraint equation g( x. z): 1 = 2λ x 1 = 2λ z 0 = 2λ y The ﬁrst equation implies λ = 0 (otherwise we would have 1 = 0). −8) Figure 2.1). − 1 2 2 is the constrained minimum point. 0. 2). 1 2 2 and −1 . y. but made no use of its value. 1 2 2 > . Notice that since the constraint equation x2 + y2 = 80 describes a circle.7. y. It turns out that λ gives an approximation of the change in the value of the function f ( x.7 Constrained Optimization: Lagrange Multipliers 99 Substituting this into g( x. x2 + y2 = 80 y (4. z) = x + z given : g( x. so x = ±4. so we can divide by λ in the second equation to get y = 0 and we can divide by λ in the ﬁrst and third equations to get x = 21 = z. 8) and (−4. which is a bounded set in R2 . Since f (4. 0. 8) is the point on the circle closest to (1. 0.1 The Lagrange multiplier method can be extended to functions of three variables. y) that we wish to maximize or minimize. − 1 2 2 1 . 2) 0 x (−4. So far we have not attached any signiﬁcance to the value of the Lagrange multiplier λ. z) = λ∇ g( x. 8) = 45 and f (−4. 2) (see Figure 2. y. then it must be the case that (4. .2. y.

pt) − f (old max. i. 2. and that λ = x/2 = y/2. y) at the constrained maximum increased from f (5. Notice that λ = 2. .5 is close to 2. 5). FUNCTIONS OF SEVERAL VARIABLES For example. y) = x y given : g( x.25. z) = x + y2 + 2 z given that 4 x2 + 9 y2 − 36 z2 = 36.5625. In a similar fashion we could show that the constrained optimization problem Maximize : f ( x. 3). Find the constrained maxima and minima of f ( x. note that solving the equation ∇ f ( x. a2 b 2 c 2 13 See B AZARAA. y) means having to solve a system of two (possibly nonlinear) equations in three unknowns. it increased by 2.25 we showed that the constrained optimization problem Maximize : f ( x. which as we have seen before.25). 3. Thus. Find the constrained maxima and minima of f ( x. y) = (5. in Example 2. pt) . Find the volume of the largest rectangular parallelepiped that can be inscribed in the ellipsoid x 2 y2 z 2 + + =1 . y) = (5. λ = 2.5.13 Exercises A 1. So we see that the value of f ( x. y) = 2 x + 2 y = 20 had the solution ( x. Luckily there are many numerical methods for solving constrained optimization problems. 5.25. y) = 2 x + 2 y = 21 has the solution ( x. y) = x y given : g( x.e.100 CHAPTER 2. B 4. y) = x y given that x2 + 3 y2 = 6. Finally. though we will not discuss them here. y) = 2 x + y given that x2 + y2 = 4. Find the points on the circle x2 + y2 = 100 which are closest to and farthest from the point (2.5625. that is. λ ≈ ∆ f = f (new max. S HERALI and S HETTY. 5. y) = λ∇ g( x. y. 5.5625 when we increased the value of c in the constraint equation g( x. And the 3-variable case can get even more complicated. may not be possible to do. All of this somewhat restricts the usefulness of Lagrange’s method to relatively simple functions. Find the constrained maxima and minima of f ( x. 5) = 25 to f (5.25) = 27. y) = c from c = 20 to c = 21.

y) but above the x y-plane over the 101 . y).1 Double Integrals In single-variable calculus. another function F ( x) whose derivative is f ( x). the area of the region between the curve and the x y-plane) as y varies over the interval [ c.e. For instance. d ].1).1 d d A(x) y 0 R The area A(x) varies with x Then A ( x) = c f ( x. as we will see shortly. Then the trace of the surface in that plane is the curve f ( x∗. y) on the rectangle R = {( x. y) c a x b x Figure 3. For any number x∗ in the interval [a. differentiation and integration are thought of as inverse operations.3 Multiple Integrals 3. y) : a ≤ x ≤ b. to integrate a function f ( x) it is necessary to ﬁnd the antiderivative of f . Let f ( x. Is there a similar way of deﬁning integration of real-valued functions of two or more variables? The answer is yes. y) with the plane x = x∗ parallel to the yz-plane. So using the variable x instead of x∗. y) ≥ 0 represents the volume “under” the surface z = f ( x. b]. and only y varies.1. so we know that the area under the curve is the deﬁnite integral. slice the surface z = f ( x. the double integral of a nonnegative real-valued function f ( x.1. y) is a continuous function of y over the interval [ c. The area A under that curve (i. b] × [ c. The area A ( x) is a function of x. so by the “slice” or cross-section method from single-variable calculus we know that the volume V of the solid under the surface z = f ( x. We will often write this as R = [a. y). let A ( x) be that area (see Figure 3. y) ≥ 0 for all ( x. c ≤ y ≤ d } in R2 . d ]. z z = f (x. y) be a continuous function such that f ( x. y) d y since we are treating x as ﬁxed. that is. Recall also that the deﬁnite integral of a nonnegative function f ( x) ≥ 0 represented the area “under” the curve y = f ( x). d ] then depends only on the value of x∗. As we will now see. This makes sense since for a ﬁxed x the function f ( x. where x∗ is ﬁxed and only y varies.

c a (3. we could just as easily have taken the area of cross-sections under the surface which were parallel to the xz-plane. y) is ﬁrst integrated with respect to x using the “inner” limits of integration a and b. so that the volume V would be d b V = f ( x. y) dx d y . 2]. The ﬁnal result is then a number (the volume). the result is then an expression involving only x. Also.1) We will always refer to this volume as “the volume under the surface”. This process of going through two iterations of integrals is called double integration. y) with respect to y. The above expression uses what are called iterated integrals.2) It turns out that in general1 the order of the iterated integrals does not matter. Once that integration is performed. 1] × [0. This order of integration can be changed if it is more convenient.1). y) with respect to y is the inverse operation of taking the partial derivative of f ( x. y) is integrated as a function of y.1. See Ch. Notice that integrating f ( x. y) d y dx a c (3.102 CHAPTER 3. y) dx d y . Also. . 1 due to Fubini’s Theorem. 18 in T AYLOR and M ANN. and the last expression in equation (3. treating the variable x as a constant (this is called integrating with respect to y). which would then depend only on the variable y. First the function f ( x. That is what occurs in the “inner” integral between the square brackets in equation (3.3) where it is understood that the fact that dx is written before d y means that the function f ( x. This is the ﬁrst iterated integral. and then the resulting function is integrated with respect to y using the “outer” limits of integration c and d . b] of that cross-sectional area A ( x): b b d V = a A ( x) dx = f ( x. we will usually discard the brackets and simply write d b V = f ( x. Example 3. That is what occurs in the “outer” integral above (the second iterated integral). MULTIPLE INTEGRALS rectangle R is the integral over [a.1) is called a double integral. Find the volume V under the plane z = 8 x + 6 y over the rectangle R = [0. which can then be integrated with respect to x. c a (3.

2]. so 2 3 2 2 V = = = e x+ y dx d y x =3 x =2 1 e x+ y dy 1 2 1 ( e y+3 − e y+2 ) d y 2 1 = e y+3 − e y+2 = e5 − e4 − ( e4 − e3 ) = e5 − 2 e4 + e3 Recall that for a general function f ( x). Solution: We know that f ( x.1 Double Integrals Solution: We see that f ( x. y) = 8 x + 6 y ≥ 0 for 0 ≤ x ≤ 1 and 0 ≤ y ≤ 2. y) = e x+ y > 0 for all ( x. so: 2 1 0 2 103 V = = = 0 (8 x + 6 y) dx d y x =1 x =0 0 2 0 4 x2 + 6 x y (4 + 6 y) d y 2 0 dy = 4 y + 3 y2 = 20 Suppose we had switched the order of integration. We can verify that we still get the same answer: 1 2 0 1 V = = = 0 (8 x + 6 y) d y dx y=2 y=0 0 1 0 8 x y + 3 y2 dx (16 x + 12) dx 1 0 = 8 x2 + 12 x = 20 Example 3.3. Find the volume V under the surface z = e x+ y over the rectangle R = [2. 3] × [1. the integral a f ( x) dx represents the difference of the area below the curve y = f ( x) but above the x-axis when f ( x) ≥ 0.2. and the area above the b . y).

f ( x. 1] 4. 1] 1 2 1 2 1 2. R = [0. 1] 3.104 CHAPTER 3. y) = 4 x y. 0 2 x y cos( x2 y) dx d y x y dx d y −1 −1 π π/2 11. 0 x( x + y) dx d y x( x y + sin x) dx d y sin x cos( y − π) dx d y 2 7. 10. .3. MULTIPLE INTEGRALS curve but below the x-axis when f ( x) ≤ 0. f ( x. 1. 1] × [0. evaluate the given double integral. 0 1 0 0 π/2 1 0 4 9. f ( x. y) = e x+ y . 2] × [0. 2] 1 2 0 2 1 For Exercises 5-12. Evaluate 0 0 sin( x + y) dx d y. our method of double integration by means of iterated integrals can be used to evaluate the double integral of any continuous function over a rectangle. 2π π Example 3. regardless of whether f ( x. y) ≥ 0. y) ≤ 0. π] × [0. R = [0. R = [0. Solution: Note that f ( x. 0 (1 − y) x2 dx d y ( x + 2) dx d y 6. y) = x3 + y2 . y) ≥ 0 or not. We can still evaluate the double integral: 2π 0 0 π 2π sin( x + y) dx d y = = 0 2π 0 − cos( x + y) x =π x =0 dy (− cos( y + π) + cos y) d y 2π 0 = − sin( y + π) + sin y = 0 = − sin 3π + sin 2π − (− sin π + sin 0) Exercises A For Exercises 1-4. R = [1. Similarly. y) = sin( x + y) is both positive and negative over the rectangle [0. −1 −1 1 dx d y 13. ﬁnd the volume under the surface z = f ( x. the double integral of any continuous function f ( x. Thus. 1] × [0. 0 1 12. Show that d b c a M dx d y = M ( d − c)( b − a). 2π]. 5. 1] × [−1. y) over the rectangle R . y) = x4 + x y + y3 . Let M be a constant. 0 8. f ( x. y) but above the x y-plane when f ( x. y) represents the difference of the volume below the surface z = f ( x. and the volume above the surface but below the x y-plane when f ( x.

bounded on the right by the vertical line x = b (where a < b).1(a). with functions of x as the limits of integration. and bounded above by a curve y = g 2 ( x).2 Double Integrals Over a General Region In the previous section we got an idea of what a double integral over a rectangle represents. with the A signifying area. y) is ﬁrst integrated with respect to y. b) (they could intersect at the endpoints x = a and x = b.2 Double Integrals Over a General Region 105 3. Suppose that we have a region R in the x y-plane that is bounded on the left by the vertical line x = a. y) d y dx a g 1 ( x) (3. We will assume that g 1 ( x) and g 2 ( x) do not intersect on the open interval (a. the double integral of a real-valued function f ( x. bounded on the right by a curve x = h 2 ( y). bounded below by the horizontal line .1 Double integral over a nonrectangular region R Then using the slice method from the previous section. if we have a region R in the x y-plane that is bounded on the left by a curve x = h 1 ( y).3. which then allows us to take the second iterated integral with respect to x. y) dx d y 1 h ( y) Figure 3.2. y) over more general regions in R2 . y) over the region R . is given by f ( x. We can now deﬁne the double integral of a real-valued function f ( x. as in Figure 3.4) This means that we take vertical slices in the region R between the curves y = g 1 ( x) and y = g 2 ( x). This makes sense since the result of the ﬁrst iterated integral will have to be a function of x alone. bounded below by a curve y = g 1 ( x). y) d A .2. Similarly. denoted by R b g 2 ( x) f ( x. though). y) d A = R f ( x. y) d y dx 1 d (b) Horizontal slice: c h 2( y) f ( x. The symbol d A is sometimes called an area element or inﬁnitesimal. Note that f ( x. y y = g 2 (x) d y R x = h 1 (y) c R 0 x = h 2 (y) x y = g 1 (x) x b 0 a g ( x) b (a) Vertical slice: a g 2( x) f ( x.

3 (4 + 6 y − (2 y + y )) d y = 0 (4 + 4 y − 3 2 y3/2 ) d y = 16 − 48 = 5 32 5 = 4 y + 2 y2 − 6 5 2 y5/2 = 8+8− 6 2 32 5 = 6.4.4 y 2 x= y/2 We get the same answer using horizontal slices (see Figure 3. as in Figure 3.3): V = R 2 (8 x + 6 y) d A 1 R x = = = 0 2 0 2 0 y/2 (8 x + 6 y) dx d y x =1 x= y/2 6 y 2 2 0 0 1 4 x2 + 6 x y dy 2 Figure 3.5) Notice that these deﬁnitions include the case when the region R is a rectangle. y) in the region R . y) d A is the volume under the surface f ( x. and bounded above by the horizontal line y = d (where c < d ). Using vertical slices we get: y y = 2x2 V = R 1 (8 x + 6 y) d A 2 x2 0 1 R x = = = 0 (8 x + 6 y) d y dx y=2 x2 y=0 0 1 Figure 3. 0 ≤ y ≤ 2 x2 }. Solution: The region R is shown in Figure 3.2.2. then taking horizontal slices gives d h 2 ( y) f ( x. then R z = f ( x. y) ≥ 0 for all ( x. y) : 0 ≤ x ≤ 1. Also. d )).2.2. y) dx d y c h 1 ( y) (3. Find the volume V under the plane z = 8 x + 6 y over the region R = {( x.1(b) (assuming that h 1 ( y) and h 2 ( y) do not intersect on the open interval ( c. Example 3. if f ( x. y) over the region R .2 0 1 0 8 x y + 3 y2 dx (16 x3 + 12 x4 ) dx 1 0 = 4 x4 + 12 x5 5 = 4 + 12 = 5 32 5 = 6. MULTIPLE INTEGRALS y = c. y) d A = R f ( x.106 CHAPTER 3.2.4 .2.

y j∗ ) ∆ xi ∆ y j . ∆ y j = y j+1 − y j . y) : 0 ≤ x ≤ 2. In any such subrectangle [ xi . y j∗ ) is the height and . y) over that subrectangle is approximately f ( xi∗ . Find the volume V of the solid bounded by the three coordinate planes and the plane 2 x + y + 4 z = 4. so it can be enclosed in some rectangle [a. 0 ≤ y ≤ −2 x + 4}. y) is a nonnegafar.2.3. y j∗ ). Using vertical slices in R gives V = R 2 1 4 (4 − 2 x − −2 x +4 y) d A = = = 0 2 0 2 0 0 1 4 (4 − 2 x − y) d y dx dx 1 − 8 (4 − 2 x − y)2 2 1 8 (4 − 2 x) dx 2 0 y=−2 x+4 y=0 1 = − 48 (4 − 2 x)3 = 64 48 = 4 3 For a general region R . 0. pick a point ( xi∗ . where ∆ xi = xi+1 − xi . y 4 z (0.4(b).5. is R = {( x. 4. 0) (a) y = −2x + 4 R x 0 (b) (0.2. which may not be one of the types of regions we have considered so f ( x. Only consider the subrectangles that are enclosed completely within the region R . Then the volume under the surface z = f ( x. b] × [ c.2. shown in Figure is given by 4 R 3. Assume that f ( x.2. Then divide that rectangle into a grid of subrectangles. The volume V f ( x. y) = z = 1 (4 − 2 x − y) and the region R . y j+1 ].5(a).2 Double Integrals Over a General Region 107 Example 3. y) d A is deﬁned as follows. as shown by the shaded subrectangles in Figure 3. y) d A .4 Solution: The solid is shown in Figure 3. d ].4(a) with a typical vertical slice. 1) 2x + y + 4z = 4 y 0 x (2. 0. where f ( x. xi+1 ] × [ y j . the double integral R tive real-valued function and that R is a bounded region in R2 . and f ( xi∗ . 0) 2 Figure 3.

In the case of a region of the type shown in Figure 3. y d z ∆ xi y j +1 (x i∗ . y) f (x i∗ . d ] as the largest diagonal of the subrectangles goes to 0).1. y) over the region R . y j∗ ) y j y j +1 y x x i x i +1 b Subrectangles inside the region R 0 a (a) x i +1 x (x i∗ . y j∗ ) ∆ x i ∆ y j Figure 3.6) j i where the summation occurs over the indices of the subrectangles inside R .108 CHAPTER 3. our deﬁnition of R reduces to a sequence of two iterated integrals. y) that is not necessarily always nonnegative: just replace each mention of volume by the negative volume in the description above when f ( x. Then the total volume under the surface is approximately the sum of the volumes of all such parallelepipeds. as shown in Figure 3.e. and so the above sum approaches the actual volume under the surface z = f ( x. We can evaluate improper double integrals (i. y) d A inition of the Riemann integral from single-variable calculus. over an unbounded region. If we take smaller and smaller subrectangles. or over a region which contains points where the function f ( x. y) < 0. b] × [ c. y) is not deﬁned) as a sequence of iterated improper single-variable integrals.2. y j∗ ) R (b) Parallelepiped over a subrectangle. using the deff ( x. the region R does not have to be bounded. . f ( x. y) d A as the limit of that double summation (the limit is taken over all We then deﬁne R subdivisions of the rectangle [a.5(b). y j∗ ) yj 0 c xi ∆ yj z = f (x. (3. y j∗ ) ∆ x i ∆ y j . then the subrectangles begin to ﬁll more and more of the region R .2.5 Double integral over a general region R A similar deﬁnition can be made for a function f ( x. so that the length of the largest diagonal of the subrectangles goes to 0. with volume f ( x i∗ . Finally.2. namely f ( x i∗ . MULTIPLE INTEGRALS ∆ xi ∆ y j is the base area of a parallelepiped.

6 b .2. 0 2 cos x sin y dx d y 1 dx d y 0 0 6. 0 e y dx d y x ye−( x 2 2 y 5. 0 2 24 x2 y d y dx 4 x d y dx 2. 0 0 2 d y dx 9. Show how Exercise 12 can be used to solve Exercise 10.5. 0 Solution: ∞ 1/ x2 0 1 2 y d y dx = = ∞ y2 y=1/ x2 y=0 dx ∞ 1 ∞ 1 x−4 dx = − 1 x−3 3 1 = 0 − (− 1 ) = 3 1 3 Exercises A For Exercises 1-6. 0 2 0 sin x dx d y 2y 0 ∞ ∞ x ln x 3. 10. 0 + y2 ) dx d y 0 1 x2 7.2 Double Integrals Over a General Region ∞ 109 1/ x2 Example 3. 8.6. B 11. 6 (Hint: Mimic Example 3. you can assume that R is a region of the type shown in Figure 3. Prove that the volume of a tetrahedron with mutually perpendicular adjacent sides of lengths a. Find the volume V of the solid bounded by the three coordinate planes and the plane x + y + z = 1.1(a).2. and recall from Section 1.3.2. evaluate the given double integral. Evaluate 1 2 y d y dx.5 how three noncollinear points determine a plane.) 13.6. b. For simplicity. 1 1 π y 1. C 12. c a Figure 3. and c. 1 0 π/2 0 y 4. as in Figure 3. Find the volume V of the solid bounded by the three coordinate planes and the plane 3 x + 2 y + 5 z = 6. is abc . Explain why the double integral R 1 d A gives the area of the region R .

z) dx d y dz . The symbol dV is often called the volume element. In each subparallelepiped inside S . ∆ y and ∆ z. A more complicated case is where S is a solid which is bounded below by a surface z = g 1 ( x.3 Triple Integrals Our deﬁnition of a double integral of a real-valued function f ( x. what does the triple integral represent? We saw that a double integral could be thought of as the volume under a two-dimensional surface. and the triple summation is over all the subparallelepipeds inside S . y. pick a point ( x∗ . y. y). It turns out that the triple integral simply generalizes this idea: it can be thought of as representing the hypervolume under a three-dimensional hypersurface w = f ( x. y is bounded between two curves h 1 ( x) and h 2 ( x). x2 ] × [ y1 . z) whose graph lies in R4 . Physically. Then b h 2 ( x) h 1 ( x) g 2 ( x. y. denoted by f ( x. and x varies between a and b. but at least we now know how to calculate that volume! In the case where S is a rectangular parallelepiped [ x1 . y. z) dV = S f ( x. y. y. y. a g 1 ( x. y.g. namely z2 y2 y1 x2 f ( x. area in R2 ).9) Notice in this case that the ﬁrst iterated integral will result in a function of x and y (since its limits of integration are functions of x and y). y2 ] × [ z1 . In general. MULTIPLE INTEGRALS 3. with sides of lengths ∆ x. It may be hard to get a grasp on the concept of the “volume” of a four-dimensional object. We simply proceed as before: the solid S can be enclosed in some rectangular parallelepiped. y∗ . z∗ ) ∆ x ∆ y ∆ z . by S f ( x. z) over a solid S in R3 .y) f ( x. It can be shown that this limit does not depend on the choice of the rectangular parallelepiped enclosing S . Then deﬁne the triple integral of f ( x. (3. y. the triple integral is a sequence of three iterated integrals.y) (3. which then leaves you with a double integral of . y.7) where the limit is over all divisions of the rectangular parallelepiped enclosing S into subparallelepipeds whose largest diagonal is going to 0. z) dz d y dx . z) over S .8) where the order of integration does not matter. that is. y). z) dV . which is then divided into subparallelepipeds. y∗ . y1 ≤ y ≤ y2 . z2 ]. y) over a region R in R2 can be extended to deﬁne a triple integral of a real-valued function f ( x. z) dV = lim S f ( x∗ . z1 ≤ z ≤ z2 }.110 CHAPTER 3. z) : x1 ≤ x ≤ x2 . z) dV = S f ( x. z∗ ). length in R1 . S = {( x. the word “volume” is often used as a general term to signify the same concept for any ndimensional object (e. bounded above by a surface z = g 2 ( x. This is the simplest case. z1 x1 (3.

3 2 0 0 1 Example 3.8. At this point. z). is the most important thing. many variations on this case (for example. Evaluate 0 ( x y + z) dx d y dz. y. We will see some other ways in which triple integrals are used later in the text. 3 2 0 3 2 0 3 Solution: 3 0 0 2 0 1 ( x y + z) dx d y dz = = = = 0 x =1 1 2 2 x y + xz x=0 1 2 y+ z d y dz d y dz y=2 y=0 0 0 3 0 1 2 4y + yz dz (1 + 2 z) dz 3 0 = z + z2 1 1− x 0 0 2− x− y = 12 Example 3. Evaluate 0 ( x + y + z) dz d y dx. triple integrals can be quite tricky.3. of course.2. regardless of what it represents. just learning how to evaluate a triple integral.7.3 Triple Integrals 111 a type that we learned how to evaluate in Section 3. 1 1− x 0 1 1− x 0 1 1− x 0 1 Solution: 1 0 0 1− x 0 2− x− y ( x + y + z) dz d y dx = = = = = = 0 1 ( x + y) z + 2 z 2 z=2− x− y z =0 d y dx 0 1 ( x + y)(2 − x − y) + 2 (2 − x − y)2 d y dx 1 2 − 1 x2 − x y − 2 y2 d y dx 2 0 0 1 1 2 y − 1 x 2 y − x y − 2 x y2 − 1 y3 2 6 11 6 y=1− x y=0 dx 0 1 2 11 1 4 6 x − x + 24 x 0 − 2 x + 1 x3 dx 6 = 7 8 . changing the roles of the variables x. There are. so as you can probably tell.

112

CHAPTER 3. MULTIPLE INTEGRALS

Note that the volume V of a solid in R3 is given by

V =

S

1 dV .

(3.10)

Since the function being integrated is the constant 1, then the above triple integral reduces to a double integral of the types that we considered in the previous section if the solid is bounded above by some surface z = f ( x, y) and bounded below by the x y-plane z = 0. There are many other possibilities. For example, the solid could be bounded below and above by surfaces z = g 1 ( x, y) and z = g 2 ( x, y), respectively, with y bounded between two curves h 1 ( x) and h 2 ( x), and x varies between a and b. Then

b h 2 ( x) h 1 ( x) g 2 ( x,y) g 1 ( x,y) b h 2 ( x) h 1 ( x)

V=

S

1 dV =

a

1 dz d y dx =

a

( g 2 ( x, y) − g 1 ( x, y)) d y dx

just like in equation (3.9). See Exercise 10 for an example.

Exercises A

For Exercises 1-8, evaluate the given triple integral.

3 2 0

π

1

1

x

0 0

y

1.

0 0

x yz dx d y dz

x

0 0 1/ y 0 4 2 0 3

2.

0 1

x yz dz d y dx

z

0 2 0

xy

3.

0

x2 sin z dz d y dx x2 z dx dz d y

y

4.

0

ze y dx d y dz yz dx dz d y

2

e

y

0

y2

0 0 1− x 0

z2

5.

1 2

6.

1 1

7.

1

1 dx d y dz

z2 y2 x2 z1 y1 x1

8.

0

1− x− y 0

1 dz d y dx

9. Let M be a constant. Show that

M dx d y dz = M ( z2 − z1 )( y2 − y1 )( x2 − x1 ).

B

10. Find the volume V of the solid S bounded by the three coordinate planes, bounded above by the plane x + y + z = 2, and bounded below by the plane z = x + y.

C

b z a a y b

**11. Show that
**

a

f ( x) dx d y dz =

order of integration in the triple integral changes the limits of integration.)

a

( b − x )2 2

f ( x) dx. (Hint: Think of how changing the

3.4 Numerical Approximation of Multiple Integrals

113

**3.4 Numerical Approximation of Multiple Integrals
**

As you have seen, calculating multiple integrals is tricky even for simple functions and regions. For complicated functions, it may not be possible to evaluate one of the iterated integrals in a simple closed form. Luckily there are numerical methods for approximating the value of a multiple integral. The method we will discuss is called the Monte Carlo method. The idea behind it is based on the concept of the average value of a function, which you learned in single-variable calculus. Recall that for a continuous function f ( x), the average value f¯ of f over an interval [a, b] is deﬁned as

f¯ =

1 b−a

b

f ( x) dx .

a

(3.11)

The quantity b − a is the length of the interval [a, b], which can be thought of as the “volume” of the interval. Applying the same reasoning to functions of two or three variables, we deﬁne the average value of f ( x, y) over a region R to be

f¯ =

1 A (R )

R

f ( x, y) d A ,

(3.12)

**where A (R ) is the area of the region R , and we deﬁne the average value of f ( x, y, z) over a solid S to be 1 f ( x, y, z) dV , (3.13) f¯ = V (S )
**

S

where V (S ) is the volume of the solid S . Thus, for example, we have

f ( x, y) d A = A (R ) f¯ .

R

(3.14)

The average value of f ( x, y) over R can be thought of as representing the sum of all the values of f divided by the number of points in R . Unfortunately there are an inﬁnite number (in fact, uncountably many) points in any region, i.e. they can not be listed in a discrete sequence. But what if we took a very large number N of random points in the region R (which can be generated by a computer) and then took the average of the values of f for those points, and used that average as the value of f¯? This is exactly what the Monte Carlo method does. So in formula (3.14) the approximation we get is

f ( x, y) d A ≈ A (R ) f¯ ± A (R )

R

f 2 − ( f¯)2 , N

(3.15)

where

f¯ =

N i =1

f ( xi , yi ) N

and

f2 =

N 2 i =1 ( f ( x i , yi ))

N

,

(3.16)

114

CHAPTER 3. MULTIPLE INTEGRALS

with the sums taken over the N random points ( x1 , y1 ), . . ., ( xN , yN ). The ± “error term” in formula (3.15) does not really provide hard bounds on the approximation. It represents a single standard deviation from the expected value of the integral. That is, it provides a likely bound on the error. Due to its use of random points, the Monte Carlo method is an example of a probabilistic method (as opposed to deterministic methods such as Newton’s method, which use a speciﬁc formula for generating points). For example, we can use formula (3.15) to approximate the volume V under the plane z = 8 x + 6 y over the rectangle R = [0, 1] × [0, 2]. In Example 3.1 in Section 3.1, we showed that the actual volume is 20. Below is a code listing (montecarlo.java) for a Java program that calculates the volume, using a number of points N that is passed on the command line as a parameter.

//Program to approximate the double integral of f(x,y)=8x+6y //over the rectangle [0,1]x[0,2]. public class montecarlo { public static void main(String[] args) { //Get the number N of random points as a command-line parameter int N = Integer.parseInt(args[0]); double x = 0; //x-coordinate of a random point double y = 0; //y-coordinate of a random point double f = 0.0; //Value of f at a random point double mf = 0.0; //Mean of the values of f double mf2 = 0.0; //Mean of the values of f^2 for (int i=0;i<N;i++) { //Get the random coordinates x = Math.random(); //x is between 0 and 1 y = 2 * Math.random(); //y is between 0 and 2 f = 8*x + 6*y; //Value of the function mf = mf + f; //Add to the sum of the f values mf2 = mf2 + f*f; //Add to the sum of the f^2 values } mf = mf/N; //Compute the mean of the f values mf2 = mf2/N; //Compute the mean of the f^2 values System.out.println("N = " + N + ": integral = " + vol()*mf + " +/- " + vol()*Math.sqrt((mf2 - Math.pow(mf,2))/N)); //Print the result } //The volume of the rectangle [0,1]x[0,2] public static double vol() { return 1*2; } }

Listing 3.1

Program listing for montecarlo.java

**The results of running this program with various numbers of random points (e.g. java
**

montecarlo 100) are shown below:

**3.4 Numerical Approximation of Multiple Integrals
**

N = 10: N = 100: N = 1000: N = 10000: N = 100000: 19.36543087722646 +/- 2.7346060413546147

115

21.334419561385353 +/- 0.7547037194998519 19.807662237526227 +/- 0.26701709691370235 20.080975812043256 +/- 0.08378816229769506 20.009403854556716 +/- 0.026346782289498317

N = 1000000: 20.000866994982314 +/- 0.008321168748642816

As you can see, the approximation is fairly good. As N → ∞, it can be shown that the Monte Carlo approximation converges to the actual volume (on the order of O ( N ), in computational complexity terminology). In the above example the region R was a rectangle. To use the Monte Carlo method for a nonrectangular (bounded) region R , only a slight modiﬁcation is needed. Pick a rectangle ˜ R that encloses R , and generate random points in that rectangle as before. Then use those points in the calculation of f¯ only if they are inside R . There is no need to calculate the area of R for formula (3.15) in this case, since the exclusion of points not inside R allows you to ˜ use the area of the rectangle R instead, similar to before. For instance, in Example 3.4 we showed that the volume under the surface z = 8 x + 6 y over the nonrectangular region R = {( x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ 2 x2 } is 6.4. Since the rectangle ˜ R = [0, 1] × [0, 2] contains R , we can use the same program as before, with the only change being a check to see if y < 2 x2 for a random point ( x, y) in [0, 1] × [0, 2]. Listing 3.2 below contains the code (montecarlo2.java):

//Program to approximate the double integral of f(x,y)=8x+6y over the //region bounded by x=0, x=1, y=0, and y=2x^2 public class montecarlo2 { public static void main(String[] args) { //Get the number N of random points as a command-line parameter int N = Integer.parseInt(args[0]); double x = 0; //x-coordinate of a random point double y = 0; //y-coordinate of a random point double f = 0.0; //Value of f at a random point double mf = 0.0; //Mean of the values of f double mf2 = 0.0; //Mean of the values of f^2 for (int i=0;i<N;i++) { //Get the random coordinates x = Math.random(); //x is between 0 and 1 y = 2 * Math.random(); //y is between 0 and 2 if (y < 2*Math.pow(x,2)) { //The point is in the region f = 8*x + 6*y; //Value of the function mf = mf + f; //Add to the sum of the f values mf2 = mf2 + f*f; //Add to the sum of the f^2 values } } mf = mf/N; //Compute the mean of the f values mf2 = mf2/N; //Compute the mean of the f^2 values System.out.println("N = " + N + ": integral = " + vol()*mf +

2 Program listing for montecarlo2. java montecarlo2 1000) are shown below: N = 10: N = 100: N = 1000: N = 10000: N = 100000: integral = 6. z) in a parallelepiped. S 1000. 4.3149056229650355 +/. 1] × [0. Use the Monte Carlo method to approximate the volume of the ellipsoid x2 9 y2 4 2 + + z = 1." + vol()*Math.15) (see Exercise 2).01009454409789472 To use the Monte Carlo method to evaluate triple integrals. Repeat Exercise 1 with the region R = {( x.0. MULTIPLE INTEGRALS " +/. 1000. instead of random pairs ( x. } } Listing 3. R 100000 and 1000000 random points.g. Show the program output for N = 10. 6. where R = [0. y.2))/N)).2] public static double vol() { return 1*2. z) : 0 ≤ x ≤ 1.9185131565120592 integral = 6. } //The volume of the rectangle [0. 1]. 0 ≤ z ≤ 1 − x − y}. y) : −1 ≤ x ≤ 1.440184132811864 +/. see P RESS et al. 3.417050897922222 +/.0. For a more detailed discussion of numerical integration methods. 100. y. and use the volume of the parallelepiped instead of the area of a rectangle in formula (3. Use the Monte Carlo method to approximate the volume of a sphere of radius 1. you will need to generate random triples ( x.0.95747529014894 +/. 2. 1 .pow(mf. Write a program that uses the Monte Carlo method to approximate the triple integral e x yz dV . 5.1]x[0.349975080015089 +/.03200476870881392 N = 1000000: integral = 6. Write a program that uses the Monte Carlo method to approximate the double integral e x y d A . 1] × [0. y) in a rectangle. 0 ≤ y ≤ x2 }. 10000. 1] × [0. Exercises C 1.9549009662159909 integral = 6. 1]. 100000 and 1000000 random points.2.116 CHAPTER 3. Show the program output for N = 10.0.0. 10000.java The results of running the program with various numbers of random points (e.10040086346895105 integral = 6.31916837260973624 integral = 6. 100. 0 ≤ y ≤ 1.Math.sqrt((mf2 .477032813858756 +/. where S = [0. Repeat Exercise 2 with the solid S = {( x.

3] we can deﬁne x as a function of u. namely x = g( u) = u+1 . which can be easily integrated to give . 3]) and hence has an inverse function (deﬁned on the interval [0. Recall that if you are given. on [0. then you would make the substitution du = 2 x dx which changes the limits of integration u = x2 − 1 ⇒ x2 = u + 1 x=2 ⇒ u=3 so that we get 2 1 x=1 ⇒ u=0 x3 2 x2 − 1 dx = = = = 1 3 0 1 2 1 2 2 x · 2x 1 2 ( u + 1) 3 x2 − 1 dx u du 0 14 3 5 u3/2 + u1/2 du . The answer is yes.5 Change of Variables in Multiple Integrals Given the difﬁculty of evaluating multiple integrals.5 Change of Variables in Multiple Integrals 117 3. That is. Then substituting that expression for x into the function f ( x) = x3 x2 − 1 gives f ( x) = f ( g( u)) = ( u + 1)3/2 u .3. Let us take a different look at what happened when we did that substitution. 2] onto [0. the function x → x2 − 1 is strictly increasing (and maps [1. the deﬁnite integral 2 1 x3 x2 − 1 dx . which will give some motivation for how substitution works in multiple integrals. for example. . First. 3]). we let u = x2 − 1. On the interval of integration [1. 2]. the reader may be wondering if it is possible to simplify those integrals using a suitable substitution for the variables. though it is a bit more complicated than the substitution method which you learned in single-variable calculus.

so that a = g( c) and b = g( d ). differentiable function from an interval [ c. which means g−1 (2) 1 f ( x) dx = g−1 (1) f ( g( u)) g ′ ( u) du . and it is what you were implicitly using when doing integration by substitution. then c = g−1 (a) and d = g−1 ( b). We will state the formulas for double and triple integrals involving real-valued functions of two and three variables.32 and § 15. This formula turns out to be a special case of a more general formula which can be used to evaluate multiple integrals. § 15. respectively. which means that g ′ ( u) = 0 on the interval ( c. d ] (which you can think of as being on the “ u-axis”) onto an interval [a. We will assume that all the functions involved are continuously differentiable and that the regions and solids involved all have “reasonable” boundaries. so since g(3) = 2 ⇒ 3 = g−1 (2) g(0) = 1 ⇒ 0 = g−1 (1) then performing the substitution as we did earlier gives 2 1 2 f ( x) dx = = = 2 x3 1 3 0 3 0 x2 − 1 dx u du . if x = g( u) is a one-to-one.2 2 See T AYLOR and M ANN.17) This is called the change of variable formula for integrals of single-variable functions.62 for all the details.118 and we see that CHAPTER 3. b] (on the x-axis). MULTIPLE INTEGRALS dx = g ′ ( u) ⇒ dx = g ′ ( u) du du 1 dx = 2 ( u + 1)−1/2 du . In general. and b a g −1 ( b ) f ( x) dx = g −1 ( a ) f ( g( u)) g ′ ( u) du . which can be written as 1 2 ( u + 1) 1 ( u + 1)3/2 u · 2 ( u + 1)−1/2 du . The proof of the following theorem is beyond the scope of the text. (3. d ). .

y( u. v. y) ∂( u. w). y( u. v) = is never 0 in R ′ . w). respectively. y.5 Change of Variables in Multiple Integrals 119 Theorem 3.20) is never 0 in S ′ . w). v) | d A ( u. v). y) = | J ( u. y) d A ( x. w) | dV ( u. v. v. w) = ∂( x. (3. w) deﬁne a one-to-one mapping of a solid S ′ in uvw-space onto a solid S in x yz-space such that the determinant ∂x ∂u ∂y ∂u ∂z ∂u ∂x ∂v ∂y ∂v ∂z ∂v ∂x ∂w ∂y ∂w ∂z ∂w J ( u. v) coordinates. (3. z( u. v. .19) We use the notation d A ( x. (3. w) and z = z( u. z) = S S′ f ( x( u. w)) | J ( u. Similarly. (3. v. w) . w) = (3. v. if x = x( u. y) and ( u. v.18) is called the Jacobian of x and y with respect to u and v. which you can think of as a two-variable version of the relation dx = g ′ ( u) du in the single-variable case. v)) | J ( u. y) = R R′ f ( x( u.3. v) and y = y( u. z) ∂( u. v. Then ∂u ∂y ∂u (3. v) .18) f ( x. z) dV ( x. y. and is sometimes written as J ( u. w) . y. v) = ∂( x. v) deﬁne a one-to-one mapping of a region R ′ in the uv-plane onto a region R in the x y-plane such that the determinant ∂x ∂x ∂v ∂y ∂v J ( u. then f ( x. v).22) Similarly. v) . v.21) The determinant J ( u. y) and d A ( u.19) is saying that d A ( x. the Jacobian J ( u. Change of Variables Formula for Multiple Integrals Let x = x( u. v) to denote the area element in the ( x. v) | d A ( u. w) of three variables is sometimes written as J ( u. v) in formula (3.23) Notice that formula (3. v. v.1. v. y = y( u. The following example shows how the change of variables formula is used.

In Figure 3. v) = 2 (v − u) maps the region R ′ onto R in a one-to-one manner. y ≥ 0. So solving for 1 x and y gives x = 2 ( u + v) and y = 1 (v − u). By looking at the numerator and denominator of the exponent of e. note that evaluating this double integral without using substitution is probably impossible.1 The regions R and R ′ Now we see that ∂x ∂u J ( u.120 CHAPTER 3. at least in a closed form. To use the change of variables formula (3.1 below. y( u. we have e x+ y d A = R x− y R′ 1 f ( x( u.19). v). v) = ∂y ∂u ∂x ∂v = ∂y ∂v 1 2 1 −2 1 2 1 2 = 1 1 1 = ⇒ | J ( u. y) : x ≥ 0. Solution: First.9. v) | d A v −v = = = = ev 0 1 0 1 0 2 u 1 2 du dv dv u=v v u v 2 e u=−v −1 v 2 ( e − e ) dv v ( e − e −1 ) 4 1 0 = 1 1 e− 4 e = e2 − 1 4e .5. we see how the mapping 2 1 1 x = x( u. x + y ≤ 1}. we will try the substitution u = x − y and v = x + y. MULTIPLE INTEGRALS x− y Example 3.5. Evaluate R e x+ y d A . y = y( u. v) | = . y 1 x+ y =1 R 0 1 x −1 1 x = 2 (u + v) v 1 R′ u = −v 0 u=v u 1 y= 1 2 (v − u) Figure 3. 2 2 2 so using horizontal slices in R ′ . v) = 2 ( u + v). we need to write both x and y in terms of u and v. where R = {( x. v)) | J ( u.

24) where the mapping x = r cos θ . we have ∂x ∂x ∂θ = ∂y ∂θ ∂r ∂y ∂r J ( u. θ ) = r sin θ .2 z = x 2 + y2 V = = = = = 0 (1 − r 2 ) r dr d θ ( r − r 3 ) dr d θ −r 4 4 0 0 2π 1 4 r2 2 r =1 r =0 dθ dθ π 0 2 . r sin θ ) r dr d θ . y = r sin θ maps the region R ′ in the r θ -plane onto the region R in the x y-plane in a one-to-one manner. where R = {( x.3.10. R′ (3. we see that 1 z x 2 + y2 = 1 V = R (1 − z) d A = R (1 − ( x2 + y2 )) d A . Thus. Find the volume V inside the paraboloid z = x2 + y2 for 0 ≤ z ≤ 1.5. In polar coordinates ( r.5. Solution: Using vertical slices.5 Change of Variables in Multiple Integrals 121 The change of variables formula can be used to evaluate double integrals in polar coordinates. θ ) : 0 ≤ r ≤ 1. Letting x = x( r. 0 ≤ θ ≤ 2π}. v) | = | r | = r . y) dx d y = R f ( r cos θ . y) : x2 + y2 ≤ 1} is the unit disk in R2 (see Figure 3. θ ) = r cos θ and y = y( r. θ ) we know that x2 + y2 = r 2 and that the unit disk R is the set R ′ = {( r.2). Example 3. so we have the following formula: Double Integral in Polar Coordinates f ( x. 2π 1 0 2π 1 0 2π y 0 x Figure 3. v) = cos θ sin θ − r sin θ r cos θ = r cos2 θ + r sin2 θ = r ⇒ | J ( u.

θ ) : 0 ≤ r ≤ 1. θ ) we know that x2 + y2 = r and that the unit disk R is the set R ′ = {( r.11. y) : x2 + y2 ≤ 1} is the unit disk in R2 (see Figure 3. y = ρ sin φ sin θ .26) S′ where the mapping x = ρ sin φ cos θ . . 2π 1 0 2π 1 0 2π y 0 x Figure 3. (3. y. 1 z x 2 + y2 = 1 Example 3. z) dx d y dz = S f (ρ sin φ cos θ .5. Thus. Find the volume V inside the cone z = Solution: Using vertical slices. z) dx d y dz = S f ( r cos θ . z = z maps the solid S ′ in r θ z-space onto the solid S in x yz-space in a one-to-one manner. S′ (3.3). Triple Integral in Spherical Coordinates f ( x.3 z= x 2 + y2 V = = = = = 0 (1 − r ) r dr d θ ( r − r 2 ) dr d θ −r 3 3 0 0 2π 1 6 r2 2 r =1 r =0 dθ dθ π 0 3 In a similar fashion. where R = {( x. z = ρ cos φ maps the solid S ′ in ρφθ space onto the solid S in x yz-space in a one-to-one manner. ρ sin φ sin θ .5.122 CHAPTER 3. z) r dr d θ dz . In polar coordinates ( r. MULTIPLE INTEGRALS x2 + y2 for 0 ≤ z ≤ 1. ρ cos φ) ρ 2 sin φ d ρ d φ d θ . y. we see that V = R (1 − z) d A = R 1− x 2 + y2 d A . y = r sin θ . r sin θ . 0 ≤ θ ≤ 2π}.25) where the mapping x = r cos θ . it can be shown (see Exercises 5-6) that triple integrals in cylindrical and spherical coordinates take the following forms: Triple Integral in Cylindrical Coordinates f ( x.

6. y) = 0 t x−1 (1 − t) y−1 dt . y > 0. Find the volume inside the elliptic cylinder x2 a2 y2 + b2 = 1 for 0 ≤ z ≤ 2. Find the volume V of the solid inside both x2 + y2 + z2 = 4 and x2 + y2 = 1. cos x− y 2 sin x+ y 2 d A . . 0).3. x) = B( x. deﬁned by 1 B( x. 2. Find the volume V inside the cone z = x2 + y2 for 0 ≤ z ≤ 3. 7. Using the substitution t = u/( u + 1). z = cw. Find the volume V inside both the sphere x2 + y2 + z2 = 1 and the cone z = 5.25). 4.) 8. 0) and (1. y) for x > 0. ( u + 1) x+ y for x > 0. so 2π π 123 a V = S 2π 1 dV = π 1 ρ 2 sin φ d ρ d φ d θ 2π π 0 ρ =a ρ =0 0 0 = = ρ3 0 2π 0 0 3 3 sin φ d φ d θ = φ= π φ= 0 − a cos φ 3 a3 sin φ d φ d θ 0 0 3 2π 2a3 4π a 3 dθ = dθ = . (2.26). ﬁnd the volume V inside the sphere S = x2 + y2 + z2 = a2 . (Hint: Use the change c of variables x = au. v = ( x − y)/2. show that the Beta function can be written as ∞ B( x. for x > 0. y > 0.12. Evaluate R x 2 + y2 . For a > 0. (Hint: Use the change of variables u = ( x + y)/2. where R is the triangle with vertices (0. y2 2 C x abc 10.5 Change of Variables in Multiple Integrals Example 3. y) = 0 u x −1 du . satisﬁes the relation B( y. Prove formula (3. y = bv. Solution: We see that S is the set ρ = a in spherical coordinates. 3 3 0 Exercises A 1. Show that the volume inside the ellipsoid a2 + b2 + z2 = 1 is 4π3 . 1). Find the volume of the solid bounded by z = x2 + y2 and z2 = 4( x2 + y2 ). then consider Example 3. Find the volume V inside the paraboloid z = x2 + y2 for 0 ≤ z ≤ 4.12. 12. Prove formula (3. Show that the Beta function.) 2 11. 9. y > 0. B 3.

e the mass of R is uniformly distributed over the region. M (3.6.29) The quantities M x and M y are called the moments (or ﬁrst moments) of the region R about the x-axis and y-axis. b]. Mx = R yδ( x. M= R δ( x.27) represent a special case when δ( x. a (3. The mass of that rectangle is approximately δ( x∗ . y∗ ) in that rectangle. y) d A . MULTIPLE INTEGRALS 3. Example 3.1). To see this.28) My = R xδ( x. M 0 a Figure 3.124 CHAPTER 3. y) of the coordinates ( x.1 Center of mass of R Mx = a ( f ( x))2 dx . (3. y∗ )∆ x ∆ y. i.27) assuming that R has uniform density. In the general case where the density of a region (or lamina) R is a continuous function δ = δ( x. 0 ≤ y ≤ 2 x2 }. and taken as 1 for simplicity). Then the mass of R is the limit of the sums of the masses of all such rectangles inside R as the diagonals of the rectangles approach 0. R Note that the formulas in (3. y) d A . for some point ( x∗ . Find the center of mass of the region R = {( x. a M= f ( x) dx .6. which is the double integral δ( x. y) d A . y) : 0 ≤ x ≤ 1. y) of points inside R (where R can be any region in R2 ) the ¯ ¯ coordinates ( x. . 0 ≤ y ≤ f ( x)} in R2 that represents a thin. In this case the area M of the region is considered the mass of R (the density is constant. y) x b My M and ¯ y= Mx .6 Application: Center of Mass Recall from single-variable calculus that for a region R = {( x.13. y) d A . y) given by ¯ x= where b y y = f (x) R ¯ ¯ ( x. respectively. if the density function at ( x. the center of mass of R has ¯ ¯ coordinates ( x. y) = x + y. y) of the center of mass of R are given by ¯ x= where My M and ¯ y= Mx . y) : a ≤ x ≤ b. think of taking a small rectangle inside R with dimensions ∆ x and ∆ y close to 0. ﬂat plate (see Figure 3. where f ( x) is a continuous function on [a. y) is δ( x. y) = 1 throughout R in the formulas in (3. The quantity M is the mass of the region R . 2 b b My = x f ( x) dx .29).

2. y) d A 2 x2 0 1 = = = = 0 ( x + y) d y dx y=2 x2 y=0 0 1 0 1 0 4 y2 xy+ 2 Figure 3. This makes sense since the density function δ( x.2 dx (2 x3 + 2 x4 ) dx 1 0 x 2 x5 + 2 5 = 9 10 and Mx = R 1 yδ( x. y) is a constant function on the region ¯ ¯ R . M 9/10 63 Note how this center of mass is a little further towards the upper corner of the region R than 3 ¯ ¯ when the density is uniform (it is easy to use the formulas in (3.6 Application: Center of Mass 125 Solution: The region R is shown in Figure 3.6. y) = 3 . 9/10 27 ¯ y = Mx 5/7 50 = = . 5 4 in that case).6. where there is quite a bit of area. 15 ¯ ¯ so the center of mass ( x. y) = x + y increases as ( x. In the special case where the density function δ( x. y) approaches that upper corner. the center of mass ( x. y) d A 2 x2 0 1 My = R 1 xδ( x. y) is given by ¯ x = My M = 11/15 22 = .27) to show that ( x. . y) is called the centroid of R .3. We have y y = 2x2 R x M = R 1 δ( x. y) d A 2 x2 0 1 2 = = = = 0 y( x + y) d y dx y=2 x2 y=0 = = = = 0 x( x + y) d y dx y=2 x2 y=0 0 1 0 6 x y2 y3 + 2 3 (2 x5 + dx 0 1 0 x y2 x y+ 2 dx 8 x6 ) dx 3 1 0 (2 x4 + 2 x5 ) dx 1 0 x 8 x7 + 3 21 = 5 7 2 x5 x6 + 5 3 = 11 .

z) : z ≥ 0. then M = 2πa . z) dV . Let S be a solid with a continuous mass density function δ( x. S Mx y = S zδ( x.6. which we know by Example 3. We have a z ¯ ¯ ¯ ( x. M ¯ z = Mx y M . M xz = S yδ( x. y. xz-plane and x y-plane. M yz .126 CHAPTER 3. z) y 0 x Figure 3. z) dV .3). y. which in spherical coordinates is S 2π π/2 = = = = a 0 (ρ cos φ) ρ 2 sin φ d ρ d φ d θ a 0 2π 0 2π 0 0 π/2 sin φ cos φ 0 π/2 ρ3 dρ dφ dθ 0 0 a4 4 sin φ cos φ d φ d θ . z) at any point ( x. Example 3. MULTIPLE INTEGRALS The formulas for the center of mass of a region in R2 can be generalized to a solid S in R3 . (3. y.32) M = In this case. z) in S . y. (3. so we need only ﬁnd z.6. y. z) dV z dV . z) dV = S 1 dV = V olume(S ). And 3 3 Mx y = S zδ( x. where ¯ x = where M yz M . Also. ¯ ¯ ¯ Then the center of mass of S has coordinates ( x.30) M yz = S xδ( x. So since the density function is a constant and S is symmetric about the z-axis.14. But since the volume of S is half the volume of the sphere of radius 3 3 a. δ( x. x2 + y2 + z2 ≤ a2 }. y. z) = 1. y.31) (3. ¯ ¯ ¯ then it is clear that x = 0 and y = 0. y. Find the center of mass of the solid S = {( x. y. if the density function at ( x. ¯ y = M xz . z) is δ( x. Solution: The solid S is just the upper hemisphere inside the sphere of radius a centered at the origin (see Figure 3. z).12 is 4πa .3 a M = S δ( x. z) dV . respectively. y. y. M xz and M x y are called the moments (or ﬁrst moments) of S around the yz-plane. y. z) dV . M is the mass of S . y.

x2 + y2 ≤ 1 }. δ( x. y ≥ 0. y. y). z) : 0 ≤ x ≤ 1. y) : y ≥ 0. 0 ≤ y ≤ 4 }. z) = x yz 7. z) = 0. z) = x2 + y2 + z2 10.3. Exercises A For Exercises 1-5. δ( x. δ( x. R = {( x. y. δ( x. R = {( x. ﬁnd the center of mass of the solid S with the given density function δ( x. y) : y ≥ 0. 1. y. δ( x. y) = x + y 3. z) : z ≥ 0. z) : x ≥ 0. 8 a ¯ ¯ ¯ Thus. z) = 1 . y. S = {( x. x2 + y2 ≤ a2 }. y) : 0 ≤ x ≤ 1. δ( x. y. 1 ≤ x2 + y2 ≤ 4 }. x2 + y2 + z2 ≤ a2 }. z) : 0 ≤ x ≤ 1. 0 ≤ y ≤ 1. 0. R = {( x. S = {( x. 6. y) = 5. y) = y x 2 + y2 B For Exercises 6-10. y. R = {( x. S = {( x. z) : 0 ≤ x ≤ 1. π a4 4 2πa3 3 so ¯ z = Mx y M = = 3a . 0 ≤ y ≤ 1. x2 + y2 + z2 ≤ a2 }. z) = x2 + y2 + z2 8. δ( x. z). y. S = {( x. y) = 2 y 2. δ( x.6 Application: Center of Mass 2π π/2 127 Mx y = = = = 0 2π 0 2π 0 0 4 a4 8 sin 2φ d φ d θ (since sin 2φ = 2 sin φ cos φ) φ=π/2 φ= 0 a − 16 cos 2φ a4 8 dθ dθ π a4 4 . 0 ≤ z ≤ 1 }. y) : y ≥ 0. y. y. z) = 1 9. δ( x. 0 ≤ z ≤ 1 − x − y}. y. 0 ≤ z ≤ 1 }. y. S = {( x. y. 38 . 0 ≤ y ≤ 1. δ( x. 0 ≤ y ≤ x2 }. ﬁnd the center of mass of the region R with the given density function δ( x. z ≥ 0. the center of mass of S is ( x. y) = 1 4. x ≥ 0. y) : 0 ≤ x ≤ 2. R = {( x.

9-10 in K AMKE . since of the six numbers on the die. and you let a variable X represent the value rolled. Note that the set of all real numbers between 0 and 1 is not a discrete (or countable) set of values. 1) has length 1. So since X represents a random number in (0. then length of (0. which is given by P ( X ≤ x) = x. MULTIPLE INTEGRALS 3.. New York: Dover. x) has length x. Likewise the probability of rolling at most a 1 3. the event X ≤ 3 is the set {1. Ω = {1. Now let X be a variable representing a random real number in the interval (0. The reasoning is this: the interval (0. For example. for any real number x in (0. Let X be a continuous real-valued random variable on a sample space Ω in R.3 In this case.128 CHAPTER 3. 4. For a continuous random variable.e. 2. and 3) that are less than or equal to 3. we consider the probability P ( X ≤ x). Instead. it makes no sense to consider P ( X = x) since it must be 0 (why?). 1). Then the probability of rolling a 3. 5. and for x in (0. Probability Suppose that you have a standard six-sided (fair) die. P ( X ≤ 3) = P ( X = 1) + P ( X = 2) + P ( X = 3) in the die example). For sim3 For a proof see p. In our case. we saw how the probability of an event was the sum of the probabilities of the individual outcomes comprising that event (e.7 Application: Probability and Expected Value In this section we will brieﬂy discuss some applications of multiple integrals in the ﬁeld of probability theory. 3. In the case of a discrete random variable. E. 1). and hence is uniformly distributed over (0. For example. 3}. is 3 = 2 . the probability of an event will instead be the integral of a function. length of (0. Theory of Sets. 1). and hence in particular the 3 has a one out of six chance of being rolled. In particular we will see ways in which multiple integrals can be used to calculate probabilities and expected values. in the case of the die. i. in our case the event X ≤ x is the set (0. An event A is a subset of the sample space. it can not be put into a one-to-one correspondence with the set of positive integers. since there 6 are six sides on the die and each one is equally likely to be rolled. 1). x). 2. which we will now describe. 2. Note that P ( X ≤ 3) = P ( X = 1) + P ( X = 2) + P ( X = 3). x) x = = x. An event A is a subset of the sample space. is 1 . 1).g. written as P ( X ≤ 3). 1950. written as P ( X = 3). 6}. 1) the interval (0. 1) 1 P ( X ≤ x) = We call X a continuous random variable on the sample space Ω = (0. there are three equally 6 likely numbers (1. . We call X a discrete random variable on the sample space (or probability space) Ω consisting of all possible outcomes.

with distribution function 1. for 0 < x < 1 (3.42) .40) In general. b). we have F ′ ( x) = f ( x) . b). for x ≤ a . for 0 < x < 1 (3. (3. for x ≥ 1 F ( x) = P ( X ≤ x) = x. for a < x < b 0. for x ≥ b = P ( X ≤ x).35) f ( x) dx = 1 . Deﬁne the distribution function F of X as 129 Suppose that there is a nonnegative. a for a < x < b . continuous real-valued function f on R such that x F ( x) = P ( X ≤ x) . for x ≤ 0 . (3. 1). 0. for short) for X .33) (3.f. We say that X has the uniform distribution on (0. for x ≤ a . −∞ ∞ −∞ (3. 1 b−a . We thus have x P ( X ≤ x) = f ( y) d y .15.3. f ( y) d y . for a < x < b elsewhere.d.34) F ( x) = and for −∞ < x < ∞ . by the Fundamental Theorem of Calculus. for −∞ < x < ∞. let Ω = (a.41) F ( x) = P ( X ≤ x) = b−a . Let X represent a randomly selected real number in the interval (0.39) 0. 1). f ( x) = F ′ ( x) = 1. for x ≥ b x (3. for −∞ < x < ∞ 1. (3.7 Application: Probability and Expected Value plicity. elsewhere. and probability density function f ( x) = F ′ ( x) = 0.38) and probability density function Example 3. then X has the uniform distribution function 1.37) Also. (3. for a < x < b 0.36) Then we call f the probability density function (or p. (3. if X represents a randomly selected real number in an interval (a.

(3. Since we are claiming that f is a p.130 CHAPTER 3. whose probability density function f is f ( x) = 1 2π e− x 2 /2 .f. First. and so ∞ −∞ ∞ −∞ e− x 2 /2 2 dx /2 = 2π . we should have ∞ 1 2 e− x /2 dx = 1 (3. just with different variables. . ∞ ∞ e −( x 2 + y2 )/2 −∞ −∞ dx d y = = = ∞ −∞ e− y 2 /2 ∞ −∞ e− x 2 /2 dx d y 2 ∞ −∞ ∞ −∞ e− x e− x 2 /2 dx 2 ∞ −∞ e− y /2 dy 2 /2 dx since the same function is being integrated twice in the middle equation.36). which is equivalent to ∞ −∞ e− x 2 /2 dx = 2π .44) −∞ 2π by formula (3. But using polar coordinates. A famous distribution function is given by the standard normal distribution. and is used widely in statistics. and hence e− x 2 dx = 2π .. MULTIPLE INTEGRALS Example 3. for −∞ < x < ∞.43) This is often called a “bell curve”.16. (3. we see that ∞ ∞ e −( x 2 + y2 )/2 2π −∞ −∞ dx d y = = = ∞ e−r 2 2 /2 r dr d θ dθ 2π 0 0 2π 0 2π 0 0 − e−r /2 r =∞ r =0 (0 − (− e0 )) d θ = 1 d θ = 2π .45) We can use a double integral in polar coordinates to verify this integral.d.

3.7 Application: Probability and Expected Value

131

In addition to individual random variables, we can consider jointly distributed random variables. For this, we will let X , Y and Z be three real-valued continuous random variables deﬁned on the same sample space Ω in R (the discussion for two random variables is similar). Then the joint distribution function F of X , Y and Z is given by

F ( x, y, z) = P ( X ≤ x, Y ≤ y, Z ≤ z) ,

for −∞ < x, y, z < ∞.

(3.46)

**If there is a nonnegative, continuous real-valued function f on R3 such that
**

z y x

F ( x, y, z) =

and

f ( u, v, w) du dv dw ,

−∞ −∞ −∞ ∞ ∞ ∞

for −∞ < x, y, z < ∞

(3.47)

−∞ −∞ −∞

f ( x, y, z) dx d y dz = 1 ,

(3.48)

then we call f the joint probability density function (or joint p.d.f. for short) for X , Y and Z . In general, for a 1 < b 1 , a 2 < b 2 , a 3 < b 3 , we have

b3 b2 a2 b1

P (a 1 < X ≤ b 1 , a 2 < Y ≤ b 2 , a 3 < Z ≤ b 3 ) =

f ( x, y, z) dx d y dz ,

a3 a1

(3.49)

with the ≤ and < symbols interchangeable in any combination. A triple integral, then, can be thought of as representing a probability (for a function f which is a p.d.f.). Example 3.17. Let a, b, and c be real numbers selected randomly from the interval (0, 1). What is the probability that the equation ax2 + bx + c = 0 has at least one real solution x? Solution: We know by the quadratic formula that there is at least one real solution if b2 − 4ac ≥ 0. So we need to calculate P ( b2 − 4ac ≥ 0). We will use three jointly distributed random variables to do this. First, since 0 < a, b, c < 1, we have

c 1 c= R1 0

1 4 1 4a

**b2 − 4ac ≥ 0 ⇔ 0 < 4ac ≤ b2 < 1 ⇔ 0 < 2 a c ≤ b < 1 ,
**

where the last relation holds for all 0 < a, c < 1 such that 1 . 0 < 4ac < 1 ⇔ 0 < c < 4a

R2 1

a

Figure 3.7.1 Region R = R1 ∪ R2

Considering a, b and c as real variables, the region R in the ac-plane where the above relation holds is given by R = {(a, c) : 0 < a < 1, 0 < c < 1, 0 < c < 41 }, which we can see is a a union of two regions R 1 and R 2 , as in Figure 3.7.1 above. Now let X , Y and Z be continuous random variables, each representing a randomly selected real number from the interval (0, 1) (think of X , Y and Z representing a, b and c,

132

CHAPTER 3. MULTIPLE INTEGRALS

respectively). Then, similar to how we showed that f ( x) = 1 is the p.d.f. of the uniform distribution on (0, 1), it can be shown that f ( x, y, z) = 1 for x, y, z in (0, 1) (0 elsewhere) is the joint p.d.f. of X , Y and Z . Now,

P ( b2 − 4ac ≥ 0) = P ((a, c) ∈ R, 2 a c ≤ b < 1) ,

so this probability is the triple integral of f (a, b, c) = 1 as b varies from 2 a c to 1 and as (a, c) varies over the region R . Since R can be divided into two regions R 1 and R 2 , then the required triple integral can be split into a sum of two triple integrals, using vertical slices in R:

P ( b2 − 4ac ≥ 0) =

1/4 0 0

1

1 2 a c

1

1/4a

1

1 db dc da +

1 db dc da

1/4 0 2 a c 1/4a

R1

1/4 1 0 1/4

R2

1

= = =

0

(1 − 2 a c) dc da +

c =1 c =0

0 1/4 0

c − 4 a c3/2 3

1/4 0 1 1/4

(1 − 2 a c) dc da

c=1/4a c =0

da +

1 1/4 1 1 12a

4 c − 3 a c3/2

da

1 − 4 a da + 3

1/4

da

8 1 = a − a3/2 ln a + 9 12 1/4 0 1 5 1 1 1 1 + 0− = − ln + ln 4 = 4 9 12 4 36 12 5 + 3 ln 4 ≈ 0.2544 P ( b2 − 4ac ≥ 0) = 36 In other words, the equation ax2 + bx + c = 0 has about a 25% chance of being solved!

Expected Value

The expected value E X of a random variable X can be thought of as the “average” value of X as it varies over its sample space. If X is a discrete random variable, then

EX =

x

x P ( X = x) ,

(3.50)

with the sum being taken over all elements x of the sample space. For example, if X represents the number rolled on a six-sided die, then

6 6

EX =

x =1

x P ( X = x) =

x

x =1

1 = 3.5 6

(3.51)

is the expected value of X , which is the average of the integers 1 − 6.

**3.7 Application: Probability and Expected Value
**

If X is a real-valued continuous random variable with p.d.f. f , then

133

EX =

∞ −∞

x f ( x) dx .

(3.52)

For example, if X has the uniform distribution on the interval (0, 1), then its p.d.f. is

f ( x) =

and so

∞

1, 0,

elsewhere,

1

for 0 < x < 1

(3.53)

1 . (3.54) 2 −∞ 0 For a pair of jointly distributed, real-valued continuous random variables X and Y with joint p.d.f. f ( x, y), the expected values of X and Y are given by

EX =

x f ( x) dx =

x dx =

EX =

respectively.

∞

∞

x f ( x, y) dx d y

and

−∞ −∞

EY =

∞

∞

y f ( x, y) dx d y ,

(3.55)

−∞ −∞

Example 3.18. If you were to pick n > 2 random real numbers from the interval (0, 1), what are the expected values for the smallest and largest of those numbers? Solution: Let U1 , . . . ,Un be n continuous random variables, each representing a randomly selected real number from (0, 1), i.e. each has the uniform distribution on (0, 1). Deﬁne random variables X and Y by

X = min(U1 , . . . ,Un )

and Y = max(U1 , . . . ,Un ) .

Then it can be shown4 that the joint p.d.f. of X and Y is

f ( x, y) =

Thus, the expected value of X is

1 1

0,

n( n − 1)( y − x)n−2 ,

elsewhere.

for 0 ≤ x ≤ y ≤ 1

(3.56)

EX =

= =

0 1 0 1 0

x

n( n − 1) x( y − x)n−2 d y dx

y=1 y= x

nx( y − x)n−1

dx

**nx(1 − x)n−1 dx , so integration by parts yields
**

1 (1 − x)n+1 n+1

1 0

= − x(1 − x)n −

4 See Ch. 6 in H OEL, P ORT and S TONE.

1 EX = , n+1

134

CHAPTER 3. MULTIPLE INTEGRALS

**and similarly (see Exercise 3) it can be shown that
**

1

y

0

EY =

0

n( n − 1) y( y − x)n−2 dx d y =

n . n+1

So, for example, if you were to repeatedly take samples of n = 3 random real numbers from (0, 1), and each time store the minimum and maximum values in the sample, then the average of the minimums would approach 1 and the average of the maximums would approach 4 3 as the number of samples grows. It would be relatively simple (see Exercise 4) to write a 4 computer program to test this.

Exercises B

1. Evaluate the integral

∞ − x2 −∞ e

**dx using anything you have learned so far.
**

∞ 1 −( x−µ)2 /2σ2 −∞ σ 2π e

**2. For σ > 0 and µ > 0, evaluate 3. Show that EY =
**

n n +1

dx.

in Example 3.18

C

4. Write a computer program (in the language of your choice) that veriﬁes the results in Example 3.18 for the case n = 3 by taking large numbers of samples. 5. Repeat Exercise 4 for the case when n = 4. 6. For continuous random variables X , Y with joint p.d.f. f ( x, y), deﬁne the second moments E ( X 2 ) and E (Y 2 ) by

E( X 2) =

∞

∞

x2 f ( x, y) dx d y

and

−∞ −∞

E (Y 2 ) =

∞

∞

y2 f ( x, y) dx d y ,

−∞ −∞

and the variances Var( X ) and Var(Y ) by Var( X ) = E ( X 2 ) − (E X )2 and Var(Y ) = E (Y 2 ) − (EY )2 .

Find Var( X ) and Var(Y ) for X and Y as in Example 3.18. 7. Continuing Exercise 6, the correlation ρ between X and Y is deﬁned as

ρ =

∞ ∞

E ( X Y ) − (E X )(EY )

Var( X ) Var(Y )

,

where E ( X Y ) = −∞ −∞ x y f ( x, y) dx d y. Find ρ for X and Y as in Example 3.18. (Note: The quantity E ( X Y ) − (E X )(EY ) is called the covariance of X and Y .) 8. In Example 3.17 would the answer change if the interval (0, 100) is used instead of (0, 1)? Explain.

for some integer n ≥ 2 135 . so we only consider the magnitude of the force. a ≤ t ≤ b. y = y(t) for t in [a. b] as follows: a = t 0 < t 1 < t 2 < · · · < t n−1 < t n = b .e.1 below). This integral (usually called a Riemann integral) can be thought of as an integral over a path in R1 . Partition the interval [a.4 Line and Surface Integrals 4.1. the intuitive idea of work is that Work = Force × Distance .1. y) which varies with the position ( x.1 Line Integrals In single-variable calculus you learned how to integrate a real-valued function f ( x) over an interval [a. We will begin with real-valued functions of two variables. You may also recall that if f ( x) represented the force applied along the x-axis to an object at position x in [a. y) is continuous and real-valued. we will see how to deﬁne the integral of a function (either real-valued or vector-valued) of two variables over a general path (i. a curve) in R2 . y = y( t). b] in R1 . Suppose that we want to ﬁnd the total amount W of work done in moving an object along a curve C in R2 with a smooth parametrization x = x( t). b] We will assume for now that the function f ( x. then the work W done in moving that object from position x = a to x = b was deﬁned as the integral: b W = f ( x) dx a In this section.1 Curve C : x = x(t). since an interval (or collection of intervals) is really the only kind of “path” in R1 . y C t=a t = ti ∆ yi ∆s i ≈ ∆ xi ∆ xi 2 + ∆ yi 2 t=b x t = t i +1 0 Figure 4. This deﬁnition will be motivated by the physical notion of work. In physics. with a force f ( x. y) of the object and is applied in the direction of motion along C (see Figure 4. b].

y( t i ∗)) for some t i ∗ in [ t i .1) where ( xi∗ .6) . by the Pythagorean Theorem. yi∗ ) = ( x( t i ∗). (4. the line integral of f ( x.4) The integral on the right side of the above equation gives us our idea of how to deﬁne.5) The symbol ds is the differential of the arc length function t s = s( t) = a x ′ ( u)2 + y ′ ( u)2 du . the sum over ∆y all subintervals becomes the integral from t = a to t = b. t i+1 ]. y) along C with respect to arc length s is b C f ( x. y) ds = f ( x( t). so that b W = f ( x( t). But since ∆ xi 2 + ∆ yi 2 = where ∆ t i = t i+1 − t i .136 CHAPTER 4. the integral of f ( x. LINE AND SURFACE INTEGRALS As we can see from Figure 4. (4. y = y( t).1. (4. yi∗ ) ∆ xi 2 + ∆ yi 2 . y). y) and a curve C in R2 . t i+1 ] the distance ∆ s i traveled along the curve is approximately ∆ xi 2 + ∆ yi 2 . (4. yi∗ ) becomes f ( x( t). and so W ≈ n −1 i =0 f ( xi∗ . parametrized by x = x( t). ∆t respectively. called a line integral: Deﬁnition 4. and f ( xi∗ . for any real-valued function f ( x. ∆ xii and ∆ t ii become x ′ ( t) and y ′ ( t). Thus. if the subinterval is small enough then the work done in moving the object along that piece of the curve is approximately Force × Distance ≈ f ( xi∗ . a ≤ t ≤ b. over a typical subinterval [ t i . yi∗ ) ∆ xi 2 + ∆ yi 2 (4. y( t)) a x ′ ( t)2 + y ′ ( t)2 dt . W ≈ n −1 i =0 f ( xi∗ .3) Taking the limit of that sum as the length of the largest subinterval goes to 0. For a real-valued function f ( x. y( t)). (4.1. y) along the curve C . y( t)) a x ′ ( t)2 + y ′ ( t)2 dt .1. yi∗ ) ∆ xi ∆t i 2 + ∆ yi ∆t i 2 ∆t i . then ∆ xi ∆t i 2 + ∆ yi ∆t i 2 ∆t i .2) is approximately the total amount of work done over the entire curve.

for all t in [a. b]. y) ds = f ( x( t). what does the line integral C f ( x. Parametrize C as follows: z r x = x( t) = r cos t . (4. y). ds = s ′ ( t) dt = x ′ ( t)2 + y ′ ( t)2 dt . y) C ds x Figure 4. Then b A = = C 2π 0 f ( x.1.1. For a general real-valued function f ( x.2 Area of shaded rectangle = height × width ≈ f (x. and thus the line integral C f ( x. y) ds is the total area of that picket fence (see Figure 4. y) = h for all ( x. Use a line integral to show that the lateral surface area A of a right circular cylinder of radius r and height h is 2π rh.1.1 Line Integrals 137 which you may recognize from Section 1.2). y) ds can be thought of as approximately the area of a section of that fence over some inﬁnitesimally small section of the curve. y f ( x. y( t)) a x ′ ( t)2 + y ′ ( t)2 dt x C : x 2 + y2 = r 2 Figure 4.1.3). then f ( x. So if you think of f ( x. y) ds represent? The preceding discussion of ds gives us a clue. y = y( t) = r sin t .3 h (− r sin t)2 + ( r cos t)2 dt 2π = h = rh r 0 2π 0 sin2 t + cos2 t dt 1 dt = 2π rh . Solution: We will use the right circular cylinder with base circle C given by x2 + y2 = r 2 and with height h in the positive z direction (see Figure 4. You can think of differentials as inﬁnitesimal lengths.1. That is. y) y Let f ( x. y) as the height of a picket fence along C .4. 0 ≤ t ≤ 2π 0 h = f (x. y) ds 0 Example 4.9 as the length of the curve C over the interval [a.7) by the Fundamental Theorem of Calculus. t]. y).

y = y( t) = r sin(2π − t) . suppose that we have a function f( x. Then −C is parametrized by x = x( a + b − t) . So it would be helpful to develop a vector form for a line integral. y) along C with respect to x. It is deﬁned at points in R2 . y) deﬁned on R2 by f( x. y) j for some continuous real-valued functions P ( x. −C (4. (4. a circle of radius r ). For a curve C with a smooth parametrization x = x( t). y) along C with respect to y. a≤t≤b . even though the curve itself is still the same (namely.138 CHAPTER 4.12) a as the line integral of f ( x. y) on R2 . In the derivation of the formula for a line integral. using the parametrization x = x( t) = r cos(2π − t) . (4. we used the idea of work as force multiplied by distance. y) = P ( x. and we have C y = y( a + b − t ) . we know that force is actually a vector. let r( t) = x( t) i + y( t) j . i. then we would have gotten an area of 4π rh. y) and Q ( x. a ≤ t ≤ b. for any f ( x. y) d y = f ( x( t). In general. However. If we had gone in the clockwise direction. twice the desired area. y = y( t).e. y( t)) y ′ ( t) dt (4.1 that if we had traversed the circle C twice. then denote by −C the same curve as C but traversed in the opposite direction. y) i + Q ( x. LINE AND SURFACE INTEGRALS Note in Example 4.e. If a curve C has a parametrization x = x( t). and its values are vectors in R2 . i. notice that we traversed the circle in the counter-clockwise direction.9) f ( x.11) a as the line integral of f ( x. y) ds unchanged. We can also deﬁne b C f ( x. y). For this. 0 ≤ t ≤ 2π .10) Notice that our deﬁnition of the line integral was with respect to the arc length parameter s. y) ds = f ( x. let t vary from 0 to 4π. y) dx = f ( x( t). y) ds . and b C f ( x.8) then it is easy to verify (see Exercise 12) that the value of the line integral is unchanged. a ≤ t ≤ b. Also. Such a function f is called a vector ﬁeld on R2 . y = y( t). it can be shown (see Exercise 15) that reversing the direction in which a curve C is traversed leaves C f ( x. y( t)) x ′ ( t) dt (4.

y( t)) · r ′ ( t) dt . y( t)). y( t)) on C . b] and hence T( t) = r ′ ( t) r ′ ( t) is the unit tangent vector to C at ( x( t). y). Notice that the function f( x( t). where r( t) = x( t) i + y( t) j is the position vector for points on C . Putting Deﬁnitions 4. y) dx + Q ( x.1 and 4. y) d y C (4. so the last integral on the right looks somewhat similar to our earlier deﬁnition of a line integral. For a vector ﬁeld f( x. where it is understood that the line integral along C is being applied to both P and Q . y) dx + C Q ( x. Then r ′ ( t) = x ′ ( t) i + y ′ ( t) j and so b C 139 P ( x. the line integral of f along C is C f · dr = = C b a P ( x. y) d y ∂x ∂y is called exact if it equals dF for some function F ( x. y) j and a curve C with a smooth parametrization x = x( t). y) dx + Q ( x. y) d y = = = a b a b a P ( x( t). then r ′ ( t) is a tangent vector to C at the point ( x( t).2 together we get the following theorem: . For convenience we will often write C P ( x. then r ′ ( t) = 0 on [a. y) d y is known as a differential form. y) dx + Q ( x. This leads us to the following deﬁnition: Deﬁnition 4. y( t)) y ′ ( t) dt (P ( x( t). Recall that if the points on a curve C have position vector r( t) = x( t) i + y( t) j. the differential of F is dF = ∂F dx + ∂F d y. We use the notation d r = r ′ ( t) dt = dx i + d y j to denote the differential of the vector-valued function r. y( t)) in the direction of increasing t (which we call the direction of C ). For a real-valued function F ( x. y).1 which is called a line integral of a scalar ﬁeld. The line integral in Deﬁnition 4. y( t)) x ′ ( t) dt + b a Q ( x( t). y( t)) · r ′ ( t) is a real-valued function on [a. y) i + Q ( x. y = y( t).14) f( x( t). A differential form P ( x. Since C is a smooth curve. b]. a ≤ t ≤ b. The quantity P ( x.2 is often called a line integral of a vector ﬁeld to distinguish it from the line integral in Deﬁnition 4. y) d y .1 Line Integrals be the position vector for a point ( x( t). y( t)) · r ′ ( t) dt by deﬁnition of f( x. y( t)) x ′ ( t) + Q ( x( t). y). y) = P ( x. y( t)) y ′ ( t)) dt f( x( t). y) dx + C Q ( x. y) dx + Q ( x.4.13) (4. y) d y = C P ( x.2.

y) represents the force moving an object along a curve C . (4.4 shows both curves. ′ C (x + y2 ) dx + 2 x y d y. LINE AND SURFACE INTEGRALS Theorem 4.1.1. C f · dr = C f · T ds . 2) Solution: Figure 4. then C ( x2 + y2 ) dx + 2 x y d y = = = = 1 0 1 0 1 0 ( x( t)2 + y( t)2 ) x ′ ( t) + 2 x( t) y( t) y ′ ( t) dt ( t + 4 t )(1) + 2 t(2 t)(2) dt 13 t2 dt 1 0 2 2 x 0 1 Figure 4. If the vector ﬁeld f( x. y( t)).4 13 t3 3 = 13 3 (b) Since x ′ ( t) = 1 and y ′ ( t) = 4 t.16) Example 4. then C ( x2 + y2 ) dx + 2 x y d y = = = = 1 0 1 0 1 0 3 ( x( t)2 + y( t)2 ) x ′ ( t) + 2 x( t) y( t) y ′ ( t) dt ( t2 + 4 t4 )(1) + 2 t(2 t2 )(4 t) dt ( t2 + 20 t4 ) dt 1 0 t + 4 t5 3 = 13 1 +4 = 3 3 . a ≤ t ≤ b and position vector r( t) = x( t) i + y( t) j.1. (b) C : x = t . y) j and a curve C with a smooth parametrization x = x( t).140 CHAPTER 4. (4. (a) Since x ( t) = 1 and y ( t) = 2. y) i + Q ( x. y = 2 t2 . y = y( t). For a vector ﬁeld f( x. y) = P ( x. Evaluate (a) C : x = t .15) where T( t) = r ′ ( t) r ′ ( t) is the unit tangent vector to C at ( x( t). then the work W done by this force is W = 2 C f · T ds = C f · dr . where: y = 2t . ′ 0≤t≤1 0≤t≤1 y 2 (1.2.

2) to (1. . C (x 2 + y2 ) dx + 2 x y d y. This may lead 3 you to think that work (and more generally. . as we will see in the next section. 0 ≤ t ≤ 1 (see Figure 4. . + f · d rn C C1 C2 Cn where each ri is the position vector of the curve C i .5). Example 4. . For example. and the letter l signiﬁes length. then the work done is 13 .5 ( t2 + 4)(1) + 2 t(2)(0) dt 0 dt + 1 0 0 ( t2 + 4) dt 13 1 +4 = 3 3 t3 + 4t 3 = Line integral notation varies quite a bit.3. Although we deﬁned line integrals over a single smooth curve. where it is understood that the limits of integration a and b are for the underlying parameter t of the curve. . in the direction of C ). y) = ( x2 + y2 ) i + 2 x y j represents the force moving an object from (0.1 is often preferred in physics since it emphasizes the idea of integrating the tangential component f · T of f in the direction of T (i. y = 2. in physics it is common to see the b notation a f · d l. 2) along the given curve C . this is not always the case. which is a useful physical interpretation of line integrals.1. 0) to y 2 C2 C1 (1. if C is a piecewise smooth curve. Evaluate (0. . Also. C n . Then ( x + y ) dx + 2 x y d y = + = = = 2 2 C C1 ( x + y ) dx + 2 x y d y 2 2 x 0 1 C2 2 0 2 0 ( x2 + y2 ) dx + 2 x y d y (02 + t2 )(0) + 2(0) t(1) dt + 1 1 0 Figure 4. then we can deﬁne f · dr = f · d r1 + f · d r2 + . y = t. where C 1 is the curve given by x = 0. . if the vector ﬁeld f( x. the line integral of a vector ﬁeld) is independent of the path taken. 2) Solution: Write C = C 1 ∪ C 2 .4. ∪ C n is the union of smooth curves C 1 . . that is C = C1 ∪ C2 ∪ . . 2).e. 0 ≤ t ≤ 2 and C 2 is the curve given by x = t. the formulation C f · T ds from Theorem 4.1. where C is the polygonal path from (0. 0) to (1. However.1 Line Integrals 141 So in both cases.

y) ds = −C f ( x.1 is unchanged when using the parametrization of the circle C given in formulas (4. Prove that C f ( x. Use a line integral to ﬁnd the lateral surface area of the part of the cylinder x2 + y2 = 4 below the plane x + 2 y + z = 6 and above the x y-plane. y) i + Q ( x. 0 ≤ t ≤ 1 C : polygonal path from (0. y) on C . 0 ≤ t ≤ 2π 7. f( x. y) = i − j. C : x = 3 t. x2 + 1 C : x = t. 0 ≤ t ≤ 2π B 12. C f · dr for the given vector ﬁeld f( x. Let C be a smooth curve with arc length L. 8. 0) along the x-axis 5. y) = C f ( x. (Hint: Use formulas (4. Show that b b a g( x) dx ≤ a | g( x) | dx for Riemann integrals. y) = x y2 i + x y3 j. y = sin t. Show that if f points in the same direction as r ′ ( t) at each point r( t) along a smooth curve C . y) = 2 x + y. 2. y) ds for the given function f ( x. (Hint: Recall that 17. Verify that the value of the line integral in Example 4.) 16. .142 CHAPTER 4. f( x. 13. x C : x = cos t. 0) counterclockwise along the circle x2 + y2 = 4 to the point (−2. 0) and then back to (2. Show that if f ⊥ r ′ ( t) at each point r( t) along a smooth curve C . y) = x i + y j. f ( x.) C f · d r ≤ ML. y = sin t. then C f · d r = C f ds.9). f( x. 0 ≤ t ≤ 1 C : x = cos t. f( x. LINE AND SURFACE INTEGRALS Exercises A For Exercises 1-4. C 15. f ( x. 0) to (1. then C f · d r = 0. 3. f( x. 1) to (0. and suppose that f( x. y) and curve C . For Exercises 6-11. 2) 4. C : the polygonal path from (0. 11. y = sin t. 0 ≤ t ≤ π/2 . 0) to (3.8). 0) to (0. 10. y) = ( x2 − y) i + ( x − y2 ) j. 14. f ( x. 0 ≤ t ≤ 2π C : x = cos t. calculate 1. y = sin t. calculate 6. y) = x y. 0) C : x = 2 + cos t. C : path from (2. y) = y i − x j. y = sin t. f( x. y) and curve C . y) ds. f ( x. 0) to (3. Prove that the Riemann integral b a f ( x) dx is a special case of a line integral. y = 0. y = 2 t. 0 ≤ t ≤ 2π C : x = cos t. y) = x + y2 . y) j is a vector ﬁeld such that f( x. y) ≤ M for all ( x. y) = ( x2 + y2 ) i. 9. y) = P ( x.

the value does change. A similar argument shows that Q ( x. with position vector r( t) = x( t) i + y( t) j (we will usually abbreviate this by saying that C : r( t) = x( t) i + y( t) j is a smooth curve). y) dx C since we are just using a different letter ( u) for the line integral along C . y) d y C C f · dr = − C f · dr . To see this. y( u)) (− x ′ ( u)) (− du) (by letting u = a + b − t) P ( x( u). y(a + b − t)) (− x ′ (a + b − t)) dt (by the Chain Rule) P ( x( u).2 Properties of Line Integrals We know from the previous section that for line integrals of real-valued functions (scalar ﬁelds). y) dx = = = = a b a a b a b P ( x(a + b − t).4. y) d y = − Q ( x. y) dx = − P ( x. a ≤ t ≤ b.2 Properties of Line Integrals 143 4.17) For line integrals of vector ﬁelds. so a P ( x. y(a + b − t)) d ( x(a + b − t)) dt dt P ( x(a + b − t). y( u)) x ′ ( u) du . y) ds = f ( x. Let C be a smooth curve parametrized by x = x( t). y) dx + Q ( x. y) d y −C = − = − −C C P ( x. let f( x. however. We know that the curve −C traversed in the opposite direction is parametrized by x = x(a + b − t). y) = P ( x. (4.18) . y) d y C Q ( x. y) dx + Q ( x. y( u)) x ′ ( u) du b = − −C P ( x( u). y) j be a vector ﬁeld. y = y(a + b − t). y) d y . with P and Q continuously differentiable functions. y) i + Q ( x. since a b b a =− . Then b −C P ( x. y) dx + − P ( x. y) ds −C (4. reversing the direction in which the integral is taken along a curve does not change the value of the line integral: C f ( x. a ≤ t ≤ b. y = y( t). −C C and hence f · dr = −C −C P ( x.

y = y( u). a ≤ t ≤ b for the curve C . c which shows that C P ( x. y = y( t). and du = α ′1u) . a ≤ t ≤ b ˜ ˜ and x = x( u) = x(α( u)). Notice that the condition α ′ ( u) > 0 in Theorem 4. The preceding discussion shows the importance of always taking the direction of the curve into account when using line integrals of vector ﬁelds. and by the Chain Rule dt ( ˜ x ′ ( u) = ˜ dx d dx dt = ( x(α( u))) = = x ′ ( t) α ′ ( u ) du du dt du ⇒ x ′ ( t) = ˜ x ′ ( u) α ′ ( u) so making the susbstitution t = α( u) gives b a P ( x( t). y(α( u))) ˜ x ′ ( u) ′ (α ( u) du) α ′ ( u) ˜ ˜ ˜ P ( x( u). is zero. such that a = α( c).2 means that the two parametrizations move along C in the same direction. dt = α ′ ( u) du. b]. Proof: Since α( u) is strictly increasing and maps [ c. d ) (i. Suppose that t = α( u) for c ≤ u ≤ d . Recall that our deﬁnition of a line integral required that we have a parametrization x = x( t). y( u)) x ′ ( u) du . d = α−1 ( b). y = y( u) = y(α( u)). Also. y) = P ( x. y) dx has the same value for both parametrizations. Let f( x. Luckily. say. That was not the case with the “reverse” parametrization for −C : for u = a + b − t we have t = α( u) = a + b − u ⇒ α ′ ( u) = −1 < 0. y) i + Q ( x.144 CHAPTER 4. Then C f · d r has the same value for the parametrizations x = x( t). b = α( d ). y = y( t). and let C be a smooth curve parametrized by x = x( t). any curve has inﬁnitely many parametrizations. . y) j be a vector ﬁeld. c ≤ u ≤ d . b] such that c = α−1 (a). direction is accounted for. LINE AND SURFACE INTEGRALS The above formula can be interpreted in terms of the work done by a force f( x. α( u) is strictly increasing on [ c. and hence QED C f · d r has the same value. A similar argument shows that C Q ( x.2. a ≤ t ≤ b. then we know that t = α( u) has an inverse function u = α−1 ( t) deﬁned on [a. it turns out that the value of a line integral of a vector ﬁeld is unchanged as long as the direction of the curve C is preserved by whatever parametrization is chosen: Theorem 4. y) d y has the same value for both parametrizations. y) (treated as a vector) moving an object along a curve C : the total work performed moving the object along C from its initial point to its terminal point. For this reason. and then back to the initial point moving backwards along the same path. But as we know.e. y = y( t). d ]). So could we get a different value for a line integral using some other ˜ ˜ parametrization of C . x = x( u). and α ′ ( u) > 0 on the open interval ( c. d ] onto [a. c ≤ u ≤ d ? If so. the curves in line integrals are sometimes referred to as directed curves or oriented curves. y( t)) x ( t) dt = = ′ α−1 ( b) α−1 (a) d P ( x(α( u)). this would mean that our deﬁnition is not well-deﬁned. This is because when force is considered as a vector.

e. dt du = cos u > 0 on (0. for C : x = x( t). where t = sin u for 0 ≤ u ≤ π/2. . namely 13 3 . y = y( t). 1 = sin(π/2). respectively.2 we know that if C is parametrized by Example 4. along closed curves.1. along the curve C : x = t. y) ds C and C f · dr to denote line integrals of scalar and vector ﬁelds. Evaluate the line integral C ( x2 + y2 ) dx + 2 x y d y from Example 4.2. π/2). And we can indeed verify this: ( x2 + y2 ) dx + 2 x y d y = = = π/2 C 0 π/2 (sin2 u + (2 sin2 u)2 ) cos u + 2(sin u)(2 sin2 u)4 sin u cos u du sin2 u + 20 sin4 u cos u du π/2 0 sin3 u + 4 sin5 u 3 13 1 = +4 = 3 3 0 In other words. So by x = sin u . we notice that 0 = sin 0. a ≤ t ≤ b. we have ( x(a). t=a t=b t=a t=b C C (b) Not closed (a) Closed Figure 4.2. y = 2 t2 .2. the line integral is unchanged whether t or u is the parameter for C . By a closed curve. Note that any closed curve can be regarded as a union of simple closed curves (think of the loops in a ﬁgure eight).4. In some older texts you may see the notation or to indicate a line integral traversing a closed curve in a counterclockwise or clockwise direction. 0 ≤ u ≤ π/2 then C ( x2 + y2 ) dx + 2 x y d y should have the same value as we found in Example 4. i.2 Properties of Line Integrals 145 Solution: First. respectively. y(a)) = ( x( b). We use the special notation f ( x.1 Closed vs nonclosed curves A simple closed curve is a closed curve which does not intersect itself. Section 4.4. y( b)). we mean a curve C whose initial point and terminal point are the same. 0 ≤ t ≤ 1. y = 2 sin2 u . and Theorem 4.

the line integral C f · d r is independent of the path between any two points in R if and only if C f · d r = 0 for every closed curve C which is contained in R. .2. again as in Figure 4. Then by path independence we have f · dr = f · dr C 1 f · d r = C 2 f · d r.3. the line integral has been independent of the path joining the two points. and so This proves path independence.2. As we mentioned before. Let C 1 be a part of the curve C that goes from P1 to P2 . this is not always the case. In a region R .2.2.2) have had the same value for different curves joining the initial point to the terminal point. Thus. Let C 1 be a curve in R going from P1 to P2 . suppose that the line integral C f · d r is independent of the path between any two points in R .g. Conversely. and so C f · d r = 0. The following theorem gives a necessary and sufﬁcient condition for this path independence: Theorem 4. the examples we have seen of line integrals (e. Proof: Suppose that C f · d r = 0 for every closed curve C which is contained in R . Example 4. Then C = C 1 ∪ −C 2 is a closed curve in R (from P1 to P1 ). LINE AND SURFACE INTEGRALS So far. Let P1 and P2 be two distinct points on C . C1 C2 C1 f · dr − C2 f · dr = 0 f · d r = 0 . That is. and let C 2 be the remaining part of C that goes from P1 to P2 .2 C1 C2 f · d r .146 CHAPTER 4. as in Figure 4. Let P1 and P2 be two distinct points in R . 0 = = = C C1 f · dr f · dr + f · dr − f · dr P1 P2 C1 −C 2 C2 Figure 4. Let C be a closed curve contained in R . and let C 2 be another curve in R going from P1 to P2 . so f · dr = 0 QED C1 f · dr + −C 2 C since C = C 1 ∪ −C 2 .2.

Then f · d r = F (B ) − F ( A ) . the above theorem does not give a practical way to determine path independence.4) b a = F ( x( t). and both x = x( t) and y = y( t) are differentiable functions of t. y) is a continuously differentiable function of x and y. y) such that ∇F = f on R . (4. Let f( x. . § 6. since it depends only on the values of F at those endpoints.4. y( t)) = F (B ) − F ( A ) QED by the Fundamental Theorem of Calculus. y) i + Q ( x. a speciﬁc line integral between two points and all line integrals around closed curves). Suppose that there is a real-valued function F ( x. 1 See T AYLOR and M ANN. then z is a differentiable function of t.4 (which uses the Mean Value Theorem).2 Properties of Line Integrals 147 Clearly.1 We will now use this Chain Rule to prove the following sufﬁcient condition for path independence of line integrals: Theorem 4. (Chain Rule) If z = f ( x. we have f · dr = = = a b a b a P ( x( t). y( t)) dt (by the Chain Rule in Theorem 4. Thus. and how seemingly unrelated line integrals can be related (in this case. y = y( t).5. y( t)) y ′ ( t) dt ∂F dx ∂ x dt + ∂F d y ∂ y dt dt (since ∇F = f ⇒ ∂F ∂x = P and ∂F ∂y = Q) F ′ ( x( t).2 from Section 2. we ﬁrst need a version of the Chain Rule for multivariable functions: Theorem 4. The proof is virtually identical to the proof of Theorem 2. y) j be a vector ﬁeld in some region R . What it mostly does is give an idea of the way in which line integrals behave. with P and Q continuously differentiable functions on R . and dz ∂ z dx ∂z d y = + (4. For a more practical method for determining path independence.4. y(a)) and B = ( x( b). a ≤ t ≤ b. Proof: By deﬁnition of b C C f · d r. y( b)) are the endpoints of C . since it is impossible to check the line integrals around all possible closed curves in a region. y) = P ( x. y( t)) x ′ ( t) + Q ( x( t).20) C where A = ( x(a). the line integral is independent of the path between its endpoints. Let C be a smooth curve in R parametrized by x = x( t).19) dt ∂ x dt ∂ y dt at all points where the derivatives on the right are deﬁned.5. so we omit it.

Suppose that ∂F = x2 + y2 . Then we must have F ( x. y) is called a potential for f. Example 4. Evaluate C x dx + y d y for C : x = 2 cos t. 2). y) such that ∂F ∂x = x 2 + y2 and ∂F ∂y = 2x y .6. since by Theorem 4. y) = 1 x3 + x y2 + g( y) for some function ∂x 3 g( y).5 3 f · d r = F (1. 0) = 1 13 1 3 (1) + (1)(2)2 − (0 + 0) = + 4 = . Thus. Recall from Examples 4. Note that we can also verify that the value of the line integral of f along any curve C going from (0. Solution: We need to ﬁnd a real-valued function F ( x.148 CHAPTER 4. A real-valued function F ( x. y) = 1 3 x + x y2 . we pick K = 0. 3 3 3 C A consequence of Theorem 4. y) = f( x. y) for f( x.1 that the line integral C ( x2 + y2 ) dx + 2 x y d y was found to have the value 13 for three different curves C going from the 3 point (0. 3 Hence the line integral C ( x2 + y2 ) dx + 2 x y d y is path independent. LINE AND SURFACE INTEGRALS Theorem 4. 2) will always be 13 . y) = x i + y j has a potential F ( x. a potential F ( x.5. g( y) = K . So ∂F = 2 x y + g ′ ( y) satisﬁes the condition ∂F = 2 x y if g ′ ( y) = 0.5 can be thought of as the line integral version of the Fundamental Theorem of Calculus. y)). y = 3 sin t.5 in the special case where C is a closed curve.5 to show that this line integral is indeed path independent. Use Theorem 4.2 and 4.e. so that the endpoints A and B are the same point. C ∇F · d r = 0 for any real-valued function F ( x. y) such that ∇F ( x. If a vector ﬁeld f has a potential in a region R . is the following important corollary: Corollary 4. y): ∂F ∂x ∂F 1 2 x + g( y) . namely F ( x. 2) − F (0.3 in Section 4. 0) to the point (1. Since any choice for K will do (why?). 0) to (1. A conservative vector ﬁeld is one which has a potential.6. 0 ≤ t ≤ 2π. where K ∂y ∂y is a constant. i. y) = . y) = ( x2 + y2 ) i + 2 x y j exists.e. Example 4. Solution: The vector ﬁeld f( x. then C f · d r = 0 for any closed curve C in R (i. so 2 1 = y ⇒ g ′ ( y) = y ⇒ g ( y) = y2 + K ∂y 2 = x ⇒ F ( x.

so F ( x. y)? If so. y) = x dx + y d y = C f · dr = 0 x2 4 by Corollary 4.6. You may assume that F would be smooth. y).) . y) be vector ﬁelds. ( x2 + y2 ) dx + 2 x y d y for C : x = cos t. 2 2 C 149 for any constant K . Is there a potential F ( x. y) and g( x. y = sin t. ﬁnd one. y = sin t. y) = h( y) i + g( x) j. y) and g( x. y) = −y x2 + y2 (a) Show that f = ∇F . since the curve C is closed (it is the ellipse + y2 9 = 1). Evaluate C ( x2 + y2 ) dx + 2 x y d y for C : x = cos t. for F ( x. Is there a potential F ( x. Does this contradict Corollary 4. and let f( x. Show that f ∇ g · dr = − g ∇ f · dr C C for any closed curve C in R . 2. y) be continuously differentiable real-valued functions in a region R . Let f( x. y) = x y2 i + x3 y j? If so.) 9. (b) Show that C x i + x2 + y2 j for all ( x. Evaluate C 3. 0 ≤ t ≤ 2π.6? Explain. 0 ≤ t ≤ π. and let C be a curve in R2 . Thus. Show that C 1 ds = L. y) for f( x. ﬁnd one. Let C be a curve whose arc length is L.4. y) = x i − y j? If so. y = sin t. Let f ( x. y) = (0. y) for f( x. B 6. Let g( x) and h( y) be differentiable functions. y) for f( x. Exercises A 1. f · d r = 2π. y) = y i − x j? If so. (Hint: Consider the mixed partial derivatives of F .2 Properties of Line Integrals 1 2 1 2 x + y is a potential for f( x. (Hint: Use Exercise 21 in Section 2. and C : x = cos t. let a and b be constants. 4. 5. C C C 7. ﬁnd it. Let f( x. Can f have a potential F ( x. ﬁnd one. C 10. y) = tan−1 ( y/ x). 0 ≤ t ≤ 2π. 0).4. Show that (a f ± b g) · d r = a f · d r ± b g · d r . 8. Is there a potential F ( x.

y) around C using the representation C = C 1 ∪ C 2 given by (4.21) where C is traversed so that R is always on the left side of C . respectively. y) i + Q ( x. LINE AND SURFACE INTEGRALS 4.3 Green’s Theorem We will now see a way of evaluating the line integral of a smooth vector ﬁeld around a simple closed curve. that is.23) and (4. y) j be a smooth vector ﬁeld deﬁned on both R and C .1 b x Integrate P ( x. (Green’s Theorem) Let R be a region in R2 whose boundary is a simple closed curve C which is piecewise smooth. (4. y) and Q ( x. See Figure 4.22) (4. y) are smooth. y) i + Q ( x. .24) (4. where X 1 and X 2 are the points on C farthest to the left and right. respectively. Then f · dr = R ∂Q ∂x C − ∂P ∂y dA . y d y = y2 (x) Y2 X 2 x = x2 (y) where Y1 and Y2 are the lowest and highest points. Let f( x. Proof: We will prove the theorem in the case for a simple region R . y) j is smooth if its component functions P ( x.3.3. where the boundary curve C can be written as C = C 1 ∪ C 2 in two distinct ways: C 1 = the curve y = y1 ( x) from the point X 1 to the point X 2 (4. y) = P ( x. A vector ﬁeld f( x. y) = P ( x.150 CHAPTER 4. and C 1 = the curve x = x1 ( y) from the point Y2 to the point Y1 (4.25) C 2 = the curve x = x2 ( y) from the point Y1 to the point Y2 .24). We will use Green’s Theorem (sometimes called Green’s Theorem in the plane) to relate the line integral around a closed curve with a double integral over the region inside the curve: Theorem 4.7.1.23) C 2 = the curve y = y2 ( x) from the point X 2 to the point X 1 . x = x1 (y) X 1 Y1 R C c y = y1 (x) a Figure 4. on C .

y1 ( x))) dx P ( x. and so .3 Green’s Theorem 151 Since y = y1 ( x) along C 1 (as x goes from a to b) and y = y2 ( x) along C 2 (as x goes from b to a). y) d y + Q ( x. y) ∂x dx d y (by the Fundamental Theorem of Calculus) ∂Q ∂x d A . y) around C using the representation C = C 1 ∪ C 2 given by (4. y)) d y Q ( x. Likewise. y2 ( x)) dx b b P ( x.4. then we have C Q ( x.26). as we see from Figure 4.1. y1 ( x)) dx − b P ( x. y2 ( x)) − P ( x.1.3. y) c d c x2 ( y) x1 ( y) x= x2 ( y) x= x1 ( y) dy ∂Q ( x. y) d y + Q ( x2 ( y). y) dx = = = C1 b a b a P ( x. as we see from Figure 4. y) d y C2 d Q ( x1 ( y). y) ∂y d y dx (by the Fundamental Theorem of Calculus) ∂P R ∂y dA . Since x = x1 ( y) along C 1 (as y goes from d to c) and x = x2 ( y) along C 2 (as y goes from c to d ). y2 ( x)) dx a = − = − = − = − a b (P ( x. y1 ( x)) dx + P ( x. y) dx + P ( x.3. y) d y = = C1 c d Q ( x. y) dx C2 a P ( x. y) a b a y2 ( x) y1 ( x) y= y2 ( x) y= y1 ( x) dx ∂P ( x. then we have C P ( x.25) and (4. integrate Q ( x. y) − Q ( x1 ( y). y) d y c d = − = = = = R Q ( x1 ( y). y) d y c d c d (Q ( x2 ( y). y) d y + d c Q ( x2 ( y).

for (1. . y) = x + y and Q ( x. y) = x x 2 + y2 . y) dx + ∂P ∂y ∂Q Q ( x. so Green’s Theorem does not apply. where −y P ( x. For the boundary curve C : x2 + y2 = 1. Recall from Example 4. 0) is not contained in R . This would seem to contradict Green’s Theorem. That is. y) = 2 x y. QED Though we proved Green’s Theorem only for a simple region R . Let f( x. 2 x2 ≤ y ≤ 2 x }. But ∂Q ∂x = y2 − x 2 ∂P ⇒ = 2 + y2 )2 ∂y (x ∂Q R ∂x − ∂P ∂y dA = R 0dA = 0 . y) = ( x2 + y2 ) i + 2 x y j has a potential function F ( x.3. the theorem can also be proved for more general regions (say. y) = x 2 + y2 and Q ( x.5 in Section 4.8. where C is the boundary (traversed counterclockwise) of the region R = { ( x.152 CHAPTER 4. note that R is not the entire region enclosed by C . § 15. since the point (0.2 that the vector ﬁeld f( x. LINE AND SURFACE INTEGRALS C f · dr = C P ( x. Evaluate C ( x2 + y2 ) dx + 2 x y d y. y Solution: R is the shaded region in Figure 4. R has a “hole” at the origin. y) j.2 that C f · d r = 2π.2. y) = P ( x. and let R = { ( x. Example 4.6. 2) 2 2 2 P ( x. it was shown in Exercise 9(b) in Section 4. y) = 1 x3 + x y2 .7. However. R C x = R (2 y − 2 y) d A = 0 1 Figure 4.2 Example 4. 3 and so C f · d r = 0 by Corollary 4.31 for a discussion of some of the difﬁculties involved when the boundary curve is “complicated”.3. y) : 0 ≤ x ≤ 1. y) d y C = − R dA + R ∂Q ∂x dA = R ∂x − ∂P ∂y dA . 2 See T AYLOR and M ANN. a union of simple regions). traversed counterclockwise. y) i + Q ( x.2 We actually already knew that the answer was zero. y) : 0 < x2 + y2 ≤ 1 }. By Green’s Theorem. we have C ( x2 + y2 ) dx + 2 x y d y = R ∂Q ∂x − ∂P ∂y dA 0dA = 0 .

The idea is to cut “slits” between the boundaries of a multiply connected region R so that R is divided into subregions which do not have any “holes”.4. the “outer” boundary and the “inner” boundaries are traversed so that R is always on the left side. which have one or more regions cut out from the interior. For example. where C 1 is the unit circle x2 + y2 = 1 traversed counterclockwise and C 2 is the circle x2 + y2 = 1/4 traversed clockwise.3. in Figure 4. then it can be shown (see Exercise 8) that f · dr = 0 . so for this R ∂y ∂Q ∂P ∂y Figure 4. which . C1 R1 C2 C3 R1 C2 C1 R2 (a) Region R with one hole R2 (b) Region R with two holes Figure 4.8. that is.4(a) the region R is the union of the regions R 1 and R 2 . which are divided by the slits indicated by the dashed lines.4 above.3. as opposed to discrete points being cut out. It turns out that Green’s Theorem can be extended to multiply connected regions.3. y) : 1/4 ≤ x2 + y2 ≤ 1 } (see Figure 4.3 Green’s Theorem 153 If we modify the region R to be the annulus R = { ( x. Notice that along each slit the boundary of R 1 is traversed in the opposite direction as that of R 2 . For such regions. which shows that Green’s Theorem holds for the annular region R .4 Multiply connected regions The intuitive idea for why Green’s Theorem holds for multiply connected regions is shown in Figure 4.3. ∂Q ∂x y 1 C1 1/2 C2 0 1/2 1 R x C We would still have R − ∂P d A = 0. Those slits are part of the boundary of both R 1 and R 2 . and take the “boundary” C of R to be C = C 1 ∪ C 2 . and we traverse then in the manner indicated by the arrows.3).3.3 The annulus R we would have f · dr = R C ∂x − dA . regions like the annulus in Example 4.

LINE AND SURFACE INTEGRALS means that the line integrals of f along those slits cancel each other out. the differential form P dx + Q d y is exact) . a region with no holes). y) j has a smooth potential F ( x. And if the potential F ( x. then C f · d r = 0. Since R 1 and R 2 do not have holes in them.6 that when a smooth vector ﬁeld f( x. we have f · dr = bdy of R 1 C 1 ∪C 2 f · dr + bdy of R 2 f · dr . y) = P ( x. y) is smooth in R . y) i + Q ( x. then ∂F = P and ∂F = Q . if ∂P ∂y = ∂Q ∂x in R then f · dr = R ∂Q ∂x C − ∂P ∂y dA = R 0dA = 0 . A similar argument shows that the theorem holds in the region with two holes shown in Figure 4. For a simply connected region R (i.e. and ∂x ∂y so we know that ∂2 F ∂P ∂Q ∂2 F = ⇒ = in R . the following can be shown: The following statements are equivalent for a simply connected region R in R2 : (a) f( x.4(b). y) in R (b) C f · d r is independent of the path for any curve C in R f · d r = 0 for every simple closed curve C in R = ∂Q ∂x (c) C (d) ∂P ∂y in R (in this case. y) i + Q ( x. and so f · dr = R1 ∂Q ∂x C 1 ∪C 2 − ∂P ∂y dA + R2 ∂Q ∂x − ∂P ∂y dA = R ∂Q ∂x − ∂P ∂y dA . y) = P ( x. so that f · dr = R1 ∂Q ∂x bdy of R 1 − ∂P ∂y dA and bdy of R 2 f · dr = R2 ∂Q ∂x − ∂P ∂y dA . which shows that Green’s Theorem holds in the region R . then Green’s Theorem holds in each subregion. We know from Corollary 4. But since the line integrals along the slits cancel out. simple closed curve C ) has a potential in R . ∂ y ∂x ∂x ∂ y ∂y ∂x Conversely.3. y) j on a region R (whose boundary is a piecewise smooth.154 CHAPTER 4.

−1). Evaluate C e x sin y dx + ( y3 + e x cos y) d y. (4.) R . C 11. b and any closed simple curve C . C ( x2 − y2 ) dx + 2 x y d y. −1). y) = ( y2 + 3 x2 ) i + 2 x y j? If so. (Hint: Use Green’s Theorem and the fact that A = 1 d A . C is the boundary of R = { ( x. For a region R bounded by a simple closed curve C . y) for f( x. Is there a potential F ( x. y) for f( x. (1. y) : 0 ≤ x ≤ 1. where C is the boundary of the rectangle with vertices (1. where C is the boundary of the annulus R = { ( x. ﬁnd one. Show that for any constants a. 6. Is there a potential F ( x. 2 x2 ≤ y ≤ 2 x } 2. C a dx + b d y = 0. C is the boundary of R = { ( x. For the vector ﬁeld f as in Example 4. ﬁnd one. 8. 4) 5. C and (0. use Green’s Theorem to evaluate the given line integral around the curve C . 0) 2 2 3. C is the circle x2 + y2 = 1 ( e x + y2 ) dx + ( e y + x2 ) d y. y) : 0 ≤ x ≤ 1. 0). y) for f( x. 1). where C is traversed so that R is always on the left. y) = ( x3 cos( x y) + 2 x sin( x y)) i + x2 y cos( x y) j? If so. 1. C 4. C x2 y dx + 2 x y d y. B 9. traversed counterclockwise. (−1. 1) and (−1. ﬁnd one.3 Green’s Theorem 155 Exercises A For Exercises 1-4. C is the boundary of the triangle with vertices (0.8. 10.4. traversed counterclockwise. y) = (8 x y + 3) i + 4( x2 + y) j? If so. show that the area A of R is A = − C y dx = C xdy = 1 2 C x d y − y dx . show directly that C f · d r = 0. x2 ≤ y ≤ x } 2 y dx − 3 x d y. y) : 1/4 ≤ x2 + y2 ≤ 1 } traversed so that R is always on the left. 7. Is there a potential F ( x.

y = y( u.8 how we identiﬁed points ( x. we will use a parametrization of a surface to deﬁne a surface integral.4.156 CHAPTER 4. y. LINE AND SURFACE INTEGRALS 4. The idea behind a parametrization of a curve is that it “transforms” a subset of R1 (normally an interval [a. z(t)) (x(a). z(b)) y R1 a t b Similar to how we used a parametrization of a curve to deﬁne the line integral along the curve. v). such as a sphere or a paraboloid. to parametrize a surface Σ in R3 : x = x( u. parametrized by x = x( t). y = y( t). b]) into a curve in R2 or R3 (see Figure 4. We will use two variables.4. z = z( t). z (x(t).2 Parametrization of a surface Σ in R3 In this case. z(a)) x = x(t) y = y(t) z = z(t) 0 x Figure 4. v). v)j + z( u. . u and v. We will now learn how to perform integration over a surface in R3 .4 Surface Integrals and the Divergence Theorem In Section 4. y(a). y(b). v) in R .4.2). z) on a curve C in R3 . for ( u.1 we learned how to integrate along a curve. with the terminal points of the position vector r( t) = x( t)i + y( t)j + z( t)k for t in [a. v) = x( u. v) in some region R in R2 (see Figure 4. v) y Figure 4. v) 0 u x r(u.4. v)k for ( u.1). v) z = z(u. y(t). v) x = x(u. the position vector of a point on the surface Σ is given by the vector-valued function r( u. a ≤ t ≤ b.1 Parametrization of a curve C in R3 C r(t) (x(b). v)i + y( u. Recall from Section 1. v). v) y = y(u. b]. v R2 z Σ R (u. z = z( u.

will have a surface area (call it d σ) that is very close to the area of the parallelogram which has adjacent sides r( u + ∆ u.8) applied to a function of two variables. v) is ∂r . say. v) to ( u + ∆ u. for ∆ u and ∆v small enough. based on the idea of “patching” the region R onto Σ in the grid-like manner shown in Figure 4. v + ∆v) − r( u. v). But by combining our usual notion of a partial derivative (see Deﬁnition 2. So those lines get mapped to curves on Σ. those gridlines in R lead us to how we will deﬁne a surface integral over Σ. v). v + ∆v) and ( u.4 Surface Integrals and the Divergence Theorem Since r( u. Taking the limit of v that sum as the diagonal of the largest rectangle goes to 0 gives S = R ∂r ∂u × ∂r ∂v du dv . and ∂u ∆u r( u. v + ∆v). Similarly. v) ∂r ≈ . and the variable u is constant along the position vector r( u. v) = ∂u ∂x ∂x ( u. Along the vertical gridlines in R .4. Then that rectangle gets mapped by the parametrization onto some section of the surface Σ which. This parametrization of the surface is sometimes called a patch. v)k . v)j + ( u. v) is a function of two variables. v) − r( u. the tangent vector ∂ to those curves at a point ( u.2) with that of the derivative of a vector-valued function (see Deﬁnition 1. v)k . deﬁne the partial derivatives ( u. the horizontal gridlines in R get mapped to v ∂r curves on Σ whose tangent vectors are ∂u . v)i + ( u.4. the lower left corner of one of the rectangular grid sections in R .4. ∂v ∂v ∂v ∂v ( u. v)) ≈ (∆ u ∂r ∂u ) × (∆v ∂r ∂v ) = ∂r ∂u × ∂r ∂v ∆ u ∆v by Theorem 1.3 in Section 2. v)) × (r( u. the total surface area S of Σ is approximately the sum ∂r ∂ of all the quantities ∂u × ∂r ∆ u ∆v.13 in Section 1.2. ( u + ∆ u. v) in R by ∂r ∂u ∂r ∂r ∂u 157 and ∂r ∂v for ( u. and ∂u ∂u ∂y ∂z ( u. v) = ( u. The corner points of that rectangle are ( u. Suppose that this rectangle has a small width and height of ∆ u and ∆v. ( u + ∆ u. Now take a point ( u. v) − r( u. Thus. v) − r( u. v)i + ∂y ( u.12 in Section 1. respectively.2. the variable u is constant. v + ∆v) − r( u. Thus. v) (corresponding to the line segment from ( u. summed over the rectangles in R . ∂v ∆v ∂r ≈ and so the surface area element d σ is approximately (r( u + ∆ u.26) . v). So the area of that rectangle is A = ∆ u ∆v. v) (corresponding to the line segment from ( u. v)j + ∂z The parametrization of Σ can be thought of as “transforming” a region in R2 (in the uvplane) into a 2-dimensional surface in R3 . In fact. as shown in Figure 4. v + ∆v) in R ). v + ∆v) − r( u. v) in R as. (4. we have r( u + ∆ u.4. v) . v) to ( u. v) in R ) and r( u.

v). v)) ∂r ∂u × ∂r ∂v du dv . for ( u. y. (4.3(a)). (4. the line segment from the origin to the center of that . y. Let r( u.158 CHAPTER 4. The surface integral of f ( x. y = y( u. where the surface area element d σ can be thought of as 1 d σ. z z (y − b)2 + z2 = a2 a 0 u y y v (x. z( u. z) d σ = Σ R f ( x( u. A torus T is a surface obtained by revolving a circle of radius a in the yz-plane around the z-axis.29) Example 4.3 Solution: For any point on the circle. Let Σ be a surface in R3 parametrized by x = x( u. Replacing 1 by a general real-valued function f ( x. v).3.4. Find the surface area of T .9. And as the circle revolves around the z-axis. we have the following: Deﬁnition 4.27) This is a special case of a surface integral over the surface Σ.4. and let f ( x. v). v)j + z( u. where the circle’s center is at a distance b from the z-axis (0 < a < b).4. y. z) over Σ is f ( x. z) a b (a) Circle in the yz-plane x (b) Torus T Figure 4. z) be a real-valued function deﬁned on some subset of R3 that contains Σ. the surface area S of Σ is S = Σ 1 dσ . (4. LINE AND SURFACE INTEGRALS We will write the double integral on the right using the special notation dσ = Σ R ∂r ∂u × ∂r ∂v du dv . v)k be the position vector for any point on Σ. v). v) in some region R in R2 . y. the line segment from the center of the circle to that point makes an angle u with the y-axis in the positive y direction (see Figure 4. y. z = z( u. y( u. v)i + y( u. as in Figure 4. v) = x( u.28) In particular. v).3. z) deﬁned in R3 .

0 ≤ u ≤ 2π . Thus. Thus. v) = x( u. the surface area of T is S = Σ 1 dσ 2π 2π 0 2π 2π 0 2π = = = = ∂r ∂u 0 × ∂r ∂v du dv 0 a( b + a cos u) du dv u=2π u =0 0 2π abu + a2 sin u 2πab dv dv 0 = 4π2 ab ∂ ∂r Since ∂u and ∂r are tangent to the surface Σ (i. 0 ≤ v ≤ 2π r( u.e.4. z = a sin u . v)i + y( u. . and so computing the cross product gives ∂r ∂u × ∂r ∂v = −a( b + a cos u) cos v cos u i − a( b + a cos u) sin v cos u j − a( b + a cos u) sin u k .3(b)).4. the torus can be parametrized as: x = ( b + a cos u) cos v . Thus. v)k we see that ∂r ∂u ∂r ∂v = ( b + a cos u) cos v i + ( b + a cos u) sin v j + a sin u k = −a sin u cos v i − a sin u sin v j + a cos u k = −( b + a cos u) sin v i + ( b + a cos u) cos v j + 0k . then their cross product ∂u × ∂r is perpendicular to the tangent plane to the surface v at each point of Σ.4 Surface Integrals and the Divergence Theorem 159 circle sweeps out an angle v with the positive x-axis (see Figure 4. So for the position vector y = ( b + a cos u) sin v . lie in the tangent plane to Σ at each point v ∂r ∂ on Σ). ∂r ∂u ∂r ∂v which has magnitude × = a( b + a cos u) . v)j + z( u.

10. Example 4.4 Deﬁnition 4. y). we make the following deﬁnition of a surface integral of a 3-dimensional vector ﬁeld over a surface: z y 0 x Figure 4. y = v. v). The surface integral of f over Σ is f · dσ = f · n dσ . y. 1 .4 gives a better idea of what outward normal vectors look like. then dividing v by its length yields the outward unit normal vector n = 1 . y ≥ 0. z)j + f 3 ( x. v Recall that normal vectors to a plane can point in two opposite directions. y. using ( u.4. v).4. We say that n is a normal vector to Σ. for 0 ≤ u ≤ 1. where n = ∂r ∂u ∂ × ∂r . As we can see 3 3 3 from Figure 4. y) : 0 ≤ x ≤ 1. Note in the above deﬁnition that the dot product inside the integral on the right is a real-valued function. z)i + f 2 ( x. y. Let Σ be a surface in R3 and let f( x. y( u. n is the outward unit normal vector to Σ.4. and hence we can use Deﬁnition 4. This is a hazy deﬁnition. Evaluate the surface integral Σ f · d σ.4. y. 0 ≤ v ≤ 1 − u . 0 ≤ y ≤ 1 − x }. at any point on Σ. We now need to parametrize Σ. v) instead of ( x. With this idea in mind. we see that z 1 n y 1 x+ y+ z = 1 Figure 4. but the picture in Figure 4. (4. z) = f 1 ( x.4. projecting Σ onto the x y-plane yields a triangular region R = { ( x. z( u. Solution: Since the vector v = (1. we will mean the unit vector that is normal to Σ and points away from the “top” (or “outer” part) of the surface.5).5 Σ 0 1 x x = u. Thus. LINE AND SURFACE INTEGRALS f ( x. By an outward unit normal vector to a surface Σ. and z ≥ 0. 1 .160 CHAPTER 4. 1. z) d σ = Σ R f ( x( u. v)) n d σ .5. 1) is normal to the plane x + y + z = 1 (why?). y. y.30) Σ Σ where. z) = yzi + xzj + x yk and Σ is the part of the plane x + y + z = 1 with x ≥ 0. z)k be a vector ﬁeld deﬁned on some subset of R3 that contains Σ. in the case of a sphere. where f( x. with the outward unit normal n pointing in the positive z direction (see Figure 4.4.3 to evaluate the integral. z = 1 − ( u + v).

So on Σ. −1) = (1. as indicated by the dashed line in Figure 4. integrating over R using vertical slices (e. 1.4. cubes. . y( u. 8 Computing surface integrals can often be tedious.4 Surface Integrals and the Divergence Theorem is a parametrization of Σ over R (since z = 1 − ( x + y) on Σ). Thus. and ellipsoids are closed surfaces. v) = x( u. especially when the formula for the outward unit normal vector at each point of Σ changes. z( u. v). 1 3 . x y) · = = 161 1 3 . that is. and for r( u. 1) ⇒ ∂r ∂u × ∂r ∂v = 3.g. v) in R . 1. 1 3 1 3 = 1 3 ( yz + xz + x y) 1 3 1 3 (( x + y) z + x y) = (( u + v)(1 − ( u + v)) + uv) (( u + v) − ( u + v)2 + uv) for ( u. f · n = ( yz. v)) · n) R 1 1− u 0 1 = ∂r ∂u × ∂r ∂v dv du = = = 1 3 0 (( u + v) − ( u + v)2 + uv) 3 dv du v=1− u v =0 0 1 0 ( u + v)2 ( u + v)3 uv2 − + 2 3 2 1 u 3 u2 5 u3 du + − + 6 2 2 6 1 0 du u u2 u3 5 u4 = + − + 6 4 2 24 = 1 . −1) × (0. v)k = ui + vj + (1 − ( u + v))k we have ∂r ∂u × ∂r ∂v = (1. xz. 0. when Σ encloses a bounded solid in R3 . The following theorem provides an easier way in the case when Σ is a closed surface. v)j + z( u. v). spheres.5) gives f · dσ = Σ Σ f · n dσ (f( x( u.4. For example. v)i + y( u. but planes and paraboloids are not.

z) = xi + yj + zk and Σ is the unit sphere Solution: We see that div f = 1 + 1 + 1 = 3.e. y. The term divergence comes from interpreting div f as a measure of how much a vector ﬁeld “diverges” from a point. p.162 CHAPTER 4.31) where div f = is called the divergence of f. y.33) . for a point ( x. z)k be a vector ﬁeld deﬁned on some subset of R3 that contains Σ. y. bounded below by another surface. 3 = 3 S 1 dV = 3 vol(S ) = 3 · In physical applications.11. y. if f represents the velocity ﬁeld of a ﬂuid.8. 36-39. so f · dσ = Σ S x2 + y2 + z2 = 1.e. Σ div f dV = S 3 dV 4π(1)3 = 4π . A positive ﬂux means there is a net ﬂow out of the surface (i. it is ﬁrst proved for the simple case when the solid S is bounded above by one surface. Evaluate f · d σ. y. i. 4 See S CHEY. in the direction of the outward unit normal vector n). z) in R3 . Then f · dσ = Σ S div f dV . The proof can then be extended to more general solids. 1 Σ V →0 V f · dσ . y. (4. ∂ f1 ∂x + ∂ f2 ∂y + ∂ f3 ∂z (4. (Divergence Theorem) Let Σ be a closed surface in R3 which bounds a solid S . where f( x. z)i + f 2 ( x. LINE AND SURFACE INTEGRALS Theorem 4. z)j + f 3 ( x. then the ﬂux is the net quantity of ﬂuid to ﬂow through the surface Σ per unit time. Namely.6 for the details. For example. y.3 Example 4. while a negative ﬂux indicates a net ﬂow inward (in the direction of −n). for an intuitive discussion of this. (4. z) = lim 3 See T AYLOR and M ANN. § 15. div f( x. the surface integral Σ f · d σ is often referred to as the ﬂux of f through the surface Σ.32). and let f( x. This is best seen by using another deﬁnition of div f which is equivalent4 to the deﬁnition given by formula (4. and bounded laterally by one or more surfaces. z) = f 1 ( x.32) The proof of the Divergence Theorem is very similar to the proof of Green’s Theorem.

33). z) = xi + 2 yj + 3 zk. Σ : boundary of the solid cube S = { ( x. Σ : x2 + y2 + z2 = 1 Σ f · d σ of . it is common to see simply instead of . as we mentioned. z) = 2i + 3j + 5k. 1. z) we have div f( x. z) = x3 i + y3 j + z3 k. y.4 Surface Integrals and the Divergence Theorem 163 where V is the volume enclosed by a closed surface Σ around the point ( x. y. Σ : x2 + y2 + z2 = 9 2. we note that sometimes the notation f ( x. y. f( x. f( x. at the given point ( x. y. y. Especially in physics texts.33). z) = xi + yj + zk. z). z). so V →0 V = lim 0 = 0. so Σ 1 = lim (0) by our assumption that the ﬂux through each Σ is zero. z) over the surface Σ. y. Σ Σ Exercises A For Exercises 1-4. then div f = 0 at that point. y. which means that the volumes they enclose are going to zero.4. It can be shown that this limit is independent of the shapes of those surfaces. The following theorem is a simple consequence of formula (4. Proof: By formula (4. Theorem 4. f( x. z). y. z ≤ 1 } 3. Σ : x2 + y2 + z2 = 1 4. Notice that the limit being taken is of the ratio of the ﬂux through a surface to the volume enclosed by that surface. y.9. y. If the ﬂux of a vector ﬁeld f is zero through every closed surface containing a given point. y. f( x. z) : 0 ≤ x. over closed surfaces. z) = lim 1 V →0 V f · d σ for closed surfaces Σ containing ( x. V →0 QED Lastly. use the Divergence Theorem to evaluate the surface integral the given vector ﬁeld f( x. which gives a rough measure of the ﬂow “leaving” a point. Vector ﬁelds which have zero divergence are often called solenoidal ﬁelds. respectively. In the limit. y. z) d σ Σ and Σ f · dσ is used to denote surface integrals of scalar and vector ﬁelds. y. V → 0 means that we take smaller and smaller closed surfaces around ( x.

z = c cos φ .10.3 to prove that the surface area S over a region R in R2 of a surface z = f ( x. Use a surface integral to show that the surface area of a right circular cone of radius R h and height h is πR h2 + R 2 . Use a surface integral to show that the surface area of a sphere of radius r is 4π r 2 . where f( x. (Note: The above double integral can not be evaluated by elementary means. and z ≥ 0. An alternative is to express the surface area in terms of elliptic integrals.164 CHAPTER 4. with the outward unit normal n pointing in the positive z direction.) 10. The ellipsoid x2 a2 + b2 + z2 = 1 can be parametrized using ellipsoidal coordinates c y2 2 x = a sin φ cos θ . i. y) is given by the formula S = R 1+ ∂f 2 ∂f 2 + ∂y ∂x dA . 8. New York: Dover. 7. Evaluate the surface integral from Exercise 2 without using the Divergence Theorem. . as in Example 4. 6. Use Deﬁnition 4.) 5 B OWMAN.5 ) C 11. (Hint: Think of the parametrization of the surface. § III. 1961. y ≥ 0.e. Show that the ﬂux of any constant vector ﬁeld through any closed surface is zero. (Hint: Use spherical coordinates to parametrize the sphere. with Applications. for 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π. F. (Hint: Use the parametrization x = r cos θ . using only Deﬁnition 4. y = b sin φ sin θ . LINE AND SURFACE INTEGRALS B 5. Show that the surface area S of the ellipsoid is π 2π S = sin φ 0 0 a2 b2 cos2 φ + c2 (a2 sin2 θ + b2 cos2 θ ) sin2 φ d θ d φ .7. y. z) = x2 i + x yj + zk and Σ is the part of the plane 6 x + 3 y + 2 z = 6 with x ≥ 0.3. Introduction to Elliptic Functions. Note that there will be a different outward unit normal vector to each of the six faces of the cube..) 9. For speciﬁc values of a. y = r sin θ . Evaluate the surface integral Σ f · d σ. z = R r . b and c it can be evaluated using numerical methods. for 0 ≤ r ≤ R and 0 ≤ θ ≤ 2π.

z) dz C (4. y. y.5 Stokes’ Theorem So far the only types of line integrals which we have discussed are those along curves in R2 . (4. y. y. y. z) ds = f ( x( t). z) dx = f ( x( t).37) a Similar to the two-variable case.4. (4. the line integral of f along C is f · dr = = C C b a P ( x. z( t)) · r ′ ( t) dt . a ≤ t ≤ b.6. For a real-valued function f ( x. y. z) and a curve C in R3 . which allows us to deﬁne the line integral of a vector ﬁeld along a curve in R3 .1 and 4. y.35) a The line integral of f ( x. y( t). z( t)) a x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt . Deﬁnition 4. z) k and a curve C in R3 with a smooth parametrization x = x( t). y. z) dz = f ( x( t). z) along C with respect to arc length s is b C f ( x. if f ( x. y. z( t)) z ′ ( t) dt . z) ds can be thought of as the total area of the “picket fence” of height f ( x. y. (4. y. y. y( t). z) = P ( x.39) f( x( t). z) along C with respect to x is b C f ( x. y. parametrized by x = x( t).5 Stokes’ Theorem 165 4. z) dx + C Q ( x. z) along C with respect to z is b C f ( x. y. . z) d y + R ( x. z) i + Q ( x. But the deﬁnitions and properties which were covered in Sections 4. z) ≥ 0 then the line integral C f ( x. so that we can now discuss line integrals along curves in R3 . z) d y = f ( x( t). y = y( t). y. y( t). y. z( t)) x ′ ( t) dt . z) along C with respect to y is b C f ( x. (4. the line integral of f ( x. where r( t) = x( t) i + y( t) j + z( t) k is the position vector for points on C . z) at each point along the curve C in R3 . z = z( t).2 can easily be extended to include functions of three variables. y.36) a The line integral of f ( x. y. z = z( t). a ≤ t ≤ b. y( t). z( t)) y ′ ( t) dt . y = y( t). For a vector ﬁeld f( x.34) The line integral of f ( x. Vector ﬁelds in R3 are deﬁned in a similar fashion to those in R2 . Deﬁnition 4.38) (4. y.5. z) j + R ( x. y( t).

Let C be a smooth curve in S parametrized by x = x( t). Some of the most important results we will need for line integrals in R3 are stated below without proof (the proofs are similar to their two-variable equivalents). Let f( x.13. y = y( t 1 . Then C f · d r = F (B ) − F ( A ) . y. y( t). Theorem 4. (4. t 2 ). If a vector ﬁeld f has a potential in a solid S . 6 See T AYLOR and M ANN. Corollary 4. y = y( t) and z = z( t) are differentiable functions of t. Q and R continuously differentiable functions on S . y = y( t). y. y.166 CHAPTER 4. z) = P ( x.e. C ∇F · d r = 0 for any real-valued function F ( x. z) k and a curve C with a smooth parametrization x = x( t). (Chain Rule) If w = f ( x. t 2 ) and z = z( t 1 . z = z( t). z) i + Q ( x.43) Theorem 4. z( b)) are the endpoints of C . t 2 ). Suppose that there is a real-valued function F ( x. y(a). y. dt ∂ x dt ∂ y dt ∂ z dt (4. y.5 for a proof.10. y( b). a ≤ t ≤ b and position vector r( t) = x( t) i + y( t) j + z( t) k. y.11.41) Also. y.42) ∂ t1 ∂ x ∂ t1 ∂ y ∂ t1 ∂ z ∂ t1 and ∂w ∂ t2 = ∂w ∂ x ∂ x ∂ t2 + ∂w ∂ y ∂ y ∂ t2 + ∂w ∂ z ∂ z ∂ t2 . if f( x. z = z( t). then6 ∂w ∂ x ∂w ∂ y ∂w ∂ z ∂w = + + (4. y. z) k be a vector ﬁeld in some solid S . a ≤ t ≤ b.12. y. y.44) where A = ( x(a). C f · dr = C f · T ds . z) i + Q ( x. § 6. then C f · d r = 0 for any closed curve C in S (i. y. if x = x( t 1 . y. For a vector ﬁeld f( x. z(a)) and B = ( x( b). and z. (4. y = y( t). Theorem 4. z) j + R ( x. z) then the line integral C f · d r represents the work done by that force in moving the object along the curve C in R3 . y. and dw ∂w dx ∂w d y ∂w dz = + + . and x = x( t). z) j + R ( x. y. z) is a continuously differentiable function of x. z) such that ∇F = f on S . LINE AND SURFACE INTEGRALS Similar to the two-variable case. (4. . t 2 ) are continuously differentiable function of ( t 1 . z( t)). then w is a differentiable function of t. z)). with P . z) represents the force applied to an object at a point ( x. z) = P ( x.40) where T( t) = r ′ ( t) r ′ ( t) is the unit tangent vector to C at ( x( t).

z) = x2 2 + y2 2 + z2 is a potential for f( x. then 8π C f ( x. 3 30 25 20 15 t = 8π z 10 5 0 -25 -20 -15 -10 t=0 -5 0 5 10 5 10 15 15 20 20 25 30 25 0 -5 -25 -20 -15 -10 x y Figure 4. Evaluate C y = t cos t .12. = t2 (sin2 t + cos2 t) + sin2 t + cos2 t + 1 so since f ( x( t). Let f ( x. z( t)) = z( t) = t along the curve C . y.5. z=t . f ( x. y( t). we have x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 = (sin2 t + 2 t sin t cos t + t2 cos2 t) + (cos2 t − 2 t sin t cos t + t2 sin2 t) + 1 = t2 + 2 . Solution: It is easy to see that F ( x. z) = x i + y j + 2 z k be a vector ﬁeld in R3 .e.5 Stokes’ Theorem Example 4.1). ∇F = f). z( t)) 0 8π x ′ ( t)2 + y ′ ( t)2 + z ′ ( t)2 dt t 0 t2 + 2 dt 8π 0 1 2 ( t + 2)3/2 3 = 1 (64π2 + 2)3/2 − 2 2 . y. Solution: Since x ′ ( t) = sin t + t cos t.12. z) = z and let C be the curve in R3 parametrized by 167 x = t sin t . See Figure 4. and z ′ ( t) = 1.1 Conical helix C Example 4. Let f( x. z) (i. y. z) ds = = = f ( x( t). evaluate C f · d r. y( t). y. y ′ ( t) = cos t − t sin t.5. 0 ≤ t ≤ 8π . Using the same curve C from Example 4. y. (Note: C is called a conical helix. z) ds. y.13.4. .

8π cos 8π. A surface Σ in R3 is orientable if there is a continuous vector ﬁeld N in R3 such that N is nonzero and normal to Σ (i.3).5. and hence orientable. y(8π). we say that the sphere is a twoFigure 4. 3 z For example.168 CHAPTER 4.5. Other examples of two-sided. 0) = F (0. 8π.2). then you arrive back at the same place from which you started but upside down! That is. so = F (8π sin 8π. An example is the Möbius strip. where A = ( x(0). which is constructed by taking a thin rectangle and connecting its ends at the opposite corners. perpendicular to the tangent plane) at each point of Σ. 8π) − F (0 sin 0. 2 We will now discuss a generalization of Green’s Theorem in R2 to orientable surfaces in R . 8π) − F (0. the unit sphere x2 + y2 + z2 = 1 is orientable.3 Möbius strip If you imagine walking along a line down the center of the Möbius strip.e. z(0)) and B = ( x(8π). A A B → → −→ B A (a) Connect A to A and B to B along the ends (b) Not orientable Figure 4.5. z) = x i + y j + z k is nonzero and normal to the sphere at each point.5. called Stokes’ Theorem. That is. z(8π)). In fact. y(0). paraboloids. You may be wondering what kind of surface would not have two sides. “two-sided” means “orientable”. y. z) is y what we have called an outward normal vector. 0) = 0+ C (8π)2 + (8π)2 − (0 + 0 + 0) = 96π2 . y. as in Figure 4. your orientation changed even though your motion was continuous along that center A . ellipsoids. resulting in a “twisted” strip (see Figure 4. Roughly. We see in this case that N( x. LINE AND SURFACE INTEGRALS So by Theorem 4. z) is another normal −N vector ﬁeld (see Figure 4. y. 0. We say that such an N is a normal vector ﬁeld. and planes. These “outward” and “inward” normal vector ﬁelds on the sphere correspond to an “outer” and “inner” side.3(b). since the N continuous vector ﬁeld N( x. surfaces are cylinders. −N( x. of the sphere.12 we know that f · d r = F (B) − F ( A ) . 0 cos 0. x respectively. y.5. and −N( x. z) is an 0 inward normal vector.2 sided surface.

a ≤ t ≤ b .7. .2). y = y( t ) . Proof: As the general case is beyond the scope of this text. We say in this situation that n is a positive unit normal vector and that C is traversed n-positively.4 C D : x = x ( t ) . (4. there is a discontinuity at your starting point (and. then the surface would be on your left. z = z( x( t). y) n C y 0 D (x. y. y. its projection C D in the x y-plane also has a smooth parametrization. y) for some smooth real-valued function z( x. Now.4 in Section 4. say z Σ : z = z(x. y) varying over a region D in R2 . y. thinking of your vertical direction as a normal vector ﬁeld along the strip. Informally. z)k be a smooth vector ﬁeld deﬁned on some subset of R3 that contains Σ. y = y( t) . see O’N EILL. y( t)) as a function of t. y). The Möbius strip has only one side. Assuming that C has a smooth parametrization. for z = z( x( t).14. (4. y) x CD Figure 4.4. we see that the closed curve C (the boundary curve of Σ) projects onto a closed curve C D which is the boundary curve of D (see Figure 4.45) where curl f = ∂R ∂y − ∂Q ∂z i + ∂P ∂z − ∂R ∂x j + ∂Q ∂x − ∂P ∂y k. and so C can be parametrized (in R3 ) as C : x = x( t) . 7 For further discussion of orientability. pick a unit normal vector n such that if you walked along C with your head pointing in the direction of n. in fact. and C is traversed n-positively. and hence is nonorientable.4). since the curve C is part of the surface z = z( x. z) = P ( x. § IV.5. by the Chain Rule (Theorem 4. We can now state Stokes’ Theorem: Theorem 4. Projecting Σ onto the x y-plane.7 For an orientable surface Σ which has a boundary curve C .5. y). a ≤ t ≤ b . y( t)) . y. z)i + Q ( x. we will prove the theorem only for the special case where Σ is the graph of z = z( x. we know that z ′ ( t) = ∂z ∂x x ′ ( t) + ∂z ∂y y ′ ( t) .5 Stokes’ Theorem 169 line. with ( x. (Stokes’ Theorem) Let Σ be an orientable surface in R3 whose boundary is a simple closed curve C . Then f · dr = Σ C (curl f ) · n d σ . and let f( x. z)j + R ( x.46) n is a positive unit normal vector over Σ. at every point) since your vertical direction takes two different values there.

z( x. y)) (Q ( x.170 and so CHAPTER 4. y)) ∂ ∂z ∂x ∂ y ( x. y. z( x. y. y))) = ∂R ∂x + ∂R ∂ z ∂z ∂x . z( x. z) dz P x ′ ( t) + Q y ′ ( t) + R P +R ∂z ∂x ∂z ∂x x ′ ( t) + ∂z ∂y ∂z ∂y y ′ ( t) dt x ′ ( t) + Q + R y ′ ( t) dt CD ˜ ˜ P ( x. y)) ( x. y))) + ∂ ∂x ∂z ∂y ( x. we have ∂ ∂x (Q ( x. z( x. y. y)) ( x. y. LINE AND SURFACE INTEGRALS C f · dr = = = = C b a b a P ( x. y. y)) ( x. z) d y + R ( x. and ∂x ∂z ˜ Q ( x. z( x. y. y) + R ( x. y.47) Thus. by formula (4. y)) + R ( x. y. y. y) d y . y. ∂x ∂z ∂x + ∂Q ∂ y + ∂Q ∂ z Similarly. z) dx + Q ( x. y) . y) ∂y for ( x. y) in D . y. where ∂z ˜ P ( x. z( x. y)) + R ( x. we have f · dr = D ˜ ∂Q ∂x C − ˜ ∂P ∂y dA . . y) .11. Now. y. (4. z( x. z( x. Thus. z( x. y. ∂ ∂x (R ( x. y. z( x. y) dx + Q ( x. y)) + R ( x. z( x.42) in Theorem 4. ˜ ∂Q ∂x = = ∂ ∂x ∂ ∂x Q ( x. so by the Product Rule we get ∂z ∂y R ( x. y) . y))) = ∂Q ∂ x ∂x ∂x ∂ y ∂x ∂z ∂x ∂Q ∂Q ∂Q ∂ z = ·1 + ·0 + ∂x ∂y ∂z ∂x ∂Q ∂Q ∂ z = + . y) = P ( x. by Green’s Theorem applied to the region D . y) = Q ( x.

y) k.47).5 Stokes’ Theorem Thus.48) since = by the smoothness of z = z( x. by equation (4. we have ∂ = j + ∂ z k.5. ˜ ∂Q ∂x 171 = = ∂Q ∂x ∂Q ∂x + + ∂Q ∂ z ∂z ∂x ∂Q ∂ z ∂z ∂x + + ∂R ∂x + ∂R ∂ z ∂ z ∂z ∂x ∂ y + R ( x. ∂x y So we see that using formula (4. of the surface Σ. we have ∂r = i + ∂ x k and x ∂r ∂y curl f.76) that the vector N = − ∂ x i − ∂ z j + k is normal to the y tangent plane to the surface z = z( x.4). − D C f · dr = ∂R ∂y − ∂Q ∂ z ∂z ∂x − ∂P ∂z − ∂R ∂ z ∂x ∂ y + ∂Q ∂x − ∂P ∂y dA (4. y)) + R ∂ z ∂x ∂ y 2 ∂2 z ∂x ∂ y ∂R ∂ z ∂x ∂ y + ∂R ∂ z ∂ z ∂z ∂x ∂ y .46) for (curl f) · n d σ = Σ D (curl f ) · n ∂R D × ∂r ∂y ∂P ∂z dA ∂R ∂x ∂Q ∂x ∂P ∂y ∂Q ∂x ∂z ∂x ∂z ∂y = = D ∂y − ∂Q ∂z i+ − j+ − + k · − − ∂P ∂y i− j+k dA − ∂R ∂y − ∂Q ∂ z ∂z ∂x − ∂P ∂z − ∂R ∂ z ∂x ∂ y dA . ∂z ∂ Now.4.48). In a similar fashion. Hence. y). n = N N = ∂ ∂z − ∂x i − ∂ z j + k y 1+ 2 ∂ ∂z 2 + ∂z ∂x y is in fact a positive unit normal vector to Σ (see Figure 4. y) in D . recall from Section 2. using the parametriza∂ ∂z tion r( x. QED which. for ( x. Hence. and so y ∂r ∂x ∂ × ∂r = y 1+ ∂r ∂x 2 ∂ ∂z 2 + ∂z .49). So subtracting gives ˜ ∂Q ∂x ∂2 z ∂x ∂ y ∂2 z ∂ y ∂x − ˜ ∂P ∂y = ∂Q ∂z − ∂R ∂ z ∂ y ∂x + ∂R ∂x − ∂P ∂ z ∂z ∂ y + ∂Q ∂x − ∂P ∂y (4. we can calculate ˜ ∂P ∂y = ∂P ∂y + ∂P ∂ z ∂z ∂ y + ∂R ∂ z ∂ y ∂x + ∂R ∂ z ∂ z ∂z ∂ y ∂x + R ∂2 z ∂ y ∂x .3 (see p. proves the Theorem. Thus. z( x. . y) at each point of Σ.49) after factoring out a −1 from the terms in the ﬁrst two products in equation (4. upon comparing to equation (4. y. y) = x i + y j + z( x.

y) : x2 + y2 ≤ 1 }.14. T × n form a right-handed system. y) = x2 + y2 is n = ∂z ∂ − ∂x i − ∂ z j + k y z 1 n C 1+ 2 ∂ ∂z 2 + ∂z ∂x y = −2 x i − 2 y j + k 1 + 4 x 2 + 4 y2 .5. . then the vectors T. LINE AND SURFACE INTEGRALS Note: The condition in Stokes’ Theorem that the surface Σ have a (continuously varying) positive unit normal vector n and a boundary curve C traversed n-positively can be expressed more precisely as follows: if r( t) is the position vector for C and T( t) = r ′ ( t)/ r ′ ( t) is the unit tangent vector to C . y) = x i + y j + ( x + y ) k for ( x.5 z = x 2 + y2 (curl f ) · n ∂r ∂x × ∂r ∂y dA = D −2 x − 2 y + 1 1 + 4 x 2 + 4 y2 1 + 4 x 2 + 4 y2 d A = D (−2 x − 2 y + 1) d A . Also. y. Since Σ can be parametrized as r( x. Verify Stokes’ Theorem for f( x.172 CHAPTER 4. z) = z i + x j + y k when Σ is the paraboloid z = x2 + y2 such that z ≤ 1 (see Figure 4. y) in the region D = { ( x. so switching to polar coordinates gives 2π 1 0 2π 1 0 2π = = = = 0 (−2 r cos θ − 2 r sin θ + 1) r dr d θ (−2 r 2 cos θ − 2 r 2 sin θ + r ) dr d θ 3 3 2 0 0 2π 0 r r − 23 cos θ − 23 sin θ + r 2 r =1 r =0 dθ − 2 cos θ − 2 sin θ + 1 d θ 3 3 2 2π 0 2 = − 3 sin θ + 2 cos θ + 1 θ 3 2 = π. Solution: The positive unit normal vector to the surface z = z( x. n. Example 4.5. it should be noted that Stokes’ Theorem holds even when the boundary curve C is piecewise smooth. and curl f = (1 − 0) i + (1 − 0) j + (1 − 0) k = i + j + k. then (curl f ) · n d σ = Σ D 2 2 Σ y 0 x Figure 4.5). so (curl f ) · n = (−2 x − 2 y + 1)/ 1 + 4 x2 + 4 y2 .

z = 1 for 0 ≤ t ≤ 2π.5 Stokes’ Theorem 173 The boundary curve C is the unit circle x2 + y2 = 1 laying in the plane z = 1 (see Figure 4. is a positive unit normal vector to Σ.15. As in Example 4. as predicted by Stokes’ Theorem. y. Calculate C f · d r for f( x. Solution: The surface is similar to the one in Example 4. . 0 2 4 So we see that C f · dr = here we used cos2 t = 1 + cos 2 t 2 Σ (curl f ) · n d σ.4. and so by Stokes’ Theorem f · dr = Σ C (curl f ) · n d σ = Σ 0 dσ = 0 . z) = (9 xz + 2 y)i + (2 x + y2 )j + (−2 y2 + 2 z)k. Let Σ be the elliptic paraboloid z = x + 9 for z ≤ 1. which can be parametrized as x = cos t. so (curl f ) · n = x (−4 y)(− 2 ) + (9 x)(− 2y 9 ) + (0)(1) 2 1+ x2 4 + 4y 9 = 2x y − 2x y + 0 1+ x2 4 + 4 y2 9 = 0. Example 4. and let C be its boundary 4 curve.14. y)) on 2 y2 the surface z = z( x. y) = x + 9 the vector 4 n = ∂ ∂z − ∂x i − ∂ z j + k y 2 y2 1+ 2 ∂ ∂z 2 + ∂z ∂x y = x −2 i− 2y 9 j+k 2 1+ x + 4 4 y2 9 . using Stokes’ Theorem is easier 4 than computing the line integral directly.5.5). The line integral in the preceding example was far simpler to calculate than the surface integral. And calculating the curl of f gives curl f = (−4 y − 0)i + (9 x − 0)j + (2 − 2)k = −4 y i + 9 x j + 0 k . where C is traversed counterclockwise. except now the boundary curve C 2 y2 is the ellipse x + 9 = 1 laying in the plane z = 1. So 2π C f · dr = = 0 2π ((1)(− sin t) + (cos t)(cos t) + (sin t)(0)) dt − sin t + 1 + cos 2 t dt 2 0 t sin 2 t 2π = cos t + + = π. z( x. In this case. y = sin t. y.14. at each point ( x. but this will not always be the case.

That ratio of circulation to surface area in the limit is what makes the curl a rough measure of circulation density (i. This is best seen by using another deﬁnition of curl f which is equivalent9 to the deﬁnition given by formula (4. then such a wheel would rotate counterclockwise if it were dropped to Figure 4.174 CHAPTER 4. z) = lim S →0 S C where S is the surface area of a surface Σ containing the point ( x. Think of the vector ﬁeld as representing the ﬂow of water. y. y An idea of how the curl of a vector ﬁeld is related to rotation is shown in Figure 4. z) and with a simple closed boundary curve C and positive unit normal vector n at ( x. curl f( x. which means that the circulation C E · d r = 0 by Stokes’ Theorem. z). Since the ﬂow is stronger (i. z) and that the vectors grow larger the further the point ( x. meaning no rotation). (4.6. In fact. In physics. 8 See Ch. the surface it bounds. 78-81. circulation per unit area). z) which is always parallel to the x y-plane at each f point ( x. y. LINE AND SURFACE INTEGRALS In physical applications.5. In the limit. y. y. and imagine dropping two wheels with paddles into that water ﬂow. 1 f · dr . Namely. the x magnitude of f is larger) as you move away 0 from the y-axis. z) points in the direction of your thumb as you cup your right hand in the direction of the rotation of the wheel. then it turns out8 that curl E = 0.6.50) n · (curl f )( x. . f( x. and it would rotate clockwise if it were dropped to the left of the y-axis. which causes Σ. z) = (1 + x2 ) j. Notice that if all the vectors had the same direction and the same magnitude.46).e. to have smaller and smaller surface area. So the curl points outward (in the positive z-direction) if x > 0 and points inward (in the negative z-direction) if x < 0. where it is used extensively. y. as in Figure 4. then the wheels would not rotate and hence there would be no curl (which is why such ﬁelds are called irrotational. y. Suppose we have a vector ﬁeld f( x. z) is from the yaxis. y. the term curl was created by the 19th century Scottish physicist James Clerk Maxwell in his study of electromagnetism. For example. p. that is. y. 2 in R EITZ. z) = 2 x k in our example) and would obey the right-hand rule. z) in R3 . Vector ﬁelds which have zero curl are often called irrotational ﬁelds. for a point ( x. y. for the derivation.5. for a simple closed curve C the line integral C f · d r is often called the circulation of f around C . y. the curl is interpreted as a measure of circulation density. think of the curve C shrinking to the point ( x. For example.5. 9 See S CHEY. In both cases the curl would be nonzero (curl f( x.6 Curl and rotation the right of the y-axis. z). if E represents the electrostatic ﬁeld due to a point charge. y. M ILFORD and C HRISTY .e.

y. we know that if C is a simple closed curve in some solid region S in R3 and if f( x. Example 4.5 Stokes’ Theorem 175 Finally. y. Thus. and = ∂y ∂z ∂z ∂x ∂x ∂y throughout R3 . y. C f ( x. C : x = cos t. y. calculate 1. y = sin t. Solution: Since R3 is simply connected. z) is a smooth vector ﬁeld such that curl f = 0 in S . Determine if the vector ﬁeld f( x. z) j + R ( x. by Stokes’ Theorem. y. z = t. z) = P ( x. z) = z. z) = x y. = . z) k has a smooth potential F ( x. z) = x yz i + xz j + x y k has a potential in R3 . y.e. y. ∂R ∂x =y ⇒ ∂P ∂z = ∂R ∂x for some ( x. y. for solid regions in R3 which are simply connected (i. we just need to check whether curl f = 0 throughout R3 . f( x. where P ( x.16. and R ( x. then C f · dr = Σ (curl f ) · n d σ = Σ 0 · n dσ = Σ 0 dσ = 0 . But we see that ∂P ∂z = xy . Q ( x. ∂P ∂R ∂Q ∂P ∂R ∂Q = . z) = x yz. z) i + Q ( x. 0 ≤ t ≤ 2π .e. z) and curve C . y. z) = xz.3. where Σ is any orientable surface inside S whose boundary is C (such a surface is sometimes called a capping surface for C ). Exercises A For Exercises 1-3. z) does not have a potential in R3 . y. So similar to the two-variable case. f ( x.4. y. and = in S (i. regions having no holes): The following statements are equivalent for a simply connected solid region S in R3 : (a) f( x. = . y. y. we have a threedimensional version of a result from Section 4. curl f = 0 in S ) ∂z ∂z ∂x ∂x ∂y (c) C (d) ∂R ∂y Part (d) is also a way of saying that the differential form P dx + Q d y + R dz is exact. z) ds for the given function f ( x. y. z) in S (b) C f · d r is independent of the path for any curve C in S f · d r = 0 for every simple closed curve C in S = ∂Q ∂P ∂R ∂Q ∂P . z) in R3 . y. that is.

0) 9. z) = ( y − 2 z) i + x y j + (2 xz + y) k. z) = i − j + k. Construct a Möbius strip from a piece of paper. (Hint: Split Σ in half. for the given vector ﬁeld f( x. y = sin t. Show that Green’s Theorem is a special case of Stokes’ Theorem. f( x. z) = x y i + ( z − x) j + 2 yz k. z) = x y i − ( x − yz2 ) j + y2 z k B For Exercises 14-15. b. z) a smooth vector ﬁeld. verify Stokes’ Theorem for the given vector ﬁeld f( x. f( x. 0 ≤ u ≤ 2π . z = 1. . Let Σ be a closed surface and f( x. f( x. y. y. z ≥ 0 Σ : z = x 2 + y2 . y. 0. z = t.176 2.) Σ 19. 0) to (1. y = t cos t. then draw a line down its center (like the dotted line in Figure 4. y = sin t. 10. y. y. z) = y i − x j + z k 12.3(b)). y. 0) C : the polygonal path from (0. z) = ( x + y) i + x j + z2 k 11. 0. y. c constant) 13. 14. f( x. y. z = C f · dr 0≤t≤1 For Exercises 4-9. How many surfaces does this result in? How would you describe them? Are they orientable? 17. z) = y i − x j + z k. C : x = 3 t. Show that (curl f ) · n d σ = 0. z) has a potential in R3 (you do not need to ﬁnd the potential itself). z) = x y i + xz j + yz k. z) = CHAPTER 4. f( x. f ( x. y.5. Use Gnuplot (see Appendix C) to plot the Möbius strip parametrized as: r( u. v) = cos u (1 + v cos u ) i + sin u (1 + v cos u ) j + v sin u k . 0 ≤ t ≤ 1 5. y. 2. y = 2 t. 0 ≤ t ≤ 1 C : x = cos t. f( x. to (1. 0) to (1. z) = 2 y i − x j + z k. y. 0. f( x. f( x. y. f( x. 15. z) = yz i + xz j + x y k. z) = x i + y j + z k. Σ : x2 + y2 + z2 = 1. 6. 0) to (1. z) = a i + b j + c k (a. 0. y. f( x. z) and curve C . y. y. 0 ≤ t ≤ 2π C : x = t. −2) For Exercises 10-13. f ( x. state whether or not the vector ﬁeld f( x. 7. y. f( x. Cut the Möbius strip along that center line completely around the strip. C : the polygonal path from (0. y = t. y C : x = t2 . z = 2. 2. z) and surface Σ. z) = z2 . 3. z = t. 0) to (1. LINE AND SURFACE INTEGRALS x + y + 2 yz. 8. 2. − 1 ≤ v ≤ 2 2 2 2 1 2 C 18. y. f( x. y = 2 t. z ≤ 1 16. C : x = t sin t. y. z = t2 − 1. 0 ≤ t ≤ 2π C : x = cos t. calculate 4. 1 ≤ t ≤ 2 2 2 3/2 3 t .

no. For instance. especially with the divergence and curl. Curl and Laplacian In this ﬁnal section we will establish some relationships between the gradient. y. y. z)j + f 3 ( x. z) = ∂f ∂f ∂f . Divergence. y. The process of “applying” ∂∂x . ∂∂y and ∂∂z are to be thought of as “partial derivative operators” that will get “applied” to a real-valued function. For a real-valued function f ( x. y. This is done by thinking of ∇ as a vector in R3 . z) produces ∂ x . z) on R3 . y. it is often convenient to write the divergence div f as ∇ · f. Curl and Laplacian 177 4.6 Gradient. where each of the partial derivatives is evaluated at the point ( x. as we will soon see. So in this way. But it helps to think of ∇ as a vector. the dot product of f with ∇ (thought of as a vector) makes sense: ∇· f = = = ∂ ∂x ∂ ∂x ∂ f1 ∂x i + ∂ ∂y j + ∂ ∂z k · ( f 1 ( x. the gradient ∇ f ( x. z)k) ∂ ∂z ( f1) + + ∂ f2 ∂y + ∂ ( f2) + ∂y ∂ f3 ∂z ( f3) = div f . z)i + f 2 ( x. say f ( x. z) is the vector ∇ f ( x. since it “operates” on functions. the symbols ∂∂x . z) is normally thought of as multiplying the quantities: ∂ ∂x (f ) = ∂f ∂x . Divergence. y. that is. divergence and curl. y. z)j + f 3 ( x. ∇ is often referred to as the “del operator”. since ∂∂x . z)i + f 2 ( x. y. We will then show how to write these quantities in cylindrical and spherical coordinates. For example. . z) is a vector-valued function on R3 . ∂ ∂z (f ) = ∂f ∂z For this reason. y. y. y. y. and we will also introduce a new quantity called the Laplacian. ∂x ∂ y Is ∇ really a vector? Strictly speaking. ∂ ∂y (f ) = ∂f ∂y . ∂∂x “applied” to f ( x. ∂∂y and ∂∂z are not actual numbers. z)k. z). ∂∂y . z) = f 1 ( x.51) Here. and ∂ z . y. (4.6 Gradient. ∂∂z to a real-valued function f ( x. to produce the partial derivatives ∂f ∂f ∂f ∂f . z).4. you can think of the symbol ∇ as being “applied” to a real-valued function f to produce a vector ∇ f . ∂x ∂ y ∂z = ∂f ∂x i + ∂f ∂y j + ∂f ∂z k in R3 . its value at a point ( x. since for a vector ﬁeld f( x. It turns out that the divergence and curl can also be expressed in terms of the symbol ∇. y. namely ∇ = ∂ ∂x i + ∂ ∂y j + ∂ ∂z k. y.

we have: i j ∂ ∂ ∇× f = ∂x ∂y P ( x. z) r · r = x2 + y2 + z2 is a real-valued function. Then r( x. (4. so we can take its divergence: div ∇ f = ∇ · ∇ f = ∂f ∂x i + ∂y j + ∂f ∂f ∂z k is a vector ∂ ∂ i + j + ∂x ∂y ∂ ∂f ∂ = + ∂x ∂x ∂y ∂ f ∂ x2 2 ∂ ∂z ∂f ∂y ∂ f ∂ z2 2 k · + ∂f ∂f ∂f i + j + k ∂x ∂y ∂z ∂ ∂f ∂z ∂z = + ∂ f ∂ y2 2 + Note that this is a real-valued function. z)j + R ( x. to which we will give a special name: Deﬁnition 4. z) = x i+ y j+ z k be the position vector ﬁeld on R3 . y.7. z)i + Q ( x. since for a vector ﬁeld f( x. y.52) ∆ f ( x. z). the Laplacian of f . y. Example 4. y. y. z). z) = ∇ · ∇ f = ∂ x2 ∂ y2 ∂ z2 Often the notation ∇2 f is used for the Laplacian instead of ∆ f . y. z) − − ∂P ∂z ∂R ∂x − − ∂Q ∂z ∂Q ∂z i − i + ∂R ∂x ∂P ∂z j + j + ∂Q ∂x ∂Q ∂x − − ∂P ∂y ∂P ∂y k k = curl f For a real-valued function f ( x. For a real-valued function f ( x. the gradient ∇ f ( x. Find (a) the gradient of r (b) the divergence of r (c) the curl of r (d) the Laplacian of r 2 2 2 = . z) = ﬁeld. y. LINE AND SURFACE INTEGRALS We can also write curl f in terms of ∇. y. y. is given by ∂2 f ∂2 f ∂2 f + + . y. y. z) = = ∂R ∂y ∂R ∂y k ∂ ∂z R ( x. denoted by ∆ f . namely as ∇ × f. y. z)k. z) Q ( x. using the convention ∇2 = ∇ · ∇. z) = P ( x. Let r( x.178 CHAPTER 4.17. y.

notice that in Example 4.16. If a vector ﬁeld f( x. ∇ × (∇ f ) = 0. Proof: We see by the smoothness of f that i ∂ ∇ × (∇ f ) = ∂ x ∂f ∂x = ∂2 f ∂ y ∂z j ∂ ∂y ∂f ∂y − k ∂ ∂z ∂f ∂z ∂2 f ∂z ∂ y i − ∂2 f ∂x ∂z − ∂2 f ∂z ∂x j + ∂2 f ∂x ∂ y − ∂2 f ∂ y ∂x k = 0. The proof is straightforward and left as an exercise for the reader. then curl f = 0. Also. Divergence. Curl and Laplacian Solution: (a) ∇ r (b) ∇ · r = (c) i ∂ ∇× r = ∂x x (d) ∆ r 2 2 179 ∂ ( x) + ∂∂y ( y) + ∂∂z ( z) = 1 + 1 + 1 = 3 ∂x = 2x i + 2 y j + 2z k = 2 r j ∂ ∂y y k ∂ = (0 − 0) i − (0 − 0) j + (0 − 0) k = 0 ∂z z = 2 2 ∂2 ( x2 + y2 + z2 ) + ∂∂y2 ( x2 + y2 + z2 ) + ∂∂z2 ( x2 + y2 + z2 ) = 2 + 2 + 2 = 6 ∂ x2 Note that we could have calculated ∆ r 2 another way. y. Corollary 4. y.17 if we take the divergence of the curl of r we trivially get ∇ · (∇ × r) = ∇ · 0 = 0 . . The following theorem shows that this will be the case in general: Theorem 4.17 if we take the curl of the gradient of r ∇ × (∇ r 2 2 we get ) = ∇× 2r = 2∇× r = 20 = 0 . z) has a potential. z).17.6 Gradient. QED since the mixed partial derivatives in each component are equal. The following theorem shows that this will be the case in general: Theorem 4. z). For any smooth vector ﬁeld f( x. Another way of stating Theorem 4. ∇ · (∇ × f) = 0. For any smooth real-valued function f ( x. using the ∇ notation along with parts (a) and (b): ∆ r 2 = ∇ · ∇ r 2 = ∇ · 2 r = 2 ∇ · r = 2(3) = 6 Notice that in Example 4. y.15.4.15 is that gradients are irrotational.

Show that ∇ · E = 4πρ . This is one of Maxwell’s Equations. Proof: Let Σ be a closed surface which bounds a solid S . j and k in place of n. Σ is orientable and its boundary is C ). z) is a smooth real-valued function on R3 . The proof is not trivial.180 CHAPTER 4. and physicists do not usually bother to prove it. × Since the choice of Σ was arbitrary. y. and can also be applied to double and triple integrals. and is often used f ( x. Example 4. y. with S being the solid region enclosed by Σ. Since ∇ f is a vector ﬁeld. which completes the proof.13. z) d σ = 0 for all surfaces Σ in some solid in physics. LINE AND SURFACE INTEGRALS Corollary 4.15. Namely.18. to prove Theorem 4. then we must have (∇× (∇ f )) · n = 0 throughout R3 .15 which can be useful. then we must have f ( x. so = 0 by Corollary 4. if the surface integral region (usually all of R3 ). . y. But the result is true.10 10 In Gaussian (or CGS) units. Using i.18. we see that we must have ∇ × (∇ f ) = 0 in R3 . z) through any closed surface is zero. assume that f ( x. Let C be a simple closed curve in R3 and let Σ be any capping surface for C (i.17) QED = 0. y. The ﬂux of the curl of a smooth vector ﬁeld f( x. The ﬂux of ∇ × f through Σ is (∇ × f ) · d σ = Σ S ∇ · (∇ × f ) dV (by the Divergence Theorem) = S 0 dV (by Theorem 4. y. A system of electric charges has a charge density ρ ( x. There is another method for proving Theorem 4. y.e. where n is any unit vector. Gauss’ Law states that E · d σ = 4π Σ S ρ dV for any closed surface Σ which encloses the charges. y. For instance. z) in space. z) at points ( x. z) and produces an electrostatic ﬁeld E( x. z) = 0 throughout that region. then (∇ × (∇ f )) · n d σ = Σ Σ C ∇ f · dr by Stokes’ Theorem.

Then er . by the right-hand rule. a point ( x.6. θ . eφ are orthonormal. we have ∇ · E dV = S 181 E · dσ Σ = 4π S ρ dV by Gauss’ Law. θ . respectively (see Figure 4. respectively (see Figure 4. where x = r cos θ . z) can be represented in cylindrical coordinates ( r. y = ρ sin φ sin θ .6. θ . curl and Laplacian. φ). e z in cylindrical coordinates Figure 4.1 Orthonormal vectors er . eθ . y.2). let er . z) can be represented in spherical coordinates (ρ .6. 0) Figure 4. At each point (ρ . We can now summarize the expressions for the gradient. z). so S ∇ · E − 4πρ = 0 since Σ and hence S was arbitrary. eφ in spherical coordinates Similarly. eθ . y. so ∇ · E = 4πρ . ez z (x.7 that a point ( x. y. divergence. z = z.6. eθ . curl and Laplacian in Cartesian.1). let eρ . z. y. θ . By the right-hand rule. z). We will present the formulas for these in cylindrical and spherical coordinates. z) z x x 0 θ y r (x. e z form an orthonormal set of vectors. z = ρ cos φ. θ .6 Gradient. At each point ( r.2 Orthonormal vectors eρ . cylindrical and spherical coordinates in the following tables: . where x = ρ sin φ cos θ . e z be unit vectors in the direction of increasing r . eθ . we see that eθ × eρ = eφ . 0) x eθ er y x 0 θ z (x. Curl and Laplacian Solution: By the Divergence Theorem. y. y. y = r sin θ . Often (especially in physics) it is convenient to use other coordinate systems when dealing with quantities such as the gradient. divergence. so combining the integrals gives (∇ · E − 4πρ ) dV = 0 . eθ . Note. θ . that e z × er = eθ . φ). Then the vectors eρ . Divergence. φ. Recall from Section 1. z) φ ρ eρ eθ eφ z y y (x.4. eθ . eφ be unit vectors in the direction of increasing ρ .

θ . As an example. we will derive the formula for the gradient in spherical coordinates. Vector ﬁeld f = f ρ eρ + f θ eθ + f φ eφ ∂F 1 1 ∂F eθ + eφ ∂ρ ρ sin φ ∂θ ρ ∂φ ∂ fθ ∂ 1 1 1 ∂ 2 (ρ f ρ ) + + (sin φ f θ ) divergence : ∇ · f = 2 ρ sin φ ∂θ ρ sin φ ∂φ ρ ∂ρ ∂ fφ ∂ fρ 1 ∂ 1 ∂ curl : ∇ × f = (sin φ f θ ) − (ρ f φ ) − eρ + eθ ρ sin φ ∂φ ∂θ ρ ∂ρ ∂φ ∂ fρ 1 ∂ 1 + − (ρ f θ ) eφ ρ sin φ ∂θ ρ ∂ρ gradient : ∇F = ∂F eρ + Laplacian : ∆ F = ∂F 1 ∂ ρ2 2 ∂ρ ∂ρ ρ + 1 ρ 2 sin φ 2 ∂2 F ∂θ 2 + 1 ρ 2 sin φ ∂ ∂φ sin φ ∂F ∂φ The derivation of the above formulas for cylindrical and spherical coordinates is straightforward but extremely tedious. φ): Scalar function F . LINE AND SURFACE INTEGRALS Cartesian ( x. z): Scalar function F .182 CHAPTER 4. z): Scalar function F . Vector ﬁeld f = f r er + f θ eθ + f z e z 1 ∂F ∂F eθ + ez ∂r r ∂θ ∂z 1 ∂ fθ ∂fz 1 ∂ (r f r ) + + divergence : ∇ · f = r ∂r r ∂θ ∂z 1 ∂ f z ∂ fθ ∂ fr ∂ fz 1 ∂ ∂ fr curl : ∇ × f = er + eθ + ez − − (r f θ ) − r ∂θ ∂z ∂z ∂r r ∂r ∂θ ∂2 F ∂F 1 ∂2 F 1 ∂ r + 2 2 + Laplacian : ∆ F = r ∂r ∂r r ∂θ ∂ z2 gradient : ∇F = ∂F er + Spherical (ρ . . The basic idea is to take the Cartesian equivalent of the quantity in question and to substitute into that formula using the appropriate coordinate transformation. y. Vector ﬁeld f = f 1 i + f 2 j + f 3 k gradient : ∇F = divergence : ∇ · f = curl : ∇ × f = Laplacian : ∆ F = ∂F ∂x ∂ f1 i + + − + ∂F ∂y ∂ f2 ∂y ∂ f2 ∂z j + + ∂F ∂z ∂ f3 ∂z k ∂x ∂ f3 ∂y ∂2 F ∂ x2 i + + ∂ f1 ∂z ∂2 F ∂ z2 − ∂ f3 ∂x j + ∂ f2 ∂x − ∂ f1 ∂y k ∂2 F ∂ y2 Cylindrical ( r. θ .

Divergence. θ . This comes down to solving a system of three equations in three unknowns. j. put the Cartesian ba∂x ∂y ∂z sis vectors i. That is.4. we will solve for k. eφ . since the angle θ is measured in the x y-plane. but we will do it by combining the formulas for eρ and eφ to eliminate k. note that since eθ ⊥ eρ . There are many ways of doing this.2 that the unit vector eρ in the ρ direction at a general point (ρ . That occurs when the angle φ is π/2. Step 1: Get formulas for eρ . note that sin φ eρ + cos φ eφ = cos θ i + sin θ j . eθ . k in terms of the spherical coordinate basis vectors eρ . First. eφ in terms of i. which will give us an equation involving just i and j. Curl and Laplacian 183 Goal: Show that the gradient of a real-valued function F (ρ . where r = x i + y j + z k is the position vector of the point in Cartesian r coordinates. j. ∂F and functions ∂x ∂ y ∂z ∂ρ ∂θ ∂φ of ρ . z) = ∂F i + ∂F j + ∂F k. ∂F in terms of ∂F . Then put the partial derivatives ∂F . k. Since this vector is also a unit vector and points in the (positive) θ direction. r x 2 + y2 + z 2 so using x = ρ sin φ cos θ . which we will use to solve ﬁrst for j then for i. eφ and functions of ρ .6 Gradient. This. Lastly. Thus. y = ρ sin φ sin θ . We can see from Figure 4. we get: eφ = cos φ cos θ i + cos φ sin θ j − sin φ k Step 2: Use the three formulas from Step 1 to solve for i. Putting φ = π/2 into the formula for eρ gives eρ = cos θ i + sin θ j + 0 k. with the formula for eθ . will then leave us with a system of two equations in two unknowns (i and j). φ) is eρ = r . θ and φ. then in particular eθ ⊥ eρ when eρ is in the x y-plane. ∂F . it must be eθ : eθ = − sin θ i + cos θ j + 0 k Lastly. θ . φ) in spherical coordinates is: ∇F = ∂F ∂ρ eρ + ∂F 1 1 ∂F eθ + eφ ρ sin φ ∂θ ρ ∂φ Idea: In the Cartesian gradient formula ∇F ( x. r xi+ yj+ zk eρ = = . eθ . θ and φ. y. then the unit vector eθ in the θ direction must be parallel to the x y-plane. k in terms of eρ . and we see that a vector perpendicular to that is − sin θ i + cos θ j + 0 k. and ρ = x2 + y2 + z2 . To ﬁgure out what a and b are. ∂F . we get: eρ = sin φ cos θ i + sin φ sin θ j + cos φ k Now.6. eθ is of the form a i + b j + 0 k. j. eθ . z = ρ cos φ. since eφ = eθ × eρ .

184 so that CHAPTER 4. ∂x ∂ y ∂z ∂ρ ∂θ ∂φ Again. . we see that cos θ (sin φ eρ + cos φ eφ ) − sin θ eθ = (cos2 θ + sin2 θ )i = i . ∂F . and so: j = sin φ sin θ eρ + cos θ eθ + cos φ sin θ eφ Likewise. . . which yields: ∂F ∂ρ ∂F = sin φ cos θ + sin φ sin θ ∂F ∂F ∂y + cos φ ∂F ∂z ∂θ ∂F ∂F + ρ sin φ cos θ ∂x ∂y ∂F ∂F ∂F = ρ cos φ cos θ + ρ cos φ sin θ − ρ sin φ ∂φ ∂x ∂y ∂z = −ρ sin φ sin θ Step 4: Use the three formulas from Step 3 to solve for ∂F . ∂F . Using a similar process of elimination as in Step 2. we have ∂F ∂ρ ∂F ∂θ ∂F ∂φ ∂F ∂φ ∂F ∂F ∂F . ∂ρ ∂θ By the Chain Rule. . this involves solving a system of three equations in three unknowns. ∂F in terms of ∂F . and so: i = sin φ cos θ eρ − sin θ eθ + cos φ cos θ eφ Lastly. ∂F . LINE AND SURFACE INTEGRALS sin θ (sin φ eρ + cos φ eφ ) + cos θ eθ = (sin2 θ + cos2 θ )j = j . we see that: k = cos φ eρ − sin φ eφ Step 3: Get formulas for ∂F . we get: ∂F ∂x ∂F = = 1 ρ sin φ 1 ρ sin2 φ cos θ ∂F ∂ρ ∂F − sin θ ∂F ∂θ ∂F + sin φ cos φ cos θ ∂F ∂φ ∂F ∂y ∂F ρ sin2 φ sin θ + cos θ + sin φ cos φ sin θ ρ sin φ ∂ρ ∂θ ∂φ 1 ∂F ∂F = − sin φ ρ cos φ ∂z ρ ∂ρ ∂φ . ∂x ∂ y ∂z in terms of ∂F ∂ x ∂ x ∂ρ ∂F ∂ x ∂ x ∂θ ∂F ∂ x ∂ x ∂φ ∂F ∂x = = = + + + ∂F ∂ y ∂ y ∂ρ ∂F ∂ y ∂ y ∂θ ∂F ∂ y ∂ y ∂φ + + + ∂F ∂ z ∂ z ∂ρ ∂F ∂ z ∂ z ∂θ ∂F ∂ z ∂ z ∂φ . ∂F .

Solution: Since r 2 = x2 + y2 + z2 = ρ 2 in spherical coordinates. let F (ρ . The gradient of F in spherical coordinates is ∇F = ∂F 1 1 ∂F eθ + eφ ∂ρ ρ sin φ ∂θ ρ ∂φ 1 1 (0) eθ + (0) eφ = 2ρ eρ + ρ sin φ ρ r = 2ρ eρ = 2ρ . In Example 4. Divergence. . as expected. φ) = r 2 ).17 we showed that ∇ r 2 = 2 r and ∆ r 2 = 6. z) = x i + y j + z k in Cartesian coordinates. Curl and Laplacian Step 5: Substitute the formulas for i. And the Laplacian is ρ ∂F eρ + ∆F = ∂F 1 ∂F ∂2 F 1 ∂ 1 ∂ ρ2 + sin φ + 2 2 2 ∂ρ 2 ∂ρ ∂φ ρ ρ sin φ ∂φ ρ 2 sin φ ∂θ 1 1 ∂ 2 1 ∂ sin φ (0) (ρ 2ρ ) + 2 = 2 (0) + 2 ρ ∂ρ ρ sin φ ρ sin φ ∂φ 1 ∂ = 2 (2ρ 3 ) + 0 + 0 ρ ∂ρ 1 = 2 (6ρ 2 ) = 6 . But the algebra is straightforward and yields the desired result: ∇F = ∂F ∂ρ eρ + ∂F 1 1 ∂F eθ + eφ ρ sin φ ∂θ ρ ∂φ Example 4. as we showed earlier. as expected. y. θ . y. j. since it involves simplifying 3 × 3 + 3 × 3 + 2 × 2 = 22 terms! Namely. ∂x ∂ y ∂z 185 from i + j + k. − sin φ ρ ∂ρ ∂φ which we see has 8 terms involving eρ . where r( x. ρ . Verify that we get the same answers if we switch to spherical coordinates.19. and 8 terms involving eφ . k from Step 2 and the formulas for ∂F ∂x ∂F ∂y ∂F ∂z ∂F ∂F ∂F . z) = Doing this last step is perhaps the most tedious. 6 terms involving eθ . ∇F = 1 ∂F ∂F ∂F ρ sin2 φ cos θ (sin φ cos θ eρ − sin θ eθ − sin θ + sin φ cos φ cos θ ρ sin φ ∂ρ ∂θ ∂φ + cos φ cos θ eφ ) ∂F ∂F ∂F 1 ρ sin2 φ sin θ + cos θ + sin φ cos φ sin θ (sin φ sin θ eρ + cos θ eθ ρ sin φ ∂ρ ∂θ ∂φ + + cos φ sin θ eφ ) + 1 ∂F ∂F ρ cos φ (cos φ eρ − sin φ eφ ) .6 Gradient.4. Step 4 into the Cartesian gradient formula ∇F ( x. θ . so r r = 2ρ = 2 r . φ) = ρ 2 (so that F (ρ .

φ) = eρ + ρ cos θ eθ + ρ eφ in spherical coordinates. ∆ (1/ r ) = 0 14. z) = e− x 2 − y2 − z2 7. θ .e.17. y. curl (curl F) = ∇(div F) − ∆ F 17. f ( x. u is harmonic) over R3 . For f( r. ∆ ( f g) = f ∆ g + g ∆ f + 2(∇ f · ∇ g) C 24. f ( x. f ( x. y. y. z) = z x 2 + y2 in Cartesian coordinates. y. y. curl (F + G) = curl F + curl G 19.) ∂n Σ . Find ∇ f in cylindrical coordinates. z) = x3 + y3 + z3 3. θ . Prove Theorem 4. Use f = u ∇v in the Divergence Theorem to prove: (a) Green’s ﬁrst identity: S ∂F ∂r er + 1 r ∂F ∂θ eθ + ∂F e z ∂z ( u ∆ v + (∇ u) · (∇v)) dV = S Σ ( u ∇ v) · d σ (b) Green’s second identity: ( u ∆ v − v ∆ u) dV = Σ ( u ∇v − v ∇ u) · d σ 27. Let f ( x. Find the Laplacian of the function in Exercise 3 in spherical coordinates. Find the Laplacian of the function in Exercise 6 in spherical coordinates. ﬁnd the Laplacian of the function f ( x. Show ∂n ∂u that d σ = 0. div ( f F) = f div F + F · ∇ f 20. LINE AND SURFACE INTEGRALS Exercises A For Exercises 1-6. z) = x i + y j + z k). 25. z) = ( x2 + y2 + z2 )3/2 6. z) = e x+ y+ z 2. f ( x. ﬁnd div f and curl f.186 CHAPTER 4. z) = r er + z sin θ eθ + rz e z in cylindrical coordinates. For f(ρ . B For Exercises 12-23. Deﬁne the normal derivative ∂u of u ∂n over a closed surface Σ with outward unit normal vector n by ∂u = D n u = n · ∇ u. y. div (F + G) = div F + div G 18. curl ( f F) = f curl F + (∇ f ) × F 23. ﬁnd div f and curl f. Derive the gradient formula in cylindrical coordinates: ∇F = 26. (Hint: Use Green’s second identity. 11. div (∇ f × ∇ g) = 0 22. y. 8. 10. z) = x + y + z 4. y. 1. 12. Suppose that ∆ u = 0 (i. ∇ · (r/ r 3 ) = 0 15. f ( x. y. ∇ (1/ r ) = −r/ r 3 13. prove the given formula ( r = r is the length of the position vector ﬁeld r( x. z) = x5 5. f ( x. div (F × G) = G · curl F − F · curl G 21. ∇ (ln r ) = r/ r 2 16. 9. z) in Cartesian coordinates.

2nd edition. E. Classical Electrodynamics.D. with lots of humor thrown in..S. CA: Academic Press. covering a wide range of topics. Bazaraa. Nonlinear Programming: Theory and Algorithms.M... 1971 An excellent introduction to elementary. H. Introduction to Probability Theory. Hoel.G. Lots of good exercises. Flatland. Very thorough. calculus-based probability theory. Shetty. J. Boston. Reading. Jackson.. J.. Marion. 7th edition. B.A. 1975 An advanced book on electromagnetism. Rorres. New York: John Wiley & Sons. 1990 An intermediate-level book on curve and surface design. Sherali and C. San Diego. New York: John Wiley & Sons. Inc. 187 ..C. E. P. Most of the mathematics will be understandable after reading the present book. New York: Academic Press. Stone. 2nd edition.D. with a modern approach based on differential forms. 8th edition.. 1987 An intermediate-level book on optics. Anton. Elementary Linear Algebra: Applications Version. 2000 Standard treatment of elementary linear algebra. G. Farin. M.. 2nd edition. Port and C. 1993 Thorough treatment of nonlinear optimization. Curves and Surfaces for Computer Aided Geometric Design: A Practical Guide. and C. H. New York: Academic Press... 1952 Classic tale about a creature living in a 2-dimensional world who encounters a higherdimensional creature. famous for being intimidating.B. 1970 Standard intermediate-level treatment of classical mechanics.. 2nd edition. S. MA: Houghton Mifﬂin Co. New York: John Wiley & Sons. O’Neill. Optics. Elementary Differential Geometry.J. 2nd edition. MA: Addison-Wesley Publishing Co. Hecht. New York: Dover Publications. Classical Dynamics of Particles and Systems.Bibliography Abbott. 1966 Intermediate-level book on differential geometry.

. UK: Cambridge University Press. Reading.P... Advanced Calculus. Moscow: Mir Publishers. MA: Ginn & Co. and W. 1980 An intermediate/advanced book on analytic geometry. 1972 Excellent treatment of n-dimensional calculus.B. Div. Morrey. New York: John Wiley & Sons. H. the code is clear enough to implement in the language of your choice. Welchons. 1973 Very intuitive approach to the subject. Many intriguing exercises. 1936 A very thorough treatment of 3-dimensional geometry from an elementary perspective.. A good book to study after the present book. Reitz.J.W. Krickenberger. Analytic Geometry. 1975 Thorough treatment of elementary analytic geometry. MA: Addison-Wesley Publishing Co. Reading. A.R. Solid Geometry. Analytical Geometry. includes many topics which (sadly) do not seem to be taught anymore. Uspensky. A First Course in Numerical Analysis.. 2nd edition. Mann. J. Vetterling and B. Rabinowitz. 1992 An excellent source of information on numerical methods for solving a wide variety of problems. W.V.. J. 1965 A good introduction to the vast subject of partial differential equations.R. 2nd edition. Protter. discussing many interesting topics. Teukolsky... Taylor. with a rigor not found in most recent books.M. Flannery. Christy.R.W. A. Highly recommended. and C. Schey. and All That: An Informal Text on Vector Calculus. 1948 A classic on the subject. Cambridge.A. A. New York: John Wiley & Sons. Boston.V. Theory of Equations. Press. 2nd edition... . Milford and R. and P. Numerical Recipes in FORTRAN: The Art of Scientific Computing. MA: AddisonWesley Publishing Co. Foundations of Electromagnetic Theory. New York: W. A First Course in Partial Differential Equations. from a physicist’s viewpoint. W. A. S.H.M. 1978 Standard treatment of elementary numerical analysis. 2nd edition. New York: McGraw-Hill. Weinberger. New York: McGraw-Hill. F. Norton & Co.H. Grad. Ralston. Though all the examples are in the FORTRAN programming language. H.188 Bibliography Pogorelov. 1979 Intermediate text on electromagnetism. and W. 3rd edition.F.T.E. Curl. M.

x = 1. −2) + t(5. (10. 6. 11 x − 24 y + 21 z − 26 = 0 17. 90◦ 7. 3. (1. v + w is larger. 18) 1. 0 7. (a) r 2 + z2 = 25 (b) ρ = 5 7. 7. center: (−1. 0 and (8. since v · w = 0. 10 3. 4. (−5. 1. 0. 8) (g) (−7. 3. x = 1 + t.6. −2. | v · w | = 0 < 21 5 = v w 13. x = 1 + 2 t.4◦ 5. center: (2. 1 − cos t.816) 3 3 3.2 (p. v + w = 26 < 21 + 5 = v + w 15. 16. 24. z = t 5.4 (p. (a) (4. 4. Hint: Use the distance formula for Cartesian coordinates. (a) r 2 + 9 z2 = 36 (b) ρ 2 (1 + 8 cos2 φ) = 36 10.5 (p. 46) 1. No intersection. −3) (b) x = 2 + 5 t. 14 Section 1.Appendix A Answers and Hints to Selected Exercises Chapter 1 y = 1. z = 3 + t Section 1. y = 1 5. (8. 1). 4 5 11. −4. π . 1) (b) x = 2 + t. z = 0 and a = − b . z = −3 + 8 t 7. −1) 5.72 9. z = −2 − 3 t (c) x−2 = 4 = z−3 5 3. 0 Section 1. No (d) 1 (c) x − 2 = z − 3. y−3 +2 y = 3 + 4 t. v( t) = (1. z = −7 t 21. −10. radius: 1. 3) 11. 5 . 2− c . 50) 1. y = 2 + 3 t. (a) 5 (e) 2 17 (b) 5 2. 11.8 (p. −24) 3. 14) 1. 9/ 35 19. 0) (b) (2 7. sin t). a( t) = (0. 11π . y = z = 1 3. Yes. x = 5 t. sin t. 7. 57) 1. π . (a) (2. 3. −6. 3) + t(1. 2. (a) (−4. (a) Line parallel to c (b) Half-line parallel to c (c) Hint: Think of the Section 1. −23. −4) (h) (−1. radius: 5. θ . f ′ ( t) = (1. No. 2 t. −1) (b) ( 17. −5) 5. 3 t2 ). 39) 1. y = 2 t. −1.1 (p. 1) Section 1.10(c). 3.6 (p. Section 1. 2 cos 2 t. 8) 1. 1. a cot φ) 12. Hint: See Theorem 1. 2) (j) No. −2 30 30 30 (f) (14. f ′ ( t) = (−2 sin 2 t. y = −2 + 7 t. 5) 3. 2) 15.7 (p. 73. 11π . z = 0 13.3 (p. −3) (b) (2. x − 2 y − z + 2 = 0 15. circle x2 + y2 = 4 in the planes z = ± 5 y y x x 9. 4 x − 4 y + 3 z − 10 = 0 13. 2a 2 b 2− c . −6. (a. Yes (c) 17 3. 29) 1. 1) (i) (−2. lines a = b . 9 13. cos t) 9. Section 1. 4. π ) 6 6 2 5. 189 . Section 1. (a) (2. Hint: use Deﬁnition 1. (a) (2 7. −1) 41 41 (d) 2 (e) 2 (c) −1 .65 9. 0◦ 9.

−1).16 21. 1 3 x . (1. 4 . 30 .16 7. x2 + y2 +4 y x2 + y2 +4 ) 15. B( t) = κ( t) = 1/2 1 (sin t. 0) : → (0. x + 2 y = z 7. 2 13. domain: Section 2. 2 y. 3. saddle pt. 100) 1. 88) {( x. 1). ∂ x = x( x2 + y2 )−1/2 . ( x0 .4 (p. ( x0 .2858. ∂ x = −2 xe−( x + y ) . 1). =0 = − x −2 . ∂ x = 2 x. 82) 1. 1). N( t) = 2 −4 9 ∂2 f = (1 + x y) e ∂x ∂ y ∂2 f ∂2 f = 0. 77) Theorem 1. ∂f ∂y = Section 2. ∂f ∂x = ye x y + y. (1. 0 3. Hint: Use f ′ ( t) = f( t) T. domain: R3 . −0. 4 . (−1. 3π 5 2 3. Replace 6. 2 .1 (p. max. 0). 8 . ∂ y2 = 1 ( x2 + y + 4)−1/2 2 ∂f ∂x 5. ∂ y = 2 y 3. 1] 7. 1) : → (1. (−1. range: [−1.5 (p.03256. −2 x + y − z − 2 = 0 9. 11. cos t.7 (p. Example 1.37. ∂ y = 0 ∂f ∂f 9. 2 z) 11.190 Appendix A: Answers and Hints to Selected Exercises ∂2 f ∂x ∂ y functions as position vectors. ∂x ∂ y ∂2 f 1 = − 4 ( x2 + y + 4)−3/2 . and Section 2. (−1. does not exist 11. ( Chapter 2 Section 2. range: [0.2 (p. −1). 2 2 13. local min. 5. does not exist dle pts. min. T( t) = (− sin t. 0. x y cos( x yz)) 9. max. (1. (1/ x. 0) 9. − sin t. = 4 x3 . 95) 2. 0) 7. ∂ x ∂ y = 0 ∂ y2 2 ∂2 f −2 ∂ f . 2 Section 2.20(e). 20). 1/ y) 7. local min. (2 x. 2 y) 3. xz cos( x yz). local min. y0 ) = (1.3998). ∂ x = x( x2 + y + 4)−1/2 . range: [−1. ∞) 5. ∂y ∂2 f ∂f ∂2 f = x cos( x y) 17. 1 1. −1 4 5 5 . − cos t. (−1. (1. 63) 1. ∂ y2 = 2. x = y = 4.3 (p. ∂2 f ∂ y2 xy = x2 e x y . (0. 5 5 5 5 20 . = 12 x2 . 74) depth=10 13. domain: R2 . (0. ∂2 f ∂ x2 1 = − 2 x( x2 + y + 4)−3/2 Section 1. ∂y ∂2 f ∂2 f = 0 19. y0 ) = (0. width = height = Section 2. −1.6 (p. z = 2 ∂f ∂f ∂f 1. 2 15. decrease: (−45. ∞) 3. max. 2 ( x − 1) + 9 ( y − 2) + 12 ( z − 2 11 f ′ ( t) × f ′′ ( t). ∂ x2 = 2. Hint: Use t by 27 s+16 2 Theorem 1. ∂ x = y cos( x y). ∂f ∂y ∂f xe x y + x 7. 2(5 2/3 3/2 − 8) 5. 3 x + 4 y − 5 z = 0 1 N( t). 1) 5. −20) 1. ∂ x2 = ( y + 4)( x2 + y + 4)−3/2 . y) : x2 + y2 ≥ 4}. (− cos t. 8abc 3 3 −4 −2 . 2 x + 3 y − z − 3 = 0 3. 15. ∂y 3 2 2 ∂f ∂f = −2 ye−( x + y ) 15. then write T ′ ( t) in terms of 3 ) = 0 9. Hint: Use Exercise 6. ∂ x = 2 x ( x2 + y + 4)−2/3 . −1). differentiate 11 1 4 that to get f ′′ ( t). increase: (45. saddle pt. ∂x ∂ y 2 = −y ∂y = y2 e x y . − 20 . (2 x. sad17. − 30 13 13 13 13 59 9 −9 . local max.94037) Section 2. 3 2 2 ∂f ∂f = 1 ( x2 + y + 4)−2/3 13. local min. put those expressions into 5. +1 23. 0). 0) 9. 3 cos(1) 17. 4. min. ( yz cos( x yz). 2 . ∂ y = y( x2 + y2 )−1/2 ∂f 11. Hint: Theorem 1. 1. local min. 1/2) 11. min. ∂2 f ∂ x2 ∂2 f ∂ x2 25.9 (p. 70) 5.

7/12. Hint: Think of how a vector ﬁeld f( x. Section 3.4 (p. 3 3. 5 π 4 6.5 (p. 2 2 π2 2. 0 3.) 2. 116) 1. 149) 1.705 4. 8π 3. 5 9. 2π 5. 16/15 3. 7/12) Section 4. (0. (17 17 − 5 5)/3 3.168 Section 4. 7. Section 3. 155) 1.2 (p. y) = 4 x2 y + 2 y2 + 3 x Section 3. 5a/12) 9.5 (p. 1 4 7. 123) 1. 23 5.1 (p. 6 11. 12 x2 + y2 + z2 5. F ( x. curl f = cot φ cos θ eρ + 2eθ − 2 cos θ eφ 25. Yes. (2 cos(π2 ) + π4 − 2)/4 2 10. π 2. 9 3. (0. eθ = − sin θ i + cos θ j. Hint: Start by showing that er = cos θ i + sin θ j. y) = x y2 + x3 7. ≈ 0. Yes. 4π 3/2 3 (8 − 3 ) 7. 1 n . 109) 1. 15 1. The values should converge to ≈ 1. −2π 9. 0. 12π/5 7. 112) 1. y) j in R2 can be extended in a natural way to be a vector ﬁeld in R3 . 1 − sin 2 2 9. − 23 er + r2 e z 7. 24π 7. y) i + Q ( x.3 (p. 142) 11. Section 3. 0 3. Hint: Think of how F is deﬁned. Yes 13. 175) 1.191 Chapter 3 Section 3.7182818284590455 in your program.7 (p. Yes. (b) No. Other languages have similar functions. otherwise use e = 2.3 (p. 1 3. 127) 4a 1. 2/5 4.2 (p. 1/2 11. 216π 2.6 (p. 2 5. 2πab Section 3.1 (p. (4ρ 2 − 6) e−ρ r 2 sin θ 11. (1. y) = ax y + bx + c y + d 2 y2 Section 3. No 9. 6 Section 4. 12ρ 8. −5π 5. 7 12 7 6 1 2 Chapter 4 Section 4. 2 9.146 3. 3π/16) 7.6 (p. (0. 1 6 Section 4. 134) 1.exp(x). 1 6 7. No 4. 8/3) 3.318. 1 6. 3π ) 5. F ( x.4 (p. (Hint: In Java the exponential function e x can be obtained with Math. 67/15 9. No 19. 15/4 Section 4. F ( x. 104) 1. ≈ 1. y) = x − 2 2 5. 10. Yes. ≈ 0. (7/12. 1 3. 8 ln 2 − 3 5. 2π(π − 1) 7. 0 3. 186) 1. e z = k. F ( x. 163) 1. Both are n ( n+1)2 ( n+2) 7. 6 10. 6( x + y + z) 2 z 1 9. y) = P ( x. div f = ρ − sin φ + cot φ. 1 3 5.

Let v and w be any two of the basis vectors i. w) is the vector in R3 such that: (a) the magnitude of n(v. j. then n(av. by deﬁnition. n(av. bw) must be either | ab |j or −| ab |j. So assume that a = 0 and b = 0. If either v or w is 0. n(v. Also. k. then n(v. bw) = ab(v × w) for any scalars a. we will perform the following steps: Step 1: Show that n(v. For av = ai and bw = bk. We will consider the case when a > 0 and b > 0 (the other three possibilities are handled similarly). namely. There are four possibilities for the combinations of signs for a and b. w) = 0. The goal is to show that n(v. then n(v. the xz-plane. This was already shown in Example 1. k. w) = v × w for all v. which would prove the right-hand rule for the cross product (by part 1(c) of our deﬁnition). deﬁne a new vector. bw) = 0 = ab(v × w). w. Step 2: Show that n(av. Since its magnitude is | ab |.11 in Section 1. Hence the magnitude of n(av. w) is v w sin θ .4. by deﬁnition. If v and w are nonzero and parallel. and (c) v. w) is perpendicular to the plane containing v and w. 3. so the result holds. as follows: 1. n(av. For example. w). the angle θ between av and bw is 90◦ . and θ is the angle between them. n(v. j. then n(v. we will show that the result holds for v = i and w = k (the other possibilities follow in a similar fashion). For any vectors v and w in R3 . bw) is perpendicular to the plane containing ai and bk. j. 192 . b if v and w are any two of the basis vectors i. Thus. bw) must be a scalar multiple of j. If v and w are nonzero and not parallel. w in R3 . 2.Appendix B We will prove the right-hand rule for the cross product of two vectors in R3 . k. w) = v × w if v and w are any two of the basis vectors i. (b) n(v. bw). w) = 0. To do this. w) form a right-handed system. is ai bk sin 90◦ = | ab |. If either a = 0 or b = 0 then n(av.

If v = 0. −j form a right-handed system. 0) = n(u. If u = 0 then the result holds trivially since n(u. n(av. If θ is the angle between u and v. bk) = ab(i × k). v + w). and so i. v. v + w) = n(u. bk) form a right-handed system. So rotating pro j P u v by 90◦ in a counter-clockwise direction in the plane P gives a vector whose magnitude is the same as that of n(u. bk) has to be either abj or −abj. which is shown in the ﬁgure below. b > 0. Thus. Note that this holds even if u ∥ v. You can think of this projection vector (denoted by pro j P u v) as the shadow of the vector u v on the plane P . ∴ n(av. So now assume that u. which is the magnitude of n(u. v). then project the vector u v straight down onto the plane P . then the projection vector pro j P u (v + w) is the sum of the projection vectors pro j P u v and pro j P u w (to see this. u v θ u v pro j P u v θ P n(u. Let P be a plane perpendicular to u. w) = n(u. w) and n(u. w) for any vectors u. which is what we would expect. by deﬁnition. then the result follows easily since n(u. v). w). So since. and ab > 0). Hence this vector must be n(u. ai. But we know that ai × bk = ab(i × k) = ab(−j) = −abj. k. v) has magnitude 0.193 In this case. which is what we needed to show. Since u (v + w) is the sum of the vectors u v and u w. since i. w. Multiply the vector v by the positive scalar u . v + w) = n(u. n(u. ai. j form a left-handed system. n(ai. v + w). v) + n(u. w) are all the zero vector. And we can see that u. bw) must be either abj or −abj. v and w are all nonzero vectors. We will describe a geometric construction of n(u. Since this vector is in P then it is also perpendicular to u. −abj form a right-handed system (since a > 0. then we see that pro j P u v has magnitude u v sin θ . Now. j. bk. bk) = −abj. v) and which is perpendicular to pro j P u v (and hence perpendicular to v). w) = 0 + n(u. k. 0 + w) = n(u. bw) = ab(v × w) Step 3: Show that n(u. then i. bk. v) Now apply this same geometric construction to get n(u. A similar argument shows that the result holds if w = 0. and since n(ai. with the light source directly overhead the terminal point of u v. this means that we must have n(ai. v). v) + n(u. since in that case θ = 0◦ and so sin θ = 0 which means that n(u. w) = n(u. using the shadow . v) and n(u. k form a right-handed system. v and this vector form a right-handed system. n(ai. Therefore.

v) n(u. we have shown that −n(v. w) must be n(w. −n(v. If v and w are nonzero and parallel. v) = 0 = −n(v. v) + n(u. w) form a right-handed system. w) P n(u. w. Also. we have . Write v = v1 i + v2 j + v3 k and w = w1 i + w2 j + w3 k. v) and is perpendicular to the plane containing w and v. Then by Steps 3 and 4. v. which is the same as the magnitude of n(v. Thus. w). v). v. v + w) = n(u. and hence so is −n(v. By deﬁnition. n(v. w). and hence w. w). then n(w. or if either is 0. w) form a right-handed system. So by deﬁnition this means that −n(v. v.194 Appendix B: Proof of the Right-Hand Rule for the Cross Product analogy again and the parallelogram rule for vector addition. w. v + w) Step 4: Show that n(w. v. w) for any vectors v. which means that n(u. v) has magnitude w v sin θ . so the result holds. w) = v × w for all vectors v. So then rotating all three projection vectors by 90◦ in a counter-clockwise direction in the plane P preserves that sum (see the ﬁgure below). So assume that v and w are nonzero and not parallel. Step 5: Show that n(v. u v u (v + w) v+w pro j P u v pro j P u (v + w) θ u v θ u w w pro j P u w n(u. and so w. Then n(w. −n(v. and hence is the same as the magnitude of −n(v. v) = −n(v. n(v. w. w) form a left-handed system. and that w. think of how projecting a parallelogram onto a plane gives you a parallelogram in that plane). n(v. w). w). w) is a vector with the same magnitude as n(w. w) is perpendicular to the plane containing w and v. w) form a right-handed system.

Thus. v1 i + v2 j + v3 k) = −v2 w1 k + v3 w1 j = −v1 w1 0 + −v2 w1 k + −v3 w1 (−j) Similarly. v1 i + v2 j + v3 k) = −v1 w3 j + v2 w3 i .195 n(v. v3 k) = −v1 w1 (i × i) + −v2 w1 (i × j) + −v3 w1 (i × k) = −v1 w1 n(i. we can calculate −n(w2 j. i) + −v2 w1 n(i. w) = v × w for all vectors v. w) = −v2 w1 k + v3 w1 j + v1 w2 k − v3 w2 i − v1 w3 j + v2 w3 i = (v2 w3 − v3 w2 )i + (v3 w1 − v1 w3 )j + (v1 w2 − v2 w1 )k = v × w by deﬁnition of the cross product. v1 i + v2 j + v3 k) + −n(w2 j. n(v. w. v1 i) + −n(w1 i. w1 i) + n(v1 i + v2 j + v3 k. . So since v. w2 j) + n(v1 i + v2 j + v3 k. ∴ n(v. v1 i + v2 j + v3 k) = −n(w1 i. = n(v1 i + v2 j + v3 k. w) = n(v1 i + v2 j + v3 k. v × w form a right-handed system. then v. which completes the proof. k) −n(w1 i. w3 k) = n(v1 i + v2 j + v3 k. w1 i + w2 j + w3 k) = −n(w1 i. w. v1 i + v2 j + v3 k) + −n(w3 k. w) form a right-handed system. w2 j + w3 k) We can use Steps 1 and 2 to evaluate the three terms on the right side of the last equation above: −n(w1 i. putting it all together. w1 i) + n(v1 i + v2 j + v3 k. we have n(v. w. v1 i + v2 j + v3 k). v1 i + v2 j + v3 k) = v1 w2 k − v3 w2 i and −n(w3 j. j) + −v3 w1 n(i. v2 j) + −n(w1 i.

Below is a very brief tutorial on how to use Gnuplot to graph functions of several variables.0. which we will now describe. style “Regular”.html and follow the links to download the latest version for your operating system.gnuplot. in Windows you would unzip the Zip ﬁle you downloaded in Step 1 into some folder (use the “Use folder names” option if extracting with WinZip).2.Appendix C 3D Graphing with Gnuplot Gnuplot is a free.2. just type gnuplot in a terminal window. GRAPHING FUNCTIONS The usual way to create 3D graphs in Gnuplot is with the splot command: splot <range> <comma-separated list of functions> 196 .exe from the folder (or bin folder) where you installed Gnuplot. while in Linux it will appear in the terminal window where the gnuplot command was run.0. Versions are available for many operating systems. open-source software package for producing a variety of graphs. All the examples we will discuss require at least version 4. run wgnuplot. which is version 4. 3. In Windows this will appear in a new window. RUNNING GNUPLOT 1.zip. 2.. INSTALLATION 1. size “12” is usually a good choice (that choice can be saved for future sessions by right-clicking in the Gnuplot window again and selecting the option to update wgnuplot.ini). For Windows.info/download. For example. For Windows. You should now get a Gnuplot terminal with a gnuplot> command prompt.” option. Go to http://www. In Windows. if the font is unreadable you can change it by right-clicking on the text part of the Gnuplot window and selecting the “Choose Font. 2. At the gnuplot> command prompt you can now run graphing commands. In Linux. you should get the Zip ﬁle with a name such as gp420win32. For example. Install the downloaded ﬁle. the font “Courier”.

use an expression of the form [a : b][ c : d ]. Function deﬁnitions use the x and y variables in combination with mathematical operators.5 2 -1 -0. This will cause the graph to be plotted for a ≤ x ≤ b and c ≤ y ≤ d . listed below: Symbol + − * / ** exp( x) log( x) sin( x) cos( x) tan( x) Operation Addition Subtraction Multiplication Division Power ex ln x sin x cos x tan x Example 2+3 3−2 2*3 4/2 2**3 exp(2) log(2) sin(pi/2) cos(pi) tan(pi/4) Result 5 1 6 2 3 2 =8 e2 ln 2 1 −1 1 Example C. type this at the gnuplot> prompt: splot [-1:1][-2:2] 2*x**2 + y**2 The result is shown below: 2 ∗ x ∗ ∗2 + y ∗ ∗2 7 6 5 4 3 2 1 0 1 1. for some numbers a < b and c < d .5 1 -2 -1. <range> is the range of x and y values (and optionally the range of z values) over which to plot.5 -1 -0. y).197 For a function z = f ( x. To specify an x range and a y range.5 0 0. To graph the function z = 2 x2 + y2 from x = −1 to x = 1 and from y = −2 to y = 2.1.5 .5 0 0.

to also plot the function z = e x+ y on the same graph.120. To get more of a colored/shaded surface. For clarity. To display the axes.1. To show the axes with the orientation which we have used throughout the text. use this command: set contour both The default mesh size for the grid on the surface is 10 units. by default the x. use this command before the splot command: set zeroaxis Also. use this command: set view 60.1 Also. the x-axis and y-axis are not shown in the graph. y) on both the surface and projected onto the x y-plane.1. 25) like this: set isosamples 25 Putting all this together.1 set xlabel "x" set ylabel "y" set zlabel "z" set contour both set isosamples 25 splot [-1:1][-2:2] 2*(x**2) + y**2. parentheses can be used to make sure the operations are being performed in the correct order: splot [-1:1][-2:2] 2*(x**2) + y**2 In the above example. put a comma after the ﬁrst function then append the new function: splot [-1:1][-2:2] 2*(x**2) + y**2. increase the mesh size (to.198 Appendix C: 3D Graphing with Gnuplot Note that we had to type 2*x**2 to multiply 2 times x2 .120.and y-axes are switched from their usual position. use these commands: set xlabel "x" set ylabel "y" set zlabel "z" To show the level curves of the surface z = f ( x. exp(x+y) . say. exp(x+y) By default. we get the following graph with these commands: set zeroaxis set view 60. to label the axes.

5 1 1. y) = c.5 0 0. it can be turned off with this command: unset key PARAMETRIC FUNCTIONS Gnuplot has the ability to graph surfaces given in various parametric forms. you would do the following: set mapping cylindrical set parametric y = r sin θ . and z = f ( u.34 from Section 1. That is.199 25 20 15 10 z 5 0 -1 -0. with a ≤ u ≤ b.2. For example. Example C.7 (p. the key was put outside the graph with the set key outside command.v) where the variable u represents θ . v) is some function of u and v. Because of the large number of level curves. The graph of the helicoid z = θ in Example 1. the variable v represents r .f(u.5 -1 -0. If you do not want the function key displayed.5 0 2 ∗ x ∗ ∗2 + y ∗ ∗2 6 5 4 3 2 1 exp( x + y) 20 15 10 5 x 0. they are the numbers c such that f ( x. z=z splot [a:b][c:d] v*cos(u). with c ≤ v ≤ d . for a surface parametrized in cylindrical coordinates x = r cos θ .5 y 1 2 The numbers listed below the functions in the key in the upper right corner of the graph are the “levels” of the level curves of the corresponding surface.5 -2 -1. 49) was created using the following commands: .v*sin(u).

select the “Output .v*sin(u). There are many terminal types (which determine the output format).png.. the postscript terminal type is popular. then go to the File menu on the main Gnuplot menubar.”.. as a PNG ﬁle. and enter png in the Terminal type? textﬁeld. in the File menu again. say. That will allow you to print the graph as a PDF ﬁle.” option in the File menu). type quit at the gnuplot> command prompt. and enter pdf in the Terminal type? textﬁeld. If that does not work on your version of Gnuplot.” option.120.. and θ varies from 0 to 4π.png’ and then run your splot command. Run the command set terminal to see all the possible types.png) in the Output ﬁlename? textﬁeld.” option and enter a ﬁlename (say. In Linux.”. PRINTING AND SAVING In Windows. Looking at the graph.u The command set xyplane 0 moves the z-axis so that z = 0 aligns with the x y-plane (which is not the default in Gnuplot).1. In Linux. select “Output Device . select “Output Device .. to save the graph as a ﬁle called graph..200 Appendix C: 3D Graphing with Gnuplot set mapping cylindrical set parametric set view 60. hit OK. to print a graph from Gnuplot right-click on the titlebar of the graph’s window. Then. go to the File menu on the main Gnuplot menubar. select “Options” and then the “Print. .. graph. you will see that r varies from 0 to 2. To save a graph. you would issue the following commands: set terminal png set output ’graph. Now run your splot command again and you should see a ﬁle called graph... hit OK. To quit Gnuplot.png in the current directory (usually the directory where wgnuplot.. though you can change that setting using the “Change Directory .1 set xyplane 0 set xlabel "x" set ylabel "y" set zlabel "z" unset key set isosamples 15 splot [0:4*pi][0:2] v*cos(u). hit OK. since the print quality is high and there are many PostScript viewers available.exe is located.

regardless of subject matter or whether it is published as a printed book.2002 Free Software Foundation. Such a notice grants a world-wide. unlimited in duration. Any member of the public is a licensee. but changing it is not allowed. while not being considered responsible for modiﬁcations made by others. Preamble The purpose of this License is to make a manual. this License preserves for the author and publisher a way to get credit for their work. because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work.GNU Free Documentation License Version 1. which means that derivative works of the document must themselves be free in the same sense.2. either copied verbatim. or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it. This License is a kind of "copyleft". A "Modiﬁed Version" of the Document means any work containing the Document or a portion of it. 51 Franklin St.2001. or with modiﬁcations and/or translated into another language. But this License is not limited to software manuals. We have designed this License in order to use it for manuals for free software. with or without modifying it. We recommend this License principally for works whose purpose is instruction or reference. which is a copyleft license designed for free software. refers to any such manual or work. November 2002 Copyright ©2000. Boston. in any medium. The "Document". modify or distribute the work in a way requiring permission under copyright law. Fifth Floor. MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document. You accept the license if you copy. Secondarily. royalty-free license. that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. either commercially or noncommercially. It complements the GNU General Public License. Inc. below. and is addressed as "you". 1. textbook. 201 . it can be used for any textual work. to use that work under the conditions stated herein.

"Dedications". the title page itself. A copy made in an otherwise Transparent ﬁle format whose markup. "Endorsements". if the Document is in part a textbook of mathematics. PostScript or PDF designed for human modiﬁcation.202 GNU Free Documentation License A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. or absence of markup. or "History". in the notice that says that the Document is released under this License. and the machine-generated HTML. philosophical. For works in formats which do not have any title page as such. An image format is not Transparent if used for any substantial amount of text. (Thus. legibly. The "Cover Texts" are certain short passages of text that are listed. If a section does not ﬁt the above deﬁnition of Secondary then it is not allowed to be designated as Invariant. the material this License requires to appear in the title page. A copy that is not "Transparent" is called "Opaque". in the notice that says that the Document is released under this License. has been arranged to thwart or discourage subsequent modiﬁcation by readers is not Transparent.) To "Preserve the . The Document may contain zero Invariant Sections. PostScript or PDF produced by some word processors for output purposes only. Texinfo input format. a Secondary Section may not explain any mathematics. as being those of Invariant Sections. and a Back-Cover Text may be at most 25 words. LaTeX input format. Examples of transparent image formats include PNG. A "Transparent" copy of the Document means a machine-readable copy. XCF and JPG. Examples of suitable formats for Transparent copies include plain ASCII without markup. The "Title Page" means.) The relationship could be a matter of historical connection with the subject or with related matters. as Front-Cover Texts or Back-Cover Texts. A Front-Cover Text may be at most 5 words. and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors. ethical or political position regarding them. represented in a format whose speciﬁcation is available to the general public. SGML or XML for which the DTD and/or processing tools are not generally available. such as "Acknowledgments". The "Invariant Sections" are certain Secondary Sections whose titles are designated. preceding the beginning of the body of the text. or of legal. "Title Page" means the text near the most prominent appearance of the work’s title. for a printed book. commercial. that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor. (Here XYZ stands for a speciﬁc section name mentioned below. and standard-conforming simple HTML. A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. plus such following pages as are needed to hold. SGML or XML using a publicly available DTD. If the Document does not identify any Invariant Sections then there are none.

you should put the ﬁrst ones listed (as many as ﬁt reasonably) on the actual cover. and the license notice saying this License applies to the Document are reproduced in all copies. COPYING IN QUANTITY If you publish printed copies (or copies in media that commonly have printed covers) of the Document. you may accept compensation in exchange for copies. 2. when you begin distribution of Opaque copies in quantity. If you publish or distribute Opaque copies of the Document numbering more than 100. The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. either commercially or noncommercially. and that you add no other conditions whatsoever to those of this License. You may also lend copies. to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. and continue the rest onto adjacent pages. under the same conditions stated above. the copyright notices. 3. Both covers must also clearly and legibly identify you as the publisher of these copies. You may add other material on the covers in addition. Copying with changes limited to the covers.203 Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this deﬁnition. If the required texts for either cover are too voluminous to ﬁt legibly. and Back-Cover Texts on the back cover. However. These Warranty Disclaimers are considered to be included by reference in this License. can be treated as verbatim copying in other respects. all these Cover Texts: Front-Cover Texts on the front cover. free of added material. as long as they preserve the title of the Document and satisfy these conditions. If you distribute a large enough number of copies you must also follow the conditions in section 3. The front cover must present the full title with all words of the title equally prominent and visible. but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License. or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document. provided that this License. If you use the latter option. you must take reasonably prudent steps. VERBATIM COPYING You may copy and distribute the Document in any medium. . and you may publicly display copies. numbering more than 100. and the Document’s license notice requires Cover Texts. you must either include a machine-readable Transparent copy along with each Opaque copy. you must enclose the copies in covers that carry. clearly and legibly.

a license notice giving the public permission to use the Modiﬁed Version under the terms of this License. then add an item describing the Modiﬁed Version as stated in the previous sentence.204 GNU Free Documentation License It is requested. if any) a title distinct from that of the Document. MODIFICATIONS You may copy and distribute a Modiﬁed Version of the Document under the conditions of sections 2 and 3 above. if it has fewer than ﬁve). in the form shown in the Addendum below. If there is no section Entitled "History" in the Document. Preserve its Title. as the publisher. immediately after the copyright notices. Include. H. . Add an appropriate copyright notice for your modiﬁcations adjacent to the other copyright notices. Preserve all the copyright notices of the Document. F. G. In addition. be listed in the History section of the Document). create one stating the title. if there were any. Use in the Title Page (and on the covers. and publisher of the Modiﬁed Version as given on the Title Page. year. Include an unaltered copy of this License. with the Modiﬁed Version ﬁlling the role of the Document. State on the Title page the name of the publisher of the Modiﬁed Version. C. 4. unless they release you from this requirement. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document’s license notice. E. but not required. List on the Title Page. You may use the same title as a previous version if the original publisher of that version gives permission. B. and publisher of the Document as given on its Title Page. thus licensing distribution and modiﬁcation of the Modiﬁed Version to whoever possesses a copy of it. and from those of previous versions (which should. authors. provided that you release the Modiﬁed Version under precisely this License. D. to give them a chance to provide you with an updated version of the Document. as authors. you must do these things in the Modiﬁed Version: A. one or more persons or entities responsible for authorship of the modiﬁcations in the Modiﬁed Version. and add to it an item stating at least the title. I. new authors. year. that you contact the authors of the Document well before redistributing any large number of copies. together with at least ﬁve of the principal authors of the Document (all of its principal authors. Preserve the section Entitled "History".

and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. N. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modiﬁed Version. if any. If the Modiﬁed Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document. K. add their titles to the list of Invariant Sections in the Modiﬁed Version’s license notice. L. to the end of the list of Cover Texts in the Modiﬁed Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. and a passage of up to 25 words as a Back-Cover Text. You may add a passage of up to ﬁve words as a Front-Cover Text. For any section Entitled "Acknowledgments" or "Dedications". You may omit a network location for a work that was published at least four years before the Document itself. To do this. statements of peer review or that the text has been approved by an organization as the authoritative deﬁnition of a standard. unaltered in their text and in their titles. provided it contains nothing but endorsements of your Modiﬁed Version by various parties–for example. Delete any section Entitled "Endorsements". Preserve the Title of the section. 5.205 J. provided that you include in the . you may at your option designate some or all of these sections as invariant. or if the original publisher of the version it refers to gives permission. You may add a section Entitled "Endorsements". COMBINING DOCUMENTS You may combine the Document with other documents released under this License. given in the Document for public access to a Transparent copy of the Document. If the Document already includes a cover text for the same cover. Preserve any Warranty Disclaimers. M. O. Preserve all the Invariant Sections of the Document. on explicit permission from the previous publisher that added the old one. you may not add another. Preserve the network location. and preserve in the section all the substance and tone of each of the contributor acknowledgments and/or dedications given therein. These titles must be distinct from any other section titles. Do not retitle any existing section to be Entitled "Endorsements" or to conﬂict in title with any Invariant Section. Such a section may not be included in the Modiﬁed Version. previously added by you or by arrangement made by the same entity you are acting on behalf of. but you may replace the old one. Section numbers or the equivalent are not considered part of the section titles. under the terms deﬁned in section 4 above for modiﬁed versions.

this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document. You must delete all sections Entitled "Endorsements". TRANSLATION Translation is considered a kind of modiﬁcation. then if the Document is less than one half of the entire aggregate. 6. but you may include translations of . unmodiﬁed. COLLECTIONS OF DOCUMENTS You may make a collection consisting of the Document and other documents released under this License. You may extract a single document from such a collection. and that you preserve all their Warranty Disclaimers. and distribute it individually under this License. and replace the individual copies of this License in the various documents with a single copy that is included in the collection. provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. If there are multiple Invariant Sections with the same name but different contents. When the Document is included in an aggregate. and list them all as Invariant Sections of your combined work in its license notice. or the electronic equivalent of covers if the Document is in electronic form. in or on a volume of a storage or distribution medium. so you may distribute translations of the Document under the terms of section 4. make the title of each such section unique by adding at the end of it. In the combination. or else a unique number. the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate. the name of the original author or publisher of that section if known. and follow this License in all other respects regarding verbatim copying of that document. Otherwise they must appear on printed covers that bracket the whole aggregate. provided you insert a copy of this License into the extracted document. and any sections Entitled "Dedications". Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. AGGREGATION WITH INDEPENDENT WORKS A compilation of the Document or its derivatives with other separate and independent documents or works. and multiple identical Invariant Sections may be replaced with a single copy. you must combine any sections Entitled "History" in the various original documents.206 GNU Free Documentation License combination all of the Invariant Sections of all of the original documents. The combined work need only contain one copy of this License. Replacing Invariant Sections with translations requires special permission from their copyright holders. forming one section Entitled "History". is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. If the Cover Text requirement of section 3 is applicable to these copies of the Document. 8. likewise combine any sections Entitled "Acknowledgments". 7. in parentheses.

Version 1. no Front-Cover Texts. sublicense or distribute the Document is void. modify.gnu. with no Invariant Sections. parties who have received copies. revised versions of the GNU Free Documentation License from time to time. but may differ in detail to address new problems or concerns. and no Back-Cover Texts. you may choose any version ever published (not as a draft) by the Free Software Foundation. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer. If a section in the Document is Entitled "Acknowledgments". ADDENDUM: How to use this License for your documents To use this License in a document you have written. sublicense. the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title. . and any Warranty Disclaimers. "Dedications". and all the license notices in the Document. Any other attempt to copy. You may include a translation of this License. include a copy of the License in the document and put the following copyright and license notices just after the title page: Copyright ©YEAR YOUR NAME. 9.207 some or all Invariant Sections in addition to the original versions of these Invariant Sections. If the Document does not specify a version number of this License. you have the option of following the terms and conditions either of that speciﬁed version or of any later version that has been published (not as a draft) by the Free Software Foundation. A copy of the license is included in the section entitled "GNU Free Documentation License". If the Document speciﬁes that a particular numbered version of this License "or any later version" applies to it. See http:// www. from you under this License will not have their licenses terminated so long as such parties remain in full compliance.org/copyleft/. or distribute the Document except as expressly provided for under this License. However. distribute and/or modify this document under the terms of the GNU Free Documentation License. 10. Each version of the License is given a distinguishing version number. modify. FUTURE REVISIONS OF THIS LICENSE The Free Software Foundation may publish new.2 or any later version published by the Free Software Foundation. the original version will prevail. and will automatically terminate your rights under this License. or rights. TERMINATION You may not copy. or "History". Such new versions will be similar in spirit to the present version. Permission is granted to copy. provided that you also include the original English version of this License and the original versions of those notices and disclaimers.

such as the GNU General Public License. or some other combination of the three.Texts. and with the Back-Cover Texts being LIST. with the Front-Cover Texts being LIST. Front-Cover Texts and Back-Cover Texts. to permit their use in free software..208 GNU Free Documentation License If you have Invariant Sections. merge those two alternatives to suit the situation. If you have Invariant Sections without Cover Texts. we recommend releasing these examples in parallel under your choice of free software license. If your document contains nontrivial examples of program code. replace the "with." line with this: with the Invariant Sections being LIST THEIR TITLES.. .

For persons making modiﬁcations to the book. VERSION: 1.0 Date: 2008-01-04 Author(s): Michael Corral Title: Vector Calculus Modiﬁcation(s): Initial version 209 . following the format in the ﬁrst item below. 1. please record the pertinent information here.History This section contains the revision history of the book.

. . .Index Symbols D . . . . . . . . . . . . . M y . . . . . . . . . . . . . . . . . 1 curvilinear . . . . . . . . . . . . . . . . . . ... . . . . . . . . . .. . . . . . . . . . . 2. . . . 178 . . . . .. . . . . . 36 conical helix . . . . . . . . . 119 ∂( u. . . . . . . . . . . . . . . . . . . 84 M x . . . . . . . . . . . . . . . . . . . . .. . 178 R2 . . . . . . . . . . . . . . . w) ∂f . . . . . . . . 139 i. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . 47. . . . . . . .. . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . .124 ¯ y . . . . . . . . . . . . . . 1 ∞ A acceleration .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117. . . . . . . . . . . . . 102 . . . . . . . . . . . . . . . . . . . . . . . 47 cylindrical . . . . . . . . . . . . . . . . . . . . . . . . . . 124 ¯ z . e z . . . . . . . . . . . . . . . . . . . . . . eφ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Beta function. . . . . . . . . . . . . . . . . . . . . . y) . 163 Σ C . . . . . . . . . . . . . . . . . . . . . . k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . 80 coordinates . 59 area element . . . . . . . . . . . . . . 55 angle . . . . . 78 er . . . .. . . . . . . . . . . . . . . . 2 . 15 210 . . . . . . . . . . 161 collinear . . . . . . . . . . . . . . . . . . . . . 69 continuously differentiable . . . . . . . . . . . . . . .. . . . 119 circulation. 113 B Bézier curve . . 71 D v f . . . . . . . . M xz . . . . . . y. . . . . . . . . . .. . . . . . . . . . . 59. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 110 S annulus . . . . . . . . . .123 C capping surface . . . . . . . 177 ∇2 . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .