Professional Documents
Culture Documents
Frederick
Frederick
Calculus
1 Three-Dimensional Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 Rectangular Coordinates in R3 5
1.2 Dot Product 7
1.3 Cross Product 9
1.4 Lines and Planes 11
1.5 Parametric Curves 13
2 Partial Differentiations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1 Functions of Several Variables 19
2.2 Partial Derivatives 22
2.3 Chain Rule 26
2.4 Directional Derivatives 30
2.5 Tangent Planes 34
2.6 Local Extrema 36
2.7 Lagrange’s Multiplier 41
2.8 Optimizations 46
3 Multiple Integrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.1 Double Integrals in Rectangular Coordinates 49
3.2 Fubini’s Theorem for General Regions 53
3.3 Double Integrals in Polar Coordinates 57
3.4 Triple Integrals in Rectangular Coordinates 62
3.5 Triple Integrals in Cylindrical Coordinates 67
3.6 Triple Integrals in Spherical Coordinates 70
4 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1 Vector Fields on R2 and R3 75
4.2 Line Integrals of Vector Fields 83
4.3 Conservative Vector Fields 88
4.4 Green’s Theorem 98
4.5 Parametric Surfaces 105
4.6 Stokes’ Theorem 120
4.7 Divergence Theorem 127
c
( a, b, c)
b
a
y
( a, b, 0)
x
Notation We will use the notation R3 to denote the entire three dimensional space.
Any point on the x-axis has the form ( x, 0, 0), i.e. y = 0 and z = 0. Similarly, points on
the y-axis are of the form (0, y, 0), and points on the z-axis are of the form (0, 0, z). The three
coordinate axes meet at a point with coordinates (0, 0, 0) which is called the origin.
A vector in R3 is an arrow which is based at one point and is pointing at another point. If a
vector v is based at ( x0 , y0 , z0 ) and points toward ( x1 , y1 , z1 ), then the vector is written as:
For example, the vector based at (3, 2, −1) pointing at (5, 2, 0) is expressed as
i Consequently, any two vectors that are pointing at the same direction and have the same
length are considered to be equal, even though they may have different base points. For
instance, a vector v based at (1, 2, 3) pointing at (4, 3, 2), i.e.
is considered to be equal to the vector w based at (0, 0, 0) pointing at (3, 1, −1). In other
words, we can write w = v.
An alternative notation for a vector is the angle bracket h a, b, ci (to save the hassle of writing
down i, j and k:
Notation h a, b, ci = ai + bj + ck.
In this course, we make very little conceptual distinction between a point ( x, y, z) and a
vector based at (0, 0, 0) pointing at the point ( x, y, z). However, speaking of notations, one should
use h x, y, zi or xi + yj + zk to denote a vector and ( x, y, z) to denote a point so as to avoid
confusion.
Vector additions can scalar multiplications are defined as follows:
Definition 1.1 — Vector Additions and Scalar Multiplications. Let a = h a1 , a2 , a3 i and b =
hb1 , b2 , b3 i be two vectors in R3 , and c be a real scalar, then:
a + b = h a1 + b1 , a2 + b2 , a3 + b3 i (vector addition)
ca = hca1 , ca2 , ca3 i (scalar multiplication)
The negative of a vector is defined as: −a = (−1)a. The difference between vectors is
defined as a − b = a + (−b).
a a+b
b
b
a a−b
Property Vector additions and scalar multiplications have the following algebraic properties
1. commutative rule: a + b = b + a
2. associatative rule: (a + b) + c = a + (b + c)
3. distributive rules: (λ + µ)a = λa + µa and λ(a + b) = λa + λb
1.2 Dot Product 7
a · b = a1 b1 + a2 b2 + a3 b3 .
It is important to note that the dot product of a vector a = h a1 , a2 , a3 i by itself is given by:
The length of a vector, i.e. |a|, is also called the norm, or the magnitude, of the vector.
Property It can easily be verified that the dot product satisfies the following algebraic
properties:
1. a · b = b · a.
2. (a + b) · c = a · c + b · c.
3. (λa) · b = λ(a · b)
4. 0 · a = a · 0 = 0.
The following theorem gives the geometric meaning of the dot product:
Theorem 1.1 Let a = h a1 , a2 , a3 i and b = hb1 , b2 , b3 i be two vectors in R3 , and θ be the angle
between these two vectors. Then we have:
Proof. The proof uses the Law of Cosines. Consider the triangle in the diagram below:
b a−b
θ
a
The side opposite to the angle is represented by the vector a − b. Using the Law of Cosines:
One immediate consequence of Equation (1.1) is that it allows us to use the dot product to
find the angle between two vectors. Precisely, the angle θ between two vectors a and b is given
by:
a·b
θ = cos−1 .
|a| |b|
The most important case is that the two vectors a and b are perpendicular, also known
as orthogonal. The angle between the vectors is π2 and so we have the following important
fact:
Corollary 1.2 Two non-zero vectors a and b are orthogonal if and only if a · b = 0.
This corollary is particularly useful to determine whether two vectors are perpendicular.
Example 1.1 Show that any triangle which is inscribed in a circle and has one of its side
coincides the diameter of the circle must be a right-angled triangle.
Solution Let O be the center of the circle. Define vectors a and b as in the diagram below.
−a O a
We would like to show the vectors in red and blue are orthogonal to each other. By basic
vector additions and subtractions:
(−a − b) · (a − b) = −a · a +
a ·
b −
b ·
a +b·b
= − | a |2 + | b |2 (recall v · v = |v|2 )
Since both a and b represent the radii of the circle, they have the same magnitude. Therefore
|a| = |b| and we have (−a − b) · (a − b) = 0. This shows the red and blue vectors are
orthogonal.
1.3 Cross Product 9
From the right-hand grab rule, we can clearly see that a × b and b × a are vectors with the
same length but in opposite direction, i.e. a × b = −b × a.
The magnitude of |a × b|, which is defined to be |a| |b| sin θ, is the area of the parallelo-
gram formed by a and b:
b |b| sin θ
θ
a
The following are algebraic properties of the cross products which are all useful. Based on
the definition of cross products we presented above, the proofs are purely geometric and are
omitted here.
Property The cross product satisfies:
1. a × b = −b × a
2. (a + b) × c = a × c + b × c
3. a×0 = 0
4. a×a = 0
10 Three-Dimensional Space
For simple vectors such as i, j and k, their cross products can be easily found from the
definition:
i × j = k, j × k = i, k × i = j.
For more complicated vectors, the cross product can be computed using the following
determinant formula:
Theorem 1.3 — Determinant Formula of Cross Product. Given two vectors a = a1 i + a2 j + a3 k
and b = b1 i + b2 j + b3 k, their cross product is given by:
i j k
a × b = a1 a2 a3 (1.2)
b1 b2 b3
= ( a2 b3 − a3 b2 )i − ( a1 b3 − a3 b1 )j + ( a1 b2 − a2 b1 )k.
a × b = ( a1 i + a2 j + a3 k) × (b1 i + b2 j + b3 k)
using the algebraic properties of the cross product. It is left as an exercise for readers.
The cross product will be extremely useful for us to find a vector which is orthogonal to a
plane.
Find a vector n which is orthogonal to the plane passing through A, B and C. Moreover, find
the area of the triangle 4 ABC.
Solution A vector n is orthogonal to the plane if and only if it is orthogonal to any two
(non-parallel) vectors on the plane. We will first find two vectors on the plane and then take
the cross product. The outcome will give a vector orthogonal to the two vectors, and hence
to the plane.
The following two vectors lie on the plane:
−→
AB = h4, 0, −1i − h0, 2, −1i = h4, −2, 0i
−→
AC = h7, −3, 0i − h0, 2, −1i = h7, −5, 1i
−→ −→
Taking the cross product: AB × AC = h−2, −4, −6i.
Therefore, the required vector n can be taken to be any scalar multiple of h−2, −4, −6i,
such as h1, 2, 3i or h2, 4, 6i.
−→ −→
The length of the cross product AB × AC is equal to the area of the parallelogram formed
−→ −→
by AB and AC. The area of the triangle 4 ABC is 12 of the area of this parallelogram.
Therefore,
1 −→ − → 1
q √
Area of 4 ABC = AB × BC = (−2)2 + (−4)2 + (−6)2 = 14.
2 2
1.4 Lines and Planes 11
h x, y, zi − h x0 , y0 , z0 i = tv
h x, y, zi = h x0 , y0 , z0 i + t hv1 , v2 , v3 i
Therefore, we have:
x = x0 + tv1
y = y0 + tv2
z = z0 + tv3
which is called the parametric equation of the line L. It is called this way because the variable
t is called the parameter of the line. While intuitively t can be thought as the time, it needs not
always be.
Notation In this course, the vector r is ‘reserved’ to denote the position vector h x, y, zi.
Using this notation, we can also write the parametric equation of the line L in vector form:
−−→
r(t) = OP0 + tv.
The t-variable in r(t) emphasizes the fact that the position vector r depends on t. It can be
omitted if it is clear that t is the parameter letter.
To summarize, in order to write down the parametric equation of a straight-line, we need
two “ingredients":
1. a given point P0 on the line, and
2. the direction v of the line.
P0 P
Example 1.3 Determine whether the following points are co-planar, i.e. contained in a
single plane:
A(0, 2, −1), B(4, 0, −1), C (7, −3, 0), D (1/3, 1/6, 1/9).
Solution Since any given three points determine a plane, we will first find an equation of
the plane through points A, B and C, then we will check whether D lies on the plane by
substituting its coordinates into the equation found.
The two “ingredients" of finding the equation of a plane are
1. a given point on the plane; and
2. a normal vector to the plane.
In order to find the normal vector the plane through A, B and C, we take the cross product
−→ −→
of AB and AC. Incidently, these three points are exactly the same as in Example 1.2, in
which we have worked out that a normal vector to the plane through A, B and C is given by
n = h1, 2, 3i.
Take A(0, 2, −1) to be the given point, then the equation of the plane through A, B and C
is given by:
1x + 2y + 3z = (1) + 2(2) + 3(−1).
After simplification: x + 2y + 3z = 1.
Substitute the coordinates of D into the equation:
1 1 1
LHS = + 2 · + 3 · = 1 = RHS.
3 6 9
Therefore, D lies on the plane through A, B and C. In other words, these four points are
co-planar.
x = cos t
y = sin t.
The former is called the Cartesian equation and the second one is called the parametric
equation.
However, in three dimensions, a single Cartesian equation such as x2 + y2 + z2 = 1
represents a surface instead. Therefore, we will only use the parametric equations to present
curves in three dimensions.
Definition 1.4 — Parametric Equation of a Curve. The parametric equation of a curve is of
the form:
x = f (t)
y = g(t)
z = h(t)
where f (t), g(t) and h(t) are differentiable functions of t. In vector notations, the parametric
equation of this curve is written as:
t
r1 (t) = (cos t)i + (sin t)j + k.
20
It is a curve that goes around the circle but the altitude is constantly increasing. See Figure
1.5a for the computer sketch.
Here is another example of a parametric curve. See Figure 1.5b for the sketch.
1.0
0.5
0.0
-0.5
-1.0
1.0
1.5
0.5
1.0
0.0
0.5
1.0
-0.5
0.5
0.0
-1.0 -1.0
-1.0 0.0
-0.5
-0.5
0.0 0.0 -0.5
0.5 0.5
-1.0
1.0 1.0
When L(t) is a non-zero constant vector (independent of t), we say that the angular momentum
is conserved. The conservation of angular momentum implies that the path of the particle is
contained in a plane. It can be explained as follows:
By the definition of cross product, the angular momentum L(t) is always orthogonal to r(t)
(and to r0 (t) too, but we do not need this). Therefore, at any time t, we have:
L(t) · r(t) = 0.
Let r(t) = x (t)i + y(t)j + z(t)k. If L(t) is a constant vector, it can be expressed as L =
Ai + Bj + Ck where A, B and C are fixed numbers. Then:
Therefore, the point ( x (t), y(t), z(t)) lies on the plane Ax + By + Cz = 0, which is plane with
normal vector L and through the origin. In other words, the path of the particle is confined in
this plane.
Property Given two curves u(t) and v(t), and a scalar function f (t), we have:
d 0 0
1. dt ( f ( t ) u ( t )) = f ( t ) u ( t ) + f ( t ) u ( t )
d 0 0
2. dt ( u ( t ) · v ( t )) = u ( t ) · v ( t ) + u ( t ) · v ( t )
d 0 0
3. dt ( u ( t ) × v ( t )) = u ( t ) × v ( t ) + u ( t ) × v ( t ).
Here is a good example on the use of one of the above product rules.
Example 1.4 Given r ( t ) represents a particle travelling at uniform speed C. Show that its
Solution The particle is travelling at uniform speed C. Therefore, |r0 (t)| ≡ C. We want to
show that r0 (t) · r00 (t) ≡ 0, so it is natural to differentiate |r0 (t)| with respect to t so that the
RHS vanishes and the LHS perhaps may be related to r00 (t).
However, since |r0 (t)| is the form of a square root so it is cumbersome to differentiate it.
1.5 Parametric Curves 15
2 2
Instead, we differentiate |r0 (t)| = C2 using the fact that |r0 (t)| = r0 (t) · r0 (t)):
2
r0 (t) = C2
r0 (t) · r0 (t) = C2
d 0
r (t) · r0 (t) = 0
dt
r00 (t) · r0 (t) + r0 (t) · r00 (t) = 0
2r0 (t) · r00 (t) = 0.
from (0, 0, 0) to 2, 38 , 2 .
Solution It is simple to verify that the initial point corresponds to t = 0 since r (0) = h0, 0, 0i,
If you plot them using Mathematica, you should find out that these two curves are the same,
although their speeds are different. The curve r2 (t) is obtained by replacing every t in r1 (t) by
2t. The initial and final times are adjusted so that the end-points of both r1 and r2 are (0, 0, 0)
and (0, 0, 2π ). We say r2 is a reparametrization of r1 .
If r(s) is a parametric curve such that |r0 (s)| = 1 for any s, we say the curve is parametrized
by arc-length. For such a parametrization, it is conventional to use s to denote the parameter.
Given a parametric curve r(t), in theory one can reparametrize the curve by arc-length, such
that with the new parameter s, the curve r(s) travels at unit speed. To find the arc-length
parametrization, you may follow the procedure:
1. Given a curve r(t) : [ a, b] → R3 , compute the following integral:
ˆ t
s= r0 (τ ) dτ.
a
2. Since the upper limit of the above integral is t, the function s should be a function of t.
Express t in terms of s whenever it is possible, so that t is a function of s, i.e. t = t(s).
3. Finally, replace all t’s by this function of s in the curve r(t).
The new parametrization r(s) will be arc-length parametrized. Let’s see some examples before
we learn why it works:
Therefore, ˆ ˆ t√
t √
s(t) = r0 (τ ) dτ = 2dτ = 2t.
0 0
Express t in terms of s, we get t = √s . Replace all t’s in r(t) by √s , we get an arc-length
2 2
parametrization:
s s s
r(s) = cos √ i + sin √ j + √ k.
2 2 2
1.5 Parametric Curves 17
r0 (t) = t + 1.
Consider: ˆ ˆ
t t
t2
s= r0 (τ ) dτ = (τ + 1)dτ = + t.
0 0 2
To solve t in terms of s, we use the quadratic equation. One should get:
√
−2 + 4 + 8s √
t= = −1 + 1 + 2s.
2
Finally, replace all t’s in r(t) by this function of s, we get an arc-length parametrization:
√
1 √ 2 2 2 √ 3/2 √
r(s) = −1 + 1 + 2s i + −1 + 1 + 2s j + −1 + 1 + 2s k.
2 3
To see why this procedure gives an arc-length parametrization, we need to show |r0 (s)| = 1.
We use the Fundamental Theorem of Calculus and the chain rule:
dr dr dt
r0 (s) = =
ds dt ds
1
= r0 (t) ds .
dt
V (r, h) = πr2 h.
z z Function f assigns to
each point (x, y) in the
domain a real number
z ⫽ f (x, y).
Graph of f
y y
(x, y) (x, y)
x x
Domain of f Domain of f
FIGURE 13.21
Figure 2.1: value of f ( x0 , y0 ) is the height of the surface above the point ( x0 , y0 , 0)
Like functions of one variable, functions of two variables must pass a vertical line
test. A relation of the form F 1x, y, z2 = 0 is a function provided every line parallel to the
z@axis intersects the graph of F at most once. For example, an ellipsoid (discussed in Sec-
tion 13.1) is not the graph of a function because some vertical lines intersect the surface
y
twice. On the other hand, an elliptic paraboloid of the form z = ax 2 + by 2 does represent
d does not pass a function (Figure 13.22).
line test:
ph of a function.
QUICK CHECK 2 Does the graph of a hyperboloid of one sheet represent a function? Does
➤
the graph of a cone with its axis parallel to the x-axis represent a function?
EXAMPLE 2 Graphing two-variable functions Find the domain and2 range of the
(a) the graph of z = x2 − y2 (b) the graph of z = x + y2
following functions. Then sketch a graph.
Figure
a. f 1x, y2 = 2x + 3y - 122.2: graphs of y2
b. g1x, two-variables
= x 2 + y 2 functions
c. h1x, y2 = 21 + x 2 + y 2
2.1.2 LevelSOLUTION
Set Diagrams
Another common way to visualize a two-variable function is through its level sets:
a. Letting z = f 1x, y2, we have the equation z = 2x + 3y - 12, or 2x + 3y - z = 12,
Definition
which2.1describes planeGiven
— Level aSets. a function
with a normal vectorf (8x2, . . .-1
1 , 3, : Rn →
, x9n )(Section R, The
13.1). a level set of the
domain
y
function is a subset of R n of2 the form:
consists of all points in ⺢ , and the range is ⺢. We sketch the surface by noting that
f
the x-intercept is 16, 0, 02 1setting y = z = 02; the y-intercept is 10, 4, 02 and the
araboloid f ( x1 , . . . , x n ) = c
tical line test: z-intercept is 10, 0, -122 (Figure 13.23).
nction.
where c is a constant.
2 z
Given f ( x, y) = x2 + y2 , which is a function from R2 to R. An example of a level set of f is
x + y2 = 1, which is a unit circle on R2 centered atPlane
2
the origin.
| {z } (6, 0, 0) z ⫽ f (x, y) ⫽ 2x ⫹ 3y ⫺ 12
f ( x,y)
By taking c to be different
x
values, we get several level sets on the plane. They are circles
centered at the origin with varying radii depending on the value of c chosen:
(0, 4, 0)
(0, 0, ⫺12)
c=0 x2 +
y y2 = 0 the origin only
2 2
c=1 x +y = 1 radius = 1
2 2
√
c=2 x +y = 2 radius = 2
FIGURE 13.23 √
c=3 x 2 + y2 = 3 radius = 3
The level set diagram of the two-variable function f ( x, y) consists of some representative
level sets of the function on R2 . See Figure 2.3.
2.1 Functions of Several Variables 21
3 16 12 10 14
16
14 6
2 8
12
10
2
1
-1
10
4
12
-2
14
16
-3 14 10 12 16
-3 -2 -1 0 1 2 3
For three-variable functions f ( x, y, z), we do not attempt to visualize its graph, but we can
visualize its level set diagram. The former requires the fourth dimension while the latter can
be visualized in R3 . A generic level set of a three-variable function f ( x, y, z) is a surface in R3 .
∂f
( x, y) := the derivative of f ( x, y) with respect to x regarding y constant
∂x
f ( x + h, y) − f ( x, y)
= lim ;
h →0 h
∂f
( x, y) := the derivative of f ( x, y) with respect to y regarding x constant
∂y
f ( x, y + h) − f ( x, y)
= lim .
h →0 h
∂f ∂f
Notation Alternatively, we sometimes denote ∂x by f x , and ∂y by f y . Note also that we do
not use f 0 ( x, y) for multivariable functions, since it is ambigious to whether it means f x or
fy.
∂f ∂f
Example 2.1 Find ∂x and ∂y for the function:
f ( x, y) = x2 sin( xy).
∂f
Solution To calculate ∂x , we regard y as a constant:
∂f ∂ 2
= x sin( xy) (regarding y constant)
∂x ∂x
∂x2 ∂
= sin( xy) + x2 sin( xy) (product rule)
∂x ∂x
∂
= 2x sin( xy) + x2 · cos( xy) xy
∂x
= 2x sin( xy) + x2 y cos( xy).
∂f
Similarly, to calculate ∂y , regard x as a constant:
∂f ∂ 2
= x sin( xy)
∂y ∂y
∂
= x2 sin( xy) (here x2 is regarded as a constant)
∂y
∂
= x2 cos( xy) · xy
∂y
= x2 cos( xy) · x
= x3 cos( xy).
2.2 Partial Derivatives 23
x b x
a
The limit
f (a ⫹ h, b) ⫺ f (a, b)
lim ... the slope of the curve z ⫽ f (x, b)
h
h 0
equals ...
at (a, b, f (a, b)), which is fx(a, b).
z z
y y
b x b x
a⫹h
a a
FIGURE 13.49
∂f
Figure 2.5: geometric interpretation of ∂x
provided this limit exists. Notice that the y-coordinate is fixed at y = b in this limit. If we
replace
Partial 1a, b2 byofthe
derivatives variable
function more1x,
withpoint y2,two
than then fx becomes a function of x and y.
variables
PartialInderivatives
a similarofway, we can
functions withmove along
more than twothevariables, z =
surfacesay f 1x,
f ( x, y, zy2 from
), are the point
defined in an
1a, b, f 1a,way.
analogous b22 For
in such a way
instance,
∂f
that x = a is fixed and only y varies. Now, the result is
∂x is the derivative with respect to x regarding all other variables,
trace described by z =
i.e. y and z, constant. Although ity2,
a f 1a, which
is not easyisto
the intersection
interpret
∂f of the surface and the plane
∂x geometrically since the graph of
f x( x,=y, a
z)(Figure 13.50). The slope ofspace,
this curve at 1a, b2 is given
∂ f by ∂ f the ordinary
∂f derivative
sits inside a 4-dimensional the way to compute ∂x , ∂y and ∂z are exactly the
of f 1a, y2 with respect
same as two-variable functions. to y. This derivative is called the partial derivative of f with respect
to y, denoted 0f>0y or fy. When evaluated at 1a, b2, it is defined by the limit
f 1a, b + h2 - f 1a, b2
fy1a, b2 = lim ,
hS0 h
provided this limit exists. If we replace 1a, b2 by the variable point 1x, y2, then fy becomes
a function of x and y.
2 3 ∂f
Example 2.2 Given f ( x, y, z) = e x +y + xyz . Compute ∂z .
Solution
∂f ∂ 2 3
= e x +y + xyz
∂z ∂z
2 3 ∂( x2 + y3 + xyz)
= e x +y + xyz · (chain rule)
∂z
x2 +y3 + xyz
=e · (0 + 0 + xy)
x2 +y3 + xyz
= xye .
Example 2.3 Let f ( x, y) = 3x4 y − 2xy + 5xy3 . Compute all first and second partial deriva-
tives.
∂f
= 12x3 y − 2y + 5y3
∂x
∂f
= 3x4 − 2x + 15xy2
∂y
∂ ∂f
Notation Since it is a bit clumsy to write ∂y ∂x every time, we can use the following
short-hand:
∂2 f ∂2 f
∂ ∂f ∂ ∂f
= =
∂x2 ∂x ∂x ∂x∂y ∂x ∂y
∂2 f ∂2 f
∂ ∂f ∂ ∂f
= =
∂y∂x ∂y ∂x ∂y2 ∂y ∂y
Similarly for the subscript notations, we write f xx = ( f x ) x , and f xy = ( f x )y . The later means
to differentiate by x first and then by y. Therefore, it is related to the fraction notation by:
∂2 f
∂ ∂f
f xy = ( f x )y = = .
∂y ∂x ∂y∂x
2.2 Partial Derivatives 25
∂2 f
The above remark seems to suggest that we should be very careful when converting ∂y∂x
into the subscript notation f xy . The order of x and y needs to be switched in the conversion.
However, thanks to the following important theorem, we don’t need to worry about this
too much, since in many cases, we have f xy = f yx .
Theorem 2.1 — Mixed Partials Theorem. Consider the function f ( x, y), if at least one of the
second partials f xy and f yx exists and is continuous, then we must have f xy = f yx .
∂f
Solution Needless to say, it is very tedious and time-consuming to compute ∂x . However,
∂ ∂f
by the Mixed Partials Theorem, we can try to find ∂x ∂y , and if it is continuous, then
∂ ∂ f ∂ ∂ f
∂y ∂x = ∂x ∂y .
∂ ∂f
It is much easier to compute ∂x ∂y since the “monster” term is gone after differentiating
the function by y:
∂f ∂
= 0 − sin( xy) · xy
∂y ∂y
= − x sin( xy)
∂ ∂f ∂
= − ( x sin( xy))
∂x ∂y ∂x
= − sin( xy) − xy cos( xy),
i The Mixed Partials Theorem also applies to functions of more than two variables and to
higher-order derivatives. Let’s say given a function of four variables f ( x, y, z, t), we have:
df d f dx
= .
dt dx dt
One may represent this chain of relations by the schematic diagram:
f
df
dx
x
dx
dt
t
df d f dx
Figure 2.6: the schematic diagram for chain rule formula dt = dx dt .
For multivariable functions, the relation between variables can be more complicated. For
example, let the function u( x, y, z) be the temperature at the point ( x, y, z) in the space. Suppose
a particle moves along the path
Then, the coordinates x, y and z all depend on t, and so u is ultimately a function of t. See
Figure 2.7 for the tree diagram of the variables.
u
∂u ∂u
∂x ∂u ∂z
∂y
x y z
dx dy dz
dt dt dt
t t t
The derivative du
dt is the rate of change of the temperature that the particle “feels” as it
travels. The multivariable chain rule can be read off from the tree diagram 2.7:
du ∂u dx ∂u dy ∂u dz
= + + .
dt ∂x dt ∂y dt ∂z dt
i In this course, our focus is on how to use the chain rule and how to write it down, but not
the proof of the chain rule. The proof is usually taught in advanced real analysis courses.
Let’s see a concrete example of the use of the multivariable chain rule:
2.3 Chain Rule 27
du ∂u dx ∂u dy ∂u dz
= + +
dt ∂x dt ∂y dt ∂z dt
2x ·(− sin t) + 2y · cos
= |{z} t +(−2z ) · 1 (calculate each derivative)
| {z } |{z} |{z} |{z} |{z}
∂u dx ∂u dy ∂z dz
∂x dt ∂y dt ∂t dt
As illustrated in the above example, once the chain rule formula is correctly written
according to the tree diagram, the remaining computations are straight-forward. From now on,
we will investigate how to write down the chain rule under various configuration of variables.
The computations will usually be skipped and are left as exercises for readers.
∂u ∂u
∂x ∂u ∂t
∂y ∂u
∂z
x y z t
dx dy dz
dt dt dt
t t t
du
There are four paths from u to t, so we expect the chain rule formula for dt consists of four
terms:
du ∂u dx ∂u dy ∂u dz ∂u
= + + + .
dt ∂x dt ∂y dt ∂z dt ∂t
i Note that du ∂u
dt is different from ∂t . As the particle travels along its path, the temperature
the particle “feels” is a ultimately a function of t, and so the rate of change of temperature
is represented by du dt (using d instead of the partial ∂).
On the other hand, the partial ∂u∂t is the time derivative of u regarding x, y and z constant!
Therefore, it is the rate of change of temperature when the position is fixed! It is not the
rate of change of temperature for the moving particle!
g
0z0x
0x
= 2 cos 121s
= 2 cos 121s +
28 Partial Differentiations
d
x
∂w
There are three paths from w to s, so the chain rule for ∂s is given by the following:
∂w ∂w ∂x ∂w ∂y ∂w ∂z
∂s
=
∂x ∂s
+
∂y ∂s
+
∂z ∂s
. EXAMPLE 3
EXAMPLE 3 More
Morevar
v
functionof
function ofss and
and t.t.
w
w a. Draw a labeled tree diag
w w a. Draw a labeled tree di
w x w w z b. Write the Chain Rule fo
x w y z
x y z b. Write the Chain Rule f
y
x x SOLUTION
x y y y zz z
s t s t s t
x x y y z z SOLUTION
a. Because w is a function
s ts s t st st s t t labeled with the partial d
FIGURE 13.59 a. Because w is asofunctio
two variables, the low
s t s t s t labeled
b. withTheorem
Extending the partia
13.8
QUICK CHECK 3 If Q is a function of
FIGURE 13.59Figure
w, x, y,2.9:and
treez,diagram
each of which is a func-
two
(redvariables,
branches so
in the
Figure lo1
tion of r, s, and t, how many depen- b. Extending and adding gives the res
QUICK CHECK 3 If Q is a function of
Theorem 13
A “multi-level” example dent variables, intermediate variables, (red branches in Figur
w, x,ofy,z only,
Now suppose w is a function andandzz, each of of
isindependent
a function which
and y,isand
xvariables aare
func-
both x and y are functions
there?
➤
of t, as illustrated in Figure
tion2.10.
of r, s, and t, how many depen- and adding gives the r
Ultimately, w is a function t. There are two paths from w to t in the tree diagram. Each path
dentThe
consists of three segments. variables, intermediate
chain rule for dw
dt is given by:
variables,
➤
and independent
dw
variables are there?
dw ∂z dx dw ∂z dy
It is probably clear by n
= + . among variables. The key i
dt dz ∂x dt dz ∂y dt
tree with the appropriate de
w It is probably
EXAMPLE 4 Aclear b
differen
dw
of x and
among y, and each
variables. of xkey
The an
dz thewith
tree Chain Rule
the formula for
appropriate
z z z
x y SOLUTION The dependent
wx y
paths
EXAMPLE 4 wASdiffer
in the tree: zS
dw dt
dx dy of tree,
x and w is a function of the s
y, and each of x
dt tive dw>dz. The tree below
dz theofChain
the twoRule formula
branches connecf
z zt z t
x FIGUREy 13.60 SOLUTION The depende
dw dw
=
x
Figure 2.10: tree diagram
y
paths in the tree: wdtS z dz
S
tree, w is a function of th
2.3.2 Implicit Differentiation
dx dy
dt dt tive dw>dz. The tree belo
Given an implicit equation such as: Implicit Differentiatio
of the two branches conn
t t sin2 y = 1, Using the Chain Rule for par
x 2 + y3 +
in a larger perspective.
dw Reca
FIGURE 13.60 =
sin xy + py = x, then
it is very difficult (possibly impossible) to express y in terms of x. In single-variableascalculus, 2
dy
we learned how to find dx using implicit differentiation – regard y as a functionAnother dt
of x, and way to compute d
dy
differentiate both sides by x then solve for dx . Notice that the original relat
The multivariable chain rule offers an alternative approach to implicit differentiation.
Implicit Differentiat
Using the Chain Rule for p
in a larger perspective. Re
as sin xy + py 2 = x, the
2.3 Chain Rule 29
F
∂F ∂F
∂x ∂y
x y
dy
dx
dF ∂F ∂F dy
= + .
dx ∂x ∂y dx
dF
Recall that F ( x, y) = 1 is a constant, so dx = 0, which yields:
dy dy Fx
0 = Fx + Fy , and so: =− .
dx dx Fy
dy 2x
=− 2 .
dx 3y + 2 sin y cos y
30 Partial Differentiations
i When u = i, then u1 = 1 and u2 = 0 and so Partial derivatives tell us a lot about the rate of chang
ever, they do not directly answer some important q
( x, y). at a point 1a, b, f 1a, b22 on the surface
d f ( x + t, y) − f ( x, y)are standing
∂f
Di f ( x, y) = f ( x + t, y) = lim =
dt t =0 t →0 t fx and
∂x fy tell you the rate of change (or slope) of the
parallel to the x-axis and y-axis, respectively. But yo
directions from that point and find a different rate of
observation in mind, we pose several questions.
We seek the z
rate of change • Suppose you are standing on a surface and you walk
of f at P0 in the dinate direction—say, northwest or south-southeast
direction of u. function in such a direction?
• Suppose you are standing on a surface and you rele
(a,, b,
b, f (a,
(a b)) In which direction will it roll?
z f (x, y) • If you are hiking up a mountain, in what direction s
you want to follow the steepest path?
These questions will be answered in this section as
tive, followed by one of the central concepts of calcu
x P0(a, b)
u y
Directional Derivatives
Unit vector u
Let 1a, b, f 1a, b22 be a point on the surface z = f 1x
FIGURE 13.63 xy-plane (Figure 13.63). Our aim is to find the rate
Figure 2.12: directional derivative P 1a, b2. In general, this rate of change is neither f 1a
0 x
兩u兩 ⫽ 1 or u = 8 0, 1 9 ), but it turns out to be a combination o
In practice, we do not need to compute uDu f from u2 ⫽ sinthe definition, since we have
Figure 13.64a Theorem
shows the 2.2unit vector u at an angl
below to help us. In order to introduce this theorem, we first define: nents are u = 8 u 1, u 2 9 = 8 cos u, sin u 9 . The deriva
u ⫽ cos / ingradient
Definition 2.4 — Gradient Vector. Given1 a two-variable function f the ( x, yline
), the the xy-plane through P0 in the direction o
vector
of f at ( x, y) is denoted and defined as: (a) h units from P0 along /, has coordinates P1a + h cos
ᐉ
Now imagine the plane Q perpendicular to the xy
y P(a∂ ⫹
f h cos , b ⫹∂hfsin ) the surface z = f 1x, y2 in a curve C. Consider two po
∇ f ( x, y) = ( x, y)i + ( x, y)j.
∂x ∂y they have z-coordinates f 1a, b2 and f 1a + h cos u, b
u h sin of the secant line between these points is
h
As an example, let f ( x, y) = x2 y + x3 , then
f 1a + h cos u, b + h sin u2
P0(a, ∂b)f h cos h
( x, y) = 2xy + 3x2
∂x
u ⫽ 具u1, u2典 ⫽ 具cos , sin 典 The derivative of f in the direction of u is obtained
∂f 2 ists, it is called the directional derivative of f at 1a,
( x, y) = x x
∂y slope of the line tangent to the curve C in the plane Q
(b)
Therefore, FIGURE 13.64
DEFINITION Directional Derivative
∇ f ( x, y) = (2xy + 3x2 )i + x2 j.
Let f be differentiable at 1a, b2 and let u = 8 cos u
The vector ∇ f ( x, y) depends on ( x, y). By taking different values of ( x, y), a different
xy-plane. gradient
The directional derivative of f at 1a, b2
f 1a + h cos u, b +
Du f 1a, b2 = lim
hS0
Du f ( x, y) = ∇ f ( x, y) · u.
θ
∇ f ( a, b)
(a, b)
(c, d)
Proof: A level curve of the function z = f 1x, y2
t f 1x, y2 = z0, where z0 is a constant. By Theorem 1
Level curve: f (x, y) z0
f (c, d)
level curve is y1x2 = - fx >fy.
It follows that any vector that points in the di
x
1a, b2 is a scalar multiple of the vector t = 8 -fy1a
FIGURE 13.71 same point, the gradient points in the direction f
Figure 2.14: ∇ f ( a, b) is orthogonal to the level curve of a, bt )and f 1a, b2 is
f at (of
product
➤ We have used the fact that the vector t # f 1a, b2 = 8 -fy, fx 9 1a,b2 # 8 fx, fy 9 1a,b2
8 a, b 9 has slope b>a.
2.4.2 Directional Derivative of Three-Variable Functions which implies that t and f 1a, b2 are orthogonal.
The directional derivative and the gradient vector for functions f ( x, y, z) of three variables is
An immediate consequence of Theorem 13.12 i
defined in an analogous way as for two-variable functions. Precisely, we have:
line. The curve described by f 1x, y2 = z0 can be view
∂f ∂f ∂f surface. By Theorem 13.12, the line tangent to the cu
∇f = i+ j+ k
∂x ∂y ∂z Therefore, if 1x, y2 is a point on the tangent line,
which, when simplified, gives an equation of the lin
fx1a, b21x - a2 + fy1a, b2
tangent line.
Level curve We now compute the gradient:
for z 2
2.4 Directional Derivatives 33 2x
Tangent to 1 f 1x, y2 = 8 fx, fy 9 = h
level curve 1 y 21 + 2x
at (1, 1)
1.0
t 具1, 2典 0.5
f (1, 1) 具1, q典
It follows that f 11, 12 = 8 1, 9 (Figure 13.72
1
2
0.8
0.7 are orthogonal because
x
0.8
t f 0 t # f 11, 12 = 8 1, -2
f is orthogonal to level curves.
0.4 0.6
b. An equation of the line tangent to the level curv
0.6 FIGURE 13.72
fx11, 121x - 12 + fy11,
d
0.2 Ball is released 1 1
2
0.4 z at (3, 4, 61).
or y = -2x + 3.
z 4 x0.1
2 3y2
0.3
0.2
x 2 + y2 = z2 + 3
Solution First we need to write the equation of the surface in a level set form, i.e.
x 2 + y2 − z2 = 3
Then, n := 4i + 2k is a normal vector of the surface at (2, 0, −1). The equation of the tangent
plane at (2, 0, −1) is given by:
z − f ( x, y) = 0.
Then, one can define g( x, y, z) = z − f ( x, y) so that the graph of the two-variable function
f ( x, y) becomes a level set of a three-variable function g( x, y, z). Let’s look at an example:
2.5 Tangent Planes 35
Example 2.7 Given the function f ( x, y) = x cos y − ye x , find the tangent plane at (0, 0, 0) to
the graph z = x cos y − ye x .
Solution First rearrange the terms so that the graph equation becomes a level set:
z − x cos y + ye x = 0.
Define g( x, y, z) = z − x cos y + ye x , then the surface under consideration is the level set
g( x, y, z) = 0.
∇ g( x, y, z) = (− cos y + ye x )i + ( x sin y + e x )j + k.
The normal vector at (0, 0, 0) is given by:
n = ∇ g(0, 0, 0) = −i + j + k.
− x + y + z = 0.
Generally, one can derive a formula for finding the tangent plane of any graph of a
two-variable function f ( x, y):
Theorem 2.4 The equation of the tangent plane for the graph z = f ( x, y) at the point
( x0 , y0 , f ( x0 , y0 )) is given by:
∂f ∂f
z = f ( x0 , y0 ) + ( x0 , y0 ) · ( x − x0 ) + ( x0 , y0 ) · ( y − y0 ).
∂x ∂y
z − f ( x, y) = 0
and define g( x, y, z) = z − f ( x, y), then the graph can be regarded as a level set g( x, y, z) = 0
of the three-variable function g.
∂f ∂f
∇ g( x, y, z) = − i− j + k.
∂x ∂y
At the point ( x0 , y0 , f ( x0 , y0 )), the normal vector to the surface is therefore given by:
∂f ∂f
n = ∇ g( x0 , y0 , f ( x0 , y0 )) = − ( x0 , y0 )i − ( x0 , y0 )j + k.
∂x ∂y
By rearrangement, we get:
∂f ∂f
z = f ( x0 , y0 ) + ( x0 , y0 ) · ( x − x0 ) + ( x0 , y0 ) · ( y − y0 ),
∂x ∂y
as desired.
36 Partial Differentiations
∂f ∂f
Solution We compute ∂x and ∂y , and then set them to zero and solve for ( x, y):
∂f
0= = y − 2x − 2
∂x
∂f
0= = x − 2y − 2.
∂y
It is a system of equations with unknowns x and y. The first equation gives y = 2x + 2, and
substitute it into the second equation, we get:
x − 2(2x + 2) − 2 = 0 ⇒ x = −2.
When x = −2, we have y = −2, and so ( x, y) = (−2, −2) is a critical point of f ( x, y). See
Figure 2.17a for its graph.
2.6 Local Extrema 37
Example 2.9 Find all critical point(s) of the function f ( x, y) = sin x sin y
Solution Consider:
∂f ∂f
=0 and =0
∂x ∂y
cos x sin y = 0 and sin x cos y = 0
(cos x = 0 or sin y = 0) and (sin x = 0 or cos y = 0)
π π
( x = + kπ or y = mπ ) and ( x = nπ or y = + pπ ).
2 2
Here m, n, k, p are any integers. Some logical deductions show that these imply the following:
π π
x= + kπ and y = + pπ
2 2
or: y = mπ and x = nπ.
f xx (0, 0) = 2
f yy (0, 0) = 2
38 Partial Differentiations
for every ( x, y) on the R2 plane. Both are positive numbers. You may be tempted to conclude
that (0, 0) is a local maximum point. However, if one plots the graph of this function (see
Figure 2.18), one can see easily that (0, 0) is neither a local maximum or a local minimum.
Around (0, 0), the graph is a concave up in some directions but concave down in other
directions. We call this (0, 0) a saddle.
This example shows the signs of f xx and f yy alone could not conclude the nature of the
critical point. In fact, the second derivative test for two-variable functions is slightly more
complicated than that in single-variable calculus:
Theorem 2.5 — Second Derivative Test for Two-Variable Functions. Let f ( x, y) be a twice dif-
ferentiable function and ( x0 , y0 ) is a critical point of f , i.e. ∇ f ( x0 , y0 ) = 0. Then the nature
of this ( x0 , y0 ) is determined by the following table:
critical point
2
f xx f yy − f xy f xx ( x0 , y0 ) ( x0 , y0 ) is a:
( x0 ,y0 )
>0 >0 local minimum
>0 <0 local maximum
<0 anything saddle
Any other cases are inconclusive.
For the function f ( x, y) = x2 + 4xy + y2 in the above example, to determine the nature of (0, 0)
we also need f xy (0, 0), which can be found as equal to 4.
Therefore, we have:
2
f xx f yy − f xy = 2 × 2 − 42 < 0,
(0,0)
f xx (0, 0) = 2 > 0.
From the table in Theorem 2.5, we conclude (0, 0) is a saddle, as expected from the plot of the
its graph.
Let’s look at one more example before we learn the proof of the Second Derivative
Test.
Example 2.10 Let f ( x, y) = 3y2 − 2y3 − 3x2 + 6xy. Find all critical points and determine
the nature of each of them.
∂f
= −6x + 6y = 0,
∂x
∂f
= 6y − 6y2 + 6x = 0.
∂y
From the first equation, we get y = x. Substitute this into the second equation, we yield:
6x − 6x2 + 6x = 0, or equivalently 2x − x2 = 0.
2.6 Local Extrema 39
x = 0 or x = 2.
By noting that y = x, we have two critical points: (0, 0) and (2, 2).
Next we compute the second derivatives of f :
f xx = −6 f xy = 6
f yx = 6 f yy = 6 − 12y
Critical point P f xx ( P) f yy ( P) f xy ( P) 2
f xx f yy − f xy ( P) Nature of P
(0, 0) -6 6 6 -72 saddle
(2, 2) -6 -18 6 72 local maximum
f 00 ( a) f 000 ( a)
f ( x ) = f ( a) + f 0 ( a)( x − a) + ( x − a )2 + ( x − a )3 + . . .
2! 3!
f 00 ( a)
f ( x ) ' f ( a) + ( x − a )2 when x is near a.
2!
f 00 ( a)
The right-hand side f ( a) + 2! ( x − a)2 is a quadratic function. If f 00 ( a) > 0, then the graph
f 00 ( a) f 00 ( a)
y = f ( a) + 2! ( x − a)2 is a concave up parabola and so f ( a) + 2! ( x − a )
2 ≥ f ( a). Therefore,
f 00 ( a) 2
f ( x ), which is approximately f ( a) + 2! ( x − a ) , is also ≥ f ( a) when x is near a. This explains
f ( x ) has a local minimum at x = a.
f 00 ( a)
On the other hand, if f 00 ( a) < 0, then the graph y = f ( a) + 2! ( x − a)2 is a concave down
parabola. Similar argument as above shows f ( x ) has a local maximum at x = a.
10
Figure 2.19: blue graph shows y = f ( x ) where f 0 (0) = 0; yellow graph shows y = f (0) +
f 00 (0) 2
2! x where f 00 (0) > 0
Back to multivariable calculus, we now explain the second derivative test using the Taylor’s
series approach. Given a function f ( x, y), the multivariable Taylor’s series about ( x, y) =
40 Partial Differentiations
( x0 , y0 ) is given by:
f ( x, y) = f ( x0 , y0 ) + f x ( x0 , y0 )( x − x0 ) + f y ( x0 , y0 )(y − y0 )
f xx ( x0 , y0 ) 2 f xy ( x0 , y0 ) f yy ( x0 , y0 )
+ ( x − x0 )2 + ( x − x0 )(y − y0 ) + ( y − y0 )2
2! 2! 2!
+ higher-order terms
The proof is beyond the scope of the course. If ( x0 , y0 ) is a critical point of f ( x, y), then
f x ( x0 , y0 ) = f y ( x0 , y0 ) = 0. For simplicity, denote P = ( x0 , y0 ), then when ( x, y) is near P, we
have:
1
f ( x, y) ' f ( x0 , y0 ) + f xx ( P)( x − x0 )2 + 2 f xy ( P)( x − x0 )(y − y0 ) + f yy ( P)(y − y0 )2 .
2
Therefore, to determine whether ( x0 , y0 ) is a local maximum/minimum or a saddle of f ( x, y),
one should determine whether the quadratic
is positive/negative or neither.
For simplicity, denote
X = x − x0 , Y = y − y0 .
Then, the quadratic expression can be simplified as:
Translate
back to previous notations, we can conclude:
2
f xx f yy − f xy f xx ( x0 , y0 ) ( x0 , y0 ) is a:
( x0 ,y0 )
>0 >0 local minimum
>0 <0 local maximum
<0 anything saddle
This explains the second derivative test for two-variable functions!
2.7 Lagrange’s Multiplier 41
∇ f ( x, y) = λ∇ g( x, y)
g( x, y) = c
i We call this the Lagrange’s Multiplier method because the scalar λ is called the Lagrange’s
Multiplier.
Example 2.11 Let f ( x, y) = x2 + y2 + 2x + 2y, find the maximum and minimum values of
f when ( x, y) is restricted on the constraint x2 + y2 = 1.
∇ f = h2x + 2, 2y + 2i
∇ g = h2x, 2yi.
2x + 2 = 2λx 1
2y + 2 = 2λy 2
2 2
x +y = 1 3
2x + 2 2λx
= .
2y + 2 2λy
42 Partial Differentiations
y( x + 1) = x (y + 1) ⇒ xy + y = xy + x ⇒ y = x.
1.0 3.82843
0.5 2.41421
0.414214
3.12132
0.0
0.292893
-0.5
-1.82843
1.70711
-1.0 -1.12132
-1.0 -0.5 0.0 0.5 1.0
Example 2.12 Let f ( x, y) = x2 − 4x + y2 + 9. Find the maximum and minimum points and
values of f ( x, y) subject to the constraint 4x2 + 9y2 = 36.
Solution Define g( x, y) = 4x2 + 9y2 , then the constraint is the level set g( x, y) = 36. Set-up
the Lagrange’s Multiplier system:
∇ f ( x, y) = λ∇ g( x, y)
g( x, y) = 36
2x − 4 = 8λx 1
2y = 18λy 2
4x2 + 9y2 = 36 3
Case 1: 2 6= 0
By 1 ÷ 2 , we get:
2x − 4 8λx
=
2y 18λy
x−2 4x
= (cancel λ)
y 9y
9y( x − 2) = 4xy (cross multiplication)
9( x − 2) = 4x (cancel y 6= 0)
18
x=
5
18
However, substitute x = 5 into 3 , we get:
2
2 18
9y = 36 − 4
5
which is a negative number, but 9y2 must be positive (or zero)! Therefore, there is no solution
in this case.
Case 2: 2 = 0
In this case, we have 2y = 18λy = 0, so y = 0. From 3 , we have 4x2 = 36 and so x = 3
or x = −3. Therefore, ( x, y) = (3, 0) and ( x, y) = (−3, 0) are the solutions in this case.
Finally, we evaluate f at each boundary critical point:
f (3, 0) = 6
f (−3, 0) = 30
Therefore, minimum point is (3, 0) with value 6; maximum point is (−3, 0) with value 30.
See Figure 2.21
972 Chapter 13 • Functions of Several Variables
0 1 2 3
of f increase and reach a maximum value at
(b) the level set diagram
z
f (x, y) x2 y2 2x 2y
Figure 2.21: f ( x, y) = x2 − 4x + y2 + 9 in Example 2.12
Maximum va
The equation ∇ f = λ∇ g concerns about the geometry of the level sets and the constraint C occurs at P
at the boundary critical point. Given a function f ( x, y) subject to the constraint g( x, y) = c. At
the point ( a, b) on the constraint where the maximum or minimum of f ( x, y) is achieved, the
level set of f ( x, y) at ( a, b) is tangent to the constraint g( x, y) = c. Consequently, the gradient
vector ∇ f ( a, b), which is perpendicular to the level set of f at ( a, b), must be parallel to the
gradient vector ∇ g( a, b), which is perpendicular to the constraint g = c. See Figure 2.22 for an
illustration. Therefore, at such a point, we must have: Constraint cu
C: {(x, y): x2
∇ f ( a, b) = λ∇ g( a, b) x y
C: {(x, y): x2 y2 4}
where λ is a scalar.
(a)
FIGURE 13.95
f (a, b) y Level curves of f
The Lagrange’s Multiplier method also works for three-variable functions, yet the system
of equations may be more complicated. Let’s look at the example:
Example 2.13 Find the distance from (0, 0, 0) to the plane 2x + 3y + 4z = 29 using La-
grange’s Multiplier.
Solution The distance from a point P to a plane is defined to be the shortest possible
distance between the given point P and any point Q on the plane. Let’s first formulate this
problem in a mathematical way. We want to:
q
minimize x 2 + y2 + z2
subject to constraint 2x + 3y + 4z = 29
p p
However, to minimize x2 + y2 + z2 amounts to calculating ∇ x2 + y2 + z2 . As you
p
can imagine, it would be messy. It is useful to observe that minimizing x2 + y2 + z2 is
equivalent to minimizing x2 + y2 + z2 , i.e. the square of the distance from the origin. The
latter is much easier to handle. Let:
f ( x, y, z) = x2 + y2 + z2
g( x, y, z) = 2x + 3y + 4z,
then the constraint is the level set g = 29. Set-up the Lagrange’s Multiplier system ∇ f = λ∇ g
as in previous examples:
2x = 2λ
2y = 3λ
2z = 4λ
2x + 3y + 4z = 29
3λ
Then x = λ, y = 2 and z = 2λ. Substitute them into the constraint equation, we get:
9λ
2λ + + 8λ = 29.
2
It is easy to see that λ = 2, and therefore ( x, y, z) = (2, 3, 4). It gives the unique critical point.
It is intuitive that the minimum point must exist in this problem (and there is no maximum
√ give the minimum point. Since f (2, 3, 4) = 29, the
point), so this unique critical point must
distance from (0, 0, 0) to the plane is 29.
46 Partial Differentiations
2.8 Optimizations
In this section we see more optimization examples of using the gradient and/or Lagrange’s
Multiplier method.
Example 2.14 Many airlines require that the sum of length, width and height of a checked
baggage cannot exceed 62 inches. Find the dimensions of the rectangular baggage that has
the greatest possible volume under this regulation.
Solution Denote l, w, h to be the length, width and height respectively. We need to maximize
the volume of the baggage, which is given by:
The constraint is l + w + h ≤ 62 (inches), but it is intuitively clear that in order to maximize the
volume, the sum l + w + h has better be at maximum possible. Define g(l, w, h) = l + w + h,
then the constraint can be regarded as the level set g = 62. Set up the Lagrange’s Multiplier
system:
∇V = λ ∇ g
g(l, w, h) = 62
which is equivalent to
wh = λ
lh = λ
lw = λ
l + w + h = 62
Although it is not too difficult to solve them by hand, let’s type the following command on
Mathematica to solve them:
Solve[{w h == L, l h == L, l w == L, l + w + h == 62}, {l, w, h, L}]
These are all critical points:
62 62 62
(l, w, h) = , , , (0, 0, 62), (0, 62, , 0), (62, 0, 0).
3 3 3
Only the first one is physically relevant. Therefore, the rectangular baggage with the largest
volume under this restriction is the square cube!
Example 2.15 Three cities A, B and C are located at (5, 2), (−4, 4) and (−1, −3) respectively
on the ( x, y)-plane. There is a railtrack whose equation is y = x3 + 1, and a station is going
to be built on the track so that the sum of squares of the distances from each city to the
station is minimized. Find the coordinates of the station.
The constraint is that the station has to be on the track, i.e. y = x3 + 1. Define g( x, y) = y − x3 ,
then the constraint can be written as g( x, y) = 1. Set up the Lagrange’s Multiplier system
2.8 Optimizations 47
∇ f = λ∇ g and g( x, y) = 1:
2( x − 5) + 2( x + 4) + 2( x + 1) = −3λx2
2( y − 2) + 2( y − 4) + 2( y + 3) = λ
y − x3 = 1
Solving the system, we get ( x, y) = (0, 1). Therefore, the station should be located at (0, 1)
in order to minimize the sum of squares of the distances.
( x1 , y1 ), . . . , ( x N , y N )
on the xy-plane. Find the straight-line y = mx + c such that the sum of squares of distances
between each ( xi , yi ) and ( xi , mxi + c) is minimized.
N
f (m, c) = ∑ (yi − mxi − c)2 .
i =1
Note that ( xi , yi )’s are given so they should be regarded as constants. The variables are m
and c. Note that there is no constraint for m and c, so we can simply solve ∇ f (m, c) = 0 for
critical points.
!
N N N N
∂f
= −2 ∑ (yi − mxi − c) xi = −2 ∑ xi yi − m ∑ xi2 − c ∑ xi
∂m i =1 i =1 i =1 i =1
!
N N N
∂f
= −2 ∑ (yi − mxi − c) = −2 ∑ yi − m ∑ xi − cN
∂c i =1 i =1 i =1
∂f ∂f
Set ∂m = ∂c = 0, regarding all xi ’s and yi ’s to be constants, then:
Am + Bc = E
Bm + Nc = F
where A = ∑iN=1 xi2 , B = ∑in=1 xi , E = ∑iN=1 xi yi and F = ∑iN=1 yi . By solving the system
carefully, one should get:
BF − EN ( ∑ xi ) ( ∑ yi ) − N ( ∑ xi yi )
m= 2
=
B − AN (∑ xi )2 − N ∑ xi2
(∑ xi ) (∑ xi yi ) − ∑ xi2 (∑ yi )
BE − AF
c= 2 =
B − AN ( ∑ x i )2 − N ∑ x 2
i
It is quite intuitive that this pair of m and c should minimize f since f ≥ 0 and so a minimum
must exist.
48 Partial Differentiations
Solution The Lagrange’s Multiplier method finds us the boundary critical points on
4x2 + 9y2 = 36. For the interior 4x2 + 9y2 < 36, the critical points are simply solutions to
∇ f = 0. The general procedure of an optimization problem with a solid domain is that:
1. Find all interior critical points by solving ∇ f = 0;
2. Find all boundary critical points using Lagrange’s Multiplier;
3. Evaluate f at each critical points found, and look for the point that gives greatest/lowest
value of f .
Interior: Set ∇ f = 0, we get:
2x − 4 = 0
2y = 0
Therefore, the only interior critical point is (2, 0), which can be checked easily that it is in the
given region.
Boundary: We have already done in Example 2.12 that the boundary critical points are (3, 0)
and (−3, 0).
Finally, evaluate f at each critical point found:
f (2, 0) = 5
f (3, 0) = 6
f (−3, 0) = 30
Therefore, the absolute minimum is 5 (attained at (2, 0)), and the absolute maximum is
30 (attained at (−3, 0)).
3 — Multiple Integrations
outer
ˆ y =2 ˆ x =1
z }| {
(4 − x − y2 x )dx dy .
y =1 x =0
| {z }
inner
When computing the inner integral (which is respect to x in this example), we regard all
other variable(s) (i.e. y) to be constant(s):
ˆ y =2 ˆ x =1 ˆ y =2 x =1
x2 y2 x 2
(4 − x − y2 x )dxdy = − 4x − dy
y =1 x =0 y =1 2 2 x =0
ˆ y=2
1 y2
= 4− − − 0 dy
y =1 2 2
ˆ y =2 y =2
7 y2 7y y3
7
= − dy = − = .
y =1 2 2 2 6 y =1 3
It is worthwhile to note that if we switch the inner and outer integrals, the final answer is the
50 Multiple Integrations
same!
ˆ x =1 ˆ y =2 ˆ y =2 y =2
y3 x
(4 − x − y2 x )dydx = 4y − xy −
dx
x =0 y =1 y =1 3 y =1
ˆ x =1
8x x
= 8 − 2x − − 4−x− dx
x =0 3 3 14.1 Double Integral
ˆ x =1
10x 7
= 4− dx = . The functions that we encounter in this book ar
x =0 3 3
needed to prove that continuous functions and many
are also integrable.
It is not a coincident! Let’s explain why it is true by learning the geometric meaning of double
integrals. Consider the integral:
Iterated Integrals
ˆ x =b ˆ y=d Evaluating double integrals using limits of Riemann
f ( x, y) dydx. tunately, there is a practical method that reduces a
x=a y=c
variable) integrals. An example illustrates the techniq
The inner integral: ➤ Recall the General Slicing Method. If a Suppose we wish to compute the volume of the
solid is sliced ˆ
parallel
y=d to the y-axis and z = f 1x, y2 = 6 - 2x - y over the rectangular
perpendicular
A( x ) := to the xy-plane,
f ( x, yand
) dythe 0 … y … 2 6 (Figure 14.5). By definition, the volum
cross-sectional area
y=c of the slice at the
point x is A1x2, then the volume of
is an integral with respect to y keeping x fixed. represents the area Vunder
This quantity = f 1x, y2 dA = 16
the solid region is O O
the curve obtained by moving along the y-directionb
from y = c to y = d on the surface R R
z = f ( x, y), while keeping x unchanged. VSee
= Figure 3.1.
A1x2 dx. According to the General Slicing Method (Section 6
La
taking slices through the solid parallel to the y-axis an
ure 14.5). The slice at the point x has a cross-secti
as x varies, the area A1x2 also changes, so we integr
x = 0 to x = 1 to obtain the volume
1
V = A1x2 dx
L0
y
The important observation is that for a fixed val
region under the curve z = 6 - 2x - y. This area is
spect to y from y = 0 to y = 2, holding x fixed; that
2
16 - 2x - A1x2 =
L0
where 0 … x … 1, and x is treated as a constant in th
we have
z
Figure 3.1: geometric meaning of a double integral 1 1 2
V = A1x2 dx = c 16 -
L0 L0 L0
The outer integral integrates the inner integral A( x ) from x = a to x = b, i.e.
y
ˆ x =b ˆ y=d ˆ x =b
f ( x, y) dydx =
A( x ) dx. The expression that appears on the right side o
x=a y=c integral (meaning repeated integral). We first evalua
x=a
holding x fixed; the result is a function of x. Then th
Since A( x ) dx can be thought as the volume of a solid slice with width spect todxx;and cross-section
the result is a real number, which is the vol
area A( x ), by integrating A( x ) dx it means adding up the volume these of these thin are
integrals slices and so
ordinary one-variable integrals.
the double integral
ˆ x =b
EXAMPLE 1 Evaluating an iterated integral E
A( x ) dx 2
A1x2 = 10 16 - 2x - y2 dy.
x=a
is the volume under the graph z = f ( x, y) over the base rectangleSOLUTION bounded by Using
x =thea,Fundamental
x = b, Theorem of Calc
y = c and y = d. It is important to understand the geometric meanings of the inner and outer 2
i
y A1x2
SOLUTION In this case, A1y2 is the a
The expression that appears on the right y inside of this equation
the interval 0 … y … is2.called an
This are
integral (meaning repeated integral). We first x =evaluate the1,inner
0 to x = integral
holding with
y fixed; re
that
holding x fixed; the result is a function of x. Then the outer integral is evaluate
spect to x; the result is a real number, which is the volume of the solid in Figure
A1y2 =1
these integrals are ordinary one-variable integrals.
where 0 … y … 2. 1
EXAMPLE 1 Evaluating an iterated integral Evaluate
Using V = Slicing
the General 10 A1x2Metho
dx, w
2
A1x2 = 10 16 - 2x - y2 dy. 2
V = A1y2 dy
SOLUTION Using the Fundamental Theorem of Calculus, holding L0 x constant, w
2
2 1
A1x2 = 16 - 2x - y2 dy
L0 = c 16 - 2x - y
y
x L0 L0
y 2 2
y
i
y y
= 2a 6y - 2xy - b` A1y2
Fundamental Theorem of Calculus
A slice at a fixed value of x A slice at a fixed value of y has 2 0
has area A(x), where 0 x 1. 2
area A(y), where 0 y 2.
= 112 - 4x - 22 - 0 = arec 16x
Simplify; limits in y. - x 2 - yx2
FIGURE 14.5 FIGURE 14.6 L0
(a) slices with fixed x = 10 fixed
(b) slices with - 4x.y Simplify.
2
Even simpler, one may denote dA := dxdy or dydx and the rectangular region by R. Then, we
can write the integral as:
¨
f ( x, y) dA.
R
52 Multiple Integrations
When setting up a double integral to find the volume of the solid under a graph z = f ( x, y),
it is worthwhile to observe that the lower and upper limits of the integral are not affected by
the function f ( x, y). Therefore, in order to interpret a double integral in a geometric way, one
may simply draw the base region (or in other words, the top-down view) instead of drawing
the solid in the three-dimensional space.
y x=1 y x=1
leaves at y = 2
y=2 y=2
enters at x = 0 leavs at x = 1
x x
enters at y = 0
(a) top-down view of Figure 3.2a (b) top-down view of Figure 3.2b
Figure 3.3: the red arrows represent the cross-section slices in Figures 3.2a and 3.2b.
3.2 Fubini’s Theorem for General Regions 53
Example 3.2 Find the volume of the solid under the plane z = 3 − x − y over the triangle
region R bounded by the x-axis, x = 1 and y = x.
Solution First we choose an order of integration, say dydx. The inner integral should
calculate the area of slices with y varies and x fixed. Since the height 3 − x − y of the solid
does not affect how we set-up the upper/lower limits, we consider the top-down view of the
solid (see Figure 3.4).
The red strip in Figure 3.4b represents a sample slice with fixed x. The strip enters at
y = 0 and leaves at y = x. Hence, the area of this slice is:
ˆ y= x
(3 − x − y) dy.
y =0 | {z }
height
“Summing up” the area of these slices, we integrate by dx over the range of x: 0 ≤ x ≤ 1 as
shown in Figure 3.4b, i.e.
ˆ x =1 ˆ y = x
(3 − x − y) dy dx.
x =0 y =0
| {z }
inner integral
It will give the volume of the solid as required in this problem. The rest of the task is to
compute the integral:
ˆ x =1 ˆ y = x ˆ x =1 y= x
y2
(3 − x − y) dydx = 3y − xy − dx
x =0 y =0 x =0 2 y =0
ˆ x =1
x2
= 3x − x2 − dx
x =0 2
ˆ 1
3x2
= 3x − dx
0 2
3x2 x 3 1
= −
2 2 0
= 1.
Alernatively, we can also integrate first by dx then by dy. Then, the inner integral is
represented by the red strip in Figure 3.4c. It enters at x = y and leaves at x = 1. The double
integral is therefore:
ˆ y =1 ˆ x =1
(3 − x − y) dxdy.
y =0 x =y
y x1
(3, 0, 0) yx
z 5 f(x, y) 5 3 2 x 2 y yx
R
x
(1, 0, 2) 0 y0 1
(b)
y x1
yx
(1, 1, 1)
y xy
x1
(1, 0, 0) (1, 1, 0)
R
R x51
x x
0 1
y5x
(a) (c)
FIGURE 14.12 (a) Prism with a triangular base in the xy-plane. The volume of this prism is defined
Figure 3.4: the graph and the top-down views of the function in Example 3.2
as a double integral over R. To evaluate it as an iterated integral, we may integrate first with respect
to y and then with respect to x, or the other way around (Example 1). (b) Integration limits of
x=1 y=x
ƒsx, yd dy dx.
The fact that we have the freedom to L
choose
x=0 Ly = 0our order of integration is guaranteed by the
Fubini’s Theorem, whose proof is again beyond the scope of this course.
If we integrate first with respect to y, we integrate along a vertical line through R and then integrate
Theorem 3.2from
— Fubini’s Theorem
left to right to includefor General
all the vertical Regions. Let
lines in R. (c) R be a limits
Integration of on the xy-plane and
region
f ( x, y) is a continuous function on R, then
y=1 x=1
¨ ¨ƒsx, yd dx dy.
Ly=0 L
f ( x, y)dxdy =x=y
f ( x, y)dydx
R R
If we integrate first with respect to x, we integrate along a horizontal line through R and then inte-
where the lower/upper
grate from bottom limits
to topof each integral
to include are setlines
all the horizontal up in
according
R. to the shape of the region
R.
Notation As there is no
Although difference
Fubini’s Theorembetween dxdy
assures us thatand dydxintegral
a double as far may
as the upper and
be calculated lower
as an
limits are set according
iterated integral to
in the same
either orderregion, we may
of integration, thesimply
value ofwrite:
one integral may be easier to
find than the value of the other. The next example shows how this can happen.
dA = dxdy or dydx
EXAMPLE 2 Calculate
Although choosing the dxdy-order will yield the same result as the dydx-order, it happens
often that one order is easier while the other sin x
one is harder. Let’s look at the following
x dA,
example: 6
R
where
Example 3.3 LetR R
is be
the the
triangle in the
region inxy-plane
the firstbounded by the
quadrant x-axis,
of the y = x, andby
the line bounded
xy-plane thethe
lineunit
2 x 2 = 1.
circle x + y = 1 and the straight-line x + y = 1. Evaluate the integral
¨ p
1 − x2 dA.
R
772 Chapter 14: Multiple Integrals
Solution The integrand sinx x does not have a simple antiderivative when integrating by dx!
Let’s switch the order of integration first.
The region corresponds to the double integral is formed by strips entering at x = y and
leaving at x = 1. The range of y is from 0 to 1. A sketch of the diagram can be found in
Figure 3.6.
Switching the order of integration, the strip for each x enters at y = 0 and leaves at y = x.
Therefore, Fubini’s Theorem says:
ˆ 1ˆ 1 ˆ 1ˆ x
sin x sin x
dxdy = dydx.
0 y x 0 0 x
(b)
ƒsx, yd dA =
6
R
y Leaves at
Using Horizontal Cross-sections To
3.3 Double Integrals in Polar Coordinates 57
x = r cos θ
y = r sin θ
Here r is the distance from the point to the origin, and θ is the angle made with the positive
x-axis.
In rectangular coordinates, the region defined by inequalities like a ≤ x ≤ b and c ≤ y ≤ d,
where a, b, c and d are constants, describe a rectangle. We have seen that it is very easy to
set-up a double integral when the region is a rectangle.
In polar coordinates, regions defined by inequalities like a ≤ r ≤ b and α ≤ θ ≤ β, where a,
b, α and β are constants, describe a fan shape or a circular sector (see Figure 3.7). It is wise
to use polar coordinates instead of rectangular coordinates when the region of integration is
given by one of these circular shapes.
Chapter 14 • Multiple Integration
y y y
Examples of
polar rectangles
b
R R
R
b
␣ 
b 
a ␣
0 b x x x
FIGURE 14.28
Figure 3.7: examples of regions good for polar coordinates
R {(r, ): a r b, ␣ } Our approach is to divide 3a, b4 into M subintervals of equal length r = 1b - a2>M.
To set-up a doubledivide
We similarly 3a, b4ofinto
integral a region a≤r≤
m subintervals of bequal α ≤ θu
andlength ≤ =β 1b - a2>m.
using polarNow
coordinates,
(rk*, k*) look at the
the upper/lower arcs ofare
limits thesimply:
circles centered at the origin with radii
Ak
r =ˆ a,
θ =rβ =
ˆ ar=+b r, r = a + ˆ2r,
r =bc,
ˆ θr= β= b
␣
and the rays or
rb θ =α r=a r=a θ =α
u = a, u = a + u, u = a + 2u, c, u = b
rdepending
on the order of integration drdθ or dθdr. However, one should be very cautious that
emanating
while dA = dxdy infrom the origin coordinates,
rectangular (Figure 14.29).itThese arcs and rays divide the region R into
is instead:
ra
n = Mm polar rectangles that we number in a convenient way from k = 1 to k = n. The
0
0 area of the kth rectangle is denoted dA =A k, and we
rdrdθ 1r *k , u *k 2 be an arbitrary point in that
or letrdθdr
RE 14.29 rectangle.
Consider the
in polar coordinates. We “box” whose base is theitkth polar rectangle and whose
a fewheight is f 1r *k , u *k 2;
* will explain why is so after learning examples.
Ak rk* r its volume is f 1r k , u k 2 Ak, for k = 1, c. , n. Therefore, the volume of the solid region be-
*
rk*
r f 1r, u2 dA = lim a f 1r *k , u *k 2 Ak.
2 O S0 k=1
R
RE 14.30
The next step is to write the double integral as an iterated integral. In order to do so,
we must express Ak in terms of r and u.
Recall that the area of a sector of a circle Figure 14.30 shows the kth polar rectangle, with an area Ak. The point 1r *k , u *k 2 is
f radius r subtended by an angle u is chosen so that the outer arc of the polar rectangle has radius r *k + r>2 and the inner arc
r 2u. has radius r *k - r>2. The area of the polar rectangle is
Ak = 1area of outer sector2 - 1area of inner sector2
1 r 2 1 r 2 r2
58 Multiple Integrations
However, things go much better if we switch to polar coordinates, since the region is in
the form a ≤ r ≤ b and α ≤ θ ≤ β. From Figure 3.8, the region is defined by:
0 ≤ r ≤ 1, 0 ≤ θ ≤ π.
where R is the semicircular region bounded by the x-axis and the curve y =
(Figure 14.26).
does not have an easy explicit expression. Nonetheless, this integral is extremely important
in probability, heat flow and many physics and engineering subjects. While this integral is
generally hard to compute, it can be magically done, with the help of a double integral, when
the upper/lower limits are:
ˆ ∞
2
e− x dx.
0
60 Multiple Integrations
Now the double integral are split as a product of two single-variable integrals. Note that they
two single-variable integrals are the same as x and y are merely dummy variables! Therefore,
combining everything we have shown above, we have:
ˆ ∞ˆ ∞ ˆ ∞ ˆ ∞ ˆ ∞ 2
2 2 2 2 2
e− x −y dxdy = e− x dx e−y dy = e− x dx .
0 0 0 0 0
ˆ ∞ 2
Consequently, in order two evaluate the single-variable integral e− x dx, one can first
0
evaluate the double integral ˆ ∞ˆ ∞ 2 − y2
e− x dxdy
0 0
ˆ ∞ 2
and then the square root of the double integral will give the value of e− x dx.
0
In contrast to the single-variable integral, the double integral is relatively easy to find
using polar coordinates. The region of integration is the entire first quadrant which, in polar
coordinates, can be described as:
π
0 ≤ r ≤ ∞, 0≤θ≤ .
2
Therefore,
ˆ ∞ˆ ∞ ˆ π/2 ˆ ∞
2 − y2 2
e− x dxdy = e−r r drdθ
0 0 0 0
ˆ
1 −r 2 r = ∞
π/2
d −r 2 2
= − e dθ note that e = e−r · (−2r )
0 2 r =0 dr
ˆ π/2
1
= −0 + dθ
0 2
π
= .
4
Therefore, ˆ √
∞
r
− x2 π π
e dx = = .
0 4 2
2
Furthermore, e− x is an even function, we also have:
ˆ ∞
2 √
e− x dx = π.
−∞
However, this trick does not work well for any similar definite integral such as:
ˆ 1
2
e− x dx.
0
3.3 Double Integrals in Polar Coordinates 61
the double integral is not “polar-friendly” since the region of integration is a square!
1 1 2
∆A =1006(r + ∆r )2 ∆θ14−•
Chapter r ∆θ
Multiple Integration
|2 {z } |2 {z }
area of outer sector y
area of inner sector y y
Examples of
polar rectangles
1
= r2 + 2r∆r + (∆r )2 − r2 ∆θb
2 R R
1
= r∆r · ∆θ + (∆r )2 ∆θ. R
b
2
␣ 
b
a
When ∆r and ∆θ get smaller and smaller, the last term (∆r )2 ∆θ is negligibleb when compared 0 x x
to the other term r∆r∆θ. Therefore, as the region is chopped into infinitesimal
0rb pieces, we 0have:
rb ar
0q ␣ ␣
R {(r, ): a r b, ␣ } Our approach is to divide 3a, b4 into M subintervals of equal length r =
 We similarly divide 3a, b4 into m subintervals of equal length u = 1b
(rk*, k*) look at the arcs of the circles centered at the origin with radii
Ak
r = a, r = a + r, r = a + 2r, c, r = b
␣
and the rays
rb
u = a, u = a + u, u = a + 2u, c, u = b
r
emanating from the origin (Figure 14.29). These arcs and rays divide the
ra
n = Mm polar rectangles that we number in a convenient way from k = 1
0 area of the kth rectangle is denoted Ak, and we let 1r *k , u *k 2 be an arbitrar
0
FIGURE 14.29 rectangle.
Figure 3.10: area of ∆A gets larger when r getsConsider
larger.the “box” whose base is the kth polar rectangle and whose heig
Ak rk* r its volume is f 1r *k , u *k 2 Ak, for k = 1, c. , n. Therefore, the volume of the s
neath the surface z = f 1r, u2 with a base R is approximately
n
(rk*, k*)
V ⬇ a f 1r *k , u *k 2Ak.
k=1
r
rk*
2
r
This approximation to the volume is another Riemann sum. We let be
value of r and u. If f is continuous on R, then as S 0, the sum appro
integral; that is,
n
rk*
r f 1r, u2 dA = lim a f 1r *k , u *k 2 Ak.
2 O S0 k=1
R
FIGURE 14.30
The next step is to write the double integral as an iterated integral. In
we must express Ak in terms of r and u.
➤ Recall that the area of a sector of a circle Figure 14.30 shows the kth polar rectangle, with an area Ak. The po
of radius r subtended by an angle u is chosen so that the outer arc of the polar rectangle has radius r *k + r>2 an
2 r u.
1 2
has radius r *k - r>2. The area of the polar rectangle is
Ak = 1area of outer sector2 - 1area of inner sector2
1 r 2 1 r 2
= a r *k + b u - a r *k - b u Area of sector
2 2 2 2
= r *k r u. Expand and si
Substituting this expression for Ak into the Riemann sum, we have
r
n n
62 Multiple Integrations
Some triple integrals have important physical meanings. For instance, if f ( x, y, z) is the
density function, then the triple integral
ˆ 1ˆ yˆ y
f ( x, y, z) dxdzdy
0 0 z
represents the mass of the solid (whose shape is determined by the upper/lower limits of the
integral). It is because dxdzdy represents (infinitesimal) volume, and
density × volume = mass.
In physics, some calculations such as the center of mass of an object and the moment of inertia
about an axis also involve evaluations of triple integrals.
Pillar-Base Approach
Before we see some practical applications of triple integrals, let’s learn how a triple integral
can be set-up. One common approach is so-called the pillar-base approach or pillar-shadow
approach. Let’s explain this through an example:
Example 3.7 Let D be the tetrahedral solid bounded by the plane z = y − x, the plane
y = 1, the xy-plane and the yz-plane (see Figure 3.11). Evaluate the triple integral:
˚
x2 dzdydx.
D
Solution We demonstrate the pillar-base approach in the solution. The so-called pillar is the
orange ray labeled M in Figure 3.11, and it determines how the inner-most integral is set-up:
ˆ z=?
x2 dz.
z=?
Its lower and upper limits of z are determined by, respectively, where the pillar enters the
solid and where the pillar leaves the solid. According to the diagram, it enters at z = 0, i.e.
the xy-plane; and it leaves at z = y − x. Therefore, the inner-most integral should be:
ˆ z=y− x
x2 dz.
z =0
Note that again the integrand x2 does not affect how we set-up this integral.
Next we call the other two variables x and y to be base variables or shadow variables.
Cast some light on the solid in the direction parallel to the pillar (i.e. z-direction), a shadow
appears on the base xy-plane as a triangle. To way to set-up the middle and outer-most
integrals are just the same as what we did for double integrals.
Since we picked the dydx-order, we draw a sample strip L on the base in the direction of
y. This strip enters at y = x and leaves at y = 1, and therefore the middle and outer-most
integrals should be set-up as:
ˆ x =1 ˆ y =1 ˆ z = y − x
x2 dzdydx.
x =0 y= x z =0
| {z } | {z }
base pillar
3.4 Triple Integrals in Rectangular Coordinates 63
Finally, the easiest step is to evaluate the integral. This is as straight-forward as in double
integrals:
ˆ 1 ˆ 1 ˆ y− x ˆ 1ˆ 1
z=y− x
x2 dzdydx = [ x 2 z ] z =0 dydx
0 x 0 0 x
ˆ 1ˆ 1
= x2 (y − x )dydx
0 x
ˆ 1 y =1
x 2 y2
= − x3 y dx
0 2 y= x
ˆ 1 2
x4
x
= − x3 − + x4 dx
0 2 2
ˆ 1 2
x4
x 3
= −x + dx
0 2 2
3 1
x x4 x5
= − +
6 4 10 0
1 1 1 1
= − + = .
6 4 10 60
14.
L0 Lx L
D
For example, if Fsx, y, zd = 1, we wo
0 (0, 1, 0)
y
x L
(x, y)
y51
V =
L0
y5x
R
=
1
(1, 1, 0) L0
x
=
L0
FIGURE 14.32
Figure The D
3.11: solid tetrahedron
in Example in 3.7.
Example 3 showing how the limits of
The Fubini’s Theoremintegration
also holdsare
forfound
triplefor
integrals, i.e.dzwe
=
the order dycan
dx. evaluate the integral by L0
different order of integration, say dxdydz or dydxdz (there are 6 possible orders), we should get
= c
the same value provided that the upper and lower limits are adjusted to present the same solid. 1
Example 3.8 Let D be the tetrahedral solid in Example 3.7. Evaluate the triple integral 2
˚ 1
x2 dydzdx = .
D
6
using the dydzdx-order.
We get the same result by integrating
V =
L
=
L
64 Multiple Integrations
Solution Now the inner-most integral is with respect to dy, meaning the pillar is pointing
along the positive y-axis. It is the ray labeled as M in Figure 3.12. It enters the solid through
the plane z = y − x, or equivalently y = x + z, and it leaves at y = 1. Therefore, the
inner-most integral should be:
ˆ y =1
x2 dy.
y= x +z
The middle and the outer-most variables are z and x, so the base is the shadow of the
solid on the xz-plane, which is 790
the triangle labeled
Chapter R in
14: Multiple Figure 3.12.
Integrals
Here the order of integration is dzdx, so we draw a sample strip (labeled L) along the
z-axis direction. It enters the region through z = 0 and leaves at theweline
Next x +y-limits
find the z = 1, or
of integration. The line L through (
equivalently, z = 1 − x. Therefore, the whole triple integral should entersbe = - 2s4 - x 2 d>2 and leaves at y = 2s4 - x 2 d>
sety as:
R at
Finally we find the x-limits of integration. As L sweeps acro
ˆ x =1 ˆ z =1− x ˆ y =1 from x = -2 at s -2, 0, 0d to x = 2 at (2, 0, 0). The volume of D
x2 dydzdx.
x =0 z =0 y= x +z V = dz dy dx
9
D
It can be verified by straight-forward computations that the answer should be the2 same 2as 2
2 2s4 - x d>2 8-x -y
we got in Example 3.7: = dz dy dx
ˆ L-2 L-2s4 - x 2d>2 Lx 2 + 3y 2
x =1 ˆ z =1− x ˆ y =1 ˆ 1 ˆ 1− x h i y =1
2s4 - x 2d>2
x2 dydzdx = x2 y dzdx = 2
s8 - 2x 2 - 4y 2 d dy dx
x =0 z =0 y= x +z 0 0 y= x +z
L-2 L-2s4 - x 2d>2
ˆ 1 ˆ 1− x
y = 2s4 - x 2d>2
2
x2 − x2 ( x + z) cs8 - 2x 2 dy - y 3 d
= dzdx 4
= dx
0 0 L-2 3 y = -2s4 - x 2d>2
ˆ 1 z =1− x
x 2 z2
2 3>2
= dx a2s8 - 2x 2 d - a b b dx
= ( x2 − x3 )z − 4 - x2 8 4 - x2
0 2 z =0 L-2 B 2 3 2
ˆ 1
x2 (1 − x2) 4 - x 2 3>2 8 4 - x 2 3>2 2
c8 a dx b - a b d dx =
4 22
= ( x2 − x3 )(1 − x ) − =
2 3 2 3
0 2L-2
ˆ 1 2
x4 1= 8p22. After integration with the substitution x = 2 sin
x
= − x3 + dx = (same!)
0 2 2 60
In the next example, we project D onto the xz-plane instead
how to use a different order of integration.
Although Fubini’s Theorem allows us to switch the order of integral, owever, to ease our
computation, we sometimes need to choose a smart choice of pillar and base variables. Let’s
look at the next example:
Example 3.9 Let D be the solid bounded by the paraboloid y = x2 + z2 and y = 16 − 3x2 −
z2 (see Figure 3.13). Find the volume of the solid, i.e. evaluate the integral:
˚
1dV.
D
Solution After taking a careful look at the diagram, one should see that it would be a bad
choice if we chose either x or z to be the pillar variable. The solid can be decomposed into
two parts as shown in the diagram. If z were chosen to be the pillar direction, then the pillar
would enter the yellowp part in a way different from it does in the blue part. The yellow
part would
p have z = ± 16 − 3x2 − y as the lower/upper limits, while the blue part have
2
z = ± y − x . Even worse, the part near the intersection of the blue and the yellow surfaces
is geometrically complicated – it is not easy to set up the inner integral for that part.
However, life is much easier if we choose y as the pillar variable. Since then the y-pillar
will enter through the blue surface and leave through the yellow surface. The shadow is an
ellipse on the xz-plane.
To set-up the inner integral, we note that the y-pillar enters at y = x2 + z2 and leaves at
y = 16 − 3x2 − z2 :
ˆ y=16−3x2 −z2
dy.
y = x 2 + z2
The shadow (or base) is an ellipse, whose equation can be obtained by setting y = x2 + z2 =
16 − 3x2 − z2 , which gives
Chapter 14 • Multiple Integration
2x2 + z2 = 8.
For the base integral,
SOLUTION Wewe √ can2 the
identify either
rightchoose
boundary dzdx
of√Doras dxdz. = 16
For ythe
the surface - 3x 2 the
former, - zsample
2
; the z-strip
left boundary is y = x + 2z 2
. These surfaces are functions
2 of x and
enters the ellipse at − 8 − 2x and leaves at 8 − 2x . The min/max values of x are −2 z, so they determine
and 2 forthe
thelimits of integration
ellipse. Therefore, for the
theinner integral
middle and in outer
the y-direction.
integrals should:
A key step in the calculation is finding the curve of intersection between the two
ˆ zthe
ˆ xit=2onto √
surfaces and projecting 8−2x2 ˆtoy=
= xz-plane form the2 −boundary
16−3x z2 of the region R. Equat-
ing the y-coordinates of the two surfaces,
√
we have x 2
+ z 2
= 16 - 3x 2 - z 2, which
dydzdx.
becomes the equationxof =− an2 ellipse:
z=− 8−2x 2 2
y= x +z 2
+ 2z = 16, or (but
2
Evaluation of this integral is4xstraight-forward
2
z = { be28 - 2x . It is left as an exercise for
careful).
2
√
The projection
readers. The of the solid
answer should be region
32π D 2. onto the xz-plane is the region R bounded by this
ellipse (centered at the origin with axes of length 4 and 412). Here are the observations
that lead to the limits of integration with the ordering dy dz dx.
163x2z2 163x2z2
冕冕 冕 冕 冕 冕 dy dz dx
2 82x 2
dy dA
R x2 z2 2 82x 2 x2z2
(a) (b)
FIGURE 14.44
Figure 3.13: the pillar-base diagram for the triple integral in Example 3.9
Inner integral with respect to y: A line through the solid parallel to the y-axis enters
the solid at y = x 2 + z 2 and exits at y = 16 - 3x 2 - z 2. Therefore, for fixed values of
x and z, we integrate over the interval x 2 + z 2 … y … 16 - 3x 2 - z 2 (Figure 14.44a).
Middle integral with respect to z: Now we must cover the region R. A line parallel to the
z-axis enters R at z = - 28 - 2x 2 and exits R at z = 28 - 2x 2. Therefore, for a fixed
value of x, we integrate over the interval - 28 - 2x 2 … z … 28 - 2x 2 (Figure 14.44b).
13. 1xy + xz + yz2 dV; D = 5 1x, y, z2: -1 … x … 1,
l
D
- 2 … y … 2, - 3 … z … 3 6
xyze -x - y2
dV; D = 5 1x, y, z2: 0 … x … 1ln 2,
2
66 14. Multiple Integrations
l 19. The wedge above the xy-pl
D
Suppose D be the tetrahedron solid in the first octant bounded by the plane 2x + 3y +x6z
0 … y … 1ln 4, 0 … z … 1 6 + y = 4 is cut by the
2 = 2
(0 0 兹8)
兹 2 2 2 8
ln
1
Cone 5 1r, u
If one sets r = constant, then it describes an infinite cylinder in R3 with z-axis as the central
axis. Therefore, if the solid is cylindrical in shape, it is usually easier to set-up a triple integral
using cylindrical coordinates. Analogous to polar coordinates, the volume element dV is given
by:
Theorem 3.3 Under cylindrical coordinates (r, θ, z), we have:
dV = rdzdrdθ.
Example 3.10 The volume of the solid bounded by two surfaces z = 4 − 4( x2 + y2 ) and
z = ( x2 + y2 )2 − 1, as shown in Figure 3.16.
Solution When using cylindrical coordinates, one often (though not always) set z as the
The shadow is a circle centered at the origin. To find its radius, we solve:
r4 − 1 = z = 4 − 4r2
which gives r = 1. Therefore, the outer and middle integrals have upper and lower limits
given by:
ˆ θ =2π ˆ r=1
.
θ =0 r =0
Combining all of the above, the volume of the solid is given by:
˚ ˆ θ =2π ˆ r =1 ˆ z=4−4r2
dV = 1 rdzdrdθ
D θ =0 r =0 z =r 4 −1
ˆ 2π ˆ 1
2
= [z]rz4=−41−4r rdrdθ
0 0
ˆ 2π ˆ 1
Chapter 14: Multiple Integrals 2 4
= (4 − 4r − r + 1)r drdθ
0 0
ˆ 2π ˆ 1
3 iterated
rated Integrals in Spherical Coordinates = volume (of5rD−as4ran− r5 ) drdθ
triple integral in (a) cylindrical and
s 33–38, (a) find the spherical coordinate limits for the in- 0(b) spherical
0 coordinates. Then (c) find V.
alculates the volume of the given solid and then (b) evalu- ˆ 2π 6 rcut
2 smaller rcap =1
41. Let D be5rthe from a solid ball of radius 2 units by
gral. = a plane − 1 unit r4 − the centerdθ
−from of the sphere. Express the volume
0 2 6 r =0
lid between the sphere r = cos f and the hemisphere ˆ of2πDas an iteratedtriple integral in (a) spherical, (b) cylindrical,
z Ú 0
5 1 8π Then (d) find the volume by eval-
= and (c) rectangular
− 1 − coordinates.
dθ = .
0uating one of the three triple 3
2 6 integrals.
z
42. Express the moment of inertia Iz of the solid hemisphere
r 5 cos f 2 r52 x 2 + y 2 + z 2 … 1, z Ú 0, as an iterated integral in (a) cylindri-
cal and (b) spherical coordinates. Then (c) find Iz.
Volumes
Find the volumes of the solids in Exercises 43–48.
43. 44.
2 2 z z
x y
z 4 4 (x 2 y 2)
lid bounded below by the hemisphere r = 1, z Ú 0, and z512r
by the cardioid of revolution r = 1 + cos f
z 21 21
r 5 1 1 cos f
r51 1 1 y
x
x y
z (x 2 y 2 ) 2 1 z 5 2 1 2 r 2
Figure45.
3.16: the solid in Example
z 46.
3.10. z
z x 2 y 2
z –y
x y
id enclosed by the cardioid of revolution r = 1 - cos f
pper portion cut from the solid in Exercise 35 by the r 3 cos y
e
Example 3.11 Let D be a solid cylinder of radius a with z-axis as the central axis, is bounded
id bounded below by the sphere r = 2 cos f and above by x r –3 cos
by planes z = z0 and z = z0 + h, where z0 and h are constants. Therefore,
x they cylinder has
e z = 2x 2 + y 2
height h. Suppose the solid has uniform density δ. Evaluate the following triple integral:
47. 48.
z
z 5 x2 1 y2
˚
z z
Iz := δ( x2 + y2 )dV
1 x 2 y 2
zD z 5 31 2 x2 2 y2
which is the moment of inertia of the solid about the z-axis.
r 5 2 cos f
x y
x
lid bounded below by the xy-plane, on the sides by the y
r = 2, and above by the cone f = p>3 r sin x y
r 5 cos u
z
p
f5 49. Sphere and cones Find the volume of the portion of the solid
3.5 Triple Integrals in Cylindrical Coordinates 69
Solution We choose the order of integration dzdrdθ. The z-pillar enters the solid at z = z0
and leaves at z = z0 + h. The shadow is the circle with radius a centered at the origin.
Therefore,
ˆ 2π ˆ a ˆ z0 + h
Iz = δ( x2 + y2 ) · rdzdrdθ
0 0 z0 | {z }
r2
ˆ 2π ˆ a ˆ z0 + h
= δr3 dzdrdθ
0 0 z0
ˆ 2π ˆ a
= δr3 hdrdθ
0 0
ˆ 2π r = a
δhr4
= dθ
0 4 r =0
ˆ 2π
δha4
= dθ
0 4
δhπa4
= .
2
Most physics/engineering textbook expresses the moment of inertia in terms of the total
mass m rather than the density δ. To rewrite the above answer in terms of m, we note that:
m
δ=
V
where V is the total volume of the solid, which is πa2 h. Therefore, combining with the above
calculation, we have:
m hπa4 ma2
Iz = · = ,
πa2 h 2 2
which is exactly what you can find in physics or engineering textbooks.
70 Multiple Integrations
z Therefore,
Mxy 32p 1
P(r, f, u) z = =
M 3 8p
f and the centroid is (0, 0, 4 >3). Notice that the centro
r
z 5 r cos f
0 Spherical Coordinates and Integration
x
u Spherical coordinates locate points in space with tw
r 1
y in Figure 14.47. The first coordinate, r = ƒ OP ƒ , is
y Unlike r, the variable r is never negative. The second
x
with the positive z-axis. It is required to lie in the in
FIGURE 14.47 The spherical coordinates the angle u as measured in cylindrical coordinates.
Figure 3.17: spherical coordinates
r, f, and u and their relation to x, y, z, and r.
The ranges of (ρ, θ, ϕ) are: DEFINITION Spherical coordinates represe
triples sr, f, ud in which
ρ ≥ 0, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.
1. r is the distance from P to the origin.
From standard trigonometry, one can figure out the following conversion rules: 1
2. f is the angle OP makes with the positive z-a
x = ρ sin ϕ cos θ 3. u is the angle from cylindrical coordinates s0
y = ρ sin ϕ sin θ
z = zρ cos ϕ On maps of the Earth, u is related to the meridia
f 5 f0, latitude, while r is related to elevation above the Ear
Similar to polar and cylindrical coordinates,
r and u vary it is
f0good to keep in mind The
thatequation r = a describes the sphere of radius
P(a, f0, u0) The equation f = f0 describes a single cone whos
ρ2 = x 2 + y2 + z2 .
axis lies along the z-axis. (We broaden our interpre
The equation ρ = ρ0 , where ρ0 is a positive constant, represents cone = p>2.of
thefsphere f0 is ρgreater than p>2, the c
) Ifradius
0
equation u = u0 describes
centered at the origin. The equation ϕ = ϕ0 represents a cone with angle ϕ0 with the origin as the half-plane that conta
its vertex. Therefore, the spherical coordinates are particular usefulwith
when the handling
positive x-axis.
spherical
or conical objects (see Figure 3.18).
To integrate using spherical coordinates, one should note that:
Equations Relating Spherical Coordinates to
u0 (ρ, θ, ϕ ), the volume
Theorem 3.4 Under the spherical coordinates y element is given by:Coordinates
and Cylindrical
1
2. f is the angle OP makes with the positive f = s0
r cosz-axis … ff…
r sin
z 3. u is the angle from cylindrical coordinates s0 … u … 2pd.
3.6 Triple Integrals in Spherical Coordinates 71 cos f = sin f
4 p
z
f = a point
On maps of the Earth, u is related to the meridian of 4
. o
f 5 f0, latitude, while r is related to elevation above the Earth’s surface.
r and u vary f0 Spherical
The equationcoordinates arethe
r = a describes useful
spherefor describing
of radius spher
a centered at th
P(a, f0, u02) The hinged
equationalong
f = the
f z-axis,
describes and
a cones
single conewhose
whose vertices
vertex lie
lies a
z x y 2 0
axisthe
liesz-axis.
along the z-axis. (We
Surfaces likebroaden our interpretation
these have equations oftoconsinclu
cone f = p>2.) If f0 is greater than p>2, the cone f = f0
4
equation u = u0 describes the half-plane that contains the z-axis
with the positive x-axis. r = 4 S
p C
f =
3 a
x y Equations Relating Spherical Coordinates to Cartesian
u0 p H
y and Cylindrical Coordinates u = .
FIGURE 14.50 The cone in Example 4. 3 a
x r = r sin f, x = r cos u = r sin f cos u
When computing z = r costriple y = r sinover
f, integrals u = ar region
sin f sinDu
r 5 a,
f and u vary the region into n spherical 2wedges. The size2 of the k
u 5 u0, r = 2x + y + z = 2r + z 2.
2 2
Volume Differential in Spherical
r and f vary
point srk , fk , uk d, is given by the changes ¢rk , ¢fk ,
Coordinates ical wedge has one edge a circular arc of length rk
FIGUREdV = r2Constant-coordinate
14.48 sin f dr df du length3rk sin
EXAMPLE ¢uk , and
Findfak spherical thickness
coordinate ¢rkfor
equation . The spherica
the sphere x2
Figure 3.18: coordinate
coordinatesplanes
equations in spherical yield of these dimensions when ¢rk , ¢fk , and ¢uk are all
spheres, single cones, and half-planes. Solution We use Equations (1) to substitute for x, y, and z:
that the volume of this spherical wedge ¢Vk is
z srk, fk, uk d a point chosen inside
2
+ ywedge.
x the 2
+ sz - 1d2 = 1
r sin f rThe
2
f cos2 u + r2 sinRiemann
sincorresponding
2 2
f sin2 u +sum forfa -function
sr cos 1d2 = 1
rΔf r sin f Δu n 2
r2 sin2 fscos2 u + sin2 ud + r2 cos f - 2r cos f +2 1 = 1
(''')'''*
Sn = a ƒsrk, fk, uk d rk sin f
1 k=1
r2ssin2 f + cos2 fd = 2r
(''')'''*
As the norm of a partition approaches zero, and th
1
Riemann sums have a limit when ƒ is continuous:
2
r = 2r
Of r = 2
r
lim Sn = ƒsr, f, ud dV = ƒsr
n: q 9 9
y D D
Δr
In spherical coordinates, we have
u
u 1 Δu dV = r2 sin f dr df
x
To evaluate integrals in spherical coordinates, we usu
Figure 3.19:FIGURE 14.51meaning
geometric In spherical = ρ2 sin ϕ The
coordinates
of dV dρdϕdθ.
procedure for finding the limits of integration is
dV = dr # r df # r sin f du to integrating over domains that are solids of revo
= r2 sin f dr df du. thereof ) and for which the limits for u and f are cons
Example 3.12 Consider a solid sphere S of radius r centered at the origin. Suppose it has
uniform density δ. Derive the moment of inertia about z-axis, which is given by:
˚
Iz := δ( x2 + y2 )dV
S
72 Multiple Integrations
0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.
Therefore,
ˆ 2π ˆ π ˆ r
Iz = δ · ρ2 sin2 ϕ · ρ2 sin ϕ dρdϕdθ
0 0 0
ˆ 2π ˆ π ˆ r
= δρ4 sin3 ϕ dρdϕdθ
0 0
ˆ !0 ˆ
2π π ˆ r
3 4
= dθ sin ϕ dϕ δρ dρ
0 0 0
δr5
ˆ π
2
= 2π · (1 − cos ϕ) sin ϕ dϕ ·
0 5
π
cos3 ϕ δr5
= 2π · − cos ϕ + ·
3 0 5
1 1
δr 5
= 2π · 1 − + 1 − · .
3 3 5
8πδr5
= .
15
Although it is a perfectly acceptable answer, the moment of inertia in many physics and
engineering books is expressed in terms of the total mass m rather than of the density. Since
the density in this problem is uniform, one can easily derive the moment of inertia formula
in terms of m:
8πr5 m 2mr2
Iz = · 4πr3 =
15 5
3
which is what one can find in physics or engineering books.
Conical objects are another type of solids which are compatible with spherical coordinates.
The inequality 0 ≤ ϕ ≤ π3 represents an infinite cone above the xy-plane with cone angle
3 counting from the positive z-axis. If one further impose the inequality 0 ≤ ρ ≤ 1 which
π
represents the sphere with radius 1 centered at the origin, then the combined inequalities:
π
0 ≤ ρ ≤ 1, 0≤ϕ≤
3
represents the common part of the sphere and the cone, which is in an ice-cream shape. Let’s
consider the following example:
Example 3.13 Find the volume of the “ice-cream cone” D cut from the solid sphere ρ ≤ 1
by the cone ϕ = π3 , as shown in Figure 3.20.
3.6 Triple Integrals in Spherical Coordinates 73
˚
Solution The volume is given by ρ2 sin ϕ dρdϕdθ. The limits of the inner-most integral
D
are determine by where the ρ-ray, labeled M in the diagram, enters the solid and where it
leaves the solid. Evidently, it enters from the origin ρ = 0. When it leaves, it always leaves
from the sphere part and never the cone part, so the upper limit should be ρ = 1.
The ϕ-angle run from 0 to π3 , and the θ-ray, labeled L in the diagram, sweeps over the
shadow from 0 to 2π.
Combining all these limits, the volume is given by the integral:
ˆ ˆ
θ =2π ˆ
ϕ=π/3 ρ =1 ˆ ! ˆ
2π
! ˆ
π/3
!
1
ρ2 sin ϕ dρdϕdθ = dθ sin ϕ dϕ ρ2 dρ 14.7 Triple Integrals in Cylindri
θ =0 ϕ =0 ρ =0 0 0 0
1 4. Find the u-limits of integration. The ray L sweep
= 2π · [− cos ϕ]0π/3 ·
3 are the u-limits of integration. The integral is
1 1 π
= 2π · − + 1 · = . u = b f = fmax r = g2sf,
2 3 3 ƒsr, f, ud dV =
9 Lu = a Lf = fmin Lr = g1sf, ud
D
Iz = sx 2 + y 2 d
9
Figure 4.1: plot of the gravitational force F, the central sphere is the Sun (not a part of the
vector field)
b. F 1x, y2 = 8 1 - y 2, 0 9 = 11 - y 22 i, for 兩y兩 … 1 (channel flow)
c. F 1x, y2 = 8 -y, x 9 = -y i + x j (a rotation field)
SOLUTION
1
1
a. This vector field is independent of y. Furthermore, because the x-component of F is
zero, all vectors in the field (for x ⬆ 0) point in the y-direction: upward for x 7 0
x
76 and downward for x 6 0. The magnitudes of the vectors in the field increase with Vector
dis- Calculus
tance from the y-axis (Figure 15.4). The flow curves for this field are vertical lines. If
F represents
The general form of aavelocity
vectorfield,
fielda in
particle
R3 isright of theby:
given y-axis moves upward, a particle left
of the y-axis moves downward, and a particle on the y-axis is stationary.
( x,vector
b. In this case,Fthe y, z) field
= Fxis( independent
x, y, z)i + Fofy (xx,and z)j y-component
y, the + Fz ( x, y, zof
)kF is zero.
Because 1 - y 2 7 0 for 兩y兩 6 1, vectors in this region point in the positive
Fx , Fy and
where each ofx-direction. TheFzx-component
is a scalar-valued function.
of the vector field is zero at the boundaries y = {1
and increases to 1 along the center of the strip, y = 0. The vector field might model
GURE 15.4 Do NOT confuse Fx in
with the partial
shallowderivative ∂F
i the flow of water a straight channel (Figure ∂x ! 15.5);
Although
its flowwe usedarethe
curves subscript
hori-
notation F
zontalx lines, indicating motion in the direction of the should
to denote a partial derivative before, we positive avoid
x-axis.using it in this chapter.
Here Fx means the x-component of the vector field F.
➤ Drawing vectors with their actual length c. It often helps to determine the vector field along the coordinate axes.
often leads to cluttered pictures of vector
Many vector
fields. For this reason, most of the vector
• When y = in
fields physics
0 (along the x-axis), we have F 1x, 02 = 8 0, xTo
are three-dimensional. 9 . With 7 0, this
startx with, we will also study
vector
two-dimensional
fields in this chapter are illustrated with fieldvector fields. A two-dimensional vector field isas ax collection
consists of vectors pointing upward, increasing in length of vectors in R2
increases. With
proportional scaling: All vectors that
are are denoted x 6by0,athe vectors point downward,
vector-valued F : R2 →
functionincreasing in length as 兩x兩 increases.
R2 , which assigns to each point ( x, y) a
multiplied by a scalar chosen to make F ( x, y )•. When
vector x = 0 (along
The general formtheofy-axis),
a two-dimensional 8 -y, 0field
we have F 10, y2 =vector 9 . If y is7 given
0, the vectors
by:
the vector field as understandable as point in the negative x-direction, increasing in length as y increases. If y 6 0, the
possible. vectors point in the positive x-direction, increasing in length as 兩y兩 increases.
F( x, y) = Fx ( x, y)i + Fy ( x, y)j.
➤ A useful observation for two-dimensional A few more representative vectors show that the vector field has a counterclockwise
Figure 4.2rotation
vector fields F = 8 f, g 9 is that the slope
showsabout
thethe
plot of the
origin; twomagnitudes
examples of vectors
of the two-dimensional vectorfrom
increase with distance fields:
the F( x, y) =
(1 − y2 )i and F( x, y) = −yi + xj.
of the vector at 1x, y2 is g1x, y2>f 1x, y2. origin (Figure 15.6).
In Example 1a, the slopes are everywhere
undefined; in part (b), the slopes are y
everywhere 0, and in part (c), the slopes
Rotation vector field
are - x>y.
F ⫽ 具⫺y, x典
Channel flow y 2
F ⫽ 具1 ⫺ y2, 0典
1
⫺2 2 x
⫺2
⫺1 1 x
⫺1
i
eθ
er
This is worthwhile to note that unlike i and j in rectangular coordinates, the pair er and eθ
depends on the base point! To express er and eθ in terms of i and j, one can do the following:
4.1 Vector Fields on R2 and R3 77
Take eθ as an example. It is the unit tangent vector along the θ-coordinate curve (with r
fixed). Such a curve can be parametrized by:
r0 (θ )
eθ = = (− sin θ )i + (cos θ )j.
|r0 (θ )|
In essence, one can derive eθ by:
1. differentiate the position vector xi + yj with respect to θ;
2. then divide the resulting vector by its magnitude to get a unit vector.
Let’s perform these procedures to derive the unit vector er :
∂ ∂(r cos θ ) ∂(r sin θ )
( xi + yj) = i+ j
∂r ∂r ∂r
= (cos θ )i + (sin θ )j.
Since it is a unit vector already, we conclude:
where Fr and Fθ are scalar-valued functions, which represent the radial and rotational compo-
nents of F respectively.
It is sometimes more convenient to use polar coordinates to express a radial or a rotational
vector field. For instance, if Fθ = 0, then F is a radial vector field, i.e.
A general vector field F = Fr er + Fθ eθ can be rewritten in terms of i and j using the relations
er = (cos θ )i + (sin θ )j and eθ = (− sin θ )i + (cos θ )j:
We next derive some conversion rules of vector field from rectangular to polar coordinates.
First we need to write i and j in terms of er and eθ . While it can be worked out by solving the
equations er = (cos θ )i + (sin θ )j and eθ = (− sin θ )i + (cos θ )j for i and j, we adopt a more
elegant approach.
Let i = Aer + Beθ where A and B are to be determined. Observing that er and eθ are
orthogonal and they are unit vectors, we consider:
i · er = ( Aer + Beθ ) · er
= Aer · er + Ber · eθ
= A + 0.
With these conversion rules for basis vectors, one can derive conversion rules for components
of a vector field:
Fx i + Fy j = Fr er + Fθ eθ
Fx ((cos θ )er − (sin θ )eθ ) + Fy ((sin θ )er + (cos θ )eθ ) = Fr er + Fθ eθ
( Fx cos θ + Fy sin θ )er + (− Fx sin θ + Fy cos θ )eθ = Fr er + Fθ eθ .
Fr = Fx cos θ + Fy sin θ
Fθ = − Fx sin θ + Fy cos θ
In polar coordinates, the vector field can be expressed as F = rer , which is a radial vector field.
To summarize all conversion rules derived so far, we have:
Polar and Rectangular Conversions
Conversion rules of coordinates:
q
x = r cos θ r= x 2 + y2
y
y = r sin θ θ = tan−1
x
Conversion rules of basis vectors:
Fr x Fθ y
Fx = p −p Fr = Fx cos θ + Fy sin θ
x2 + y2 x 2 + y2
Fr y Fθ x
Fy = p +p Fθ = − Fx sin θ + Fy cos θ
x 2 + y2 x 2 + y2
Generally, we do not usually express a vector field in polar coordinates unless it is radial or
rotational. Take F = (1 − y2 )i as an example, which we have Fx = 1 − y2 and Fy = 0. In polar
coordinates, the components are given by:
Therefore, F = {(1 − r2 sin2 θ ) cos θ }er − {(1 − r2 sin2 θ ) cos θ }eθ , which is way more compli-
cated than it was in rectangular coordinates.
Fr x Fθ y
Fx = p −p Fr = Fx cos θ + Fy sin θ
x 2 + y2 x 2 + y2
Fr y Fθ x
Fy = p +p Fθ = − Fx sin θ + Fy cos θ
x 2 + y2 x 2 + y2
Fz = Fz Fz = Fz
Similarly, differentiate r with respect to θ or ϕ, and divide the resulting vector by its length
to get a unit vector, one can get the expression of eθ and e ϕ as:
eθ = −(sin θ )i + (cos θ )j
e ϕ = (cos ϕ cos θ )i + (cos ϕ sin θ )j − (sin ϕ)k
eρ · eθ = 0, eρ · e ϕ = 0, eθ · e ϕ = 0.
GMm
Fρ = − , Fθ = 0, Fϕ = 0.
ρ2
Therefore,
GMm GMm x GMm
Fx = − (sin ϕ cos θ ) = − 2 · = − 3 x
ρ2 ρ ρ ρ
GMm GMm y GMm
Fy = − 2 (sin ϕ sin θ ) = − 2 · = − 3 y
ρ ρ ρ ρ
GMm GMm z GMm
Fz = − 2 (cos ϕ) = − 2 · = − 3 z,
ρ ρ ρ ρ
and so:
GMm xi + yj + zk
F = Fx i + Fy j + Fz k = − ( xi + yj + zk) = − GMm 2 ,
ρ3 ( x + y2 + z2 )3/2
Work = F · ∆r.
However, if the force is not a constant, meaning that either the direction or the magnitude
is not uniform, the work done by the force is not as simple as stated above. Likewise, if the
path is not a straight-path but a curved one so that ∆r is changing over time, the work done
by the force is again a bit more complicated. Line integrals of vector fields are introduced to
handle these more complicated scenarios.
Definition 4.1 — Line Integrals of Vector Fields. Given a vector field F( x, y, z) and a path C
which is parametrized by r(t), a ≤ t ≤ b, the line integral of F over C is defined to be:
ˆ b
F · r0 (t) dt.
a
Notation In the spirit of dr = r0 (t) dt, we may denote the above line integral in a concise
way as: ˆ
F · dr
C
where C is the given path.
Let’s first look at some computational examples before we explain the physical and geomet-
ric meanings of line integrals.
Example 4.1 Let F( x, y) = −yi + xj and C be´ the counter-clockwise path along the circular
arc from (1, 0) to (0, 1). Find the line integral C F · dr.
Solution First we parametrize C. A quick sketch of the path should tell us that C can be
parametrize by:
π
r(t) = (cos t)i + (sin t)j, 0 ≤ t ≤ .
2
By the definition of line integrals,
ˆ ˆ t=π/2
F · dr = F · r0 (t) dt
C t =0
ˆ π/2
= (−yi + xj) · ((− sin t)i + (cos t)j) dt
0
ˆ π/2
= (y sin t + x cos t) dt.
0
Along the path C, we have x = cos t and y = sin t according to the parametrization, so we
have:
ˆ ˆ π/2
F · dr = sin2 t + cos2 t dt
C 0
ˆ π/2
π
= 1 dt = .
0 2
84 Vector Calculus
A line integral in three dimension can be computed in an exactly the same way as in two
dimension. Let’s see one example:
Solution We are given the parametrization in the problem so we can proceed to the
computation of the line integral:
ˆ ˆ 1 D E
F · dr = y − x2 , z − y2 , x − z2 · r0 (t) dt
C 0
ˆ 1 D E D E
= y − x2 , z − y2 , x − z2 · 1, 2t, 3t2 dt
0
ˆ 1
= y − x2 + 2t(z − y2 ) + 3t2 ( x − z2 ) dt.
0
a
F · r0 (t) dt ' ∑ F · r0 (ti )∆ti
i
Page 4
where we divide the time interval [ a, b] into small subdivisions:
Assume each subdivision is very small that F and r0 are roughly constants in each subdivi-
sion. By definition of derivatives, we have:
∆r(ti )
r 0 ( ti ) ' .
∆ti
4.2 Line Integrals of Vector Fields 85
Therefore,
ˆ b
a
F · r0 (t) dt ' ∑ F · ∆r(ti )
i
Since F · ∆r(ti ) is the work done by F with displacement ∆r(ti ) (as they are roughly
constants), summing them up gives the approximated total work done by the force over the
whole path C. As n → ∞, the subdivisions become infinitesimal and ∑i F · ∆r(ti ) becomes
more accurate and approaches to the total work done by the force.
The value of the integral is determined by many factors, including the length of C, the
magnitude of F and the velocity of r(t). However, the sign of F · r0 (t) is solely determined by
the angle θ between F and the tangent vector r0 (t) of the curve C, since:
F · r0 (t) = |F| r0 (t) cos θ.
It is positive when F and r0 (t) make an acute angle, and is negative when they make an obtuse
angle. Therefore, the sign of the line integral can roughly reveal whether the path C is along or
against the direction of the vector field F. The more positive is the value, the more often the
path is traveling along the vector field.
This geometric meaning is very vague when compared to the physical meaning, but is a
good one for us to develop intuition towards line integrals.
where C is the path from (0, 1) down to (0, 0) along the y-axis, then to (1, 0) along the x-axis.
Solution Note that the path C has two segments. Break C into two segments C1 and C2
where C1 is the path from (0, 1) to (0, 0) along the y-axis, and C2 is the path from (0, 0) to
(1, 0) along the x-axis.
(0, 1)
C1
(1, 0)
(0, 0) C2
Along C1 , we have x = 0, and so dx = 0 too. From this we can immediately tell that
ˆ ˆ
−y dx + x dy = −y d(0) + 0 dy = 0.
C1 C1
∂f ∂f ∂f
∇f = i+ j+ k.
∂x ∂y ∂z
The most preliminary method to determine whether a given vector field is conservative is
to solve for the scalar potential f , as illustrated by the following example:
Determine whether or not F is a conservative vector field. If so, find its potential function f
such that F = ∇ f .
∂f
1 = 2x + y
∂x
∂f
2 = x + z3
∂y
∂f
3 = 3yz2 + 1
∂z
From 1 , one can find f ( x, y, z) by integrating 2x + y by x, regarding y to be a constant. In
single variable calculus, an integration constant will be added after the integration. However,
we are now considering partial derivatives, and not only constants but also y or z will vanish
after differentiating by x! In other words, the “integration constant” is no longer a constant
but instead a function of y and z. Precisely, we have:
4 f ( x, y, z) = x2 + yx + g(y, z)
where g(y, z) is some function of y and z. We will figure out g(y, z) in the remaining steps.
By differentiating both sides of 4 with respect to y, we get:
∂f ∂g
= x+ .
∂y ∂y
∂g
= z3 .
∂y
An integration by y yields:
g(y, z) = yz3 + h(z).
4.3 Conservative Vector Fields 89
Note that from similar principle discussed above for f , the integration “constant” is no
longer just a constant but a function not depending on y and hence a function of z only.
By differentiating both sides with respect to z, we get:
∂g
= 3yz2 + h0 (z).
∂z
Finally, by comparing this result with 3 , one must have h0 (z) = 1, and so clearly h(z) = z + C,
where C is geniunely a constant this time!
Combining all results above, we have
f ( x, y, z) = x2 + yx + yz3 + z + C
It is important to keep in mind that not all vector fields are conservative! Here is one which
such an f does not exist:
Example 4.5 Let F( x, y) = −yi + xj. Determine whether or not F is a conservative vector
field. If so, find its potential function f such that F = ∇ f .
∂f
1 = −y
∂x
∂f
2 =x
∂y
Solving 1 for f by integration, we get f ( x, y) = − xy + g(y) for some function g(y). However,
by differentiating this result with respect to y, we get:
∂f
= − x + g 0 ( y ).
∂y
One important feature of conservative vector fields is the path independence of line integral,
meaning that the line integral depends only on the end-points of the path. Precisely, we have:
Theorem 4.1 Given a conservative vector field F = ∇ f , where f is a potential function, then
along any path C connecting from point P0 ( x0 , y0 , z0 ) to point P1 ( x1 , y1 , z1 ), then the line
integral is given by: ˆ
F · dr = f ( x1 , y1 , z1 ) − f ( x0 , y0 , z0 ).
C
Proof. The proof is a consequence of the multivariable chain rule and the Fundamental Theorem
of Calculus. D E
∂f ∂f ∂f
Suppose F = ∇ f = ∂x , ∂y , ∂z and the path C is parametrized by:
Then,
ˆ ˆ
dr
F · dr = F·
dt
C dt
ˆC
dr
= ∇ f · dt
C dt
ˆ b
∂f ∂f ∂f dx dy dz
= , , · , , dt
a ∂x ∂y ∂z dt dt dt
ˆ b
∂ f dx ∂ f dy ∂ f dz
= + + dt
a ∂x dt ∂y dt ∂z dt
ˆ b
d
= f (r(t))dt (chain rule)
a dt
= f (r(b)) − f (r( a)) (Fundamental Theorem of Calculus)
The significance of this theorem is that the RHS depends only on the initial and final points
of the curve C, but not the intermediate path. In´other words,
´ if C1 and C2 are two paths with
the same initial and final points, then we have C F · dr = C2 F · dr. Moreover, if C is closed
1 ¸
path whose initial and final positions are the same, then C F · dr = 0.
Notation If C is a closed path, meaning that the two endpoints are the same, it is a
convention to use denote the line integral as:
˛
F · dr.
C
To summarize, we have:
Corollary 4.2 For a conservative vector field F, if C1 and C2 are two paths with the same
initial and final positions, then
ˆ ˆ
F · dr = F · dr.
C1 C2
Theorem 4.1 can be applied to find the line integral of a conservative vector field if path is
too complicated.
Example 4.6 Consider the vector field F( x, y, z) = (2x + y)i + ( x + z3 )j + (3yz2 + 1)k which
appeared in Example 4.4, and the path C given by the parametric equation
t
r(t) = (et sin2012 t)i + (cos2013 t)j + k, 0 ≤ t ≤ 2π.
π
´
Find the line integral C F · dr.
4.3 Conservative Vector Fields 91
Solution It is clear that direct computation of this line integral is extremely laborious (and
may be impossible). Fortunately, it was shown in Example 4.4 that F is a conservative vector
field with potential function
f ( x, y, z) = x2 + yx + yz3 + z + C.
The initial and final positions of the given path are respectively:
r(0) = h0, 1, 0i
r(2π ) = h0, 1, 2i.
Alternatively, one can also find this line integral using the path independence property
of conservative vector fields. Let L be the straight path connecting from (0, 1, 0) to (0, 1, 2)
which are the initial and final positions respectively. Since F is conservative, we must have:
ˆ ˆ
F · dr = F · dr.
C L
The latter is much easier to figure out: along the line L, we have
x ≡ 0, y ≡ 1, 0 ≤ z ≤ 2.
Therefore
ˆ ˆ
F · dr = (2x + y)dx + ( x + z3 )dy + (3yz2 + 1)dz
L L
ˆ z =2
= (0 + 1)d(0) + (0 + 23 )d(1) + (3 · 1 · z2 + 1)dz
z =0
ˆ 2
= (3z2 + 1)dz
0
2
= z3 + z = 10.
0
´ ´
By path independence, C F · dr = L F · dr = 10.
1
KE = m|r0 (t)|2 .
2
It is a nice exercise to show that the total energy is conserved:
92 Vector Calculus
Exercise 4.1 Let F = −∇V and r(t) = h x (t), y(t), z(t)i, show that the total energy:
1
E(t) := m|r0 (t)|2 + V (r(t))
2
is a constant. [Hint: multivariable chain rule and Newton’s Second Law F = mr00 (t).]
GMm
F=− eρ .
ρ2
To determine whether this kind of radially symmetric vector field is conservative and find
its potential function, it is more efficient to use the gradient ∇ f in spherical coordinates form:
∂f 1 ∂f 1 ∂f
∇f = eρ + eϕ + e .
∂ρ ρ ∂ϕ ρ sin ϕ ∂θ θ
The derivation of the above conversion is straight-forward but quite tedious. One can first
write
∂f ∂f ∂f
∇f = i+ j+ k,
∂x ∂y ∂z
then convert each of i, j and k into eρ , eθ and e ϕ . Secondly, use the chain rule to express each
∂f ∂f ∂f ∂f ∂f ∂f
of ∂x , ∂y
and ∂z in terms of ∂ρ , ∂θ and ∂ϕ .
We omit this tedious derivation here (readers may verify this as an exercise if you wish).
Here we are going to use the spherical form of ∇ to find a potential function for F.
Firstly, set F = ∇ f and try to solve for f :
GMm ∂f 1 ∂f 1 ∂f
− eρ = ∇ f = eρ + eϕ + e .
ρ2 ∂ρ ρ ∂ϕ ρ sin ϕ ∂θ θ
GMm ∂f
− =
ρ2 ∂ρ
∂f
0=
∂ϕ
∂f
0=
∂θ
The first equation gives
GMm
f (ρ, θ, ϕ) = + g(θ, ϕ)
ρ
where g(θ, ϕ) is an arbitrary function of θ and ϕ. However, the second and the third equations
demand that
∂g ∂g
= = 0.
∂θ ∂ϕ
4.3 Conservative Vector Fields 93
GMm
V=− .
ρ
Using this expression, the vector field F = reθ (which is equal to −yi + xj in rectangular
coordinates) can be seen to be not conservative. Set F = ∇ f , then:
∂f 1 ∂f ∂f
reθ = er + e + k,
∂r r ∂θ θ ∂z
and so:
∂f
0=
∂r
1 ∂f
r=
r ∂θ
∂f
0=
∂z
The second equation shows f (r, θ, z) = r2 θ + g(r, z) for some arbitrary function g. Then, we
∂f ∂g ∂g
will have ∂z = ∂z , and so the third equation demands that ∂z = 0. Therefore, g depends only
on r, and we can simply write g = g(r ). Finally, we have f (r, θ, z) = r2 θ + g(r ). Differentiation
∂f
with respect to r gives ∂r = 2rθ + g0 (r ). Then the first equation demands that:
g 0 (r )
0 = 2rθ + g0 (r ), or equivalently: = −θ.
2r
However, the LHS would be a function of r but the RHS is a function of θ. This cannot be true.
This shows the vector field F is not conservative.
F = Fx i + Fy j + Fz k,
the curl of the vector field F, denoted either by curl(F) or ∇ × F, is defined as:
∂ ∂ ∂
∇×F = i + j + k × Fx i + Fy j + Fz k
∂x ∂y ∂z
i j k
∂ ∂ ∂
= ∂x ∂y ∂z
Fx Fy Fz
∂Fz ∂Fy ∂Fx ∂Fz ∂Fy ∂Fx
= − i+ − j+ − k.
∂y ∂z ∂z ∂x ∂x ∂y
94 Vector Calculus
i The symbol ∇ := ∂ ∂ ∂
∂x i + ∂y j + ∂z k should be considered as an operator rather than a vector.
∂
It carries no physical or geometric meaning. “Multiplying” ∂x by a function P gives the
partial derivative ∂P
∂x as the product.
F = Fx i + Fy j + 0k.
i ∇ × F is called the curl of F because it measures how circular the vector field F is, as a
consequence of the Green’s and Stokes’ Theorems which we will learn very soon.
Solution
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
2x + y x + z3 2
3yz + 1
∂ 2 ∂ 3 ∂ ∂ 2
= (3yz + 1) − ( x + z ) i + (2x + y) − (3yz + 1) j
∂y ∂z ∂z ∂x
∂ ∂
+ ( x + z3 ) − (2x + y) k
∂x ∂y
= (3z2 − 3z2 )i + (0 − 0)j + (1 − 1)k = 0.
The curl test is a very straight-forward test to check whether a given vector field F is conserva-
tive, without solving for the potential function f .
Theorem 4.3 — Curl Test. Given a vector field F is defined and continuously differentiable
everywhere in R3 (or everywhere in R2 for vector fields in R2 ), then F is conservative if and
only if ∇ × F = 0.
Proof. The theorem has two directions (if and only if). One direction is easy, while the other
direction is more technical. We only present the (⇒)-part here
The (⇒)-part of the proof is a consequence of the Mixed Partial Theorem. Suppose F = ∇ f ,
∂f ∂f ∂f
then F = ∂x i + ∂y j + ∂z k, and so
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
∂f ∂f ∂f
∂x ∂y ∂z
∂2 f ∂2 f ∂2 f ∂2 f ∂2 f ∂2 f
= − i+ − j+ − k
∂y∂z ∂z∂y ∂z∂x ∂x∂z ∂x∂y ∂y∂x
= 0i + 0j + 0k = 0.
4.3 Conservative Vector Fields 95
y x
F( x, y) = − i+ 2 j.
x 2 + y2 x + y2
This vector field is not defined when ( x, y) 6= (0, 0). It can be easily verified (left as an exercise)
that ∇ × F = 0 for any ( x, y) 6= (0, 0). However, when integrating along the unit circle
C : r(t) = (cos t)i + (sin t)j where 0 ≤ t ≤ 2π:
˛ ˛
−ydx + xdy
F · dr =
C C x 2 + y2
ˆ 2π
− sin td(cos t) + cos td(sin t)
=
0 sin2 t + cos2 t
ˆ 2π 2
(sin t + cos2 t)dt
=
0 sin2 + cos2 t
ˆ 2π
= 1dt = 2π 6= 0.
0
C is a closed path, so F cannot be conservative even thought the curl of F is zero! The curl test
fails here because the vector field is not defined everywhere.
You have seen that if a two-dimensional vector field F is not defined at (0, 0), one cannot
always use the curl test to conclude whether it is conservative. However, one can relax the
everywhere-defined condition a little bit to allow us to use the curl test on vector fields which
can be undefined at some points. First we introduce:
Definition 4.4 — Simply Connected Regions. A region Ω is simply connected if every closed
loop in Ω can be contracted to a point continuously without leaving the region Ω.
contain its boundary points.
An open region R in ⺢2 (or D in ⺢3) is connected
➤ Roughly speaking, connected means points of R by a continuous curve lying in R. An
that R is all in one piece and simply if every closed simple curve in R can be deformed
connected in ⺢2 means that R has (Figure 15.28).
no holes. ⺢2 and ⺢3 are themselves
96 connected and simply connected. Vector Calculus
QUICK CHECK 1 Is a figure-8 curve simple? Closed
This curve cannot be contracted connected?
➤
to a point and remain in R.
eρ ρe ϕ (ρ sin ϕ)eθ
1 ∂ ∂ ∂
∇×F = .
ρ2 sin ϕ ∂ρ ∂ϕ ∂θ
Fρ ρFϕ (ρ sin ϕ) Fθ
Although the derivation of the above formula is extremely tedious, one of its neat con-
sequence is that if F is radially symmetric, i.e. Fϕ = Fθ = 0 and Fρ depends only on ρ,
then:
eρ ρe ϕ (ρ sin ϕ)eθ
1 ∂ ∂ ∂
∇×F = 2 = 0.
ρ sin ϕ ∂ρ ∂ϕ ∂θ
Fρ 0 0
For instance, the gravitational force field is radially symmetric, so its curl is a zero vector.
Although the spherical coordinates are undefined at the origin, it is not an issue when applying
the curl test using spherical coordinates since R3 with the origin removed is a simply connected
region. Therefore, any radially symmetric vector field R3 is conservative.
There is an analogous formula for cylindrical coordinates which is given by the following.
Again, we omit its derivation since it is also tedious to do so.
er reθ k
1
∇×F = ∂ ∂ ∂ .
r ∂r ∂θ ∂z
Fr rFθ Fz
Similarly, it can also be checked that for vector fields with Fθ = Fz = 0 and Fr depends only on
r, then ∇ × F = 0. However, cylindrical coordinates cannot be defined on the z-axis. Since R3
with the z-axis removed is not simply connected, one needs to be very careful when applying
the curl test under cylindrical coordinates!
98 Vector Calculus
C
DEFINITION Simple and Closed Curves
C
Suppose a curve C 1in ⺢2 or ⺢32 is described para
a … t … b. Then C is a simple curve if r1t12 ⬆
a 6 t1 6 t2 6 b; that is, C never intersects itself
Closed, simple Not closed, simple curve C is closed if r1a2 = r1b2; that is, the initia
the same (Figure 15.27).
C
C
The proof of the Green’s Theorem is quite technical if the curve C is complicated. The proof
of the theorem in some special cases can be found in the textbook. Let’s Suppose
firstthat
looktheatcomponents
some of F = 8 f, g, h 9
examples. Not connected, Not connected, tives on a region D in ⺢ 3
. Also assume that F is co
simply connected not simply connected
tion that there is a potential function w such that F
FIGURE 15.28 F and ⵜw, we see that f = wx, g = wy, and h = wz
a function has continuous second partial derivative
➤ The term conservative refers to second partial derivatives does not matter. Under t
conservation of energy. See Exercise 52
the following:
for an example of conservation of energy
in a conservative force field. • wxy = wyx, which implies that fy = gx,
• wxz = wzx, which implies that fz = h x, and
➤ Depending on the context and the • w = w , which implies that g = h .
4.4 Green’s Theorem 99
Example 4.8 Use the Green’s Theorem to evaluate the line integral:
˛
−ydx + xdy
C
Solution The vector field corresponding to this line integral is F = −yi + xj, which is
defined and is smooth everywhere on R2 . The given path C encloses the rectangle R with
vertices (0, 0), (1, 0), (1, 1) and (0, 1).
(0, 1) (1, 1)
(0, 0) (1, 0)
∇ × F = 2k.
Example 4.9 Let F = −y2 i + xyj and C be a simple closed curve in R2 enclosing a region
R which is symmetric across the x-axis. Show that:
˛
F · dr = 0.
C
∇ × F = 3yk.
100 Vector Calculus
Since the region R is symmetric across the x-axis and 3y is an odd function. If we split R
into two: R+ (the part above the x-axis) and R− (the part below the x-axis), then
¨ ¨
3y dA and 3y dA
R+ R−
are numerically the same except that one of the negative of the other. Therefore, by
¨ ¨ ¨
3y dA = 3y dA + 3y dA,
R R+ R−
we get: ¨
3y dA = 0
R
since the R+ -integral gets canceled out by the R− -integral. Therefore,
˛
F · dr = 0.
C
To summarize, applying Green’s Theorem is very strict forward. It often simplifies the
computation of a line integral over a closed loop since there is no need to parametrize the
curve. If the curve encloses a rectangular region, the Green’s Theorem is an effective tool for
evaluating the line integral because a double integral over a rectangle is easy to set up.
In other words,
˛
1
(∇ × F) · k ' F · dr.
area of R C
¸
Since C F · dr indicates how circular the vector field F is around the curve C, the above
result tells us that the quantity (∇ × F) · k at a point measures the density of circulation around
that point. That’s why we call ∇ × F to be curl of F because it is an indicator of curliness of a
vector field!
As an example, you can verify that the curls of the vector fields F = −yi + xj and G =
xi + yj are respectively given by:
4.4 Green’s Theorem 101
(∇ × F) · k = 2
(∇ × G) · k = 0
It suggests that F is more circular than G. By plotting them in Mathematica, we can see that
vectors in F are circling around the origin, while vectors in G are diverging from the origin.
It is not defined at the origin (0, 0). If we let C be the (counter-clockwise) unit circle centered
at the origin, then C can be parametrized by:
However, one can check from direct computations that ∇ × F = 0 at every point except the
origin, and is undefined at the origin. Therefore, the double integral
¨
(∇ × F) · k dA
R | {z }
undefined at (0, 0)
is an improper integral! To summarize, one cannot directly apply the Green’s Theorem if the
given curve encloses a point at which the vector field is not defined. In such cases, we should
either compute the line integral directly by parametrization, or use some other tools to compute
it. We will compute this integral by so-called the hole-drilling technique to be presented in the
next subsection.
If the curve does not enclose the origin (at which F is undefined), then Green’s Theorem
applies to F and such a curve without any issues. For instance, if C is a circle centered at
(3, 0) with radius 2, then it does not enclose the origin which is the only point at which F is
undefined. The Green’s Theorem does show that
˛ ¨ ¨
F · dr = (∇ × F) · k dA = 0 dA = 0.
C R | {z } R
=0 at every point in R
is a famous one which concerns about the winding number of a closed curve. As discussed
before, it is not defined at the origin, and ∇ × F = 0 at every point except the origin. If C is a
closed curve (not necessarily simple closed) in R2 not passing through the origin, there is a
celebrated result in topology that says:
˛
F · dr = 2π × number of times C travels around the origin counter-clockwisely.
C
102 Vector Calculus
1
¸ −y x
Consequently, the quantity 2π C r2 dx + r2 dy is often called the winding number of the curve C.
This number is not only important in pure mathematics but also in physics. Furthermore, the
computation of surface flux of a vector field satisfying the inverse square law, as we will see
later, are also in the same spirit as the computation of the winding number.
Our goal here is to use the Green’s Theorem to explain why this line integral gives the
winding number for some not-too-complicated curves.
Circles Centered at Origin
We start with the simplest case where the curve is a circle with radius ε centered at the origin,
which from now on denoted by Γε . The path is parametrized by:
Γε : r(t) = (ε cos t)i + (ε sin t)j, 0 ≤ t ≤ 2π.
Then, by straight-forward calculations, we get:
˛ ˆ t=2π
y x ε sin t ε cos t
− 2 2
dx + 2 + y2
dy = − 2 d(ε cos t) + d(ε sin t)
Γε x + y x t =0 ε ε2
ˆ t=2π
= sin2 t + cos2 t dt
t =0
ˆ t=2π
= 1 dt = 2π.
t =0
Similarly, R2 does not enclose the origin so the Green’s Theorem can be applied to R2 :
ˆ ˆ ˆ ˆ ¨
F · dr − F · dr − F · dr − F · dr = (∇ × F) ·k dA = 0. (4.2)
C2 L1 Γ2 L2 R2 | {z }
| ¸
{z } =0
boundary of R2 F·dr
By summing up (4.1) and (4.2) and canceling out the line integrals over L1 and L2 , we get:
ˆ ˆ ˆ ˆ
F · dr + F · dr − F · dr − F · dr = 0.
C1 C2 Γ1 Γ2
Self-Intersecting Curves
Next, let’s use the above result to deal with some intersecting curves. Consider the curve C in
Figure 4.7. Page 2
Since we have already know how to deal with any simple closed curve enclosing the origin,
we are going to split the curve C into simple closed curves C1 , C2 and C3 according to Figure
4.7.
104 Vector Calculus
As both C1 and C2 are simple closed curves enclosing the origin, by our previous discussion
we know: ˛ ˛
F · dr = F · dr = 2π.
C1 C2
For C3 , it is a simple closed curve not enclosing the origin. Therefore, the Green’s Theorem
can be applied on C3 without any issues. Hence, we have
˛ ¨
F · dr = (∇ × F) · k dA = 0
−C3 R
where R is the region enclosed by C3 . Here we have again used the fact that ∇ × F = 0 inside
the region R.
According the directions indicated on the diagram, we have:
˛ ˛ ˛ ˛
F · dr = F · dr + F · dr + F · dr = 2π + 2π + 0 = 4π.
C C1 C2 C3
Therefore, the winding number of this curve C is equal to 2. Similar technique can be
applied to more complicated curves which go around the origin a lot of times.
4.5 Parametric Surfaces 105
d S
R
a b u
y
c
x
r(u, v) ⫽ 具a system,
Solution If, under a certain coordinate u, v典 surface hasz one of the coordinates being
cos u, a sin the
constant, then
v giving a parametrization to that surface is fairly easy: simply take the other
a
two coordinates as parameters, and define r according to the conversion rules between this
coordinate system and the rectangular coordinate system.
h
The cylinder described in the problem can be presented by equation
S r = r0 under
R (r, θ, z ). The conversion rule between cylindricalh and rectangular
cylindrical coordinates
x = r cos θ
y = r sin θ
z=z
One can also specify the range of z so that the parametrization gives a finite cylinder. For
instance,
r(θ, z) = (r0 cos θ )i + (r0 sin θ )j + zk, 0 ≤ θ ≤ 2π, 0 ≤ z ≤ 1
gives the finite cylinder with height 1 unit (from z = 0 to z = 1).
Similarly, a cone making π4 angle with the z-axis can be represented by z = r in cylindrical
coordinates, or ϕ = π4 in spherical coordinates. Therefore, it is can be parametrized by two
different ways:
r1 (r, θ ) = (r cos θ )i + (r sin θ )j + rk, 0 ≤ r < ∞, 0 ≤ θ ≤ 2π
π π π
r2 (ρ, θ ) = ρ sin cos θ i + ρ cos sin θ j + ρ cos k
4 4 4
ρ cos θ ρ sin θ ρ
= √ i+ √ j + √ k, 0 ≤ ρ < ∞, 0 ≤ θ ≤ 2π
2 2 2
A sphere with radius 3 can be presented by ρ = 3 in spherical coordinates, and so it can be
parametrized by:
r3 (θ, ϕ) = (3 sin ϕ cos θ )i + (3 sin ϕ sin θ )j + (3 sin ϕ)k, 0 ≤ θ ≤ 2π, 0 ≤ ϕ ≤ π.
Parametrization of Graphs
If a surface can be represented by a Cartesian equation (i.e. level-set form) such as x2 + y − z3 =
1 and that you can write one of the variables as a function of the other two variables (such
as y = z3 − x2 + 1 in this example), then we can use the other two variables as parameters. For
instance, the surface y = z3 − x2 + 1 can be parametrized as:
r( x, z) = xi + (z3 − x2 + 1) j + zk.
| {z }
y
4.5 Parametric Surfaces 107
However,
p x cannot be written as a function of y and z for the surface y = z3 − x2 + 1, since
x = ± z3 − y + 1 and the ± makes it not a function of y and z. On the other hand, z can be
written as a function of x and y, since z3 = x2 + y − 1 and there is exactly one cubic root for
x2 + y − 1. However, the resulting parametrization
1/3
r( x, y) = xi + yj + x2 + y − 1 k
The Möbius strip, a famous object in topology, has the following parametrization
v u v u v u
r(u, v) = 1 + cos cos u i + 1 + cos sin u j + sin k
2 2 2 2 2 2
where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1.
Since r denotes the position vector xi + yj + zk, we can also write down a parametrization
in an equation form (especially when the expression of the parametrization is too long to fit in
one line). For instance, the Möbius strip parametrization can be equivalently written as:
v u
x = 1 + cos cos u
2 2
v u
y = 1 + cos sin u
2 2
v u
z = sin
2 2
108 Vector Calculus
where 0 ≤ u ≤ 2π and −1 ≤ v ≤ 1.
The following parametrization, which looks intimidating, describes a very beautiful surface
called the Klein bottle (see Figure 4.10):
2
x = − cos u 3 cos v − 30 sin u + 90 cos4 u sin u − 60 cos6 u sin u + 5 cos u sin u cos v
15
1
y = − sin u(3 cos v − 3 cos2 u cos v − 48 cos4 u cos v + 48 cos6 u cos v
15
− 60 sin u + 5 cos u sin u cos v − 5 cos3 u cos v sin u
− 80 cos5 u sin u cos v + 80 cos7 u sin u cos v)
2
z= (3 + 5 cos u sin u) sin v
15
for 0 ≤ u ≤ π and 0 ≤ v ≤ 2π.
is a special type of surface integral where the region of integration R is on the flat xy-plane. A
surface integral is one that the region of integration can be a curved surface. Many geometric
and physical quantities, such as surface area, surface flux, and moment of inertia for some
shell objects, can be computed using surface integrals. The Stokes’ Theorem to be introduced
in the next section also involves surface integrals.
We first state the definition of surface integrals, compute some examples and then explain
its geometric and physical meaning.
Definition 4.6 — Surface Integrals. Given a surface S parametrized by r(u, v) with a ≤ u ≤ b
and c ≤ v ≤ d, and a continuous, scaled-valued function f ( x, y, z), the surface integral of f
over the surface S is denoted and defined to be:
¨ ˆ v=d ˆ u=b
∂r ∂r
f dS = f (r(u, v)) × dudv.
S v=c u= a ∂u ∂v
i The line integral of a vector field over a curve C is independent of how we parametrize C.
It is also true that the surface integral over a surface S is also independent of the surface
parametrization r(u, v). The proof involves change of variables technique in multivariable
calculus and is omitted here.
S
Examples of closed surfaces include spheres and torus, while a hemisphere (only the
spherical part, the flat part is not included) is not closed since it has a circle as its boundary.
110 Vector Calculus
Example 4.11 Let S be the sphere of radius a centered at the origin. Evaluate the surface
integral ‹
x 2 + y2 dS
S
Solution We first parametrize the surface. Using spherical coordinates, the sphere is
presented by ρ = a. Take (θ, ϕ) as parameters, then a parametrization of the sphere is given
by:
r(θ, ϕ) = ( a sin ϕ cos θ )i + ( a sin ϕ sin θ )j + ( a cos ϕ)k
where 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π.
According to the definition of surface integrals, we need to compute ∂r
∂θ × ∂r
∂ϕ . It is
straight-forward, although quite tedious:
∂r
= (− a sin ϕ sin θ )i + a sin ϕ cos θ )j + 0k
∂θ
∂r
= ( a cos ϕ cos θ )i + ( a cos ϕ sin θ )j + (− a sin ϕ)k
∂ϕ
i j k
∂r ∂r
× = − a sin ϕ sin θ a sin ϕ cos θ 0
∂θ ∂ϕ
a cos ϕ cos θ a cos ϕ sin θ − a sin ϕ
= (− a2 sin2 ϕ cos θ )i + (− a2 sin2 ϕ sin θ )j + (− a2 sin ϕ cos ϕ)k
q
∂r ∂r
× = a4 sin4 ϕ(cos2 θ + sin2 θ ) + a4 sin2 ϕ cos2 ϕ
∂θ ∂ϕ
q
= a4 sin2 ϕ(sin2 ϕ + cos2 ϕ)
= a2 sin ϕ.
ˆ ϕ=π ˆ θ =2π
= a4 sin3 ϕ dθdϕ
ϕ =0 θ =0
ˆ ϕ=π
= 2πa4 sin3 ϕ dϕ
ϕ =0
4 8πa4
= 2πa4 · = .
3 3
4.5 Parametric Surfaces 111
Example 4.12 Let S be the plane 3x + 2y + z = 1 defined over the region 0 ≤ x ≤ 1 and
0 ≤ y ≤ 1. Compute the surface integral:
¨
( x + y + z) dS.
S
Solution The equation of the given plane can be written as a graph z = 1 − 3x − 2y over
the given region 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1. We can take x and y as parameters and so a
parametrization of the plane is given by:
r( x, y) = xi + yj + (1 − 3x − 2y)k
where 0 ≤ x ≤ 1 and 0 ≤ y ≤ 1.
Next we compute all the ingredients of the surface integral:
∂r
= i − 3k
∂x
∂r
= j − 2k
∂y
∂r ∂r
× = 3i + 2j + k
∂x ∂y
∂r ∂r p √
× = 32 + 22 + 12 = 14
∂x ∂y
x + y + z = x + y + (1 − 3x − 2y) = 1 − 2x − y
Surface Element
Now we explain the geometric and physical meaning of surface integrals. Given a para-
metric surface S with parametrization r(u, v), a ≤ u ≤ b and c ≤ v ≤ d, from now on we
denote:
112 Vector Calculus
∂r
Notation dS = ∂u × ∂v
∂r
dudv
It is a called the surface element of the integral. If we subdivide the domain in the uv-plane
into small rectangular pieces with area ∆u ∆v, the parametrization r(u, v) transform them into
small pieces ∆S on the surface (see Figure 4.11). Page 3
If the number of subdivisions is very large so that each ∆S is very small, then ∆S is
approximately a parallelogram. The red side of the parallelogram has length approximately
∂r
equal to ∂u ∆u (to see this, regard u is the time then the distance traveled after ∆u unit time
is approximately the speed × time. Similarly, the blue side of the parallelogram has length
approximately equal to ∂r
∂v ∆v. Suppose the angle of the parallelogram is θ, then the area of
∆S is approximately:
∂r ∂r ∂r ∂r ∂r ∂r
∆u · ∆v · sin θ = sin θ · ∆u ∆v = × ∆u ∆v.
∂u ∂v ∂u ∂v ∂u ∂v
| {z }
| ∂u
∂r
|| ∂v |
∂r sin θ
Infinitesimally, the ∆S becomes the surface element dS, and ∆u ∆v becomes dudv. In other
words, the surface element dS presents the area of a very tiny piece of subdivision on the
surface. Summing up the area of all tiny subdivisions, we get the surface area of S, i.e.
¨
dS = surface area of S.
S
Different geometric or physical meanings of the integrand f give different meanings to the
surface integral of f over S. For instance, ¨
If f is each element f dS means f dS means
S
1 area of dS total surface area
surface density mass of dS total mass of the surface
density × ( x2 + y2 ) moment of inertia of dS moment of inertia of S
Let S be the sphere of radius a centered at the origin. We have computed in Example 4.11
that ‹ 8πa4
x2 + y2 dS = .
S 3
Suppose this spherical shell has a uniform surface density δ, then the integral:
‹
δ x2 + y2 dS
S
4.5 Parametric Surfaces 113
is the moment of inertia Iz about z-axis. Since its formula in many physics textbooks is written
in terms of the total mass m of the sphere, let’s rewrite it in terms of m.
‹
Iz = δ x2 + y2 dS
S
8πa4
=δ
3
m 8πa4
= 2
·
4πa 3
2 2
= ma .
3
Figure 4.12: surface flux for uniform vector field through a plane
The force F can be decomposed into two components, one perpendicular to the plane,
another parallel to the plane. As the flux counts only the amount of vectors through the plane,
only those perpendicular to the plane should be counted. Suppose the force F makes an angle
Page 5
θ with the unit normal vector n̂, then the perpendicular component of the force has length
F · n̂ = |F| cos θ.
Furthermore, there are vectors at every point on the plane. The larger the plane, the more
vectors passing through it. Therefore, the flux of F through the plane should be defined as:
Now consider a curved surface (so that n̂ is no longer constant) and F is no longer a constant
vector field. Infinitesimally, each area element can be regarded as a tiny flat parallelogram, and
both F and n̂ can be regarded as constant vectors over each of tiny parallelogram. Recall that
dS is the area of this element. Therefore, the flux of F through this tiny bit of surface is given
by:
F · n̂ dS
Summing up, the total flux of F through the entire surface is given by a surface integral as
stated in the definition below:
Definition 4.7 — Surface Flux. Given a vector field F and a surface S, the surface flux of F
through S is defined to be the surface integral:
¨
F · n̂ dS
S
i Note that there are often two choice of n̂, so the sign of the surface flux depends on which
direction of n̂ is chosen. For a closed surface such as a sphere, it is a convention to choose
the outward unit normal.
i There are some surfaces which you cannot “choose” a unit normal convention. For
instance, if you pick a normal vector on the Möbius strip and let it vary continuously over
the strip, the normal vector may end up pointing at opposite direction when it returns to
the original position! We call these non-orientable surfaces. We do not define surface flux
on non-orientable surfaces.
Generally speaking, we need to parametrize the surface in order to compute the flux.
However, there are some exceptions that can be done without parametrization.
Example 4.13 Consider the gravitational vector field given in spherical coordinates by:
GMm
F=− eρ .
ρ2
Compute the outward flux of F through the sphere S with radius ρ0 centered at the origin, i.e.
‹
F · n̂ dS.
S
Solution Outward flux means the unit normal vector n̂ is chosen to be the outward normal.
For a round sphere centered at the origin, clearly we have n̂ = eρ .
Therefore,
GMm
F · n̂ = − eρ · eρ
ρ2
GMm 2
= − 2 eρ
ρ
GMm
=− 2 .
ρ
4.5 Parametric Surfaces 115
i Using the Divergence Theorem, we will later see that the above flux is always −4πGMm
for any closed orientable surface S provided that it encloses the origin. This fact is
commonly called the Gauss’s Law for Gravity.
Most surface flux cannot be compute as neatly as in the previous example. Generally speak-
ing, the surface flux can be computed by parametrizing the surface. Given a parametrization
r(u, v), with a ≤ u ≤ b and c ≤ v ≤ d, of a surface S, the unit normal vector n is given by:
∂r
× ∂r
n̂ = ± ∂u ∂v
∂r
∂u × ∂r
∂v
∂r ∂r
dS = × dudv.
∂u ∂v
116 Vector Calculus
xi + yj + zk
F = − GMm .
( x2+ y2 + z2 )3/2
Let S be part of the horizontal plane z = 1 over the region x2 + y2 ≤ 1. Compute the surface
flux of F through S, i.e. ¨
F · n̂ dS.
S
Here n̂ is chosen to be the upward normal.
Solution First we parametrize the surface S. Since the plane z = 1 is a graph over the
xy-plane, it seems the easiest way is to use x and y as parameters. However, the base region
x2 + y2 ≤ 1 is not a rectangle so it may be difficult to set up the ranges of x and y for the
parametrization. Since the base region is a solid circle, we can use cylindrical coordinates
too. Let
r(r, θ ) = (r cos θ )i + (r sin θ )j + k (since z = 1),
4.5 Parametric Surfaces 117
∂r
= (cos θ )i + (sin θ )j
∂r
∂r
= (−r sin θ )i + (r cos θ )j
∂θ
i j k
∂r ∂r
× = cos θ sin θ 0
∂r ∂θ
−r sin θ r cos θ 0
= (r cos2 θ + r sin2 θ )k
= rk (which is upward)
(r cos θ )i + (r sin θ )j + zk
F = − GMm
(r2 cos2 θ + r2 sin2 θ + z2 )3/2
(r cos θ )i + (r sin θ )j + k
= − GMm (since z = 1)
(r2 + 1)3/2
∂r ∂r GMmr
F· × =− 2 .
∂r ∂θ (r + 1)3/2
Example 4.15 Let F = xi − yj. Find the upward flux of F over S which is the upper part of
√
the sphere with radius 2 centered at the origin cut out by the plane z = 1.
118 Vector Calculus
Solution Since the surface is spherical, it is usually the best to use spherical coordinates to
√
parametrize it. Under spherical coordinates, the sphere is represented by ρ = 2, so we use
θ and ϕ for the parameters. Let
D√ √ √ E
r(θ, ϕ) = 2 sin ϕ cos θ, 2 sin ϕ sin θ, 2 cos ϕ .
∂r D √ √ E
= − 2 sin ϕ sin θ, 2 sin ϕ cos θ, 0
∂θ
∂r D√ √ √ E
= 2 cos ϕ cos θ, 2 cos ϕ sin θ, − 2 sin ϕ
∂ϕ
∂r ∂r D E
× = −2 cos ϕ sin2 θ, −2 sin ϕ sin2 θ, −2 cos ϕ sin ϕ
∂θ ∂ϕ
D√ √ E
F = xi − yj = 2 sin ϕ cos θ, − 2 sin ϕ sin θ, 0
∂r ∂r
√
F· × = −2 2 cos 2θ sin3 ϕ.
∂θ ∂ϕ
∂r
Note that ∂θ × ∂ϕ
∂r
obtained above is a downward normal since the k-component is negative.
The upward flux is given by:
¨ ˆ 2π ˆ π/4
∂r ∂r
F · n̂ dS = F· − × dϕdθ
S 0 0 ∂θ ∂ϕ
ˆ 2π ˆ π/4 √
= 2 2 cos 2θ sin3 ϕ dϕdθ
0 0
√ ˆ 2π
! ˆ !
π/4
=2 2 cos 2θ dθ sin3 ϕ dϕ = 0.
0 0
| {z }
=0
has unit m3 s−1 and so it measures the net volume of fluid through the surface S in the direction
of n̂. If the flux is positive, then there is more fluid flowing along the direction n̂ than against
it. On the other hand, if the flux is negative, there is more fluid flowing against n̂ than along it.
If S is a closed surface (such as a sphere, a cube, a torus), it is a convention to take n̂ as the
outward unit normal. The surface flux
‹
u · n̂ dS
S
over this closed surface measures the net volume of fluid flowing in the direction of n̂ through
the surface, or in other words, the net volume of fluid flowing out from region D enclosed by
S.
4.5 Parametric Surfaces 119
measures the net mass of the fluid flowing out from the enclosed region D through its boundary
S per unit time. Assuming there is no sink or source inside D, by the conservation of mass, the
rate of change of the total mass of fluid enclosed by S is related to the surface flux by:
˚ ‹
∂
$ dV = − $u · n̂ dS.
∂t D S
| {z }
total mass in D
Later we can apply the Divergence Theorem on the above relation to derive an important
equation to the fluid flow.
Now suppose the vector field J represents the transfer of heat energy in unit Joule per
second. Note that while energy is a scalar, the transfer of heat at different point may have a
different direction and so it is a vector quantity. Again, take S to be a closed surface enclosing
a solid region D, then the surface flux of j through S:
‹
J · n̂ dS
S
measures the amount of heat energy flowing out from the region D through S. This flux
integral is commonly called the heat flux through S by physicists.
If E is a electric field and, again, S is a closed surface, then the flux integral
‹
E · n̂ dS
S
is commonly called the electric flux through S. A result by Gauss claims that this flux integral is
proportional to the total amount of charges enclosed by the surface. If B is a magnetic field
and, again, S is a closed surface, then the flux integral
‹
B · n̂ dS
S
is commonly called the magnetic flux through S. Gauss said that it must be zero.
120 Vector Calculus
where n̂ is the unit normal vector to S, with direction determined by the right-hand rule
(see Figure 4.16).
i ➤ Theofright-hand
By comparing the statements the Green’srule
and tells
Stokes’
youTheorems,
which ofone can easily see that
the Green’s Theorem is a special case of the Stokes’ Theorem, in a sense that the former
applies to plane curve andtwo
the normal vectors
flat region at abypoint
enclosed of S on
the curve use.xy-plane. TheTHEOREM
to the unit Stoke
15.13
normal vector n̂ for the plane region is that
Remember obviously given by of
the direction k ifnormal Let S be a smooth
the plane curve is traveling orient
in the counter-clockwise orientation.
vectors changes continuously on an orientation is consistent
i
oriented surface. field whose components
For closed curves in the three-dimensional space, one cannot say whether they are counter-
clockwise or clockwise as it depends on the direction of observations. Therefore, the
counter-clockwise convention of the Green’s Theorem is generalized to the right-hand rule
condition in the statement of the Stokes’ Theorem.
i The Stokes’ Theorem applies only on orientable surfaces. That says, it may not hold for
where
surfaces such as the Möbius strip. Also, the condition where the vector field F needs to be n is the unit vecto
defined and is continuously differentiable on and near the surface S is crucial. However,
we will mostly deal with vector fields that satisfy this condition.
The condition that S has to be simply-connected is also crucial, but we will later learn how
i QUICK
to modify the Stokes’ Theorem CHECK Suppose
so as to1allow that S is a surface S.
non-simply-connected
The
meaning of Stokes
region in the xy-plane with a boundary Green’s Theorem: Under th
The proof of the Stokes’ Theorem is omitted here. Interested readers may consult the
textbook for a proof of oneoriented counterclockwise.
special case. What is the field over the surface S (as g
Let’s look at some examples.
normal to S? Explain why Stokes’ culation on the boundary of
Theorem becomes the circulation form end of this section. First, we
➤
of Green’s Theorem. theorem.
If F is a conservative v
such that F = ⵜw. Because
4.6 Stokes’ Theorem 121
Since the RHS is a surface integral, we need to parametrize it first in order to compute it.
Using spherical coordinates, the parametrization of S is given by:
where 0 ≤ θ ≤ 2π and 0 ≤ ϕ ≤ π
2. Then,
∂r
= (−2 sin ϕ cos θ )i + (2 sin ϕ cos θ )j + 0k
∂θ
∂r
= (2 cos ϕ cos θ )i + 2 cos ϕ sin θ )j + (−2 sin ϕ)k
∂ϕ
∂r ∂r
× = (−4 sin2 ϕ cos θ )i + (−4 sin2 ϕ sin θ )j + (−4 sin ϕ cos ϕ)k
∂θ ∂ϕ
∂r ∂r
× = (4 sin2 ϕ cos θ )i + (4 sin2 ϕ sin θ )j + (4 sin ϕ cos ϕ)k
∂ϕ ∂θ
∂r
Note that ∂θ × ∂ϕ
∂r
is pointing downward, so we use ∂r
∂ϕ × ∂r
∂θ instead.
Next we need to compute ∇ × F:
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
z−y x −x
= 2j + 2k.
122 Vector Calculus
i Although the Stokes’ Theorem was used in the previous example (as required by the
problem), it is actually easier to compute the line integral directly by parametrizing the
curve:
Let r(t) = (2 cos t)i + (2 sin t)j + 0k where 0 ≤ t ≤ 2π. Then
˛ ˛
F · dr = F · r0 (t) dt
C C
ˆ 2π
= ((z − y)i + xj − xk) · ((−2 sin t)i + (2 cos t)j + 0k) dt
0
ˆ 2π
= (−2(z − y) sin t + 2x cos t) dt
0
ˆ 2π
= −2(−2 sin t) sin t + 2(2 cos t) cos t dt
0
ˆ 2π
= 4(sin2 t + cos2 t) dt = 8π.
0
The purpose of the previous example is simply to illustrate how to use the Stokes’ Theorem,
although it is not necessary to use it. The line integral in the next example, however, would be
extremely difficult to compute without the Stokes’ Theorem.
Example 4.17 Let C be the curve of intersection of the plane Ax + By + Cz = 0 and the
sphere x2 + y2 + z2 = a2 . Show that
˛
πa2 ( A + B + C )
ydx + zdy + xdz = ± √
C A2 + B2 + C 2
where ± is determined by the orientation of C.
Solution The plane Ax + By + Cz = 0 passes through the origin. Therefore, the curve C is
a great circle on the sphere. However, this great circle is neither horizontal or vertical, so it is
difficult to parametrize C to compute the line integral.
Let’s use the Stokes’ Theorem to see if there is any luck! When using the Stokes’ Theorem,
one needs to pick a surface S whose boundary curve is C. There are three choices:
1. region of the plane enclosed by C
2. the hemisphere above the plane
3. the hemisphere below the plane
4.6 Stokes’ Theorem 123
All of the above choice should give the same conclusion. However, let’s pick the region of
the plane enclosed by C to be the surface S and we will explain why it is the smartest choice
among all three.
Now the given line integral is associated to the vector field F = yi + zj + xk, i.e.
˛ ˛
ydx + zdy + xdz = F · dr.
C C
i j k
∂ ∂ ∂
∇×F = ∂x ∂y ∂z
y z x
= −i − j − k.
We also need the unit normal vector n̂, but since the region S is a plane whose equation is
Ax + By + Cz = 0. The unit normal is a constant vector given by:
Ai + Bj + Ck
n̂ = ± √
A2 + B2 + C 2
where ± is determined by the orientation of C.
Therefore,
Ai + Bj + Ck A+B+C
(∇ × F) · n̂ = ± (−i − j − k) · √ = ±√ .
2 2
A +B +C 2 A2 + B2 + C 2
Next, we apply the Stokes’ Theorem on these C and S:
˛ ¨
F · dr = (∇ × F) · n̂ dS
C S
¨
A+B+C
=± √ dS
S A2 + B2 + C 2
¨
A+B+C
= ±√ dS
A2 + B2 + C2 | S{z }
surface area
πa2 ( A + B + C)
=±√ .
A2 + B2 + C 2
Note that the surface area of the region S on the plane is πa2 , since its boundary is a circle
with radius a.
There are two major reasons why the part of the plane enclosed by C is a smarter choice
for S than the hemispheres. For one thing, both ∇ × F and n̂ are constant vector field if
S is chosen to be a planar region, so that computing its surface integral is very easy – no
parametrization! For another, if any of the hemispheres were chosen to be S, then the surface
integral needs to be computed by parametrization – which can be tedious. It is also very
difficult to determine the range of values of ϕ and θ since the plane cutting the sphere is not
a horizontal one.
Occasionally, the Stokes’ Theorem can be applied to evaluate a surface integral over an
arbitrary or complicated surface, which is not easy to be parametrized. If the given vector field
G can be expressed in the form of F = ∇ × G for another vector field G, by the (backward)
Stokes’ Theorem asserts that:
¨ ¨ ˛
F · n̂ dS = (∇ × G) · n̂ dS = G · dr
S S C
124 Vector Calculus
where C is the boundary curve of the surface S. Very often, the line integral is easier to compute
if one can parametrize it.
i Note that the above discussion holds only when the given vector field F is of the form
F = ∇ × G. If such an F is not in this form, there is no easy way to apply the Stokes’
Theorem backward.
Example 4.18 Let C be an arbitrary simple closed curve on the xy-plane in the xyz-space,
and S be an arbitrary surface above the xy-plane with boundary curve C.
y
1. Verify that i = ∇ × − 2z j + 2 k .
2. Show that: ¨
i · n̂ dS = 0.
S
z i j k
y ∂ ∂ ∂
∇× − j+ k = ∂x ∂y ∂z
2 2 y
0 − 2z z
∂ y ∂ z
= − − i − 0j + 0k
∂y 2 ∂z 2
=i
y
For part (2), we denote G = − 2z j + 2 k for simplicity. Then i = ∇ × G. Applying the
Stokes’ Theorem backward, we get:
¨ ¨
i · n̂ dS = (∇ × G) · n̂ dS
S
˛S
= G · dr.
C
Therefore,
¸
F · dr.
C
(∇ × F) · n̂ ' .
Surface area of S
4.6 Stokes’ Theorem 125
This quantity is large when the vector field F is circular about the normal vector n̂. In other
words, the quantity (∇ × F) · n̂ measures the circulation density around any given point. That’s
why ∇ × F is often called the curl of F.
Page 6
Conservative Vector Fields
Recall that a vector field F is conservative if there exists a scalar function f such that F = ∇ f .
If F is defined and continuously differentiable everywhere (or on a simply-connected domain),
then the curl test asserts that F is conservative if and only if ∇ × F = 0. Using the Stokes’
Theorem, if F is conservative, then:
˛ ¨
F · dr = (∇ × F) ·n̂ dS = 0,
C S | {z }
=0
which recovers the result we proved before in Theorem 4.1. Of course, the Stokes’ Theorem as-
serts more than Theorem 4.1 does because it applies to any vector field, not just the conservative
ones.
Summing up all three equations and cancelling out the Li ’s terms, we get:
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ !
+ + + − − − − F · dr
C1 C2 C3 C4 Γ1− Γ1+ Γ2− Γ2+
¨ ¨ ¨
= + + (∇ × F) · n̂ dS
S1 S2 S3
While the surface integral (RHS) is in the same form as in the usual Stokes’ Theorem, the LHS
is not summing up the boundary line integrals, but rather the outer boundary has a plus sign
in front and the inner boundaries each has a minus sign in front.
For more complicated surfaces (with many, but finitely many, holes), one can apply the
technique illustrated above to establish:
Theorem 4.7 — Stokes’ Theorem for Higher Genusa Surfaces. Let S be an orientable surface
in R3 with n holes. Denote C to be its outer boundary, and Γ1 , Γ2 , . . . , Γn to be its inner
boundaries. Suppose F is a vector field is defined and continuously differentiable on and
near the surface S, then:
˛ n ˛ ¨
F · dr − ∑ F · dr = (∇ × F) · n̂ dS.
C i =1 Γ i S
Here n̂ is the unit normal vector to S with orientation determined by the right-hand rule
applied to the outer boundary C.
a The word genus is the mathematical term for number of “holes” inside the surface.
4.7 Divergence Theorem 127
F = Fx i + Fy j + Fz k,
Note that if the vector field F is given in spherical coordinates, the divergence of F is not
defined as:
∂Fρ ∂F ∂Fϕ
∇·F = + θ+ WRONG!
∂ρ ∂θ ∂ϕ
Using the chain rule and the conversion rules of components, one can convert ∇ × F from
the rectangular coordinate form to both spherical and cylindrical forms:
1 ∂ 2 1 ∂ 1 ∂Fθ
∇·F = ρ Fρ + ( F sin ϕ) +
ρ2 ∂ρ ρ sin ϕ ∂ϕ θ ρ sin ϕ ∂θ
1 ∂ 1 ∂Fθ ∂Fz
= (rFr ) + + .
r ∂r r ∂θ ∂z
The derivation, although straight-forward, is very tedious and is therefore omitted here.
We will see the geometric interpretation of ∇ × F after we state the Divergence Theorem.
Essentially, it measures how diverging the vector field is.
The Divergence Theorem is particularly useful for computing the flux over a closed surface,
as the theorem says we do not need to parametrize the surface and compute the normal vector,
and all is needed is to compute a triple integral instead, which is often easier than computing
surface integrals. Let’s look at some examples.
128 Vector Calculus
Example 4.19 Consider the vector field F = 3xi + 4yj − 5zk. Let S be the sphere with
radius a centered at the origin. Evaluate the flux integral:
‹
F · n̂ dS
S
Solution You can imagine the computation would be quite tedious if we computed this
flux integral directly by parametrizing the sphere. However, since this is a flux integral over
a closed surface, one can try to use the Divergence Theorem to compute it!
By easy computations, we have:
∂ ∂ ∂
∇·F = (3x ) + (4y) + (−5z) = 3 + 4 − 5 = 2.
∂x ∂y ∂z
Denote D to be the solid sphere with radius a centered at the origin, i.e. the solid region
enclosed by S. By the Divergence Theorem, we have:
‹ ˚ ˚
4 8
F · n̂ dS = ∇ · F dV = 2 dV = 2 · πa3 = πa3 .
S D D 3 3
Solution The closed surface S has six faces! If one attempts to compute the flux integral
directly, one needs to split it into six integrals, corresponding to each of its six faces.
However, if one applies the Divergence Theorem, the difficult surface integral becomes a
triple integral over a rectangular region which is very easy to set-up.
We first compute:
∂ 2 ∂ ∂
∇·F = ( x ) + (4xyz) + (ze x ) = 2x + 4xz + e x .
∂x ∂y ∂z
We omit the computational detail of the triple integral above, which is a very straight-forward.
4.7 Divergence Theorem 129
Let’s demonstrate one example using the cylindrical form of the divergence.
Example 4.21 Let F = r (2 + sin2 θ )er + (r sin θ cos θ ) eθ + 3zk, and S be the quarter-cylinder
r = 2, 0 ≤ θ ≤ π and 0 ≤ z ≤ 5. Compute the surface flux:
‹
F · n̂ dS.
S
Solution The cylinder has 5 faces. It would be time-consuming to compute the total flux
face-by-face directly. Let’s try to use the Divergence Theorem (since S is a closed surface).
The vector field F is given in cylindrical form, and its components are:
Fr = r 2 + sin2 θ , Fθ = r sin θ cos θ, Fz = 3z.
Therefore, we have:
1 ∂ 2 1 ∂ ∂(3z)
∇·F = r (2 + sin2 θ ) + (r sin θ cos θ ) +
r ∂r r ∂θ ∂z
1
= · 2r (2 + sin2 θ ) + (cos2 θ − sin2 θ ) + 3
r
= 4 + 2 sin2 θ + cos2 θ − sin2 θ
= 4 + cos2 θ + sin2 θ = 5.
One should note that the surface integral stated in the Divergence Theorem is
‹
F · n̂ dS
S
but not of (∇ × F) · n̂. In fact, using the Divergence Theorem, one can show that
‹
(∇ × F) · n̂ dS = 0
S
for any continuously differentiable vector field F.
It is because:
‹ ˚
(∇ × F) ·n̂ dS = ∇ · G dV
S | {z } D
=:G
˚
= ∇ · (∇ × F) dV.
D
Using the Mixed Partials Theorem, all of the above second derivatives are canceled out, and so
∇ · (∇ × F) = 0.
Therefore, we get:
‹
(∇ × F) · n̂ dS = 0.
S
When the region D is very small, one can regard ∇ × F is nearly a constant, and so we have:
‹
F · n̂ ' (∇ · F) × volume of D.
S
In other words, ∇ × F measures the flux density near a point. The more diverging F is
around a point, the higher the flux over a tiny closed surface around that point, resulting in
greater value of ∇ · F. This justifies the use of name divergence for ∇ · F.
If S is the sphere with radius a centered at the origin, then its outward unit normal is simply
eρ and so
‹ ‹
1 GMm
F · n̂ dS = − GMm eρ · eρ dS = − 2 · 4πa2 = −4πGMm.
S S ρ2 a
If one attempts to apply the Divergence Theorem on this vector field, then by the spherical
form of the divergence operator, we can easily see that:
1 ∂ 2 1 1 ∂
∇ · F = − GMm ρ · 2 = − GMm (1) = 0
ρ ∂ρ ρ ρ ∂ρ
is an improper integral since D (which is the solid sphere enclosed by S) contains the origin at
which F is undefined!
To conclude, one needs to be very careful when applying the Divergence Theorem if the
region D contains some points at which the vector field is not defined.
4.7 Divergence Theorem 131
Gluing S1 , Π and Σ1 together gives a closed surface not enclosing the origin. Denote D1 to
be the solid enclosed by this closed surface. Hence, one can apply the Divergence Theorem
without any issue:
¨ ¨ ¨ ˚
GMm GMm
+ + eρ · n̂ dS = ∇· eρ dV = 0
S1 Π Σ1 ρ2 D1 ρ2
| {z }
=0
where n̂ is the outward unit normal of the boundary surface of D1 . Denote n̂up and n̂down to
be the upward and downward normal vector respectively. The above integrals can be expressed
as:
¨ ¨ ¨
GMm GMm GMm
2
e ρ · n̂ up dS + 2
e ρ · n̂ down dS + eρ · n̂down dS = 0. (4.3)
S1 ρ Π ρ Σ1 ρ2
Similarly, gluing S2 , Π and Σ2 together gives a closed surface not enclosing the origin. By
the Divergence Theorem applied to this surface, we get:
¨ ¨ ¨
GMm GMm GMm
2
e ρ · n̂ down dS + 2
e ρ · n̂ up dS + eρ · n̂up dS = 0. (4.4)
S2 ρ Π ρ Σ2 ρ2
132 Vector Calculus
We then add up the above two equations. First note that S1 and S2 can glue together to
form the closed surface S. Both n̂up of S1 , and n̂down of S2 become the outward normal of S.
Therefore,
¨ ¨ ¨
GMm GMm GMm
2
e ρ · n̂ down dS + 2
e ρ · n̂ up dS = eρ · n̂outward dS.
S1 ρ S2 ρ S ρ2
For the planar surface Π, the downward normal n̂down is in the opposite direction of the
upward normal n̂up , i.e. n̂down = −n̂up . Therefore,
¨ ¨
GMm GMm
2
eρ · n̂down dS + eρ · n̂up dS = 0.
Π ρ Π ρ2
Finally, the surfaces Σ1 and Σ2 glue together to form the closed sphere Σ. The normal
vectors n̂down of Σ1 , and n̂up of Σ2 , are the inward unit normal of Σ. Therefore,
¨ ¨ ¨
GMm GMm GMm
2
eρ · n̂down dS + 2
eρ · n̂up dS = eρ · n̂inward dS.
Σ1 ρ Σ2 ρ Σ ρ2
Therefore,
‹ ¨
GMm GMm
e ρ · n̂ outward dS = − eρ · n̂inward dS
S ρ2 Σ ρ2
‹
GMm
= eρ · n̂outward dS
Σ ρ2
= 4πGMm (computed before)
This holds true for any closed surface S enclosing the origin. This proves the Gauss’s Law
for Gravity (assuming the inverse-square law).
However, if S does not enclose the origin, then one can apply the Divergence Theorem on
the gravitational vector field without any issue.
To conclude, for any closed surface S not passing through the origin, we have:
‹ (
1 4π if S encloses the origin;
2
eρ · n̂ dS =
S ρ 0 otherwise.
qQ
F = QE = eρ
4πε 0 ρ2
where ρ and eρ are evaluated at the point of the Q-particle. Therefore, if both q and Q are of
the same sign (i.e. like charges), the force F exerted on the Q-particle is pointing outward from
the origin. This indicates the two charges repulse each other. On the other hand, if q and Q
are of different sign, the force F exerted on the Q-particle is pointing inward, indicating an
attraction.
Note that the above E-field is not defined at the origin. Physically speaking, it can be
interpreted as having highly concentrated charge density at the origin. We leave it as an
exercise for readers to verify that:
∇·E = 0
∇×E = 0
∂f 1 ∂f 1 ∂f
∇f = eρ + eϕ + e .
∂ρ ρ ∂ϕ ρ sin ϕ ∂θ θ
In physics, we usually define potential energy according to the negative convention: E = −∇V.
Set:
q ∂V 1 ∂V 1 ∂V
eρ = −∇V = − eρ − eϕ − e .
4πε 0 ρ 2 ∂ρ ρ ∂ϕ ρ sin ϕ ∂θ θ
By equating the components, we have
q ∂V
=−
4πε 0 ρ2 ∂ρ
∂V
0=−
∂ϕ
∂V
0=−
∂θ
Clearly, V depends only on ρ, and by the first equation, we have:
ˆ
q q
V=− dρ = + constant.
4πε 0 ρ2 4πε 0 ρ
By picking the constant to be zero (so that the potential energy is 0 at infinity), the electric
potential energy due to the q-particle is therefore given by:
q
V= .
4πε 0 ρ
5.1 Coulomb’s Law 135
The above discussion assumes the point charge is located at the origin, so that the E-field
generated can be conveniently expressed using spherical coordinates. Using rectangular
coordinates, the above E-field and the electric potential V are given by:
q ( xi + yj + zk)
E= 3/2
4πε 0 ( x2 + y2 + z2 )
q
V= p
4πε 0 x + y2 + z2
2
If the point charge is located at ( x0 , y0 , z0 ), then the E-field generated and its electric
potential V are simply translations of the above, i.e.
q (( x − x0 )i + (y − y0 )j + (z − z0 )k) q r − r0
E= · =
4πε 0 (( x − x0 )2 + (y − y0 )2 + (z − z0 4πε 0 |r − r0 |3
)2 )3/2
q q
V= =
4πε 0 |r − r0 |
p
2 2
4πε 0 ( x − x0 ) + (y − y0 ) + (z − z0 ) 2
where r = xi + yj + zk and r0 = x0 i + y0 j + z0 k.
If there are more than one point charges, the resulting E-field generated by these point
charges is the vector sum of all E-fields generated by these charges. Precisely, if these particles
are located at r1 , r2 , . . . with charges q1 , q2 , . . . correspondingly, then the overall E-field is given
by:
q r − r1 q r − r2 q r − ri
E= 1 · 3
+ 2 · 3
+... = ∑ i · .
4πε 0 |r − r1 | 4πε 0 |r − r2 | i
4πε 0 |r − ri |3
Physicists often call this rule the Principle of Superposition.
(E · n̂) A
For a non-uniform E-field across a curved surface S, the electric flux through the surface is
defined using a surface integral: ¨
ΦE = E · n̂ dS.
S
Recall that the E-field generated by a point particle with charge q located at the origin is
given by:
q
E= eρ .
4πε 0 ρ2
Given a closed surface S, we have computed in the previous chapter that:
‹ (
1 4π if S encloses the origin;
e · n̂ dS =
2 ρ
S ρ 0 otherwise.
Therefore, the electric flux through S due to a particle located at the origin is given by:
‹ ( q q
q eρ 4πε 0 4π = ε 0 if S encloses the origin;
ΦE = 2
· n̂ dS =
S 4πε 0 ρ 0 otherwise.
By translation invariance, the electric flux through a closed surface S due to a particle located
at any position is given by:
(q
if S encloses the particle;
ΦE = ε 0
0 otherwise.
136 Topics in Physics and Engineering
For more than one point charges (say with charges q1 , q2 , . . .), the overall E-field is given by
E = E1 + E2 + . . .
where Ei is the electric field generated by the qi -particle. Therefore, the resulting electric flux
through a closed surface S is given by:
‹ ‹
ΦE = (E1 + E2 + . . .) · n̂ dS = ∑ Ei · n̂ dS
S i S
‹
For each i, the flux integral Ei · n̂ dS is zero if S does not enclose the qi -particle, and is equal
S
qi
to ε0if S encloses the qi -particle. Therefore, by summing up each individual electric flux, we
conclude that:
Qenc
ΦE =
ε0
where Qenc is the sum of charges enclosed by S. It is commonly called the Gauss’s Law for
Electricity (in the case of discrete charges).
From practical viewpoint, a continuous charge distribution (i.e. charge cloud) is composed
of densely packed discrete particles. Therefore, physicists assert that the Gauss’s Law for
Electricity is true for any (discrete or continuous) charge distribution:
‹
Qenc
E · n̂ dS =
S ε0
Here E is the electric field, B is the magnetic field and J is the current. Q is the total amount of
charges enclosed by the surface S. Both µ0 and ε 0 are positive constants.
We have discussed the Gauss’s Law for Electricity in the previous section. It can be derived
by assuming the Coulomb’s Inverse-Square Law. The Gauss’s Law for Magnetism asserts that
the magnetic flux through any closed surface must be zero! In physical terms, it means the
amount of magnetic field entering the surface is exactly equal to that leaving the surface.
Under the assumption that the Gauss’s Law for Magnetism is true, magnetic monopoles (i.e.
magnets with only one pole) are ruled out. It is because a magnet with only one pole will
generate a B-field similar to the E-field generated by a point charge, which is radially outward
or inward, giving non-zero flux (positive if radially outward, negative if radially inward) on a
surface enclosing the magnetic monopole. It would lead to a contradiction to the Gauss’s Law
for Magnetism.
Easy experiments tell us that breaking an ordinary bar magnet cannot create a magnetic
monopole. Instead, the magnet will be broken into two magnets each of which contains both
138 Topics in Physics and Engineering
Figure 5.4: breaking a bar magnet does not create magnetic monopoles
south and north poles. If in someday a magnet monopole is discovered (maybe using some
exotic materials), the Gauss’s Law for Magnetism will need to be modified.
The Faraday’s and Ampère’s Laws in integral form look a bit more complicated than the
two Gauss’s Laws. We will discuss them after we derive the differential form of Maxwell’s
Equations.
for any solid region D in R3 , then the functions f and g are identical, i.e. f ( x, y, z) = g( x, y, z)
for any ( x, y, z) in R3 .
Proof. Here we present a heuristic, rather than a rigorous, proof. Since it is assumed that
˚ ˚
f ( x, y, z) dV = g( x, y, z) dV
D D
for any solid region D, it is in particular true for tiny ones. Take D to be a very tiny region in
R3 , then both f and g are approximately constants over the region D. Therefore,
˚ ˚
f ( x, y, z) dV ' g( x, y, z) dV.
D D
˝
By canceling out D dV on both sides, we get f ( x, y, z) = g( x, y, z) over the region D. Since
this tiny region D is arbitrarily chosen, one can repeat the same argument with other tiny
regions to conclude that f ( x, y, z) = g( x, y, z) over the whole R3 .
5.2 Introduction to Maxwell’s Equations 139
One can use this theorem and the Divergence Theorem to derive the differential form of
Gauss’s Laws. The Gauss’s Law for Magnetism asserts that:
‹
B · n̂ dS = 0
S
for any closed surface S. Denote D to be the solid enclosed by S, then the Divergence Theorem
asserts that: ˚ ‹ ˚
∇ · B dV = B · n̂ dS = 0 = 0 dV.
D S D
Note that S, and hence D, are arbitrary. By Theorem 5.1, we have:
∇ · B = 0.
This equation is called the differential form of the Gauss’s Law for Magnetism.
One can also convert the Gauss’s Law for Electricity into a differential form by introducing
the charge density function $, which infinitesimally measures the amount of charges per unit
volume. The amount of charge Q in a solid region D is related to the charge density by:
˚
Q= $ dV.
D
for any closed surface S. Here D is the solid region enclosed by S. Using the Divergence
Theorem, we get: ˚ ˚
$
∇ · E dV = dV.
D ε
D 0
Since S and hence D are arbitrary, Theorem 5.1 shows:
$
∇·E = .
ε0
It is called the differential form of the Gauss’s Law for Electricity. In physics term, this equation
tells us the denser the charges, the more diverging the E-field.
Since it is true for any surface Σ, one can pick a tiny surface Σ so that (∇ × E) · n̂ and − ∂B
∂t · n̂
are approximately constants, and so:
¨ ¨
∂B
(∇ × E) · n̂ dS = − · n̂ dS.
Σ ∂t Σ
˜
By canceling out Σ dS on both sides, we have
∂B
(∇ × E) · n̂ = − · n̂.
∂t
140 Topics in Physics and Engineering
Since the surface Σ is arbitrary, the unit normal n̂ is also arbitrary. In order for the above to be
true, we get the differential form of the Faraday’s Law:
Page 6
∂B
∇×E = − .
∂t
The Faraday’s Law in differential form tells us the a changing B-field (so that ∂B
∂t is a non-zero
vector) will induce a E-field. Moreover, the more rapid is the change of B-field, the more
circular is the induced E-field.
Similarly, the Ampère’s Law in integral form can be converted into differential form by
Stokes’ Theorem:
˛ ¨ ¨
∂
B · dr = µ0 J · n̂ dS + ε 0 E · n̂ dS
C Σ ∂t Σ
¨ ¨
∂E
(∇ × B) · n̂ dS = µ0 J + µ0 ε 0 · n̂ dS
Σ Σ ∂t
∂E
∇ × B = µ0 J + µ0 ε 0 .
∂t
The physical interpretation of the Ampère’s Law is a bit more complicated than that of
Faraday’s Law. For the sake of simplicity, let’s consider only the case where the E-field is
unchanging:
∇ × B = µ0 J (when E is unchanging)
Recall that J is the current. This equation tells us that when the E-field is unchanging, the
presence of an electric current in a wire will induce a magnetic field circling around the wire.
In fact, the equation ∇ × B = µ0 J is the original form of Ampère’s Law. Later Maxwell
found that it is incomplete and correct it by adding a ∂E
∂t term.
J = − a∇u
where J is a vector field (in energy per second) representing the flow of heat energy, and a is a
positive constant depending on the medium. In other words, heat energy diffuses from higher
temperature regions to lower ones, and the rate of diffusion is proportional to the magnitude
of ∇u.
Let D be an arbitrary solid region with boundary surface S. Denote $ to be the energy
density function (in energy per volume), which equals to bu for some positive constant b whose
value depends on the medium. Then the triple integral:
˚
$ dV
D
measures the amount of heat loss through the closed surface S. By the conservation of heat
energy, heat energy must escape through the surface S. In mathematical terms, it is stated as:
˚ ‹
∂
$ dV = − J · n̂ dS.
∂t D S
The negative sign appears on the RHS because of the outward convention of n̂.
Applying the above physical laws, we get:
˚ ‹
∂u
b dV = − (− a∇u) · n̂ dS
∂t
˚D ‹ S
∂u
b dV = a∇u · n̂ dS
∂t
˚D ˚S
∂u
b dV = ∇ · ( a∇u) dV (Divergence Theorem)
D ∂t D
∂2 u ∂2 u ∂2 u
∇ · ∇u = + 2 + 2.
∂x2 ∂y ∂z
Therefore, we can conclude that:
∂2 u ∂2 u ∂2 u
∂u b
= + 2+ 2
∂t a ∂x2 ∂y ∂z
142 Topics in Physics and Engineering
x 2 + y2 + z2
1
Φ( x, y, z, t) = exp − .
(4πkt)3/2 4kt
1
lim Φ(0, 0, 0, t) = lim = ∞.
t →0 (4πkt)3/2
t →0
2 2 2
x +y +z
In contrast, if ( x, y, z) 6= (0, 0, 0), both exp − 4kt and (4πkt)3/2 go to 0 as t → 0.
However, the exponential term goes to 0 faster than the t3/2 term, so
x 2 + y2 + z2
1
lim Φ( x, y, z, t) = lim exp − = 0 when ( x, y, z) 6= (0, 0, 0).
t →0 t→0 (4πkt )3/2 4kt
Therefore, the function Φ( x, y, z, t) represents the heat diffusion starting from a highly concen-
trated heat source at t = 0. As time goes, the temperature distribution becomes more and more
uniform.
In general, if the initial temperature distribution is given by the function g( x, y, z), it can be
shown (proof beyond the scope of the course) that the following function
ˆ ∞ ˆ ∞ ˆ ∞
u( x, y, z, t) = Φ( x − u, y − v, z − w, t) g(u, v, w) dudvdw
−∞ −∞ −∞
∂u
satisfies the heat equation ∂t = k∆u with initial condition g( x, y, z), meaning that
lim u( x, y, z, t) = g( x, y, z).
t →0
In other words, the function u( x, y, z, t) predicts how heat diffuses when given an initial
temperature profile g( x, y, z). However, the integral defining u( x, y, z, t) is in general difficult
to compute explicitly.
∆u = 0.
u( x, y, z) = C for any ( x, y, z) on S.
Next we consider the vector field (u − C )∇u. We leave it as an exercise for readers to verify
from the definition that:
At steady state, we have ∆u = 0 and so ∇ · ((u − C )∇u) = |∇u|2 . Next we integrate this result
over D: ˚ ˚
∇ · ((u − C )∇u) dV = |∇u|2 dV.
D D
Applying the Divergence Theorem on the LHS, we get:
‹ ˚
(u − C )∇u · n̂ dS = |∇u|2 dV.
S D
From our assumption, we have u = C for any point on S. Therefore, the integrand (u − C )∇u · n̂
of the flux integral on LHS is zero, and so:
˚
0= |∇u|2 dV.
D
Since the integrand |∇u|2 is non-negative, the only scenario for the above to happen is that
Therefore, u must be a constant in the region D, and by continuity, this constant must be C.
144 Topics in Physics and Engineering
x 2 + y2 + z2
1
Φ( x, y, z, t) = exp −
(4πkt)3/2 4kt
is the solution to the heat equation starting at t = 0 with a highly concentrated heat source at
the origin. As t → 0, we have shown that:
(
0 if ( x, y, z) 6= (0, 0, 0);
lim Φ( x, y, z, t) =
t →0 ∞ otherwise.
The Dirac delta function in R3 , denoted by δ, is a “function” which can be loosely defined as
follows: (
0 if ( x, y, z) 6= (0, 0, 0);
δ( x, y, z) =
∞ otherwise.
As such, we then have
lim Φ( x, y, z, t) = δ( x, y, z)
t →0
and so the delta function δ can be thought as the initial condition for Φ.
Figure 5.7: the Dirac delta function can be loosely regarded as the above graph
While the above definition of δ is very neat, it is paradoxical. From mathematical viewpoint,
it is a function which is zero almost everywhere in R3 . By the definition of integral, the above
definition of δ will imply:
ˆ ∞ ˆ ∞ ˆ ∞
δ( x, y, z) dxdydz = 0.
−∞ −∞ −∞
represents the total heat energy. It being 0 means there is no heat energy over the entire R3 . As
time goes by, the heat distribution becomes Φ since it solves the heat equation and Φ → δ as
t → 0. However, it can be computed (using spherical coordinates) that:
ˆ ∞ ˆ ∞ ˆ ∞
Φ( x, y, z, t) dxdydz = 1
−∞ −∞ −∞
for any t > 0, meaning the total heat energy in R3 suddenly jumps to 1 as t becomes positive.
It contradicts the conservation of energy!
5.4 Dirac Delta Functions 145
We put “applying” in quote because the vector field ρ12 eρ is not defined at the origin, so
rigorously speaking one cannot applying the Divergence Theorem here. By decreeing the
Divergence Theorem holds for this vector field, we obtain:
˚ (
1 4π if D contains the origin;
∇· e
2 ρ
dV =
D ρ 0 otherwise.
Therefore,
1 1
∇· eρ
4π ρ2
satisfies the definition of the Dirac delta function, and so one can write:
1
∇· e ρ = 4πδ.
ρ2
According to the Coulomb’s Law, a point particle with charge q at the origin generates a
E-field given by:
q
E= eρ .
4πε 0 ρ2
From the above discussion, we get:
q q
∇·E = · 4πδ = δ.
4πε 0 ε0
$
The Gauss’s Law for Electricity in differential form states that ∇ · E = ε 0 where $ is the
charge density. Therefore, using the Dirac delta function (again decree that the Divergence
Theorem is true for the above E-field), one can derive:
$ = qδ.
In practice, one can regard qδ as the charge density distribution for a point particle (with
charge q).