Professional Documents
Culture Documents
T.K.SUBRAHMONIAN MOOTHATHU
Contents
The basic idea in Calculus is to approximate smooth objects locally by linear objects.
Suggested textbooks for additional reading:
1. T.M. Apostol, Calculus, Vol. II, 1969.
2. T.M. Apostol, Mathematical Analysis, 1974.
3. J.J. Callahan, Advanced Calculus, 2010.
4. S.R. Ghorpade and B.V. Limaye, A Course in Multivariable Calculus and Analysis, 2010.
5. J.H. Hubbard and B.B. Hubbard, Vector Calculus, Linear Algebra, and Differential Forms, 1999.
6. S. Lang, Calculus of Several Variables, 1987.
7. P.D. Lax and M.S. Terell, Multivariable Calculus with Applications, 2017.
1
2 T.K.SUBRAHMONIAN MOOTHATHU
A general remark about notations: We do not wish to complicate notations unnecessarily; hence
certain notations have to be understood based on the context. For example, the notation ‘x ∈ Rn ’
means x = (x1 , . . . , xn ), where each xj ∈ R; on the other hand, the notation ‘v1 , . . . , vk ∈ Rn ’
means each vi is an n-tuple vi = (vi1 , . . . , vin ) with vij ∈ R.
is a norm on Kn , where the triangle inequality is nothing but the Minkowski’s inequality. When
∑
p = 2, we get the Euclidean norm ∥·∥2 on Kn defined as ∥x∥2 = ( nj=1 |xj |2 )1/2 . The metric induced
∑
by the Euclidean norm is the Euclidean metric dE on Kn , where dE (x, y) = ( nj=1 |xj − yj |2 )1/2 .
Two other commonly used norms on Kn are ∥ · ∥1 (which is the p-norm for p = 1) and ∥ · ∥∞ , defined
∑
respectively as ∥x∥1 = nj=1 |xj | and ∥x∥∞ = max{|xj | : 1 ≤ j ≤ n} for x = (x1 , . . . , xn ) ∈ Kn .
Remark: (i) If ∥ · ∥ is a norm on a vector space X, then |∥x∥ − ∥y∥| ≤ ∥x − y∥ for every x, y ∈ X
(to see this, note by triangle inequality that ∥x∥ ≤ ∥y∥ + ∥x − y∥ and ∥y∥ ≤ ∥x∥ + ∥y − x∥); and
consequently, ∥ · ∥ : X → R is Lipschitz continuous. (ii) Our primary interest is in the normed
space (Rn , ∥ · ∥2 ). As a metric space, Cn can be identified with R2n in a natural manner.
Exercise-1: [Recall from Real Analysis] With respect to dE , the following are true:
(i) Rn is complete and (path) connected.
(ii) Qn is a countable dense subset of Rn .
(iii) Every bounded subset of Rn is totally bounded.
MULTIVARIABLE CALCULUS 3
Definition: Two norms ∥ · ∥ and ∥ · ∥0 on a vector space X are said to be equivalent if there are
0 < a < b such that a∥x∥ ≤ ∥x∥0 ≤ b∥x∥ for every x ∈ X. Note that this is equivalent to saying
that the identity map I : (X, ∥ · ∥) → (X, ∥ · ∥0 ) is a homeomorphism.
[101] Any two norms on Rn are equivalent (similarly, any two norms on Cn are equivalent).
Proof. The equivalence of norms is an equivalent relation on all the collection of all norms on Rn .
Therefore, it suffices to show that an arbitrary norm ∥ · ∥ on Rn is equivalent to the Euclidean
∑n ∑n
norm ∥ · ∥2 on Rn . Let b = j=1 ∥ej ∥. For x = j=1 xj ej ∈ R , we have |xj | ≤ ∥x∥2 for
n
∑ ∑
every j and hence ∥x∥ ≤ nj=1 |xj |∥ej ∥ ≤ ∥x∥2 nj=1 ∥ej ∥ = b∥x∥2 . From this, we also note that
|∥x∥ − ∥y∥| ≤ ∥x − y∥ ≤ b∥x − y∥2 , and thus ∥ · ∥ : Rn → R is Lipschitz continuous with respect to
the Euclidean norm ∥ · ∥2 . Next, to find a > 0 such that a∥x∥2 ≤ ∥x∥ for every x ∈ Rn , we argue as
follows. Consider the unit sphere S = {y ∈ Rn : ∥y∥2 = 1}, which is compact, and define f : S → R
as f (y) = ∥y∥. Since f is (Lipschitz) continuous and positive on the compact set S, there is a > 0
such that f (y) ≥ a for every y ∈ A. Now for any x ∈ Rn \ {0}, we have y := x/∥x∥2 ∈ S, and
therefore a ≤ f (y) = ∥y∥ = ∥x∥/∥x∥2 , which means a∥x∥2 ≤ ∥x∥ as required.
Remark: Recall the norms ∥ · ∥1 and ∥ · ∥∞ on Rn mentioned earlier. They induce the metrics d1
∑
and d∞ on Rn , where d1 (x, y) = nj=1 |xj − yj | and d∞ (x, y) = max{|xj − yj | : 1 ≤ j ≤ n} for
x, y ∈ Rn . A consequence of [101] is the following:
(iii) [Linearity in the first variable] For each y ∈ X, the map ⟨·, y⟩ : X → K is K-linear, i.e.,
⟨c1 x1 + c2 x2 , y⟩ = c1 ⟨x1 , y⟩ + c2 ⟨x2 , y⟩ for every x1 , x2 ∈ X and c1 , c2 ∈ K.
Any inner product ⟨·, ·⟩ on X induces a norm ∥ · ∥ on X by the rule ∥x∥ := ⟨x, x⟩1/2 . Then we
have the Cauchy-Schwarz inequality |⟨x, y⟩| ≤ ∥x∥∥y∥ for every x, y ∈ X.
Remark: If K = R, then condition (ii) becomes ⟨y, x⟩ = ⟨x, y⟩ for every x, y ∈ X, and then it
follows by (iii) that ⟨·, ·⟩ is linear in each variable separately (in other words, any inner product on
a real vector space is in particular a bilinear map).
∑
Example: The standard inner product on Rn is defined as ⟨x, y⟩ = nj=1 xj yj for x, y ∈ Rn , and
∑
the standard inner product on Cn is defined as ⟨x, y⟩ = nj=1 xj yj for x, y ∈ Cn . The norm induced
by this inner product is nothing but the Euclidean norm ∥ · ∥2 on Rn (respectively, Cn ).
Remark: Let ⟨·, ·⟩ be the standard inner product on Rn , and ⟨·, ·⟩0 be an arbitrary inner product
on Rn , and A be the n × n matrix whose ijth entry is ⟨ei , ej ⟩0 . Then we see by the bilinearity of
∑ ∑
the inner product that ⟨x, y⟩0 = ni=1 nj=1 xi yj ⟨ei , ej ⟩0 = ⟨Ax, y⟩ for every x, y ∈ Rn . Note that
A is a symmetric real matrix which is positive-definite (i.e, ⟨Av, v⟩ > 0 for every v ∈ Rn \ {0}).
Conversely, it can be verified that if A is an n × n positive-definite real (symmetric) matrix, then
(x, y) → ⟨Ax, y⟩ is an inner product on Rn .
Exercise-4: [Verify the claims] Let a ∈ Rn . (i) For 1 ≤ k ≤ n, any k-dimensional plane in Rn
containing a has the form a + span{v1 , . . . , vk }, where v1 , . . . , vk ∈ Rn are linearly independent
(here, an (n − 1)-dimensional plane in Rn is called a hyperplane). In particular, any 2-dimensional
plane containing a has the form {a + su + tv : s, t ∈ R}, where u, v ∈ Rn are linearly independent.
MULTIVARIABLE CALCULUS 5
(ii) If y ∈ Rn \ {0}, then the hyperplane H passing through a and orthogonal to y is given by
H = {x ∈ Rn : ⟨x − a, y⟩ = 0}. Letting r = ⟨a, y⟩, we may also write H = {x ∈ Rn : ⟨x, y⟩ = r}.
For any b ∈ Rn , the equation of the line passing through b and orthogonal to H is {b + ty : t ∈ R}.
If c ∈ H is the point where the this line intersects H, then c = b + t0 y and ⟨b + t0 y − a, y⟩ = 0 so
that t0 = ⟨a − b, y⟩/∥y∥2 . If ∥y∥ = 1, then we get c = b + ⟨a − b, y⟩y, and dist(b, H) = ∥b − c∥ =
∥⟨b − a, y⟩y∥ = |⟨b − a, y⟩|.
Definition: Any sequence in Rn converging to the origin (0, . . . , 0) ∈ Rn will be called a null
sequence. In particular, a null sequence in R means a sequence converging to 0.
Remark: Some care must be taken while considering limits in Rn . Let U ⊂ R2 be open, f : U → R
be a function and (a, b) ∈ U . The three expressions ‘lim(x,y)→(a,b) f (x, y)’, ‘limx→a limy→b f (x, y)’,
‘limy→b limx→a f (x, y)’ mean three different things. If f is continuous in a neighborhood of the
point (a, b), then the three expressions give the same value, namely f (a, b) (check). If f is not
continuous, then some of the limits may not exist, and even if they exist, they may not be equal.
|x|
(i) Let f : R2 → R be f (0, 0) = 0 and f (x, y) = for (x, y) ̸= (0, 0). Then we have
|x| + |y|
limx→0 (limy→0 f (x, y)) = 1 ̸= 0 = limy→0 (limx→0 f (x, y)), and (hence) lim(x,y)→(0,0) f (x, y) does
not exist (for the last assertion, we may also note that f (1/k, 0) = 1 and f (0, 1/k) = 0).
6 T.K.SUBRAHMONIAN MOOTHATHU
(ii) Let f : R2 → R be f (x, y) = g(y)x + g(x)y, where g(x) = 1 for x ≥ 0 and g(x) = −1
for x < 0. Then |f (x, y)| ≤ |x| + |y| and hence by Exercise-2, f is continuous at (0, 0) with
lim(x,y)→(0,0) f (x, y) = 0 = f (0, 0). If x ∈ (0, 1), then f (x, 1/k) = x + 1/k and f (x, −1/k) =
−x − 1/k. Hence limy→0 f (x, y) does not exist. A similar observation holds with x and y inter-
changed. Thus the two iterated limits limx→0 limy→0 f (x, y) and limy→0 limy→0 f (x, y) do not exist.
Moreover, it follows that f fails to be continuous in every neighborhood of (0, 0).
Remark: Often, we will denote the Euclidean norm ∥ · ∥2 on Rn simply as ∥ · ∥ when no other norm
is being considered. Similarly, the notation ⟨·, ·⟩ will mean the standard inner product.
General tip: Keep track of the dimension: throughout this course, while considering elements of
the Euclidean space and functions between Euclidean spaces, make a mental note of the dimension
of the relevant Euclidean space(s), i.e., observe clearly whether the space is R or Rn or Rm , etc.
This will help you to reduce notational as well as conceptual errors.
a + tv ∈ U for every t ∈ (−ε, ε), and gj : (−ε, ε) → R is defined as gj (t) = fj (a + tv), then
fj′ (a; v) = gj′ (0), if the derivative exists.
(ii) The directional derivatives f ′ (a; e1 ), . . . , f ′ (a; en ) in the direction of the standard basis vectors
∂f
e1 , . . . , en ∈ Rn are called the partial derivatives of f at a. We will write f ′ (a; ej ) as (a), and
∂xj
call it the partial derivative of f at a with respect to xj , or call it the jth partial derivative of f at
∂f
a. Note that (a), if it exists, is the derivative of the one-variable function x → f (x, a2 , . . . , an )
∂x1
at x = a1 , and similarly for other partial derivatives. Thus the jth partial derivative of f measures
the rate of change of f with respect to the jth variable, when the other variables are kept fixed.
∂f
Remark: To determine (a), often the following method is used: formally differentiate f with
∂xj
respect to xj and substitute a in the resulting expression. This works provided the partial derivative
exists in a neighborhood of a and is continuous at a. If we cannot see the continuity of the partial
∂f
derivative at a in advance, then the existence or value of (a) should be determined by directly
∂xj
f (a + tej ) − f (a)
studying the limit limt→0 .
t
MULTIVARIABLE CALCULUS 7
∂f
Example: (i) Let f : R2 → R be f (x, y) = x2 y. Then for a = (a1 , a2 ), we have (a) = 2a1 a2
∂x
∂f f (a1 + 3t, a2 + 5t) − f (a1 , a2 )
and (a) = a21 . Moreover, if v = (3, 5), then f ′ (a; v) = limt→0 =
∂y t
(a1 + 3t)2 (a2 + 5t) − a21 a2 ∂f ∂f
limt→0 = 6a1 a2 + 5a21 = 3 (a) + 5 (a).
t ∂x ∂y
(ii) The existence of partial derivatives does not ensure the existence of directional derivatives. Let
f : R2 → R be f (x, y) = x if 0 ≤ x ≤ y, f (x, y) = y if 0 ≤ y ≤ x, and f (x, y) = 0 otherwise. Check
∂f ∂f
that f is continuous. For a = (0, 0), we have (a) = f ′ (a; e1 ) = 0 and (a) = f ′ (a; e2 ) = 0.
∂x ∂y
f (a + tv) − f (a) f (t, t) − 0 t
But if v = (1, 1), then limt→0+ = limt→0+ = limt→0+ = 1 ̸= 0 =
t t t
f (a + tv) − f (a) ′
limt→0− , and hence the directional derivative f (a; v) does not exist (geometrically,
t
the graph of f along the line x = y has a sharp turning at (0, 0)).
(iii) The existence of all directional derivatives does not imply continuity for the function. Let
xy 2
f : R2 → R be f (0, 0) = 0 and f (x, y) = 2 for (x, y) ̸= (0, 0). We saw earlier that f is
x + y4
not continuous at (0, 0) because f (1/n2 , 1/n) = 1/2 ̸→ 0. But if a = (0, 0) and v = (v1 , v2 ),
then f ′ (a; v) = 0 if v1 = 0 and f ′ (a; v) = v22 /v1 if v1 ̸= 0; thus all directional derivatives exist at
a = (0, 0). Note that the map v → f ′ (a; v) from R2 → R is not linear in this example.
xy 3
(iv) Let f : R2 → R be f (0, 0) = 0 and f (x, y) = for (x, y) ̸= (0, 0) =: a. Consider
x2 + y6
f (a + tv) − f (a) tv1 v3
v = (v1 , v2 ) ̸= (0, 0). Note that = 2 . If v1 = 0 and v2 ̸= 0, then
t v1 + t4 v26
f (a + tv) − f (a) tv1 v3
= 0, and hence f ′ (a; v) = 0. If v1 ̸= 0, then f ′ (a; v) = limt→0 2 = 0. Thus
t v1 + t4 v26
all directional derivatives exist at a, and the map v 7→ f ′ (a; v) from R2 → R is the zero map, which
is linear. However, f (1/n3 , 1/n) = 1/2 ̸→ 0 = f (a), and therefore f is not continuous at a.
Discussion: In the case of one dimension, we think of the derivative as the ‘rate of change’; moreover,
f (x) − f (a)
if f is differentiable at a, then limx→a = f ′ (a), which is equivalent to saying that
x−a
f (x) − f (a) − L(x − a)
limx→a = 0, where L : R → R is the linear map y 7→ f ′ (a)y. To define
x−a
differentiability in higher dimensions, we need to consider rate of change in different directions. The
whole information about rate of change along various directions is encoded in the map v 7→ f ′ (a; v).
It is nice if this map is a linear map, say L. But this map being linear is not sufficient to guarantee
the continuity of f at a (as noted in the previous example). For differentiability to imply continuity,
f (a + tv) − f (a)
we should also demand that the limit limt→0 exists uniformly for all unit vectors
t
∥f (x) − f (a) − L(x − a)∥
v. This is ensured by demanding that limx→a = 0.
∥x − a∥
[102] Let U ⊂ Rn be open, f : U → Rm , and a ∈ U . Then the following are equivalent:
8 T.K.SUBRAHMONIAN MOOTHATHU
Proof. Let S = {v ∈ Rn : ∥v∥ = 1}. (i) ⇒ (ii): Let ε > 0. Choose δ > 0 such that
∥f (x) − f (a) − L(x − a)∥
< ε whenever 0 < ∥x − a∥ < δ. Then for every t ∈ (−δ, δ) \ {0}
∥x − a∥
and every v ∈ S, putting x = a + tv and noting tL(v) = L(tv), we get that
f (a + tv) − f (a) ∥f (a + tv) − f (a) − L(tv)∥ ∥f (x) − f (a) − L(x − a)∥
∥ − L(v)∥ = = < ε.
t |t| ∥x − a∥
This means (ii) holds with f ′ (a; v) = L(v).
(ii) ⇒ (i): Let L : Rn → Rm be L(v) = f ′ (a; v), which is a linear map by (ii). Given ε > 0,
f (a + tv) − f (a)
choose δ > 0 such that ∥ − f ′ (a; v)∥ < ε for every t ∈ (−δ, δ) \ {0} and every
t
x−a
v ∈ S. Then for any x ∈ U with 0 < ∥x − a∥ < δ, putting t = ∥x − a∥, v = and noting
∥x − a∥
f ′ (a; tv) = tf ′ (a; v), we get that
∥f (x) − f (a) − L(x − a)∥ ∥f (a + tv) − f (a) − f ′ (a; tv)∥ f (a + tv) − f (a)
= =∥ − f ′ (a; v)∥ < ε.
∥x − a∥ |t| t
This establishes (i).
lower dimensional theory to a higher dimensional theory often demands such modifications in one’s
perspective. For instance, a real polynomial f : R → R of degree n ≥ 1 has at most n zeroes,
but a polynomial f : R2 → R in two variables of degree n ≥ 1 can have uncountably many zeroes
(example: f (x, y) = xy); so the correct perspective in higher dimension is the ‘dimension’ of the
zero-set and not the number of zeroes.
Definition: Let U ⊂ Rn be open and a ∈ U . (i) If f : U → R is a function such that the partial
∂f
derivative (a) exists for every j ∈ {1, . . . , n}, then the gradient vector ∇f (a) ∈ Rn of f at a is
∂xj
∂f ∂f
defined as ∇f (a) = ( (a), . . . , (a)). For example, if f : R3 → R is f (x, y, z) = 2x3 y − yz 4 ,
∂x1 ∂xn
then ∇f (x, y, z) = (6x2 y, 2x3 − z 4 , −4yz 3 ).
∂fi
(ii) If f = (f1 , . . . , fm ) : U → Rm is a function such that the partial derivative (a) exists for
∂xj
every i ∈ {1, . . . , m} and every j ∈ {1, . . . , n}, then the Jacobian matrix Jf (a) of f at a is defined
∂fi
as the m × n matrix whose ijth entry is (a). Note that the ith row of Jf (a) is ∇fi (a).
∂xj
Exercise-6: [Directional derivative as a linear combination of partial derivatives] Let U ⊂ Rn be
open, a ∈ U , and suppose f : U → Rm is differentiable at a. Then,
∑ ∑ ∂f ∑
(i) f ′ (a; v) = nj=1 vj f ′ (a; ej ) = nj=1 vj (a) for every v = (v1 , . . . , vn ) = nj=1 vj ej ∈ Rn .
∂xj
(ii) If m = 1, then f (a; v) = ⟨∇f (a), v⟩ for every i ∈ {1, . . . , m} and every v ∈ Rn .
(iii) If n = 1, and f = (f1 , . . . , fm ), then f ′ (a) = (f1′ (a), . . . , fm
′ (a)).
(iv) In the general case, f ′ (a; v) = (⟨∇f1 (a), v⟩, . . . , ⟨∇fm (a), v⟩), where f = (f1 , . . . , fm ).
[Hint: (i) v 7→ f ′ (a; v) is linear by [102]. (ii) This follows from (i).]
∂fi
(iii) The partial derivative (a) exists for every i ∈ {1, . . . , m} and every j ∈ {1, . . . , n}, and
∂xj
|fi (x) − fi (a) − ⟨∇fi (a), x − a⟩|
limx→a = 0 for each i ∈ {1, . . . , m}.
∥x − a∥
(iv) There exist a linear map L : R → Rm and a function F : U → Rm with limx→a F (x) = 0 =
n
F (a) such that f has the Caratheodory representation f (x) − f (a) = L(x − a) + ∥x − a∥F (x) for
every x ∈ U (and if this holds, then f ′ (a; ·) = L).
Proof. (i) ⇔ (ii): Let L : Rn → Rm be a linear map, and write L = (L1 , . . . , Lm ). Since conver-
∥f (x) − f (a) − L(x − a)∥
gence in Rm is determined coordinatewise, we note that limx→a = 0 iff
∥x − a∥
|fi (x) − fi (a) − Li (x − a)|
limx→a = 0 for every i ∈ {1, . . . , m}.
∥x − a∥
(ii) ⇒ (iii): If Li = fi′ (a; ·), then Li (v) = fi (a; v) = ⟨∇fi (a), v⟩ by [102] and Exercise-6.
(iii) ⇒ (ii): Let Li : Rn → R be the linear map defined as Li (v) = ⟨∇fi (a), v⟩. Then we obtain
|fi (x) − fi (a) − Li (x − a)|
limx→a = 0 by (iii), and hence fi is differentiable at a.
∥x − a∥
(i) ⇒ (iv): Let L be as in the definition of differentiability, and define F : U → Rm as F (a) = 0
f (x) − f (a) − L(x − a)
and F (x) = for x ̸= a.
∥x − a∥
∥f (x) − f (a) − L(x − a)∥
(iv) ⇒ (i): If f (x) − f (a) = L(x − a) + ∥x − a∥F (x), then limx→a =
∥x − a∥
limx→a ∥F (x)∥ = 0.
(ii) ⇒ (i): Let L : Rn → R be L(v) = ⟨F (a), v⟩, which is a linear map. Then for x ̸= a, using (ii) and
Cauchy-Schwarz inequality, we see that |f (x) − f (a) − L(x − a)| = |⟨F (x), x − a⟩ − ⟨F (a), x − a⟩| =
|f (x) − f (a) − L(x − a)|
|⟨F (x)−F (a), x−a⟩| ≤ ∥F (x)−F (a)∥∥x−a∥. Hence ≤ ∥F (x)−F (a)∥ → 0
∥x − a∥
as x → a by the continuity of F at a.
MULTIVARIABLE CALCULUS 11
(ii) ⇒ (i): Given Φ as in (ii) with Φ(x) = Lx , put L = La . Then f (x) − f (a) − L(x − a) =
Lx (x − a) − La (x − a). Hence ∥f (x) − f (a) − L(x − a)∥ ≤ ∥Lx − La ∥∥x − a∥ by Exercise-7(iii). So,
∥f (x) − f (a) − L(x − a)∥
limx→a ≤ ∥Lx − La ∥ → 0 as x → a by the continuity of Φ at a.
∥x − a∥
The following sufficient condition is practically useful to check whether a multivariable function
is differentiable. For u, v ∈ Rn , let [u, v] denote the line segment joining u and v.
12 T.K.SUBRAHMONIAN MOOTHATHU
Proof. By considering each fi separately, we may suppose f is real-valued (i.e., m = 1). Choose
r > 0 such that B(a, r) ⊂ U and all the partial derivatives of f are continuous in B(a, r). Fix
∑
x ∈ B(a, r), and note that x − a = nj=1 (xj − aj )ej . Define vectors u0 , u1 , . . . , un ∈ B(a, r) as
follows: u0 = a, u1 = u0 + (x1 − a1 )e1 , u2 = u1 + (x2 − a2 )e2 , · · · , un = un−1 + (xn − an )en = x.
Observe that uj−1 and uj differ only in the jth coordinate. Define gj : [0, 1] → R as gj (t) =
f (uj−1 + t(xj − aj )ej ). Applying the one-variable Mean value theorem to gj , we may find a vector
vj = uj−1 + tj (xj − aj )ej on the line segment [uj−1 , uj ] such that f (uj ) − f (uj−1 ) = gj (1) − gj (0) =
∂f ∂f ∂f
gj′ (tj ) = (vj )(xj − aj ). Define F : B(a, r) → Rn as F (x) = ( (v1 ), . . . , (vn )). Then we
∂xj ∂x1 ∂xn
∑ ∑ ∂f
have f (x) − f (a) = nj=1 (f (uj ) − f (uj−1 )) = nj=1 (vj )(xj − aj ) = ⟨F (x), x − a⟩ for every
∂xj
x ∈ B(a, r). Moreover, F (x) → ∇f (a) = F (a) as x → a by the continuity of the partial derivatives
of f . Hence f is differentiable at a by [104].
Proof. (i) By [103](iv), f has a Caratheodory representation f (x)−f (a) = L(x−a)+∥x−a∥F (x) for
x ∈ U , where L : Rn → Rm is linear and limx→a F (x) = 0 = F (a). Since L is linear, there is M > 0
such that ∥L(x) − L(y)∥ ≤ M ∥x − y∥ for every x, y ∈ Rn . Since limx→a ∥F (x)∥ = 0, there is δ > 0
2The existence of ∇(f g)(a) by itself does not imply the differentiability of f g at a.
MULTIVARIABLE CALCULUS 13
such that B(a, δ) ⊂ U and ∥F (x)∥ ≤ 1 for every x ∈ B(a, δ). Then ∥f (x) − f (a)∥ ≤ (M + 1)∥x − a∥
for every x ∈ B(a, δ).
(ii) The jth column of the matrix of the linear map f ′ (a; ·) is specified by the vector f ′ (a; ej ) =
∂f ∂f1 ∂fm
(a) = ( (a), . . . , (a)).
∂xj ∂xj ∂xj
(iii) If f (x)−f (a) = L1 (x−a)+∥x−a∥F (x) and g(x)−g(a) = L2 (x−a)+G(x) are the Caratheodory
representations as in [103](iv) of f and g respectively, then (c1 f + c2 g)(x) − (c1 f + c2 g)(a) =
(c1 L1 + c2 L2 )(x − a) + ∥x − a∥(c1 F + c2 G)(x). Again use [103].
(iv) Let L1 = f ′ (a; ·), L2 = g ′ (a; ·), and put L = f (a)L2 + g(a)L1 . By adding and subtracting the
quantity f (x)g(a) − f (x)L2 (x − a), we may note that ∥f (x)g(x) − f (a)g(a) − L(x − a)∥ ≤
∥f (x)∥∥g(x) − g(a) − L2 (x − a)∥ + ∥f (x) − f (a)∥∥L2 (x − a)∥ + ∥g(a)∥∥f (x) − f (a) − L1 (x − a)∥.
∥f (x)g(x) − f (a)g(a) − L(x − a)∥
Hence, limx→a = 0 by hypothesis, by part (i), and by the fact
∥x − a∥
that limx→a L2 (x − a) = 0.
We can give another (easier) proof using [104] as follows. Let f (x) − f (a) = ⟨F (x), x − a⟩ and
g(x) − g(a) = ⟨G(x), x − a⟩ be the Caratheodory representations of f and g given by [104]. Then
f (x)g(x) − f (a)g(a) = f (x)g(x) − f (x)g(a) + f (x)g(a) − f (a)g(a) = f (x)(g(x) − g(a)) + g(a)(f (x) −
f (a)) = f (x)⟨G(x), x − a⟩ + g(a)⟨F (x), x − a⟩ = ⟨H(x), x − a⟩, where H(x) := f (x)G(x) + g(a)F (x).
Then H is continuous at a since f, G, F are continuous at a. Hence by [104], f g is differentiable at
a and ∇(f g)(a) = H(a) = f (a)G(a) + g(a)F (a) = f (a)∇g(a) + g(a)∇f (a).
(v) Let f (x) − f (a) = ⟨F (x), x − a⟩ and g(x) − g(a) = ⟨G(x), x − a⟩ be the Caratheodory representa-
f (x)g(a) − f (a)g(x)
tions of f and g given by [104]. Then (f /g)(x)−(f /g)(a) = . But = f (x)g(a)−
g(x)g(a)
f (a)g(x) = (f (x) − f (a))g(a) − f (a)(g(x) − g(a)) = g(a)⟨F (x), x − a⟩ − f (a)⟨G(x), x − a⟩. Hence
g(a)F (x) − f (a)G(x)
(f /g)(x)−(f /g)(a) = ⟨H(x), x−a⟩, where H(x) := . Clearly, H is continuous
g(x)g(a)
g(a)∇f (a) − f (a)∇g(a)
at a. Therefore f /g is differentiable at a and ∇(f /g)(a) = H(a) = .
g(a)2
Remark: Identify M (n, R) := {all n × n real matrices} with Rn×n . Then A 7→ det(A) and A 7→
trace(A) from Rn×n to R are polynomials in n × n variables, and hence differentiable by Exercise-8.
(ii) Suppose k = 1 and f = (f1 , . . . , fm ) in (i). Then we have that (g ◦f )′ (a; v) = g ′ (f (a); f ′ (a; v)) =
∑ ∂g ∑ ∂g
⟨∇g(f (a)), f ′ (a; v)⟩ = m i=1 (f (a))⟨∇fi (a), v⟩ ∀ v ∈ Rn . So, ∇(g◦f )(a) = m i=1 (f (a))∇fi (a).
∂yi ∂yi
∂(g ◦ f ) ∑ ∂g ∂fi ∂f
Moreover, (a) = m i=1 (f (a)) (a) = ⟨∇g(f (a)), (a)⟩ for 1 ≤ j ≤ n.
∂xj ∂yi ∂xj ∂xj
(iii) If n = k = 1 in (i), then (g ◦ f )′ (a) = ⟨∇g(f (a)), f ′ (a)⟩.
(iv) If m = k = 1 in (i), then (g ◦ f )′ (a; v) = g ′ (f (a))f ′ (a; v) = g ′ (f (a))⟨∇f (a), v⟩ for every v ∈ Rn .
(v) If n = m = 1 in (i), then (g ◦ f )′ (a) = g ′ (f (a))f ′ (a) = (g1′ (f (a)f ′ (a), . . . , gk′ (f (a))f ′ (a)), where
g = (g1 , . . . , gk ).
Proof. (i) By [103], f and g have Caratheodory representations f (x)−f (a) = L1 (x−a)+∥x−a∥F (x)
and g(y) − g(f (a)) = L2 (y − f (a)) + ∥y − f (a)∥G(y), where L1 = f ′ (a; ·) : Rn → Rm and L2 =
g ′ (f (a); ·) : Rm → Rk are linear, limx→a F (x) = 0 ∈ Rm and limy→f (a) G(y) = 0 ∈ Rk . Let
L = L2 ◦ L1 : Rn → Rk , which is again a linear map. We have g(f (x)) − g(f (a))
= L2 (f (x) − f (a)) + ∥f (x) − f (a)∥G(f (x)) = L2 (L1 (x − a) + ∥x − a∥F (x)) + ∥f (x) − f (a)∥G(f (x))
∥f (x) − f (a)∥
= L(x − a) + ∥x − a∥H(x), where H(x) := L2 (F (x)) + G(f (x)) for x ̸= a and
∥x − a∥
H(a) := 0 ∈ Rk . Since limx→a ∥H(x)∥ = 0 by [107](i) and by the properties of F, G, and L2 , it
follows that the g◦f has the Caratheodory representation g(f (x))−g(f (a)) = L(x−a)+∥x−a∥H(x)
for x ∈ U . Hence by [103], g ◦ f is differentiable at a with (g ◦ f )′ (a; ·) = L = L2 ◦ L1 .
Another proof : By Caratheodory lemma [105], there are functions Φ1 : U → L(Rn , Rm ) and
Φ2 : V → L(Rm , Rk ), Φ1 (x) = L1,x and Φ2 (y) = L2,y , such that Φ1 and Φ2 are continuous at
a, f (a) respectively, f (x) − f (a) = L1,x (x − a) for every x ∈ U , and g(y) − g(f (a)) = L2,y (y − f (a))
for every y ∈ V . Let Φ : U → L(Rn , Rk ) be Φ(x) = Lx := L2,f (x) ◦ L1,x . Then g(f (x)) − g(f (a)) =
L2,f (x) (f (x) − f (a)) = L2,f (x) (L1,x (x − a)) = Lx (x − a). Moreover, using Exercise-7(iii), we may
observe that the operator norm ∥Lx − La ∥ = ∥L2,f (x) ◦ L1,x − L2,f (a) ◦ L1,a ∥ = ∥L2,f (x) ◦ (L1,x −
L1,a ) + (L2,f (x) − L2,f (a) ) ◦ L1,a ∥ ≤ ∥L2,f (x) ∥∥L1,x − L1,a ∥ + ∥L2,f (x) − L2,f (a) ∥∥L1,a ∥ → 0 as x → a
by the continuity of f and Φ1 at a, the continuity of Φ2 at f (a), and the continuity of the operator
norm (see Exercise-2). Thus Φ is continuous at a. Therefore by [105], g ◦ f is differentiable at a,
and (g ◦ f )′ (a; ·) = La = L2,f (a) ◦ L1,a = g ′ (f (a); ·) ◦ f ′ (a; ·).
MULTIVARIABLE CALCULUS 15
(ii) The first assertion is evident. For the last assertion, argue as follows. We have Jg◦f (a) =
∂g
Jg (f (a))Jf (a) by (i). Now note that Jg (f (a)) is the 1 × m matrix whose ith entry is (f (a)),
∂yi
∂fi
and Jf (a) is an m × n matrix whose ijth entry is (a).
∂xj
For (iii) and (iv), use part (ii) and Exercise-6(ii). For (v), use part (i) and Exercise-6(iii).
Remark: (i) Suppose n = m = 2 and k = 1 in [105](i). Write f (x, y) = (u(x, y), v(x, y)) and put
h = g ◦ f . Then h(x, y) = g(u(x, y), v(x, y)). So the Chain rule as expressed in [105](ii) takes the
∂g ∂g ∂u ∂g ∂v ∂g ∂g ∂u ∂g ∂v
form = + and = + , an expression usually found in Calculus
∂x ∂u ∂x ∂v ∂x ∂y ∂u ∂y ∂v ∂y
textbooks. (ii) Two examples illustrating the Chain rule are given below. However, in practical
problems, it is often easier to compute (g ◦ f )′ (a; ·) directly after determining g ◦ f . The Chain rule
is mostly useful in theoretical proofs.
Example: In the two examples below, the functions are differentiable by [106].
(i) [n = k = 1 and m = 2] Let f : R → R2 be f (t) = (t, t2 ) and g : R → R be g(x, y) = x + y .
2 3 4
[ ] 1 [ ]
Then Jg◦f (t) = Jg (f (t))Jf (t) = Jg (t, t2 )Jf (t) = 3t2 4t6 = 3t2 + 87 . Hence (g ◦ f )′ (t) =
2t
3t2 + 8t7 . Direct computation is easier: (g ◦ f )(t) = t3 + t8 and hence (g ◦ f )′ (t) = 3t2 + 8t7 .
Exercise-9: Let ϕ, ψ : Rn → R be ϕ(x) = ∥x∥2 , ψ(x) = ∥x∥, and ρ : Rn ×Rn → R be ρ(x, y) = ⟨x, y⟩.
(i) ϕ is differentiable in Rn , ∇ϕ(a) = 2a, and ϕ′ (a; v) = 2⟨a, v⟩ for every a, v ∈ Rn .
a ⟨a, v⟩
(ii) ψ is differentiable in Rn \ {0}, ∇ψ(a) = , and ψ ′ (a; v) = for every a ∈ Rn \ {0} and
∥a∥ ∥a∥
v ∈ Rn . But ψ is not differentiable at 0 ∈ Rn .
(iii) ρ is differentiable in Rn × Rn , ∇ρ(a, b) = (b, a), and ρ′ ((a, b); (u, v)) = ⟨b, u⟩ + ⟨a, v⟩ for every
(a, b), (u, v) ∈ Rn × Rn .
∑n
j=1 xj . So ϕ is differentiable by Exercise-8, and ∇ϕ(a) = 2a. (ii) Let ϕ :
[Hint: (i) ϕ(x) = 2
√
Rn \ {0} → R \ {0} be ϕ(x) = ∥x∥2 , and g : R \ {0} → R be g(x) = x. Then ϕ, g are differentiable
and ψ = g ◦ ϕ on Rn \ {0}. Hence ψ is differentiable on Rn \ {0}, and ψ ′ (a; v) = g ′ (ϕ(a))⟨∇ϕ(a), v⟩.
To see ψ is not differentiable at 0 ∈ Rn , note that t 7→ |t| = ψ(t, 0, . . . , 0) is not differentiable at
0 ∈ R. (iii) ρ is differentiable by Exercise-8 since it is a multivariable polynomial.]
16 T.K.SUBRAHMONIAN MOOTHATHU
[109] Let U ⊂ Rn be open, and a, b ∈ U be distinct vectors such that the line segment [a, b] ⊂ U .
(i) [Mean value theorem for real valued functions] If f : U → R is differentiable, then there is
c ∈ (a, b) such that f (b) − f (a) = f ′ (c; b − a) = ⟨∇f (c), b − a⟩.
(ii) [Mean value theorem for vector-valued functions] If f : U → Rm is differentiable, then for each
z ∈ Rm , there is c ∈ (a, b) (where c depends on z) such that ⟨f (b) − f (a), z⟩ = ⟨f ′ (c; b − a), z⟩.
(iii) [Mean value inequalities for multivariable functions] Let f = (f1 , . . . , fm ) : U → Rm be
differentiable. Let M1 = sup{∥f ′ (c; ·)∥ : c ∈ [a, b]}, where ∥f ′ (c; ·)∥ is the operator norm of the
∑
linear map f ′ (c; ·), and M2 = sup{ mi=1 ∥∇fi (c)∥ : c ∈ [a, b]}. Then ∥f (b) − f (a)∥ ≤ M1 ∥b − a∥
and ∥f (b) − f (a)∥ ≤ M2 ∥b − a∥.
Proof. (i) Let g : [0, 1] → R be g(t) = f (a + t(b − a)), which is differentiable, being the composition
of two differentiable functions. Applying the one-variable Mean value theorem to g, we may find
t0 ∈ (0, 1) with f (b) − f (a) = g(1) − g(0) = g ′ (t0 )(1 − 0) = f ′ (c; b − a), where c := a + t0 (b − a).
(ii) Fix z ∈ Rm and define g : [0, 1] → R as g(t) = ⟨f (a + t(b − a)), z⟩, which is differentiable since f
and the maps t 7→ a + t(b − a), y 7→ ⟨y, z⟩ are differentiable. Applying the one-variable Mean value
theorem to g, we may find t0 ∈ (0, 1) with ⟨f (b) − f (a), z⟩ = g(1) − g(0) = g ′ (t0 )(1 − 0) = g ′ (t0 ).
Let c = a + t0 (b − a). By Chain rule and the linearity of the inner product in the first variable, we
g(t) − g(t0 )
may observe that g ′ (t0 ) = ⟨f ′ (c; b − a), z⟩ (or directly calculate the limit limt→t0 ).
t − t0
MULTIVARIABLE CALCULUS 17
(iii) Taking z = f (b)−f (a) in (ii) and applying Cauchy-Schwarz inequality, we get ∥f (b)−f (a)∥2 =
|⟨f (b)−f (a), f (b)−f (a)⟩| = |⟨f ′ (c; b−a), f (b)−f (a)⟩| ≤ ∥f ′ (c; b−a)∥∥f (b)−f (a)∥ for some c ∈ (a, b),
and hence ∥f (b) − f (a)∥ ≤ ∥f ′ (c; b − a)∥. Since ∥f ′ (c; b − a)∥ ≤ ∥f ′ (c; ·)∥∥b − a∥, it follows that
∑
∥f (b) − f (a)∥ ≤ M1 ∥b − a∥. Moreover, as f ′ (c; b − a) = m i=1 ⟨∇fi (c), b − a⟩ei , another application
′
∑m ∑
of Cauchy-Schwarz inequality yields ∥f (c; b − a)∥ ≤ i=1 |⟨∇fi (c), b − a⟩| ≤ m i=1 ∥∇fi (c)∥∥b − a∥.
This implies ∥f (b) − f (a)∥ ≤ M2 ∥b − a∥.
Example: Let U ⊂ Rn be open, f : U → Rm be differentiable, and [a, b] ⊂ U . Then there may not
exist c ∈ (a, b) with f (b)−f (a) = f ′ (c; b−a). Consider f : R2 →R2 defined 2 3
as f (x, y) = (x , y ). Let
2c1 0
a = (0, 0) and b = (1, 1). Since b − a = (1, 1) and Jf (c1 , c2 ) = for any c = (c1 , c2 ) ∈ R2 ,
0 3c22
we get f ′ (c; b−a) = (2c1 , 3c22 ). If f ′ (c; b−a) = f (b)−f (a) = (1, 1), then c1 ̸= c2 and hence c ∈
/ [a, b].
Definition and Example: Let U ⊂ Rn be open and f : U → Rm be a function for which the partial
∂f ∂2f ∂ ∂f
derivatives : U → Rm exist in U . If := ( ) exists, then it is called a second order
∂xj ∂xi xj ∂xi ∂xj
partial derivative of f . Repeating this process, for any k ∈ N, we may define kth order partial
x3 y − xy 3
derivatives of f , if they exist. For example, let f : R2 → R be f (x, y) = if (x, y) ̸= (0, 0)
x2 + y 2
∂f x4 y − 4x2 y 3 − y 5 ∂f
and f (0, 0) = 0. Then (x, y) = 2 2 2
for (x, y) ̸= (0, 0) and (0, 0) = 0. Since
∂x (x + y ) ∂x
∂f ∂2f ∂f x5 − 4x3 y 2 − xy 4
(0, y) = −y, we get (0, y) = −1 for every y ∈ R. Similarly, (x, y) =
∂x ∂y∂x ∂y (x2 + y 2 )2
∂f ∂f 2
∂ f
for (x, y) ̸= (0, 0) and (0, 0) = 0. Since (x, 0) = x, we get (x, 0) = 1 for every y ∈ R.
∂y ∂y ∂x∂y
∂2f ∂2f
Hence (0, 0) = 1 ̸= −1 = (0, 0).
∂x∂y ∂y∂x
[110] [Equality of mixed partial derivatives] Let U ⊂ Rn be open, f : U → Rm be a function, and
∂2f ∂2f ∂2f ∂2f
1 ≤ j, k ≤ n. If and exist and are continuous in U , then = in U .
∂xj ∂xk ∂xk ∂xj ∂xj ∂xk ∂xk ∂xj
Proof. Writing f = (f1 , . . . , fm ) and considering each fi separately, we may suppose m = 1. Also we
may assume n = 2 since we deal with only two variables at a time. So now the [function under consid-]
∂2f 1 ∂f ∂f
eration is f : U ⊂ R → R. Let (a, b) ∈ U . Then
2 (a, b) = limy→b (a, y) − (a, b)
∂y∂x y − b ∂x ∂x
(f (x, y) − f (a, y)) − (f (x, b) − f (a, b))
= limy→b limx→a
(y − b)(x − a)
g(x, y) − g(x, b)
= limy→b limx→a (where g(x, y) := f (x, y) − f (a, y)).
(y − b)(x − a)
∂g
(x, y0 )
∂y
= limy→b limx→a (for some y0 ∈ (b, y) by Mean value theorem)
x−a
∂f ∂f
(x, y0 ) − (a, y0 )
∂y ∂y
= limy→b limx→a
x−a
∂2f
= limy→b limx→a (x0 , y0 ) (for some x0 ∈ (a, x) by Mean value theorem)
∂x∂y
∂2f ∂2f
= (a, b) since is continuous and (x0 , y0 ) → (a, b) as (x, y) → (a, b).
∂x∂y ∂x∂y
∂2f
Note that in this proof, we used only the continuity of .
∂x∂y
[111] (i) Let a ∈ Rn and 0 < r < s. Then there exist smooth functions (i.e., C ∞ -functions)
f, g : Rn → R with the following properties: 0 ≤ f, g ≤ 1, f (x) > 0 iff ∥a − x∥ < s, f (x) = 0 iff
∥a − x∥ ≥ s, g(x) > 0 iff ∥a − x∥ > r, and g(x) = 0 iff ∥a − x∥ ≤ r.
(ii) Let a ∈ Rn and 0 < r < s. Then there is a smooth function h : Rn → R with 0 ≤ h ≤ 1 such
that h(x) = 1 for every x ∈ B(a, r), and h(x) = 0 for every x ∈ Rn \ B(a, s).
(iii) Let A ⊂ U ⊂ Rn , where A is a nonempty compact set and U is open in Rn . Then there is a
smooth function h : Rn → R with 0 ≤ h ≤ 1 such that h(x) = 1 for every x ∈ A, and h(x) = 0 for
every x ∈ Rn \ U .
Proof. (i) We know from Real Analysis that there is a smooth function ψ : R → R such that
0 ≤ ψ ≤ 1, ψ(x) = 0 for x ≤ 0, and ψ(x) > 0 for x > 0. For example, we may take ψ to be the
function ψ(x) = 0 if x ≤ 0 and ψ(x) = e−1/x if x > 0. Note that the function y → ∥y∥2 from Rn to
R is a multivariable polynomial and hence smooth. Therefore, the functions f, g : Rn → R defined
as f (x) = ψ(s2 − ∥a − x∥2 ) and g(x) = ψ(∥a − x∥2 − r2 ) are smooth. Verify the required properties.
(ii) Let f, g be as in the above proof. Note that f (x)+g(x) > 0 for every x ∈ Rn . Define h : Rn → R
f (x)
as h(x) = . Check that this works.
f (x) + g(x)
(iii) For each a ∈ A, there is ra > 0 with B(a, 2ra ) ⊂ U . As {B(a, ra ) : a ∈ A} is an open cover for
∪
the compact set A, there are finitely many vectors a1 , . . . , ap ∈ A with A ⊂ pj=1 B(aj , raj ). First
proof : By (ii), there are smooth functions h1 , . . . , hp : Rn → R such that 0 ≤ hj ≤ 1, hj (x) = 1 for
every x ∈ B(aj , raj ), and hj (x) = 0 for every x ∈ Rn \ B(aj , 2raj ), 1 ≤ j ≤ p. Let h : Rn → R be
∏ ∪
h(x) = 1 − pj=1 (1 − hj (x)), and check that this works. Second proof : Let U1 = pj=1 B(aj , raj )
∪
and U2 = pj=1 B(aj , 2raj ). Then A ⊂ U1 ⊂ U1 ⊂ U2 ⊂ U . For each j, we may choose smooth
functions fj , gj : Rn → R by (i) such that fj (x) > 0 iff ∥aj −x∥ < 2raj , fj (x) = 0 iff ∥aj −x∥ ≥ 2raj ,
∑
gj (x) > 0 iff ∥aj − x∥ > raj , and gj (x) = 0 iff ∥aj − x∥ ≤ raj . Let F, G : Rn → R be F = pj=1 fj
∏
and G = pj=1 gj . Then F, G are smooth, F (x) > 0 iff x ∈ U2 , F (x) = 0 iff x ∈ Rn \ U2 , G(x) > 0
iff x ∈ Rn \ U1 , and G(x) = 0 iff x ∈ U1 . In particular, F + G > 0 in Rn . Define h : Rn → R as
F (x)
h(x) = . Check that this works.
F (x) + G(x)
Remark: Using [111](iii), we may construct smooth partitions of unity; read this from textbooks.
20 T.K.SUBRAHMONIAN MOOTHATHU
Proof. (i) Define g : [0, 1] → R as g(t) = f (a + t(b − a)), which is a C q+1 -function. Note that
g ′ (t) = f ′ (a + t(b − a); b − a) = ⟨∇f (a + t(b − a)), b − a⟩ = ⟨b − a, ∇⟩g(t), and hence we may
verify inductively that g (k) (t) = ⟨b − a, ∇⟩k g(t) = ⟨b − a, ∇⟩k f (a + t(b − a)) for 1 ≤ k ≤ q + 1,
where g (k) denotes the kth derivative of g. Applying the one-variable Taylor’s theorem to g, we
∑ g (k) (0)(1 − 0)k g (q+1) (t0 )(1 − 0)q+1
may find t0 ∈ (0, 1) with g(1) = g(0) + qk=1 + . This means
k! (q + 1)!
∑ ⟨b − a, ∇⟩k f (a) ⟨b − a, ∇⟩q+1 f (c)
f (b) = f (a) + qk=1 + , where c := a + t0 (b − a).
k! (q + 1)!
(ii) Let g be as above. Then the integral form of one-variable Taylor’s theorem says that g(1) =
∑ g (k) (0)(1 − 0) ∫ 1 (1 − t)q (q+1)
g(0) + qk=1 + 0 g (t)dt. This yields the required result.
k! q!
Definition: Let U ⊂ Rn be open and f : Rn → R be a function for which all the second order partial
∂2f
derivatives exist at a ∈ U . Then the n × n matrix Hf (a) whose ijth entry is (a) is called
∂xi ∂xj
the Hessian matrix of f . Note that if the second order partial derivatives of f are continuous in a
neighborhood of a, then Hf (a) is a symmetric matrix by [110].
Definition: Let A be an n×n real symmetric matrix. For x ∈ Rn , let Ax = A(x) ∈ Rn be the vector
obtained by applying the linear map specified by A (with respect to the standard basis) to x. Since
A is symmetric, Ax can be obtained in terms of matrix multiplication either by multiplying the
column vector x on the left by A, or the row vector x on the right by A. We say A is positive definite
if ⟨Ax, x⟩ > 0 for every x ∈ Rn \ {0}, and negative definite if ⟨Ax, x⟩ < 0 for every x ∈ Rn \ {0}.
1 0 −1 0
Example: When n = 2, we have that is positive definite, is negative definite,
0 1 0 −1
1 0
and is a symmetric matrix which is neither positive definite nor negative definite.
0 −1
[113] [Facts from Linear Algebra] Let A be an n × n real symmetric matrix. Then,
(i) The eigenvalues, say λ1 , . . . , λn , of A are all real, and there is an orthonormal basis {u1 , . . . , un }
of Rn such that Auj = λj uj for 1 ≤ j ≤ n.
(ii) A is positive definite ⇔ all eigenvalues of A are positive.
(iii) A is negative definite ⇔ all eigenvalues of A are negative.
(iv) Suppose n = 2 and A = [aij ]2×2 . Then, A is positive definite ⇔ det(A) > 0 and a11 > 0.
Similarly, A is negative definite ⇔ det(A) > 0 (warning: not negative) and a11 < 0. If det(A) < 0,
then the two eigenvalues of A have opposite signs because det(A) is equal to the product of the
two eigenvalues of A.
(iii) Note that A is negative definite iff −A is positive definite, or imitate the proof of (ii).
(iv) Note that a12 = a21 since A is symmetric. Since ⟨Ae1 , e1 ⟩ = a11 , we may suppose a11 ̸= 0 for
proving both the implications. Now for (x, y) ∈ R2 , observe that ⟨A(x, y), (x, y)⟩
a12 y 2 a2 a12 y 2 det(A)y 2
= a11 x2 + 2a12 xy + a22 y 2 = a11 (x + ) + (a22 − 12 )y 2 = a11 (x + ) + .
a11 a11 a11 a11
From this expression, it is clear that A is positive definite if a11 > 0 and det(A) > 0. Conversely,
if A is positive definite, then from the above expression, a11 = ⟨Ae1 , e1 ⟩ > 0 and det(A)a11 =
⟨A(a12 , −a11 ), (a12 , −a11 )⟩ > 0 so that det(A) > 0.
Remark: If A ∈ Rn×n , then the function Q : Rn → R given by Q(x) = ⟨Ax, x⟩ is called a quadratic
form. Some of the assertions in [113] can also be stated in the language of a quadratic form.
22 T.K.SUBRAHMONIAN MOOTHATHU
(iii) If Hf (a) has two eigenvalues λ1 and λ2 with λ1 < 0 < λ2 , then for the corresponding eigen-
vectors x, y ∈ Rn \ {0}, we have that ⟨Hf (a)x, x⟩ < 0 < ⟨Hf (a)y, y⟩. So assume this inequality
holds. Let v = y/∥y∥. We claim that f (a + εv) > f (a) for every sufficiently small ε > 0. Since
δ := ⟨Hf (a)v, v⟩ is positive, we may choose r > 0 such that B(a, r) ⊂ U and the operator norm
∥Hf (c) − Hf (a)∥ < δ for every c ∈ B(a, r). Now consider any ε ∈ (0, r) and put b = a + εv ∈
MULTIVARIABLE CALCULUS 23
B(a, r)\{a}. By Exercise-14, there is c ∈ (a, b) ⊂ B(a, r) with 2(f (b)−f (a)) = ⟨Hf (c)(b−a), b−a⟩.
2(f (b) − f (a))
As in the proof of (i), we conclude that = ⟨Hf (a)v, v⟩ + ⟨(Hf (c) − Hf (a))v, v⟩ > 0,
∥b − a∥2
where the last inequality is deduced using the choice of δ and r. This shows that f (a + εv) − f (a) =
f (b) − f (a) > 0 for every ε ∈ (0, r). Similarly, by taking u = x/∥x∥, we may show that
f (a + εu) − f (a) < 0 for every sufficiently small ε > 0. Hence a is a saddle point of f .
Example: (i) Let f : R2 → R be f (x, y) = xy −x2 −y 2 . Then we see that ∇f (x, y) = (y −2x, x−2y),
−2 1
and hence ∇f (0, 0) = (0, 0). Now, A = [aij ] := Hf (0, 0) = . The eigenvalues of this
1 −2
matrix are the roots of its characteristic polynomial, which is λ2 + 4λ + 3. Hence the eigenvalues
are −1 and −3. Thus (0, 0) is a strict local maximum of f by [114](i). Alternately, one may use
[114](i) and [113](iv) to conclude the same since det(A) > 0 and a11 = −2 < 0.
2 +y 2
(ii) Let f : R2 → R be f (x, y) = ex . Directly, we may see that (0, 0) is the unique
minimum
of
2 0
f . Now ∇f (x, y) = (2xex +y , 2yex +y ) so that ∇f (0, 0) = (0, 0), and Hf (0, 0) = , whose
2 2 2 2
0 2
eigenvalues are positive. Thus (0, 0) is a local minimum of f by our test.
(iii) Let f : R2 → R be f (x, y) = ex y 2 . Since ex > 0 and y 2 ≥ 0, we see directly that (x, 0) is a local
minimum of f for every x ∈ R. However, this does not follow from our test. Note that ∇f (x, y) =
0 0
(ex y 2 , 2ex y) so that ∇f (0, 0) = (0, 0). Since Hf (0, 0) = , we have det(Hf (0, 0)) = 0, and
0 2
the tests [113] and [114] are not applicable.
(iv) Let f : R2 → R be f (x, y) = cos x+ y sin x. Then ∇f (x, y) = (− sin x + y cos x, sin x) so that
−1 1
∇f (0, 0) = (0, 0). We have Hf (0, 0) = . Since det(Hf (0, 0)) < 0, we conclude by [113](iv)
1 0
and [114](iii) that (0, 0) is a saddle point of f .
Under suitable hypothesis, we will show that if the Jacobian matrix Jf (a) of f at a is invertible,
then f is locally invertible. We need a little preparation.
Exercise-15: Let L(Rn , Rn ) = {all linear maps L : Rn → Rn }, equipped with the operator norm.
Recall that L(Rn , Rn ) ∼
= Rn×n , and the operator norm on L(Rn , Rn ) is equivalent to the Euclidean
norm on Rn×n . Let U := {L ∈ L(Rn , Rn ) : L is invertible} = {A ∈ Rn×n : det(A) ̸= 0}. Then,
(i) U is open in L(Rn , Rn ) (equivalently, in Rn×n ) because det : Rn×n → R is continuous.
(ii) The map A 7→ A−1 from U to U is continuous.
24 T.K.SUBRAHMONIAN MOOTHATHU
1
[Hint: (ii) Consider A ∈ U . Choose 0 < r < by (i) such that C ∈ U whenever ∥A−C∥ < r.
2∥A−1 ∥
Now consider C ∈ U with ∥A − C∥ < r. Note that C −1 − A−1 = C −1 (A − C)A−1 . Hence ∥C −1 ∥ ≤
∥A−1 ∥ + ∥C −1 ∥∥A − C∥∥A−1 ∥ < ∥A−1 ∥ + ∥C −1 ∥/2, and hence ∥C −1 ∥ ≤ 2∥A−1 ∥. Therefore,
∥C −1 − A−1 ∥ ≤ ∥C −1 ∥∥A − C∥∥A−1 ∥ < 2∥A−1 ∥2 ∥A − C∥, which gives continuity at A.]
([ ] )
∂f i (u i )
Proof. (i) Let U n = U ×· · ·×U ⊂ Rn×n . Define ϕ : U n → R as ϕ(u1 , . . . , un ) = det .
∂xj n×n
Then ϕ is continuous because f is a C 1 -function and det : Rn×n → R is continuous. Since
ϕ(a, . . . , a) = det(Jf (a)) ̸= 0, we may find r > 0 with B(a, r) ⊂ U such that ϕ(c1 , . . . , cn ) ̸= 0 for ev-
ery (c1 , . . . , cn ) ∈ B(a, r) × · · · × B(a, r) = B(a, r)n . In particular, det(Jf (x)) = ϕ(x, . . . , x) ̸= 0 for
every x ∈ B(a, r). Now suppose
[ ] x ∈ B(a, r) are distinct. By Exercise-16,
u, ([ ])there are c1 , . . . , cn ∈
∂fi (ci ) ∂fi (ci )
(u, x) with f (x)−f (u) = (x−u). Since x−u ̸= 0 and det = ϕ(c1 , . . . , cn ) ̸= 0,
∂xj ∂xj
it follows that f (x) − f (u) ̸= 0. Thus f is injective on B(a, r).
(ii) Let U0 ⊂ B(a, r) be open and u ∈ U0 . We have to find δ > 0 with B(f (u), δ) ⊂ f (U0 ). Choose
ε > 0 with B(u, ε) ⊂ U0 . Since the boundary ∂B(u, ε) is compact, its image f (∂B(u, ε)) is also
compact by the continuity of f . Moreover, f (u) ∈
/ f (∂B(u, ε)) by injectivity. Hence there is δ > 0
such that ∥f (x) − f (u)∥ ≥ 2δ for every x ∈ ∂B(u, ε). Fix y ∈ B(f (u), δ) and define ψ : B(u, ε) → R
∑
as ψ(x) = ∥f (x) − y∥2 = ni=1 (fi (x) − yi )2 . Then ψ is continuous and attains its minimum at
some z ∈ B(u, ε) by the compactness of B(u, ε). Observe that if x ∈ ∂B(u, ε), then ψ(x) ≥
(∥f (x) − f (u)∥ − ∥f (u) − y∥)2 ≥ (2δ − δ)2 = δ 2 > ψ(u). Therefore, ψ cannot attain its minimum
on the boundary of B(u, ε). In other words, we must have z ∈ B(u, ε). Since ψ is differentiable,
∂ψ ∑ ∂fi
we conclude by Exercise-12 that ∇ψ(z) = 0. This means 0 = (z) = 2 ni=1 (fi (z) − yi ) (z)
∂xj ∂xj
for 1 ≤ j ≤ n. In matrix form, this is equivalent to (f (z) − y)Jf (z) = 0 ∈ Rn . As det(Jf (z)) ̸= 0
by (i), the only possibility is f (z) = y. This shows B(f (u), δ) ⊂ f (B(u, ε)) ⊂ f (U0 ).
MULTIVARIABLE CALCULUS 25
(iii) Fix v = f (u) ∈ V , where u ∈ B(a, r). We claim that g is differentiable at v with Jg (v) =
Jf (u)−1 . Consider y ∈ V \ {v}. By (i), there is x ∈ B(a, [ r) \ {u} ] with f (x) = y. Then by
∂fi (ci )
Exercise-16, there are c1 , . . . , cn ∈ (u, x) with f (x) − f (u) = (x − u). This means y − v =
∂xj
[ ] [ ]
∂fi (ci ) ∂fi (ci ) −1
(g(y)−g(v)), and hence g(y)−g(v) = (y−v). As y → v, we have x → u since g
∂xj ∂x
[ ]−1 j
∂fi (ci )
is continuous by (ii); and consequently → Jf (u)−1 in Rn×n by the C 1 -property of f and
∂xj
Exercise-15(ii). Therefore, it follows by [105] (or by direct calculation) that g is differentiable at v
with Jg (v) = Jf (u)−1 . This proves the claim, and thus g is differentiable in V with Jg (v) = Jf (u)−1
for every v = f (u), where u ∈ B(a, r). Since f is a C 1 -function, it now follows by Exercise-15(ii)
that g is also a C 1 -function (or note that the entries of Jg (v) are rational functions of the entries
of Jf (u)). Similarly, if f is a C k -function, we may deduce that g is a C k -function.
Remark: (i) Inverse function theorem guarantees only a local inverse for f . Even if det(Jf (a)) ̸= 0
for every a ∈ U , f may not have a global inverse. For example, let f : R2 → R2 be f (x, y) =
(ex cos y, ex sin y). Then f (x, y + π) = f(x,
y) so that f failsto
be injective, and hence f has no
ex cos y −ex sin y
global inverse. But det(Jf (x, y)) = det = e2x ̸= 0 for every (x, y) ∈ R2 .
ex sin y ex cos y
(ii) For another proof of [115] using Banach fixed point theorem for contraction maps, see Theorem
9.24 in Rudin, Principles of Mathematical Analysis.
To motivate our next result, we ask the following question: if f : Rn+m → Rm is a function,
then from the expression f (x, y) = 0 (where x ∈ Rn and y ∈ Rm ), can we solve y as a function of
x, at least locally? In other words, can we represent the zero set {(x, y) ∈ Rn+m : f (x, y) = 0}, at
least locally, as the graph G(g) of a function g from an open set in Rn to Rm ? This question about
the ‘implicit’ existence of a function g is answered by the Implicit function theorem stated as [116]
below. First let us consider some examples.
(ii) Let f : R2 → R be f (x, y) = xy. Here, y cannot be solved as a function of x globally from
‘f (x, y) = 0’ since f (0, y) = 0 for every y ∈ R. However, if x ̸= 0, then we have y = 0 whenever
f (x, y) = 0, and hence y may be thought of as the zero function of x for x ∈ R \ {0}.
26 T.K.SUBRAHMONIAN MOOTHATHU
(iii) Let f : R2 → R be f (x, y) = x2 +y 2 −1. Then {(x, y) : f (x, y) = 0} is the unit circle in R2 . The
full circle is not the graph of any function. But any arc of the unit circle which projects injectively
to the x-axis is clearly the graph of a function; for example, {(x, y) : f (x, y) = 0 and y > 0} is the
√
graph of the function g : (−1, 1) → R defined as g(x) = 1 − x2 . If an open arc of the unit circle
contains either (1, 0) or (0, 1), then that arc is not the graph of any function. Here, note that (1, 0)
∂f
and (0, 1) are precisely the points where vanishes.
∂y
(iv) Let f : Rn+m → Rm be linear. Let L1 : Rn → Rm and L2 : Rm → Rm be L1 (x) = f (x, 0)
and L2 (y) = f (0, y) so that f (x, y) = L1 (x) + L2 (y). Now from ‘f (x, y) = 0’, we can solve y as
y = −L2−1 (L1 (x)) provided L2 is invertible. Let A be the m × (n + m) matrix of the linear map f
with respect to the standard bases of Rn+m and Rm . Then Jf (x, y) = A for every (x, y) ∈ Rn+m .
Note that the (n + j)th column of A is same as the jth column of the [matrix of ]the linear map L2
∂f
since L2 (y) = f (0, y). Therefore L2 is invertible iff the m × m matrix (x, y) is invertible.
∂y
[116] [Implicit function theorem] Let U ⊂ Rn+m be open, f : U [→ Rm be 1
] a C -function, and
∂f
(a, b) ∈ U be such that f (a, b) = 0 ∈ Rm and the m × m matrix (a, b) is invertible. Then,
∂y
there exist r > 0, an open neighborhood A ⊂ Rn of a, and a C 1 -function g : A → Rm such that
(i) B((a, b), r) ⊂ U , g(a) = b, and {(x, y) ∈ B((a, b), r) : f (x, y) = 0} = {(x, g(x)) : x ∈ A}.
[ ]−1 [ ]
∂f ∂f
(ii) Jg (x) = − (x, g(x)) (x, g(x)) for every x ∈ A.
∂y ∂x
Proof. (i) Let F : U → Rn+m be F (x, y) = (x, f (x, y)), which is a C 1 -function with F (a, b) =
In 0 [ ]
∂f
(a, 0). As JF (x, y) = ∂f ∂f , we see det(J F (a, b)) = det( (a, b) ) ̸= 0. By
(x, y) (x, y) ∂y
∂x ∂y
Inverse function theorem [115], there exist r > 0, an open set V ⊂ [Rn+m , and 1
] a C -function
∂f
G : V → B((a, b), r) such that B((a, b), r) ⊂ U , det(JF (x, y)) = det( (x, y) ) ̸= 0 for every
∂y
(x, y) ∈ B((a, b), r), F |B((a,b),r) : B((a, b), r) → V is a bijective open map, and G is the inverse of
F |B((a,b),r) . Moreover, G must be of the form G(x, y) = (x, h(x, y)) since F (x, y) = (x, f (x, y)).
(ii) Let ϕ : A → Rn+m be ϕ(x) = (x, g(x)) = G(x, 0), which is a C 1 -function with f ◦ ϕ ≡ 0.
We deduce by the Chain rule that 0m×n = Jf ◦ϕ (x) = Jf (ϕ(x))Jϕ (x) for every x ∈ A. But
MULTIVARIABLE CALCULUS 27
[ ]
∂f ∂f I
and Jϕ (x) =
n
Jf (ϕ(x)) = (x, g(x)) (x, g(x)) . Hence we get
∂x ∂y m×(n+m) Jg (x)
(n+m)×m
[ ] [ ] [ ]−1 [ ]
∂f ∂f ∂f ∂f
0m×n = (x, g(x)) + (x, g(x)) Jg (x). So, Jg (x) = − (x, g(x)) (x, g(x)) .
∂x ∂y ∂y ∂x
Remark: The required function g in [116] can be produced in a simpler way when n = m = 1.
Let U ⊂ R2 be open, f : U → R be a C 1 -function, and (a, b) ∈ U be such that f (a, b) = 0 and
∂f ∂f
(a, b) ̸= 0. Replacing f with −f if necessary, assume (a, b) > 0. By the C 1 -property of f , we
∂y ∂y
∂f
may choose δ > 0 with > 0 in [a − δ, a + δ] × [b − δ, b + δ]. Then f (x, ·) is strictly increasing
∂y
in [b − δ, b + δ] for each x ∈ [a − δ, a + δ]. In particular, f (a, b − δ) < 0 < f (a, b + δ). Therefore,
we may choose ε ∈ (0, δ) such that f (x, b − δ) < 0 < f (x, b + δ) whenever |x − a| < ε. For each
x ∈ (a−ε, a+ε), applying the intermediate value property to the strictly increasing function f (x, ·),
we may find a unique y ∈ (b − δ, b + δ) with f (x, y) = 0. Define g : (a − ε, a + ε) → R as g(x) = y.
To find local extrema of a function f restricted to a subset of its original domain (i.e., under
some constraints), a technique called Lagrange’s Multiplier Method (LMM) is often useful. This
method does not pinpoint the local extrema, but helps us to narrow down our search to possible
solutions. First we will explain theoretically why this method works. In this context, we will briefly
mention what is a tangent space to a level set; to learn more about tangent vectors and tangent
spaces, see some textbook in Differential Geometry.
(ii) There exist δ > 0 and a C 1 -path α : (−δ, δ) → Rn+m such that α(t) ∈ S for every t ∈ (−δ, δ),
α(0) = p, and α′ (0) = v.
Proof. (i) By [117], Tp S is equal to the kernel of the surjective linear map f ′ ((p ; ·) : Rn+m → Rm .
(ii) These vectors are the rows of Jf (p), which has rank m since f ′ (p ; ·) : Rn+m → Rm is surjective.
(iii) Let v ∈ Tp S. Then f ′ (p ; v) = (⟨∇f1 (p), v⟩, . . . , ⟨∇fm (p), v⟩) = 0 ∈ Rm by [117]. Or argue as
follows. Choose α as in [117](ii). Since f ◦ α is constant, we have that fi ◦ α is constant for each i.
Hence by Chain rule, 0 = (fi ◦ α)′ (0) = ⟨∇fi (α(0)), α′ (0)⟩ = ⟨∇fi (p), v⟩.
(ii) Let f : R2 → R be f (x, y) = xy and S = f −1 (0). Then S is equal to the union of x-axis and
y-axis. Geometrically, the collection of all tangent vectors to S at (0, 0) is equal to S itself, and this
is not a vector subspace of R2 . Here, the problem is that ∇f (0, 0) = (0, 0), and therefore the linear
map f ′ ((0, 0) : ·) : R2 → R is not surjective; in fact, this linear map is the zero map, and hence
its kernel is the whole of R2 , which is strictly larger than the collection of all geometric tangent
vectors to S at (0, 0).
Now we will state the result justifying the Lagrange’s Multiplier method (LMM), and then we
will illustrate its use through several examples.
Proof. We claim that ∇g(p) ⊥ Tp S. To prove the claim, consider v ∈ Tp S. Then by [117],
there is a C 1 -path α : (−δ, δ) → Rn+m such that α(t) ∈ S for every t ∈ (−δ, δ), α(0) = p, and
α′ (0) = v. Then g ◦ α : (−δ, δ) → R is a differentiable function having a local extremum at 0. Hence
0 = (g ◦ α)′ (0) = g ′ (α(0); α′ (0)) = g ′ (p; v) = ⟨∇g(p), v⟩. This proves the claim. By the claim and
[118](iv), it follows that ∇g(p) ∈ span{∇f1 (p), . . . , ∇fm (p)}.
The Lagrange’s Multiplier Method (LMM) may be explained roughly as follows. Suppose g is a
real-valued function defined on an open subset U of Rn+m , and we need to find the (local) extrema of
g|S , where S := f −1 (c) is the level set of a function f = (f1 , . . . , fm ) : U → Rm . If the assumptions
of [119] are satisfied, then any local extremum p ∈ S of g|S must satisfy the following: (i) f (p) = c,
30 T.K.SUBRAHMONIAN MOOTHATHU
∑m
and (ii) ∇g(p) = i=1 λi ∇fi (p) for some λ1 , . . . , λm ∈ R. Let S0 = {p ∈ U : p satisfies (i) and (ii)}.
Based on the given problem, we may be able to identify the subset S0 of S (often, S0 is a finite
subset). In this manner, LMM helps us to narrow down our search. Next, by examining each
p ∈ S0 , we need to determine using other considerations whether p is a (local) extremum of g|S .
Example: (i) We wish to find the maximum/minimum of x + y subject to the constraint that
x2 + y 2 = 1. Define g, f : R2 → R as g(x) = x + y and f (x, y) = x2 + y 2 , which are C 1 -functions.
Let S = f −1 (1), and note that ∇f (p) = 2p ̸= 0 for every p ∈ S. If p = (a, b) ∈ S is a local
extremum of g|S , then ∇g(p) = λ∇f (p) for some λ ∈ R by [119]. This gives (1, 1) = 2λ(a, b). Also
1 1 1
a2 + b2 = 1 since (a, b) ∈ S. Therefore, λ = ± √ and (a, b) = ±( √ , √ ). LMM gives only this
2 2 2
much information. Now by direct examination, we deduce that x + y attains its maximum and
1 1 −1 −1
minimum subject to the constraint x2 + y 2 = 1 respectively at ( √ , √ ) and ( √ , √ ).
2 2 2 2
(ii) We wish to find the unique point on the line y = x + 5 at a minimum distance to (2, 1). Define
g, f : R2 → R as g(x, y) = (x − 2)2 + (y − 1)2 , f (x, y) = y − x (which are C 1 -functions), and
put S = f −1 (5). We know geometrically that g|S has a unique minimum. Note that ∇f (x, y) =
(−1, 1) ̸= (0, 0) for every (x, y) ∈ S. Hence if (a, b) ∈ S is where g|S attains its minimum, then
∇g(a, b) = λ∇f (a, b) for some λ ∈ R by [119]. Solving the equations 2((a − 2), (b − 1)) = λ(−1, 1)
and b = a + 5, we get (a, b) = (−1, 4), which must be the required point.
1 −1 0
Note that Jf (x, y, z) = has rank 2 for every (x, y, z) ∈ S. If p = (x, y, z) ∈ S is a local
1 0 2z
extremum of g|S , then by [119], ∇g(p) = λ1 ∇f1 (p)+ λ2 ∇f2 (p) and f (p) = (1, 1). The first equation
implies λ1 = −1, λ2 = 2, and z = 1/4. Then using f (p) = (1, 1), we get x = 1 − z 2 = 15/16 and
y = x − 1 = −1/16. Thus p = (15/16, −1/16, 1/4), and g(p) = 9/8. Observe that any point q ∈ S
must be of the form q = (1 − z 2 , −z 2 , z). If |z| > 1, then g(q) = 1 + z − 2z 2 < 0 < g(p). Moreover,
{(x, y, z) ∈ S : |z| ≤ 1} is compact, and g must have a maximum on this compact set. Therefore,
g|S must attain its maximum at p, and max g(S) = g(p) = 9/8.
(iv) We wish to find the maximum volume of a 3-dimensional rectangular box A with surface
∑
area c > 0. Suppose that A = { 3j=1 tj ej : 0 ≤ t1 ≤ x; 0 ≤ t2 ≤ y; 0 ≤ t3 ≤ z}, where
x, y, z are positive. Then the volume of A is xyz and surface area of A is 2(xy + yz + xz). Let
U = {(x, y, z) ∈ R3 : x > 0, y > 0, z > 0}. Define g, f : U → R as g(x, y, z) = xyz and
f (x, y, z) = xy + yz + xz, and put S = f −1 (c/2). For (x, y, z) ∈ S, from xy + yz + xz = c/2,
c2
we have xy < c/2 and xz < c/2, and therefore g(x, y, z) = xyz < → 0 as x → ∞. Similarly,
4x
g(x, y, z) → 0 when y → ∞, and also when z → ∞, for (x, y, z) ∈ S. Moreover, xyz = 0 if one
MULTIVARIABLE CALCULUS 31
of x, y, z is 0. These observations imply that g|S must have a maximum, say at p = (x0 , y0 , z0 ).
Since ∇f (p) = (y0 + z0 , x0 + z0 , x0 + y0 ) ̸= 0 ∈ R3 by the definition of U , we deduce by [119] that
λx0 λy0
∇g(p) = λ∇f (p) for some λ ∈ R. Then y0 = = z0 , and similarly, x0 = = z0 . This
x0 − λ y0 − λ
means the volume is maximum when the box is a cube. Using f (p) = c/2, we get x0 = (c/6)1/2
and hence the maximum volume is (c/6)3/2 .
(v) Let n ≥ 2 and A ∈ Rn×n be a symmetric matrix. We will show A has a real eigenvalue using
LMM. Define g, f : Rn → R as g(x) = ⟨Ax, x⟩, f (x) = ∥x∥2 = ⟨x, x⟩ (which are C 1 -functions),
and put S = f −1 (1), which is the unit sphere in Rn . By Exercise-10(iii) and the symmetry of
A, we have ⟨∇g(x), v⟩ = g ′ (x; v) = ⟨Ax, v⟩ + ⟨x, Av⟩ = 2⟨Ax, v⟩, and hence ∇g(x) = 2Ax. Also,
∇f (x) = 2x ̸= 0 for x ∈ S. Since S is compact, there is x ∈ S where g|S attains its maximum. By
[119], we get ∇g(x) = λ∇f (x) for some λ ∈ R, which implies Ax = λx. Then λ is an eigenvalue of
A because x is a unit vector (∵ x ∈ S). We remark that this type of argument can be continued to
show A is diagonalizable, and all its eigenvalues are real (for the next step, take the unit sphere in
Y := {y ∈ Rn : ⟨x, y⟩ = 0} in the place of S).
∏n 2
∑n 2
Exercise-17: (i) Maximize j=1 xj subject to j=1 xj = 1.
∑n
∏n j=1 aj
(ii) [Geometric mean ≤ Arithmetic mean] If a1 , . . . , an ∈ (0, ∞), then ( j=1 aj )1/n ≤ .
∏n ∑n n
2 2 −1
[Hint: (i) Let g(x1 , . . . , xn ) = j=1 xj , f (x1 , . . . , xn ) = j=1 xj , and S = f (1). Then g has a
positive maximum on the compact set S, say at p = (x1 , . . . , xn ), where xj ̸= 0 for every j. Since
∇f (p) = 2p ̸= 0, LMM gives ∇g(p) = λ∇f (p) for some λ ∈ R. This implies g(x) = λx2j for every
∑
j and hence ng(x) = nj=1 λx2j = λ. Therefore, g(x) = ng(x)x2j , or x2j = 1/n for every j. Thus
∏n
1 √ 1 b j=1 aj
g(p) = n . (ii) Let bj = aj and b = (b1 , . . . , bn ). Then n = g(p) ≥ g( ) = ∑n .]
n n ∥b∥ ( j=1 aj )n
(ii) The lower Riemann integral L(f ) and upper Riemann integral U (f ) of f over A are defined as
L(f ) = sup{L(f, P ) : P is a partition of A} and U (f ) = inf{U (f, P ) : P is a partition of A}. By
the last sentence of (i) above, we have that L(f ) ≤ U (f ) always. If L(f ) = U (f ) =: y ∈ R, then
∫ ∫
we say f is Riemann integrable over A and we write A f (x)dx = y (or simply A f dx = y).
Notation: Let R(A) denote the collection of all Riemann integrable functions f : A → R.
The results about Riemann integration in the multivariable case are analogous to the results in
the one-variable case. Some of the results are stated below and the proofs are left to the student3.
3See my notes Real Analysis for the proofs in the one-variable case.
MULTIVARIABLE CALCULUS 33
with ∥P ∥ < δ, and let D1 , . . . , Dk be the sub n-boxes of A determined by P . Then diam(f (Di )) ≤
ε
for every i by the choice of δ. Now (i) may be applied.]
µ(A) + 1
Exercise-19: Let A ⊂ Rn be an n-box and f ∈ R(A). Then for every ε > 0, there is a δ > 0 such
that the following are true for every partition P of A with ∥P ∥ < δ:
∫
(i) 0 ≤ U (f, P ) − L(f, P ) ≤ ε and L(f, P ) ≤ A f dx ≤ U (f, P ).
∫
(ii) | A f dx − S(f, P, T )| ≤ ε for any Riemann sum S(f, P, T ) of f with respect to P .
Consequently, for any f ∈ R(A) and any sequence (Pn ) of partitions of A with (∥Pn ∥) → 0, the
∫
following are true: A f dx = limn→∞ L(f, Pn ) = limn→∞ U (f, Pn ) = limn→∞ S(f, Pn , Tn ) for any
choice of tags Tn of Pn .
For further discussion of Riemann integration, we need the notion of a null set (a set of Lebesgue
measure zero) in Rn - defined below - and some of the results pertaining to null sets.
Definition: We say X ⊂ Rn is a null set (or a set of Lebesgue measure zero) in Rn if for every
∪ ∑∞
ε > 0, there is a sequence (Ak ) of n-boxes with X ⊂ ∞
k=1 Ak and k=1 µ(Ak ) ≤ ε. For example,
if A ⊂ Rn is an n-box, then ∂A is a null set (because each face of A is an n-box of zero volume).
Exercise-23: Let X ⊂ Rn . (i) If X is a null set, then every subset of X is a null set.
(ii) X is a null set ⇔ for every ε > 0, there is a sequence (Ak ) of n-boxes in Rn such that
∪ ∑∞
X⊂ ∞ k=1 int(Ak ) and k=1 µ(Ak ) ≤ ε ⇔ for every ε > 0, there is a sequence (Ak ) of n-cubes in
∪∞ ∑
R such that X ⊂ k=1 Ak and ∞
n
k=1 µ(Ak ) ≤ ε.
(iii) If X is a compact null set, then for every ε > 0, there are finitely many n-boxes A1 , . . . , Ap in
∪ ∑
Rn with X ⊂ pk=1 int(Ak ) and pk=1 µ(Ak ) ≤ ε.
(iv) If X is equal to a countable union of null sets in Rn , then X is a null set.
(v) If X is countable, then X is a null set by (iv) because every singleton is a null set.
Exercise-24: (i) Let X ⊂ Rn be compact and f : X → Rm be continuous. Then its graph G(f ) is
a null set in Rm+n .
(ii) Let f : Rn → Rm be continuous. Then its graph G(f ) is a null set in Rm+n .
[Hint: (i) Let A be an n-box with X ⊂ A and µn (A) > 0. Consider ε > 0. Choose ε0 > 0
ε
with (2ε0 )m < , and let δ > 0 be such that ∥f (x) − f (y)∥ < ε0 for every x, y ∈ X with
µn (A)
∥x − y∥ < δ. Let P be a partition of A with ∥P ∥ < δ. Then A gets divided into finitely many sub
n-boxes of diameter < δ. Let D1 , . . . , Dk be a listing of those sub n-boxes intersecting X. Since
diam(f (Di ∩ X)) ≤ ε0 , there is an m-cube Ei ⊂ Rm of side length 2ε0 with f (Di ∩ X) ⊂ Ei .
ε ∪ ∑
Note that µm (Ei ) = (2ε0 )m < . Now G(f ) ⊂ ki=1 (Di × Ei ), and ki=1 µn+m (Di × Ei ) =
µn (A)
∑k ∑k ε ε
i=1 µn (Di )µm (Ei ) ≤ i=1 µn (Di ) = µn (A) × = ε. (ii) Write Rn as a countable
µn (A) µn (A)
union of compact sets, and use (i) and Exercise-23(iv).]
MULTIVARIABLE CALCULUS 35
[120] Let U ⊂ Rn be a nonempty open set. Then, (i) There is a sequence (Kj ) of compact sets in
∪
Rn such that U = ∞j=1 Kj and Kj ⊂ int(Kj+1 ) for every j ∈ N.
(ii) In addition, we may choose Kj ’s in such a way that each Kj is a finite union of n-cubes with
pairwise disjoint interiors.
Proof. (i) If U = Rn , let Kj = B(0, j). Else, let Kj = B(0, j) ∩ {x ∈ Rn : dist(x, Rn \ U ) ≥ 1/j}.
(ii) Choose Kj ’s as in (i). Now fix j ∈ N. Choose δ > 0 such that the δ-neighborhood Nδ (Kj ) :=
{x ∈ Rn : dist(x, Kj ) < δ} of Kj is included in int(Kj+1 ). Let A be an n-cube containing Kj , and
P be a partition of A with ∥P ∥ < δ. The partition P divides A into sub n-boxes. Let Y1 , . . . , Yk
∪
be a listing of those sub n-boxes of A intersecting Kj , and put Ej = ki=1 Yi . Then Ej is compact,
∪
and Kj ⊂ Ej ⊂ Nδ (Kj ) ⊂ int(Kj+1 ). Carry out this construction for each j. Then U = ∞ j=1 Ej
and Ej ⊂ int(Ej+1 ). Thus the new collection (Ej ) of compact sets satisfies the requirement.
Proof. (i) We have λ < ∞ since f is a C 1 -function and A is compact. Consider a, b ∈ A. Then
the line segment [a, b] ⊂ A because A is convex. By Mean value inequality [109](iii), we see
that ∥f (b) − f (a)∥ ≤ λ∥b − a∥. Next, consider an n-cube D ⊂ A with side-length δ. Then
√ √
diam(D) = δ n, and hence diam(f (D)) ≤ δλ n. Consequently, f (D) can be put inside some
√
n-cube of side-length 2δλ n.
(ii) First suppose there is an n-cube A with X ⊂ A ⊂ U . Since U is open, by enlarging A a little
bit we may suppose that there is δ > 0 such that the δ-neighborhood Nδ (X) of X is included
in A. By part (i), f |A is λ-Lipschitz for some λ > 0. Consider ε > 0, and choose ε0 > 0
√ ∪
with (2λ n)n ε0 ≤ ε. By Exercise-23(ii), there is a sequence (Ck ) of n-cubes with X ⊂ ∞ k=1 Ck
∑∞
and k=1 µ(Ck ) ≤ ε0 . By partitioning Ck ’s into smaller cubes if necessary, we may suppose
Ck ⊂ Nδ (X) ⊂ A for every k ∈ N. Let δk be the side-length of Ck . By (i), there is an n-cube
√ ∪ ∑∞
Ek ⊂ Rn of side-length 2δk λ n with f (Ck ) ⊂ Ek . Hence f (X) ⊂ ∞ k=1 Ek and k=1 µ(Ek ) =
∑∞ √ n √ n ∑∞ √ n
k=1 (2δk λ n) = (2λ n) k=1 µ(Ck ) ≤ (2λ n) ε0 ≤ ε. Thus f (X) is a null set.
∪∞
In the general case, using [120], first write U as a countable union of n-cubes, U = i=1 Ai . By
what is proved above, f (X ∩ Ai ) is a null set for each i ∈ N. Since a countable union of null sets is
∪
again a null set, it follows that f (X) = ∞ i=1 f (X ∩ Ai ) is also null set.
36 T.K.SUBRAHMONIAN MOOTHATHU
∪∞
Proof. (i) ⇒: Note that X = {x ∈ A : ω(f, x) > 0} = q=1 Xq , where Xq := {x ∈ A : ω(f, x) ≥
1/q}. Since a countable union of null sets is a null set, it suffices to show each Xq is a null
set in Rn . So fix q ∈ N and consider ε > 0. Since f ∈ R(A), there is a partition P of A
ε
with U (f, P ) − L(f, P ) ≤ . Let D1 , . . . , Dk be the sub n-boxes of A determined by P , and
2q ∪
Γ = {1 ≤ i ≤ k : Xj ∩ int(Di ) ̸= ∅}. Write Xq = Xq′ ∪ Xq′′ , where Xq′ = i∈Γ (Xq ∩ int(Di )) and
∪ ∪ ∪
Xq′′ = Xq ∩ ( ki=1 ∂Di ). Choose finitely many n-boxes E1 , . . . , Em such that ki=1 ∂Di ⊂ m j=1 Ej
∑m
and j=1 µ(Ej ) ≤ ε/2 (in fact, we may choose Ej ’s to be the (n − 1)-dimensional faces of Di ’s, and
then µ(Ej ) = 0 for each j). If i ∈ Γ, then Xq ∩int(Di ) ̸= ∅, and therefore diam(f (Di )) ≥ 1/q. Hence
ε ∑ 1∑ ∑
≥ U (f, P ) − L(f, P ) ≥ i∈Γ diam(f (Di ))µ(Di ) ≥ i∈Γ µ(Di ), which implies i∈Γ µ(Di ) ≤
2q ∪ ∪ q∑ ∑
ε/2. Thus Xq = Xq′ ∪ Xq′′ ⊂ ( i∈Γ Di ) ∪ ( j=1 Ej ) and i∈Γ µ(Di ) + j=1 µ(Ej ) ≤ ε/2 + ε/2 = ε.
m m
(i) ⇐: Let ε > 0 be given. We need to find a partition P of A with U (f, P ) − L(f, P ) ≤ ε. Let
M > |f | and ε0 = ε/(2M + µ(A)). Since X is a null set, there is a sequence (Aj ) of n-boxes with
∪ ∑∞
X ⊂ ∞ j=1 int(Aj ) and j=1 µ(Aj ) ≤ ε0 by Exercise-23(ii). For each a ∈ A \ X, choose (by the
continuity of f at a) an n-box Ea such that a ∈ int(Ea ) and diam(f (Ea )) < ε0 . Then {int(Aj ) :
j ∈ N} ∪ {int(Ea ) : a ∈ A \ X} is an open cover for the compact set A. Hence there exist m ∈ N and
a1 , . . . , am ∈ A such that {int(Aj ) : 1 ≤ j ≤ m} ∪ {int(Eaj ) : 1 ≤ j ≤ m} is an open cover for A.
∪ ∪m ∑m
Let A′j = A∩Aj and Ej′ = A∩Eaj . Then we have A = ( m ′ ′
j=1 Aj )∪( j=1 Ej ),
′
j=1 µ(Aj ) ≤ ε0 , and
diam(f (Ej′ )) < ε0 for 1 ≤ j ≤ m. Write each A′j and each Ej′ as products of closed intervals, and use
the endpoints of those closed intervals to define a partition P of A. Then the sub n-boxes D1 , . . . , Dk
of A determined by P satisfy the following: for each i ∈ {1, . . . , k}, there is j ∈ {1, . . . , m} such that
either Di ⊂ A′j or Di ⊂ Ej′ . Let Γ1 = {1 ≤ i ≤ k : Di ⊂ A′j for some j} and Γ2 = {1 ≤ i ≤ k : Di ⊂
∑ ∑
Ej′ for some j}. Then U (f, P ) − L(f, P ) ≤ i∈Γ1 diam(f (Di ))µ(Di ) + i∈Γ2 diam(f (Di ))µ(Di ) ≤
∑ ∑ ∑ ′
∑k
2M i∈Γ1 µ(Di ) + ε0 i∈Γ2 µ(Di ) ≤ 2M m j=1 µ(Aj ) + ε0 i=1 µ(Di ) ≤ 2M ε0 + ε0 µ(A) = ε.
(ii) This is a corollary of (i) because any countable set in Rn is a null set.
with [122](i) gives another proof of the fact that f + g, f g ∈ R(A) whenever f, g ∈ R(A). Similarly,
we can give another reasoning for Exercise-22(iii) using [122](ii) because Xg◦f ⊂ Xf when g is
continuous.
(ii) We wish to point out that the continuity of g is necessary in Exercise-22(iii). Let f : [0, 1] → [0, 1]
be f (0) = 1, f (x) = 0 if x is irrational, and f (p/q) = 1/q if p, q ∈ N are coprime with p ≤ q. Then
{x ∈ [0, 1] : f is not continuous at x} = [0, 1] ∩ Q, which is a countable set, and hence f is Riemann
integrable by [122]. Let g : [0, 1] → R be g(0) = 0 and g(x) = 1 if x > 0. Then g is also Riemann
integrable by [122]. But g ◦ f : [0, 1] → R is the indicator function 1[0,1]∩Q of [0, 1] ∩ Q, which is
discontinuous at every point of [0, 1]. Since [0, 1] is not a null set, we see by [122](i) (or by directly
calculating U (g ◦ f, P ) and L(g ◦ f, P )) that g ◦ f is not Riemann integrable.
Remark: Those who know Lebesgue measure theory can give an easier proof for the converse part
∪
of Exercise-25(i) by writing Y = ∞ k=1 Yk , where Yk := {y ∈ A : f (y) ≥ 1/k}, and noting that
∫ ∫ ∫
A f dx = A f dµ ≥ Yk f dµ ≥ µ(Yk )/k.
38 T.K.SUBRAHMONIAN MOOTHATHU
Proof. (i) As in Exercise-26, let fL,1 (x) = L(f (x, ·)), fU,1 (x) = U (f (x, ·)), fL,2 (y) = L(f (·, y)), and
fU,2 (y) = U (f (·, y)). Let P = P1 × P2 be a partition of A, where P1 , P2 are partitions of A1 , A2
respectively. Using Exercise-26(i) and the inequality fL,1 ≤ fU,1 , we get
(*) L(f, P ) ≤ L(fL,1 , P1 ) ≤ U (fL,1 , P1 ) ≤ U (fU,1 , P1 ) ≤ U (f, P ), and
MULTIVARIABLE CALCULUS 39
(ii) By hypothesis L(f (x, ·)) = U (f (x, ·)) for each x ∈ A1 , and L(f (·, y)) = U (f (·, y)) for each
y ∈ A2 . So this is a corollary of (i).
∏
Exercise-27: Let A = nj=1 [aj , bj ] be an n-box, f : A → R be continuous, and σ be any permutation
∫ ∫b ∫b
of {1, . . . , n}. Then A f = a σ(1) (· · · ( a σ(n) f dxσ(n) ) · · · )dxσ(1) (which means we can integrate f in
σ(1) σ(n)
any order over the intervals [aj , bj ]). [Hint: Repeated application of [123](iii).]
(iii) Let A = [0, 1]2 and f : A → R be f (x, y) = 1 if x = 0 and y ∈ Q, and f (x, y) = 0 otherwise.
Then f is continuous in (0, 1] × [0, 1] (where f ≡ 0), whose complement {0} × [0, 1] is a null set in
∫
R2 . Hence f ∈ R(A) by [122] and A f = 0. If y ∈ [0, 1] is fixed, then f (·, y) is continuous in (0, 1],
∫1
whose complement {0} is a null set in R. Hence 0 f (x, y)dx exists and is equal to 0. Therefore
∫1 ∫1
0 ( 0 f (x, y)dx)dy exists and is equal to 0. But if we fix x = 0, then f (0, ·) fails to be continuous
∫1 ∫1 ∫1
at every point of [0, 1]. Hence the integrals 0 f (0, y)dy and 0 ( 0 f (x, y)dy)dx do not exist.
(iv) Let A = [0, 1]2 , and S ⊂ A be a dense subset of A with the property that S contains at
most one point from each horizontal line and at most one point from each vertical line. Such a
set S can be constructed as follows. Let D1 , D2 , . . . ⊂ A be a listing of all sub-rectangles of A
having rational coordinates for the vertices and having nonempty interiors. Let (x1 , y1 ) ∈ D1 .
Having chosen (xj , yj ) ∈ Dj for 1 ≤ j ≤ n, we choose (xn+1 , yn+1 ) ∈ Dn+1 in such a way that
40 T.K.SUBRAHMONIAN MOOTHATHU
xn+1 ̸= xj for 1 ≤ j ≤ n and yn+1 ̸= yj for 1 ≤ j ≤ n. Then the set S := {(xn , yn ) : n ∈ N} has
the required properties. Now define f : A → R as the indicator function 1S of S. Then f fails
∫
to be continuous at every point of A, and hence A f does not exist. If x ∈ [0, 1] is fixed, then
∫1
f (x, y) = 0 for every y ∈ [0, 1] with one possible exception. Hence 0 f (x, y)dy = 0, and therefore
∫1 ∫1 ∫1 ∫1
0 ( 0 f (x, y)dy)dx = 0. Similarly, the iterated integral 0 ( 0 f (x, y)dx)dy exists and is equal to 0.
Remark: (i) Let U ⊂ R2 be open and f : U → R be a C 2 -function. We can give another proof
∂2f ∂2f ∂2f ∂2f
for the fact = using Fubini’s theorem. Let F = and G = , which are
∂x∂y ∂y∂x ∂x∂y ∂y∂x
∫ ∫ b ∫ d
continuous since f is a C 2 -function. Consider A = [a, b] × [c, d] ⊂ U . Then A F = a ( c F dy)dx =
∫d ∫b ∫ ∫b ∫d ∫d ∫b
c ( a F dx)dy and A G = a ( c Gdy)dx = c ( a Gdx)dy by Fubini’s theorem. Now, using the
∫d ∫b ∫ d ∂f ∂f
Fundamental theorem of calculus, we note that c ( a F dx)dy = c ( (b, y) − (a, y))dy =
∂y ∂y
∫b ∫d ∫ b ∂f ∂f
f (b, d)−f (b, c)−f (a, d)+f (a, c), and a ( c Gdy)dx = a ( (x, d)− (x, c))dx = f (b, d)−f (a, d)−
∫ ∫ ∂x ∂x
f (b, c) + f (a, c). Hence A F = A G for every 2-box (rectangle) A ⊂ U . If F (x0 , y0 ) ̸= G(x0 , y0 ) for
some (x0 , y0 ) ∈ U , say F > G at (x0 , y0 ), then we can find a rectangle A ⊂ U with (x0 , y0 ) ∈ int(A)
∫ ∫ ∫
and ε > 0 such that F (x, y) > G(x, y) + ε for every (x, y) ∈ A; then A F ≥ A G + εµ(A) > A G,
a contradiction to what we have already proved.
(ii) Conversely4, Fubini’s theorem for a continuous real-valued function f defined on a rectangle
A = [a, b] × [c, d] ⊂ U can be deduced using the equality of mixed partial derivatives of suitable
functions defined in terms of certain integral expressions of f .
[124] [Interchanging differentiation and integration] (i) Let A ⊂ Rn be an n-box and f : A×[c, d] →
∂f
R be a function such that f (·, t) ∈ R(A) for each t ∈ [c, d], and : A × [c, d] → R is continuous.
∂t
∫ d ∫ ∫ ∂f
Then t 7→ A f (x, t)dx from [c, d] to R is differentiable and ( A f (x, t)dx) = A (x, t)dx.
dt ∂t
(ii) Let A ⊂ Rn be an n-box and f : [a, b] × A → R be a function such that f (s, ·) ∈ R(A) for
∂f ∫
each s ∈ [a, b], and : [a, b] × A → R is continuous. Then s 7→ A f (s, x)dx from [a, b] to R is
∂s
d ∫ ∫ ∂f
differentiable and ( A f (s, x)dx) = A (s, x)dx.
ds ∂s
(iii) Let U ⊂ R be open, and f : U × [c, d] → R be a C 1 -function such that f (x, ·) is Riemann
n
∫d
integrable over [c, d] for each x ∈ U . Then F : U → R defined as F (x) = c f (x, t)dt has the
∂F ∫ d ∂f
property that all partial derivatives of F exist and (x) = c (x, t)dt for every j ∈ {1, . . . , n}
∂xj ∂xj
and every x ∈ U .
(iv) Let U ⊂ Rn be open, and f : [a, b] × U → R be a C 1 -function such that f (·, x) is Riemann
∫b
integrable over [a, b] for each x ∈ U . Then F : U → R defined as F (x) = a f (s, x)ds has the
4See A. Aksoy, M. Martelli, Mixed partial derivatives and Fubini’s theorem, College Math. J., 33, (2002).
MULTIVARIABLE CALCULUS 41
∂F ∫ b ∂f
property that all partial derivatives of F exist and (x) = a (s, x)ds for every j ∈ {1, . . . , n}
∂xj ∂xj
and every x ∈ U .
f (x, t) − f (x, w)
Proof. (i) Fix w ∈ [c, d], and let Q(x, t) = for t ̸= w. We need to show that
t−w
∫ ∫ ∂f ∂f
limt→w A Q(x, t)dx = A (x, w)dx. Consider ε > 0. Since is uniformly continuous on the
∂t ∂t
∂f ∂f ε
compact set A×[c, d], there is δ > 0 such that | (x, t)− (x, w)| < for every x ∈ A and
∂t ∂t µ(A) + 1
every t ∈ [c, d] with |t−w| < δ. Now consider t ∈ [c, d] with 0 < |t−w| < δ. For each x ∈ A, applying
∂f
the Mean value theorem to f (x, ·), we may find tx between t and w with Q(x, t) = (x, tx ). Hence
∂t
∫ ∫ ∂f ∫ ∂f ∂f ∫ ε
| A Q(x, t)dx − A (x, w)dx| ≤ A | (x, tx ) − (x, w)|dx ≤ A dx < ε.
∂t ∂t ∂t µ(A) + 1
(ii) The proof is similar to that of (i).
(iii) Fix j ∈ {1, . . . , n} and x ∈ U . Choose δ > 0 such that x + sej ∈ U for every s ∈ [−δ, δ],
∂g ∂f
and define g : [−δ, δ] × [c, d] → R as g(s, t) = f (x + sej , t). Note that (s, t) = (x + sej , t).
∂s ∂xj
∂g ∂g
Hence : [−δ, δ] × [c, d] → R is continuous by the C 1 -property of f , and moreover (0, t) =
∂s ∂s
∂f d ∫d ∫ d ∂g
(x, t). Applying (ii) to g, we see that ( c g(s, t)dt) = c (s, t)dt. At s = 0, the right hand
∂xj ds ∂s
∫ d ∂f ∫ d g(s, t) − g(0, t)
side is equal to c (x, t)dt, and the left hand side is equal to lims→0 c dt =
∂xj s
∫ d f (x + sej , t) − f (x, t) F (x + sej ) − F (x) ∂F
lims→0 c dt = lims→0 = (x).
s s ∂xj
(iv) The proof is similar to that of (iii).
As we go through the finer details of Riemann integration theory (such as the theory of Jordan
measurable sets), we will also see some of its disadvantages. To a certain extent, these disadvantages
are rectified in Lebesgue integration theory by the use of Lebesgue measurable sets (which we will
not discuss here).
we may modify f on ∂(A ∩ E) and also suppose that f (x) = 0 for every x ∈ ∂(A ∩ E). Choose
∫
a sequence (Pn ) of partitions of A such that limn→∞ L(f |A , Pn ) = limn→∞ U (f |A , Pn ) = A f .
By refining Pn ’s, assume that there is a sequence (Qn ) of partitions of A ∩ E such that Pn is an
extension of Qn . Now choose a sequence (Pen ) of partitions of E such that Pen is an extension
of Qn . Since f ≡ 0 outside int(A ∩ E), we get L(f |E , Pen ) = L(f |A∩E , Qn ) = L(f |A , Pn ) and
∫
U (f |E , Pen ) = U (f |A∩E , Qn ) = U (f |A , Pn ) for every n ∈ N. It follows that L(f |E ) = f = U (f |E ).]
A
42 T.K.SUBRAHMONIAN MOOTHATHU
Proof. We get (i) ⇔ (iii), and (ii) ⇔ (iv) because LJ(X) = L(1X ) and U J(X) = U (1X ). Moreover,
we have (i) ⇔ (ii) because of Exercise-28.
(ii) ⇒ (v) ⇒ (i): Let A be an n-box with X ⊂ int(A). Then {x ∈ A : 1X is not continuous at x} =
∂X. Now use Lebesgue’s criterion [122].
MULTIVARIABLE CALCULUS 43
Definition: Let X ⊂ Rn be a bounded set. If the constant function 1 is Riemann integrable over X,
i.e., if the indicator function 1X ∈ R(A) for some n-box A containing X, then we say5 the set X is
∫ ∫
Jordan measurable, and we define the Jordan measure µ(X) of X as µ(X) = X 1dx = A 1X dx.
[126] (i) The definition of Jordan measurability (and Jordan measure) of a bounded set X ⊂ Rn
is independent of the particular choice of an n-box A containing X because of Exercise-28.
(ii) By [125], a bounded set X ⊂ Rn is Jordan measurable ⇔ ∂X is a null set in Rn ⇔ LJ(X) =
U J(X) with respect to some/every n-box A ⊂ Rn containing X.
(iii) If X itself is an n-box, then (by taking A = X, we may see that) X is Jordan measurable and
its Jordan measure coincides with its n-dimensional volume.
∫ ∫
(iv) If X ⊂ Rn is Jordan measurable, then X c = c A 1X = cµ(X) for every c ∈ R; in particular,
by taking c = 1, we observe using Exercise-25(i) that µ(X) = 0 iff X is a null set (i.e., its Lebesgue
measure is zero) in Rn .
(iii) By considering f + and f − separately, assume f ≥ 0. Then fX = max{fX1 , fX2 }, fX1 ∩X2 =
min{fX1 , fX2 }, and fX = fX1 + fX2 − fX1 ∩X2 . Now use Exercise-29(vi) and Exercise-29(i).
(iv) We may suppose k = 2; the general case can be proved by a repeated application of this case.
When k = 2, the result follows from (iii) and Exercise-29(viii).
(iii) By [127](i), X0 is also Jordan measurable and µ(X0 ) = µ(X0 ) = 0. Note that {x ∈ A :
fX (x) ̸= gX (x)} ⊂ X0 ∪ ∂X, and the set on the right hand side is a closed null set in Rn . Now
apply Exercise-25(iv).
Example: (i) X := 1[0,1]∩Q is not Jordan measurable because ∂X = [0, 1] is not a null set in R.
This example shows also that a bounded set which is a countable union of Jordan measurable sets
need not be Jordan measurable.
(ii) We will construct a bounded open set X in R which is not Jordan measurable. Let ε ∈ (0, 1/2)
and {xn : n ∈ N} ⊂ (0, 1) be a dense subset of [0, 1]. For each n ∈ N, choose an open interval
∪
Jn ⊂ (0, 1) containing xn with µ(Jn ) < ε/2n , and put X = ∞ n=1 Jn , which is an open set in R
∪
with X = [0, 1]. If ∂X is a null set, there is a sequence (Jen ) of open intervals with ∂X ⊂ ∞ e
n=1 Jn
∑∞ e e
and n=1 µ(Jn ) < ε. Then {Jn : n ∈ N} ∪ {Jn : n ∈ N} is an open cover for the compact
set X ∪ ∂X = [0, 1]. Extract a finite subcover {Jn : 1 ≤ n ≤ p} ∪ {Jen : 1 ≤ n ≤ p}. Then
∑ ∑
1 = µ([0, 1]) ≤ pn=1 µ(Jn ) + pn=1 µ(Jen ) ≤ ε + ε = 2ε < 1, a contradiction (here we used: if
∑
J1 , . . . , Jp are intervals covering an interval J, then µ(J) ≤ pn=1 µ(Jn )).
(iv) If X is as in (ii) above, and K = [0, 1] \ X, then K is a compact set which is not Jordan
measurable because ∂K = ∂X.
MULTIVARIABLE CALCULUS 45
(iv) There are path connected Jordan measurable sets which are not Borel sets. Let n ≥ 2, and
A ⊂ Rn be an n-box with int(A) ̸= ∅. Since ∂A is an uncountable compact set, there is a non-Borel
set Y ⊂ ∂A (∵ the cardinality of the collection of Borel subsets of Rn is equal to that of R whereas
the cardinality of the power set P(∂A) is equal to that of P(R)). Let X = int(A) ∪ Y , which is not
a Borel set. But X is Jordan measurable because ∂X ⊂ ∂A, and clearly X is path connected.
Remark: (i) Every Jordan measurable set X ⊂ Rn is a (bounded) Lebesgue measurable set because
X = int(X) ∪ (X ∩ ∂X), where int(X) is an open set and hence Lebesgue measurable, and X ∩ ∂X
is also Lebesgue measurable because it is a subset of the null set ∂X.
(ii) Let X ⊂ Rn be a bounded set. The existence some f in R(X) does not imply the Jordan
measurability of X: trivially, 0 ∈ R(X) for every bounded set X.
Remark: Exercise-31 is useful in evaluating certain integrals. For example, let X = {(x, y) ∈ R2 :
∫ ∫ 1 ∫ x2
0 ≤ x ≤ 1 and 0 ≤ y ≤ x2 } and f : X → R be f (x, y) = xy 2 . Then X f = 0 ( 0 xy 2 dy)dx =
∫ 1 x7 1
0 3 dx = 24 . In certain cases, instead of bounding y with functions of x, we may bound x with
46 T.K.SUBRAHMONIAN MOOTHATHU
functions of y and interchange the order of integration. For example, let X = {(x, y) ∈ R2 : 0 ≤
2 ∫ ∫1 ∫1 2
x ≤ 1 and x ≤ y ≤ 1} and f : X → R be f (x, y) = ey . Then X f = 0 ( x ey dy)dx, but the inner
integral is not easy to evaluate. However note that X = {(x, y) ∈ R2 : 0 ≤ y ≤ 1 and 0 ≤ x ≤ y},
∫ ∫1 ∫y 2 ∫1 2 ∫1 e−1
and hence X f = 0 ( 0 ey dx)dy = 0 yey dy = (1/2) 0 et dt = by the substitution t = y 2 .
2
Exercise-32: Let X ⊂ Rn be a Jordan measurable compact set, and f : X × [c, d] → R be a function
∂f
such that f (·, y) ∈ R(X) for each y ∈ [c, d], and : X × [c, d] → R is continuous. Then the
∂y
∫ d ∫ ∫ ∂f
function y 7→ X f (x, y)dx from [c, d] → R is differentiable and ( X f (x, y)dx) = X (x, y)dx.
dy ∂y
∂f
[Hint: ∈ R(X) by [128](ii). Now imitate the proof of [124].]
∂y
We may write the Change of variable formula in one-variable theory in the following form:
Our aim is to generalize Exercise-33 to higher dimensions by replacing the ‘local magnification
factor’ |g ′ (x)| with | det(Jg (x))|. The reason for | det(Jg (x))| to be the ‘local magnification factor’
in higher dimensions stems from the result [129] stated below.
Exercise-34: [Fact from Linear Algebra] Every invertible linear map L : Rn → Rn can be written
as a finite product of elementary linear maps. [Hint: For the corresponding result in terms of
matrices, see Theorem 12 in Section 1.6 of Hoffman and Kunze, Linear Algebra.]
Convention: Let L ∈ L(Rn , Rn ). Then L′ (x; ·) = L, and hence JL (x) is equal to the matrix of L
with respect to the standard basis of Rn for each x ∈ Rn . Identifying L with its matrix, we will
write det(L) to mean the determinant of the matrix of L; with this convention, det(L) = det(JL (x))
for every x ∈ Rn .
Proof. Note that E(X) and L(X) are Jordan measurable because of Exercise-30(i).
(i) Since X is Jordan measurable, we have LJ(X) = U J(X) with respect to any n-cube containing
X by [126](ii). Hence X can be approximated with a finite union of n-cubes with pairwise disjoint
interiors, and therefore, we may suppose X itself is an n-cube. Since E(cx + y) = cE(x) + y for
c ∈ R \ {0} and x, y ∈ Rn , we may also suppose after a scaling and translation that X is the unit
cube in Rn . In particular, µ(X) = 1, and thus we need to just show µ(E(X)) = | det(E)|. Keep in
∑
mind that E(X) = { nk=1 ck E(ek ) : ck ∈ [0, 1] for every k} since X is the unit n-cube.
If E is of type-1, then there are λ ∈ R \ {0} and j ∈ {1, . . . , n} such that E(ej ) = λej and
E(ek ) = ek for every k ̸= j. Hence | det(E)| = |λ|. As E(X) is an n-box whose jth edge has length
|λ| and other edges have unit length, we conclude µ(E) = |λ| = | det E|. If E is of type-2, then
| det(E)| = 1 and E(X) = X so that µ(E(X)) = 1 = µ(X). If E is of type-3, then there are i ̸= j
in {1, . . . , n} such that E(ej ) = ei + ej and E(ek ) = ek for every k ̸= j. Then | det(E)| = 1. In the
xi xj -plane, E maps the unit square to the parallelogram with vertices 0, ei , ei + ej , and 2ei + ej ,
whose area is 1. Consequently, µ(X) = 1 = | det(E)| in this case also.
(ii) Write L = E1 · · · Ep , a finite product of elementary linear maps, and we will use induction
on p. The case p = 1 is covered by (i). Put L0 = E1 · · · Ep−1 so that L = L0 Ep . Since Y :=
Ep (X) is also Jordan measurable, we get by induction assumption on p − 1 that µ(L(X)) =
µ(L0 (Y )) = | det(L0 )|µ(Y ). Now, µ(Y ) = µ(Ep (X)) = | det(Ep )|µ(X) by (i), and det(L0 ) =
∏
det(E1 ) · · · det(Ep−1 ). It follows that µ(L(X)) = pi=1 | det(Ei )|µ(X) = | det(L)|µ(X).
[130] [Linear change of variable] Let L ∈ L(Rn , Rn ) be invertible, and X ⊂ Rn be Jordan measur-
able. If f : L(X) → R is Riemann integrable, then F : X → R defined as F (x) = f (L(x))| det(L)|
∫ ∫
is Riemann integrable over X, and L(X) f = X F .
One advantage of n-cubes over n-dimensional balls is that subsets of Rn can be covered with
finitely many or countably many n-cubes of the same size with pairwise disjoint interiors (whereas
any covering using balls will have overlapping of the balls in general, which makes it difficult to add
up estimates from different balls). We need to make an estimate about the volume of the image of
an n-box under a C 1 -map. For this purpose, it is convenient to use certain special norms:
Exercise-35: The quantity ∥ · ∥0 defined above is a norm on the vector space L(Rn , Rn ) ∼
= Rn×n
with ∥ − T ∥0 = ∥T ∥0 and ∥I∥0 = 1. Moreover,
(i) ∥T x∥∞ ≤ ∥T ∥0 ∥x∥∞ for every T ∈ L(Rn , Rn ) and x ∈ Rn .
(ii) For every T ∈ L(Rn , Rn ), there is x ∈ Rn with ∥x∥∞ = 1 and ∥T x∥∞ = ∥T ∥0 .
(iii) ∥S ◦ T ∥0 ≤ ∥S∥0 ∥T ∥0 for every S, T ∈ L(Rn , Rn ).
∑ ∑ ∑
[Hint: (i) | nj=1 tij xj | ≤ nj=1 |tij |∥x∥∞ for 1 ≤ i ≤ n. (ii) Let i ∈ {1, . . . , n} be with nj=1 |tij | =
∥T ∥0 . Define x ∈ Rn as xj = 1 if tij ≥ 0 and xj = −1 if tij < 0. (iii) Choose x ∈ Rn with ∥x∥∞ = 1
and ∥(S ◦ T )x∥∞ = ∥S ◦ T ∥0 . Then ∥S ◦ T ∥0 ≤ ∥S∥0 ∥T x∥∞ ≤ ∥S∥0 ∥T ∥0 ∥x∥∞ = ∥S∥0 ∥T ∥0 .]
Since Φ(x, x) = I and since A × A is compact also with respect to the supremum norm, we may
find δ > 0 such that ∥Jg (x)−1 Jg (y)∥0 < 1 + ε for every x, y ∈ A with ∥x − y∥∞ < δ.
Since the side-length r of D is < δ, we get ∥x − y∥∞ ≤ r < δ for every y ∈ D. Hence
∥Jg (x)−1 Jg (y)∥0 < 1 + ε for every y ∈ D by the choice of δ. Observe that Jf (y) = Jg (x)−1 Jg (y) by
the definition of f , and hence λ := sup{∥Jf (y)∥0 : y ∈ D} ≤ 1+ε. Let a ∈ D be the center of D, and
consider b ∈ D. Note that ∥b−a∥∞ ≤ r/2. By Exercise-36, ∥f (b)−f (a)∥∞ ≤ λ∥b−a∥∞ ≤ (1+ε)r/2.
Therefore, f (D) is contained in an n-cube with center f (a) and side-length (1 + ε)r. This implies
µ(f (D)) ≤ (1 + ε)n rn = (1 + ε)n µ(D). Combining this with the estimate of the previous paragraph,
we conclude µ(g(D)) ≤ (1 + ε)n | det(Jg (x))|µ(D).
Proof. (i) As in the proof of [130], we may see that g(A) is Jordan measurable and F ∈ R(A). Let
ε > 0. By [131], there is δ > 0 such that µ(g(D)) ≤ (1 + ε)n | det(Jg (x))|µ(D) for every n-cube
D ⊂ A with side-length < δ. Choose a partition P of A such that the sub n-boxes D1 , . . . , Dk of A
∫
determined by P are n-cubes with side-legth < δ, and such that U (F, P ) ≤ ε + A F . Since g is a
C 1 -diffeomorphism, g(Di ) is Jordan measurable for every i, and g(Di )∩g(Dj ) = g(Di ∩Dj ) is a null
∫ ∑ ∫ ∑
set for every i ̸= j by [121](ii). Therefore, g(A) f = ki=1 g(Di ) f ≤ ki=1 sup(f (g(Di )))µ(g(Di)) ≤
∑
ε + ki=1 f (g(xi ))µ(g(Di)) for some choice of points xi ∈ Di .
By the choice of δ, we have µ(g(Di )) ≤ (1 + ε)n | det(Jg (xi ))|µ(Di ) for 1 ≤ i ≤ k. This can be
combined with the previous inequality because f ≥ 0. Thus we get
∫ ∑k ∑k
g(A) f ≤ ε + (1 + ε)
n n
i=1 f (g(xi ))| det(Jg (xi ))|µ(Di ) = ε + (1 + ε) i=1 F (xi )µ(Di )
50 T.K.SUBRAHMONIAN MOOTHATHU
∫
≤ ε + (1 + ε)n U (F, P ) ≤ ε + (1 + ε)n (ε + A F ).
∫ ∫
Since ε > 0 is arbitrary, we deduce that g(A) f ≤ A F .
(ii) By considering f + and f − separately, assume f ≥ 0. Let Y = g(X). The continuous map x 7→
| det(Jg (x))| is bounded on the compact set X, and therefore F is bounded. Let fY , FX : Rn → R
be the extended functions which are zero respectively outside Y and X. Since X ⊂ U , there are
∪
finitely many n-cubes A1 , . . . , Ak with pairwise disjoint interiors such that the set K := ki=1 Ai
satisfies X ⊂ K ⊂ U . Since X and Ai are Jordan measurable, the sets Y = g(X) and g(Ai ) are
Jordan measurable. As f ∈ R(Y ), it follows that fY ∈ R(g(Ai )) by [127](v). Applying part (i)
∫ ∫
to fY and FX , we get that FX ∈ R(Ai ) and g(Ai ) fY ≤ Ai FX for 1 ≤ i ≤ k. It follows that
∫ ∫
FX ∈ R(K) and g(K) fY ≤ K FX since Ai ∩ Aj and g(Ai ) ∩ g(Aj ) are null sets for i ̸= j (see
∫ ∫
[127](iv)). This implies F ∈ R(X) and Y f ≤ X F . Now observe that if x ∈ X and y = g(x),
1
then det(Jg−1 (y)) = and hence f (y) = F (g −1 (y))| det(Jg−1 (y))|. This allows us to
det(Jg (x))
interchange the roles of f and F (and interchange g and g −1 ) to establish the reverse inequality
∫ ∫ ∫ ∫
X F ≤ Y f . Thus Y f = X F .
Remark: For two other proofs of the Change of variable theorem see (i) Chapter 4 of Munkres,
Analysis on Manifolds, and (ii) P.D. Lax, Change of variables in multiple integrals, American
Mathematical Monthly, 1999.
(i) Let X ⊂ Rn be a Jordan measurable set with X ⊂ U . Then x 7→ | det(Jg (x))| is Riemann
∫
integrable over X and µ(g(X)) = X | det(Jg (x))|dx.
(ii) Assume the open sets U, V are Jordan measurable, and x 7→ det(Jg (x)) is bounded on U . Then
∫
µ(V ) = U | det(Jg (x))|dx.
Proof. (i) Let f : g(X) → R be f ≡ 1, and note that f ∈ R(X) by [128](ii). Now by [132](ii), we
∫ ∫ ∫
have µ(g(X)) = g(X) 1 = g(X) f = X | det(Jg (x))|dx.
One important use of the Change of variable theorem is in transforming the Euclidean coordinate
to polar, cylindrical, spherical coordinates.
Definition: (i) [Polar coordinates in R2 ] Let U = {(r, θ) ∈ R2 : r > 0 and 0 < θ < 2π}, and
g : U → R2 be g(r, θ) = (r cos θ, r sin θ). Then V := g(U ) = R2 \ {(x, 0) : x ≥ 0}, where
{(x, 0) : x ≥ 0}
is a closed nullset in R2 . The function g : U → V is a bijective C 1 -function
cos θ −r sin θ
with Jg (r, θ) = so that det(Jg (r, θ)) = r ̸= 0 for every (r, θ) ∈ U . Hence
sin θ r cos θ
g : U → V is a C 1 -diffeomorphism by Inverse function theorem. If (x, y) ∈ V and (x, y) = g(r, θ),
then (r cos θ, r sin θ) is said to be the polar coordinate representation of (x, y). Here note that
r2 = x2 + y 2 , and θ is the angle (measured in the anticlockwise direction) from the positive x-axis
to the line segment joining (0, 0) and (x, y).
(ii) [Cylindrical coordinates in R3 ] Let U = {(r, θ, z) ∈ R3 : r > 0, 0 < θ < 2π, and z ∈ R} and
g : U → R3 be g(r, θ, z) = (r cos θ, r sin θ, z) (this means using polar coordinates in the xy-plane
and keeping the z-coordinate unchanged). Let V = R3 \ {(x, 0, z) : x ≥ 0 and z ∈ R}. Then
{(x, 0, z) : x ≥ 0 and z ∈ R} is a closed null set in R3 , and g : U → V is a bijective C 1 -function
cos θ −r sin θ 0
with Jg (r, θ, z) =
sin θ r cos θ 0 so that det(Jg (r, θ, z)) = r ̸= 0 for every (r, θ, z) ∈ U . Hence
0 0 1
g : U → V is a C -diffeomorphism. If (x, y, z) ∈ V and (x, y, z) = g(r, θ, z), then (r cos θ, r sin θ, z)
1
is said to be the cylindrical coordinate representation of (x, y, z). Here note that r2 = x2 + y 2 , and
θ is the angle (measured in the anticlockwise direction) from the positive x-axis to the line segment
joining (0, 0, 0) and (x, y, 0).
(iii) [Spherical coordinates in R3 ] Note that A := {(x, 0, z) : x ≥ 0 and z ∈ R} is a closed null set
in R3 . Let V = R3 \ A, and consider (x, y, z) ∈ V . Define r > 0 and t > 0 by the conditions
that r2 = x2 + y 2 + z 2 and t2 = x2 + y 2 . In the xy-plane, we may use polar coordinates and write
52 T.K.SUBRAHMONIAN MOOTHATHU
(x, y, 0) = (t cos θ, t sin θ), where θ ∈ (0, 2π) is the angle (measured in the anticlockwise direction)
from the positive x-axis to the line segment joining (0, 0, 0) and (x, y, 0). Let η ∈ (0, π) be the angle
between the positive z-axis and the line segment joining (0, 0, 0) and (x, y, z). Then z = r cos η and
t = r sin η so that (x, y, z) = (r cos θ sin η, r sin θ sin η, r cos η).
Let U = {(r, θ, η) ∈ R2 : r > 0, 0 < θ < 2π, and 0 < η < π} and g : U → V be
g(r, θ, η) = (r cos θ sin η, r sin θ sin η, r cos η). Then g is a bijective C 1 -function with Jg (r, θ, η) =
cos θ sin η −r sin θ sin η r cos θ cos η
sin θ sin η r cos θ sin η r sin θ cos η so that det(Jg (r, θ, η)) = −r2 sin η ̸= 0 for every (r, θ, η)
cos η 0 −r sin η
in U . Hence g : U → V is a C 1 -diffeomorphism. If (x, y, z) ∈ V and (x, y, z) = g(r, θ, η), then
(r cos θ sin η, r sin θ sin η, r cos η) is said to be the spherical coordinate representation of (x, y, z).
Exercise-37: (i) Use polar coordinates to see the area of B(0, λ) ⊂ R2 is πλ2 .
(iii) Use spherical coordinates to see the area of B(0, λ) ⊂ R3 is 4πλ3 /3.
[Hint: (i) Let U = (0, λ) × (0, 2π) and V = B(0, λ) \ {(x, 0) : x ≥ 0}. Note that {(x, 0) : x ≥ 0} is a
closed null set in R2 and g : U → V given by g(r, θ) = (r cos θ, r sin θ) is a C 1 -diffeomorphism with
∫ ∫ λ ∫ 2π
det(Jg (r, θ)) = r. Hence by [133](ii), µ(B(0, λ)) = µ(V ) = U | det(Jg (r, θ))| = 0 0 rdθdr = πλ2 .
(ii) Let U = (0, λ) × (0, 2π) × (0, π), g : U → R3 be g(r, θ, η) = (r cos θ sin η, r sin θ sin η, r cos η),
and V = g(U ). Then V is equal to B(0, λ) minus a closed null set, and g : U → V is a
C 1 -diffeomorphism with |det(Jg (r, θ, η))| = r2 sin η. Hence by [133](ii), µ(B(0, λ)) = µ(V ) =
∫ 2 ∫ λ ∫ 2π ∫ π 2 4πλ3
U r sin η = 0 0 0 r sin ηdηdθdr = .]
3
∫
Example: We wish to evaluate V f , where V = {(x, y) ∈ R2 : x > 0, y > 0, and x2 + y 2 < λ2 },
and f : V → R is f (x, y) = x2 y. Let U = (0, λ) × (0, π/2) and note g : U → V given by
∫
g(r, θ) = (r cos θ, r sin θ) is a C 1 -diffeomorphism with det(Jg (r, θ)) = r. By [133](ii), V f =
∫ ∫ λ ∫ π/2 4 ∫λ∫1
U f (g(r, θ))| det(Jg (r, θ))| = 0 0 r cos2 θ sin θdθdr = 0 0 r4 t2 dtdr = λ5 /15 (where t = cos θ).
[134] For n ∈ N and λ > 0, let v(n, λ) denote the n-dimensional volume of B(0, λ) ⊂ Rn . Then,
(i) v(n, λ) = λn v(n, 1).
(ii) v(n + 2, 1) = 2πv(n, 1)/(n + 2).
(iii) v(3, λ) = 4πλ3 /3 and v(4, λ) = π 2 λ4 /2.
Proof. (i) Let g : Rn → Rn be g(x) = λx. Then g is an invertible linear map (in particular a
C 1 -diffeomorphism) with g(B(0, 1)) = B(0, λ). The matrix of g is a diagonal matrix where all
diagonal entries are equal to λ, and hence | det(Jg (x))| = λn for every x ∈ Rn . By [133](ii),
∫ ∫
v(n, λ) = B(0,1) | det(Jg (x))|dx = B(0,1) λn = λn v(n, 1).
(ii) In Rn+2 , put y = xn+1 and z = xn+2 . With the help of part (i), we see v(n + 2, 1)
MULTIVARIABLE CALCULUS 53
∫ ∫
= y 2 +z 2 <1 ( x21 +···+x2n <1−(y 2 +z 2 ) 1)dydz
∫ √ ∫
= y 2 +z 2 <1 v(n, 1 − (y 2 + z 2 ))dydz = v(n, 1) y 2 +z 2 <1 (1 − (y 2 + z 2 ))n/2 dydz.
Now applying [133](ii) to the polar coordinates in the yz-plane, note that
∫ ∫ ∫
2 + z 2 ))n/2 dydz = 1 2π (1 − r 2 )n/2 rdθdr = 2π/(n + 2) by putting t = 1 − r 2 .
2 2
y +z <1 (1 − (y 0 0
(iii) Since v(1, 1) = 2, we get by (i) and (ii) that v(3, λ) = λ3 v(3, 1) = λ3 × 2πv(1, 1)/3 = 4πλ3 /2.
Since v(2, 1) = π, we get by (i) and (ii) that v(4, λ) = λ4 v(4, 1) = λ4 × 2πv(2, 1)/4 = π 2 λ4 /2.
Line integral refers to the integral of a function f over a path α, and there are two types:
(i) line integrals of scalar fields (i.e., real-valued) f , and this line integral will be independent of
the orientation of the path α, and
(ii) line integrals of vector fields (i.e., vector-valued) f , and this line integral will be sensitive to
the orientation of the path α.
∫ ∫
Definition: The length l(α) of a C 1 -path α : [a, b] → Rn is defined as l(α) = α 1 = ab ∥α′ (t)∥dt
∑
(i.e., take f ≡ 1 in the line integral defined above). If α := kj=1 α[j] : [a, b] → Rn is a piecewise
∑
C 1 path, its length is defined as l(α) = kj=1 l(α[j] ).
Example: (i) Let α : [0, π/2] → R2 be α(t) = (2 cos t, 2 sin t) and f : R2 → R be f (x, y) = x + 5y.
∫ ∫ π/2
Then f (α(t)) = 2 cos t + 10 sin t and ∥α′ ∥ ≡ 2. Hence α f = 0 (4 cos t + 20 sin t)dt = 24.
(ii) Let α, β : [0, 2π] → R2 be α(t) = (cos t, sin t) and β(t) = (cos 3t, sin 3t). Then α and β have the
same image (the unit circle), but l(α) = 2π ̸= 6π = l(β) because ∥α′ ∥ ≡ 1 and ∥β ′ ∥ ≡ 3. Moreover,
∫ ∫
if f : R2 → R is f (x, y) = x2 + y 2 , then f ◦ α ≡ 1 ≡ f ◦ β, and therefore α f = 2π ̸= 6π = β f .
[135] [Line integral of a scalar field remains invariant under an equivalent reparametrization of the
path] Let U ⊂ Rn be open, and f : U → R be continuous.
(i) Let α : [a, b] → U be a piecewise C 1 path, g : [c, d] → [a, b] be a C 1 -diffeomorphism, and
∫ ∫
β : [c, d] → U be β = α ◦ g. Then α f = β f .
(ii) Let α : [a, b] → U be a piecewise C 1 path, and define −α : [a, b] → Rn as (−α)(t) = α(a + b − t)
∫ ∫
(the path in the reverse direction). Then α f = −α f .
Proof. (i) By the additivity of the integral, we may suppose that α is a C 1 -path; and then β
is also a C 1 -path. Let h : [a, b] → R be h(t) = f (α(t))∥α′ (t)∥. Then by Change of variable
∫ ∫b ∫d
theorem, α f = a h(t)dt = c h(g(s))|g ′ (s)|ds. By Chain rule, β ′ (s) = g ′ (s)α′ (g(s)), and therefore
∫ ∫d ′
∫d ′ ′
∫d ′
∫
β f = c f (β(s))∥β (s)∥ds = c f (β(s))∥α (g(s))∥|g (s)|ds = c h(g(s))|g (s)|ds = α f .
Remark: When we have to integrate a scalar field over a circle or the boundary of a rectangle,
etc., we should consider the natural parametrization in the anticlockwise direction. For example,
∫
let A = [a, b] × [c, d], and suppose we wish to evaluate ∂A 1. Let α, β : [a, b] → R2 be α(t) =
(t, c), β(t) = (t, d); and γ, σ : [c, d] → R2 be γ(t) = (a, t), σ(t) = (b, t). Then the anticlockwise
∫ ∫ ∫ ∫ ∫
parametrization of ∂A is given by the path α+σ−β−γ. By [135](ii), ∂A 1 = α 1+ σ 1+ β 1+ γ 1 =
∫b ′ ′
∫d ′ ′
∫b ∫d
a (∥α (t)∥+∥β (t)∥)+ c (∥σ (t)∥+∥γ (t)∥) = a 2+ c 2 = 2((b−a)+(d−c)), which is the perimeter
of the rectangle A.
which is a Riemann sum of the continuous function t 7→ ⟨f (α(t)), α′ (t)⟩ from [a, b] to R. Motivated
by this observation, we define:
(ii) Let α : [0, π/2] → R2 be α(t) = (2 cos t, 2 sin t) and f : R2 → R2 be f (x, y) = (3x, 5y). Then
∫ ∫ π/2 ∫ π/2 ∫ π/2
αf = 0 ⟨f (α(t)), α′ (t)⟩dt = 0 ⟨(6 cos t, 10 sin t), (−2 sin t, 2 cos t)⟩dt = 8 0 cos t sin tdt =
∫1
8 0 sds = 4 by putting s = sin t.
[136] [Line integral of a vector field is sensitive to the orientation of the path; but is invariant
under an orientation-preserving equivalent reparametrization of the path] Let U ⊂ Rn be open,
f : U → Rn be continuous, and α : [a, b] → U be a piecewise C 1 path. Then,
∫ ∫
(i) −α f = − α f , where −α : [a, b] → U is given by (−α)(t) = α(a + b − t).
∫ ∫
(ii) Let g : [c, d] → [a, b] be a C 1 -diffeomorphism, and β : [c, d] → U be β = α ◦ g. Then, αf = β f
∫ ∫
if g ′ > 0; and α f = − β f if g ′ < 0.
∫ ∫b ′
∫b ′
∫
Proof. (i) −α f = a ⟨f (α(a + b − t), α (a + b − t)⟩dt = − a ⟨f (α(s), α (s)⟩ds = − αf by putting
s = a + b − t.
∫ ∫b
(ii) Let h : [a, b] → R be h(t) = ⟨f (α(t)), α′ (t)⟩. Then by Change of variable, α f = a h(t)dt =
∫d ′
∫ ∫d ′
∫d ′ ′ ′ ′
c h(g(s))|g (s)|ds. Also, β f = c ⟨f (β(s)), β (s)⟩ds = c h(g(s))g (s)ds since β (s) = g (s)α (g(s))
∫ ∫ ∫ ∫
by Chain rule. Now it is clear that α f = β f if g ′ > 0, and α f = − β f if g ′ < 0.
∫
Example: Let A = [a, b]×[c, d]. We wish to evaluate ∂A ((x+y)dx+(x−y)dy). Let α, β : [a, b] → R2
be α(t) = (t, c), β(t) = (t, d); and γ, σ : [c, d] → R2 be γ(t) = (a, t), σ(t) = (b, t). Then the
anticlockwise parametrization of ∂A is given by the path α + σ − β − γ. Moreover, observe that
∫
dy = 0 along α and β; and dx = 0 along γ and σ. Therefore by [136](i), ∂A (x + y)dx + (x − y)dy
56 T.K.SUBRAHMONIAN MOOTHATHU
∫ ∫ ∫ ∫
= α (x + y)dx + σ (x − y)dy − β (x + y)dx − γ (x − y)dy
∫b ∫d ∫b ∫d ∫b ∫d
= a (t + c)dt + c (b − t)dt − a (t + d)dt − c (a − t)dt = a (c − d)dt + c (b − a)dt = 0.
Definition: Let U ⊂ Rn be a connected open set and f : U → Rn be continuous. We say f has path
independent line integral in U if for any two piecewise C 1 paths α, β : [a, b] → U with α(a) = β(a)
∫ ∫
and α(b) = β(b), we have that α f = β f .
Recall from Exercise-11 that if U ⊂ Rn is a connected open set, then for every x, y ∈ U , there is
a polygonal path (i.e., a continuous path consisting of finitely many line segments, and in particular
a piecewise C 1 path) in U from x to y. Here note that a path α : [a, b] → U is said to be a path
from x to y if α(a) = x and α(b) = y.
[137] [Fundamental theorem of Calculus for line integrals of a vector field] Let U ⊂ Rn be a
connected open set, and f : U → Rn be continuous.
∫
(i) Assume there is a function F : U → R with ∇F = f . Then αf = F (α(b)) − F (α(a)) for any
piecewise C1 path α : [a, b] → U .
∫
(ii) Assume f has path independent line integral in U . Fix z ∈ U . Define F : U → R as F (x) = α f,
where α is any piecewise C 1 path in U from z to x. Then F is a C 1 -function with ∇F = f .
Proof. (i) Since ∇F = f and f is continuous, it follows that F is a C 1 -function, and in particu-
∫ ∫b ∫b ∫b
lar differentiable. Now, α f = a ⟨f (α(t)), α′ (t)⟩dt = a ⟨∇F (α(t)), α′ (t)⟩dt = a (F ◦ α)′ (t)dt =
F (α(b)) − F (α(a)) by the Chain rule [108](iii) and the Fundamental theorem of Calculus of one-
variable theory.
(ii) It suffices to show ∇F = f , and then the continuity of f will imply that F is a C 1 -function. Let
F (x + tej ) − F (x)
f = (f1 , . . . , fn ). Fix x ∈ U and j ∈ {1, . . . , n}. We need to show that limt→0 =
∫ t
fj (x). Let α be piecewise C 1 path in U from z to x. Then F (x) = α f . Choose an open ball
B ⊂ U centered at x and consider t ̸= 0 with x + tej ∈ B. Let β : [0, 1] → U be β(s) = x + stej ,
i.e., β is a parametrization of the line segment joining x and x + tej . Then α + β is a path in
∫ ∫ ∫ ∫
U from z to x + tej , and therefore F (x + tej ) = α+β f = α f + β f = F (x) + β f . Hence
∫ ∫1 ∫t ∫t
F (x + tej ) − F (x) = β f = 0 ⟨f (x + stej ), tej ⟩ds = 0 ⟨f (x + λej ), ej ⟩dλ = 0 fj (x + λej )dλ by
1 ∫t F (x + tej ) − F (x)
putting λ = st. Moreover, fj (x) = 0 fj (x)dλ. Consequently, | − fj (x)| ≤
∫t t t
0 |fj (x + λej ) − fj (x)|dλ
, and the right hand side goes to 0 as t → 0 by the continuity of fj .
|t|
[138] Let U ⊂ Rn be a connected open set, and f : U → Rn be continuous. Then the following are
∫
equivalent: (i) α f = 0 for every piecewise C 1 closed path α in U .
(ii) f has path independent line integral in U .
MULTIVARIABLE CALCULUS 57
Proof. (i) ⇒ (ii): Let α, β : [a, b] → U be piecewise C 1 paths with α(a) = β(a) and α(b) = β(b).
∫ ∫ ∫
Then α − β is a piecewise C 1 closed path, and hence 0 = α−β f = α f − β f by (i) and [136](i).
The implication ‘(ii) ⇒ (iii)’ is established in [137](ii), and ‘(iii) ⇒ (i)’ follows from [137](i).
∂f1 ∂f2
→− (a, b) + (a, b) as ε → 0. This proves the main assertion.
∂y ∂x
Statement (i) is an immediate corollary. To deduce (ii) from (i), note that Jf (a, b) is the transpose
of the Hessian matrix HF (a, b), and HF (a, b) is symmetric by [110] because F is C 2 (as f is C 1 ).
Example: (i) Let f : R2 → R2 be f (x, y) = (−y, x). We know that this flow represents a rotation.
∂f2 ∂f1
Since − ≡ 2, the circulation density of f at (a, b) is 2 for every (a, b) ∈ R2 . From [139](ii),
∂x ∂y
we deduce that there does not exist any C 2 -function F : R2 → R with ∇F = f .
(ii) Let f : R2 → R2 be f (x, y) = (x, y), which represents a flow expanding in all directions from
∂f2 ∂f1
the origin with increasing speed. Here, − ≡ 0, and thus the circulation density of f at
∂x ∂y
x2 + y 2
(a, b) is 0 for every (a, b) ∈ R2 . If F : R2 → R is F (x, y) = , then ∇F = f .
2
∂f2 ∂f1
If f : R2 → R2 is a C 1 -function with − ≡ 0, we may ask whether there is a C 2 -function
∂x ∂y
F : R2 → R with ∇F = f , or equivalently whether f has path independent line integral in R2 .
The affirmative answer is given by [140] below. Another related result is Green’s theorem (stated
as [141] below), which is true for regions bounded by piecewise C 1 paths, but we will prove only a
special case of the Green’s theorem.
e
A = {(x, y) : a ≤ x ≤ b and ϕ(x) ≤ y ≤ ψ(x)} = {(x, y) : c ≤ y ≤ d and ϕ(y) e
≤ x ≤ ψ(y)}.
Clearly, every rectangle is an elementary region. On the other hand, the set {(x, y) ∈ R2 : −1 ≤
x ≤ 1 and x2 ≤ y ≤ x2 + 1} is not an elementary region because the second representation fails.
∫ ∫ ∫ ∂f1
Proof. (i) Let f = (f1 , f2 ). Then ∂A f = ∂A (f1 dx + f2 dy). We will show that A (− ) =
∂y
∫ ∫ ∂f2 ∫ 1 e e
∂A f1 dx and A ∂x = ∂A f2 dy. Choose piecewise C paths ϕ, ψ, ϕ, and ψ such that
e
A = {(x, y) : a ≤ x ≤ b and ϕ(x) ≤ y ≤ ψ(x)} = {(x, y) : c ≤ y ≤ d and ϕ(y) e
≤ x ≤ ψ(y)}.
e ψe are null sets in R2 . Since ∂A consists of these graphs and at most two
The graphs of ϕ, ψ, ϕ,
horizontal and at most two vertical line segments, ∂A is also a null set in R2 . Hence the compact
∂f2 ∂f1
set A is Jordan measurable, and therefore the continuous function − is indeed Riemann
∂x ∂y
integrable over A by [128](ii).
∫ ∂f1 ∫ b ∫ ψ(x) ∂f1 ∫b
We have A (− ) = a ( ϕ(x) (− )dy)dx = a (f1 (x, ϕ(x) − f1 (x, ψ(x))dx by the first rep-
∂y ∂y
resentation of A. Let α, β : [a, b] → R2 be α(t) = (t, ϕ(t)) and β(t) = (t, ψ(t)). Then dx = dt
along both α and β. Note that the vertical line segments of ∂A (if any) do not contribute to the
∫
integral ∂A f1 dx since dx = 0 along vertical lines. Therefore, by the first representation of A, we
∫ ∫ ∫ ∫b ∫ ∂f1
get ∂A f1 dx = α f1 dx − β f1 dx = a f1 (t, ϕ(t) − f1 (t, ψ(t))dt = A (− )dy.
∂y
60 T.K.SUBRAHMONIAN MOOTHATHU
∫ ∂f2 ∫ d ∫ ψ(y)
e ∂f2 ∫d
Next, e e
y) − f2 (ϕ(y),
A ∂x = c ( ϕ(y) e ∂x
dx)dy = c (f2 (ψ(y), y))dy by the second representa-
e
tion of A. Let γ, σ : [c, d] → R2 be γ(t) = (ϕ(t), e
t) and σ(t) = (ψ(t), t). Then dy = dt along both
γ and σ. Note that the horizontal line segments of ∂A (if any) do not contribute to the integral
∫
∂A f2 dy since dy = 0 along horizontal lines. Therefore, by the second representation of A, we get
∫ ∫ ∫ ∫d ∫ ∂f1
e e
∂A f2 dy = σ f2 dy − γ f2 dy = c (f2 (ψ(t), t) − f2 (ϕ(t), t))dt = A (− ∂y )dy.
∫
∂f2 ∂f1 ∫
Remark: The equality ‘ A( − ) = ∂A f ’ in Green’s theorem may be interpreted as follows:
∂x ∂y
the net amount of anticlockwise rotation of a 2-dimensional flow f in a planar region A is equal to
the net amount of the flow f along the boundary of A in the anticlockwise direction.
x2 y 2
Exercise-41: Let A ⊂ R2 be the region bounded by the ellipse + 2 = r2 , where a, b > 0. Then
a2 b
µ(A) = πabr2 by an application of Green’s theorem. [Hint: Choose a simple enough C 1 -function
∂f2 ∂f1 ∫ ∫ ∂f2 ∂f1 ∫
f : R2 → R2 with − ≡ 1, say f (x, y) = (0, x). Then µ(A) = A 1 = A ( − ) = ∂A f
∂x ∂y ∂x ∂y
by [141]. Parametrizing ∂A with α : [0, 2π] → R2 given by α(t) = (ar cos t, br sin t), we see
∫ ∫1 ′
∫ 2π ∫
2 2π ( 1 + cos(2t) )dt = πabr 2 .]
∂A f = 0 ⟨f (α(t)), α (t)⟩dt = 0 abr cos tdt = abr
2 2
0 2
As in the case of line integrals, we will define two types of surface integrals - (i) for scalar fields,
and (ii) for vector fields. We will think of a surface as a function rather than as a set (as in the
case of a path). First, we need to recall the notion of a cross product (vector product).
Definition: The cross product (also called vector product) is a binary operation on R3 defined by
the following conditions: (i) e1 × e2 = e3 , e2 × e3 = e1 , and e3 × e1 = e2 .
(ii) ej × ei = −(ei × ej ) for 1 ≤ i, j ≤ 3, and in particular ej × ej = 0 ∈ R3 for 1 ≤ j ≤ 3.
∑ ∑
(iii) u × v = 3i=1 3j=1 ui vj (ei × ej ) for every u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ) in R3 .
w1 w2 w3 u1 u2 u3 u1 v1 w1
(v) ⟨u × v, w⟩ = det
u 1 u2 u 3
= det v1 v2 v3 = det u2 v2 w2 .
v1 v2 v3 w1 w2 w3 u3 v3 w3
(vi) ⟨u × v, w⟩ ̸= 0 ⇔ {u, v, w} is linearly independent (this follows from (v)). In particular, u × v
is perpendicular to both u and v, i.e., ⟨u × v, u⟩ = 0 = ⟨u × v, v⟩.
(vii) ∥u × v∥2 = ∥u∥2 ∥v∥2 − |⟨u, v⟩|2 by (iii). It follows that ∥u × v∥ = ∥u∥∥v∥ sin θ if θ ∈ [0, π] is
the angle between u and v because ⟨u, v⟩ = ∥u∥∥v∥ cos θ.
(viii) ∥u × v∥ is the area of the parallelogram in R3 with vertices 0, u, v, and u + v by (vii).
(ix) |⟨u × v, w⟩| is the area of the parallelepiped in R3 specified by the three vectors u, v, and w (to
see this, note that if η is the angle between u × v and w, then |⟨u × v, w⟩| = ∥u × v∥∥w∥| cos η|).
(x) u × (v × w) = ⟨u, w⟩v − ⟨u, v⟩w.
(xi) [Jacobi identity] (u × v) × w + (v × w) × u + (w × u) × v = 0 ∈ R3 .
There are different approaches to the definition of a surface. We will consider only the restricted
notion of a parametric surface in R3 . As in the case of a path, we will define a parametric surface as
a function; and the image of this function will be what we geometrically think of as a surface. There
will be a little bit of ambiguity in the definition of a parametric surface since minor modifications
will be needed depending on the context.
e1 e2 e3
∂P ∂P
(x, y) × (x, y) = det
−r sin x sin y r cos x sin y 0 = −r sin yP (x, y) ̸= (0, 0, 0) for
∂x ∂y
r cos x cos y r sin x cos y −r sin y
∂P ∂P
every (x, y) ∈ int(X). Thus P is a parametric surface. In this example, (x, y) × (x, y) is
∂x ∂y
the inward normal to the sphere at P (x, y) because of the negative sign in ‘−r sin yP (x, y)’, and
∂P ∂P
∥ (x, y) × (x, y)∥ = r2 sin y because ∥P (x, y)∥ = r and sin y ≥ 0 for y ∈ [0, π].
∂x ∂y
(ii) [Cylinder] Let r > 0, h > 0, X = [0, 2π] × [0, h], and define a C 1 -function P : X → R3 as
P (x, y) = (r cos x, r sin x, y), which is injective on int(X). Note that the image P (X) is a vertical
cylinder of height h and radius r (without the top and bottom discs) with the center of the bottom
disc placed at the origin of R3 . We have that
e1 e2 e3
∂P ∂P
(x, y) × (x, y) = det
−r sin x r cos x 0 = (r cos x, r sin x, 0) = P (x, 0) ̸= (0, 0, 0) for
∂x ∂y
0 0 1
∂P ∂P
every (x, y) ∈ X. Thus P is a parametric surface. In this example, (x, y) × (x, y) is the
∂x ∂y
∂P ∂P
outward normal to the cylinder at P (x, y), and ∥ (x, y) × (x, y)∥ = ∥P (x, 0)∥ = r.
∂x ∂y
Exercise-43: (i) [Observation] If P = (P1 , P2 , P3 ) : X → R3 is a parametric surface and a ∈ X,
then by Exercise-42(iii),
we see that
∂P1 ∂P1
e (a) (a)
1 ∂x ∂y e1
∂P2 [ ]
∂P ∂P ∂P2
(a) × (a) = det e2 (a) (a) = det E JP (a) , where E :=
e2 .
∂x ∂y ∂x ∂y
∂P3 ∂P3
e3 (a) (a) e3
3×1
∂x ∂y
∂P ∂P
This suggests that ∥ (a) × (a)∥ is the ‘local magnification factor’ of P at a ∈ X.
∂x ∂y
(ii) Let P : X → R3 and Pe : X e → R3 be parametric surfaces, and suppose g : X → Xe is a
C 1 -diffeomorphism with P = Pe ◦ g. Then for every a ∈ X and b := g(a) ∈ X, e we have that
∂P ∂P ∂ Pe ∂ Pe
(a) × (a) = det(Jg (a))( (b) × (b)).
∂x ∂y ∂x ∂y
[Hint: (ii) JP (a) = JPe (b)Jg (a) by Chain rule. Now use (i).]
Example: (i) Recall the examples of the sphere and the cylinder from the previous page. In the
∂P ∂P
case of the sphere with radius r, we have ∥ (x, y) × (x, y)∥ = r2 sin y, and hence the surface
∫ 2π ∫ π ∂x ∫ 2π ∂y
area of this sphere is = 0 ( 0 r2 sin ydy)dx = 0 2r2 dx = 4πr2 . In the case of the cylinder with
∂P ∂P
height h and radius r, we have ∥ × ∥ ≡ r, and hence the surface area of this cylinder (without
∂x ∂y
∫ 2π ∫ h
the top and bottom discs) is = 0 ( 0 rdy)dx = 2πrh.
(ii) Let h > 0, r > 0, X = [0, π/2] × [0, h], and P : X → R3 be P (x, y) = (r cos x, r sin x, y). Note
∂P ∂P ∫ ∫ ∂P ∂P
that ∥ × ∥ ≡ r. If f : R3 → R is f (x, y, z) = x + y + z, then P f = X (f ◦ P )∥ × ∥=
∂x ∂y ∂x ∂y
∫ π/2 ∫ h 2 ∫ 2 2
2 sin x + ry)dy)dx = π/2 (r 2 h cos x + r 2 h sin x + rh )dx = 2r 2 h + πrh .
0 ( 0 (r cos x + r 0 2 4
Exercise-44: Let P : X → R3 be a parametric surface of the form P (x, y) = (x, y, ϕ(x, y)), where
∂P ∂P
ϕ : X → R is a C 1 -function. Then P (X) is the graph of ϕ, P is injective on X, and × =
∂x ∂y
e e e3
1 2 √
∂ϕ ∂ϕ ∂ϕ ∫ ∂ϕ 2 ∂ϕ 2
= (− ∂x , − ∂y , 1) ̸= 0 ∈ R in X. Hence area(P (X)) = A 1 + ( ∂x ) + ( ∂y ) .
det 1 0 3
∂x
∂ϕ
0 1
∂y
Exercise-45: Let U ⊂ R3 be open, f, g : U → R be continuous, and P : X → U be a parametric
∫ ∫ ∫
surface. Then, (i) P (c1 f + c2 g) = c1 P f + c2 P g for every c1 , c2 ∈ R.
∫ ∫ ∫
(ii) If f ≥ g, then P f ≥ P g. In particular, if f ≥ 0, then P f ≥ 0.
64 T.K.SUBRAHMONIAN MOOTHATHU
e e
Proof. If a ∈ X and b = g(a) ∈ X, e then ∥ ∂P (a) × ∂P (a)∥ = ∥ ∂ P (b) × ∂ P (b)∥| det(Jg (a))|
∂x ∂y ∂x ∂y
e e
e → R be h(b) = (f ◦ Pe)(b)∥ ∂ P (b) × ∂ P (b)∥. Then by Change of
by Exercise-43(ii). Let h : X
∫∂x ∫ ∂y ∫
variable theorem and the initial observation, we see that Pe f = Xe h = X (h ◦ g)| det(Jg (·)| =
∫ ∂P ∂P ∫
X (f ◦ P )∥ ∂x × ∂y ∥ = P f .
[143] (i) Let 0 ≤ a < b, and ϕ : [a, b] → R be a C 1 -function. Assume that the graph of ϕ lies in
the xz-plane in R3 . Then the area of the ‘surface of revolution’ obtained by rotating the graph of
∫b √
ϕ around the z-axis is 2π a x 1 + (ϕ′ (x))2 dx.
√
(ii) The surface area of the cone with height h > 0 and radius r > 0 (without the disc) is πr r2 + h2 .
Proof. (i) Let X = [a, b] × [0, 2π]. The ‘surface of revolution’ is parametrized by P : X → R3 given
e1 e2 e3
∂P ∂P
by P (x, y) = (x cos y, x sin y, ϕ(x)). Now, (x, y) × (x, y) = det
cos y sin y ϕ′ (x)
=
∂x ∂y
−x sin y x cos y 0
∂P ∂P √
(−xϕ′ (x) cos y, −xϕ′ (x) sin y, x) so that ∥ (x, y) × (x, y)∥ = x 1 + (ϕ′ (x))2 . Therefore,
√ ∂x ∂y
√
∫ b ∫ 2π ∫b
area(P (X)) = a ( 0 x 1 + (ϕ′ (x))2 dy)dx = 2π a x 1 + (ϕ′ (x))2 dx.
(ii) The surface of the cone (without the disc) can be obtained as a ‘surface of revolution’ as
described in (i) if we take ϕ : [0, r] → R as ϕ(x) = hx/r. Hence by (i), the area of the cone
∫r √ ∫r √ √ √
= 2π 0 x 1 + (ϕ′ (x))2 dx = 2π 0 x 1 + (h/r)2 dx = πr2 1 + (h/r)2 = πr r2 + h2 .
defined as ∫ ∫ ∂P ∂P .
P f= X ⟨f ◦ P, ∂x
×
∂y
⟩
Remark: Sometimes we are interested in calculating the flow f across S := P (X) in the direction
∂P ∂P ∫ ∂P ∂P
of −( × ) (depends on the context). Then we consider − X ⟨f ◦ P, × ⟩ as the surface
∂x ∂y ∫ ∫ ∂x ∂y
integral. The surface integral is also denoted as P f · n
b, or as S f · n
b dS, where S = P (X) and
∂P ∂P ∂P ∂P
b = ±(
n × )/∥ × ∥ (the unit normal).
∂x ∂y ∂x ∂y
Exercise-46: Let U ⊂ R3 be open, f, g : U → R3 be C 1 -functions, and P : X → U be a parametric
∫ ∫ ∫
surface. Then P (c1 f + c2 g) = c1 P f + c2 P g for every c1 , c2 ∈ R.
Example: Let h, r > 0, X = [0, 2π] × [0, h], and P : X → R3 be P (x, y) = (r cos x, r sin x, y). We
know that P (X) is a cylinder with height h and radius r whose axis is the z-axis. (i) Let f : R3 → R3
be f (x, y, z) = (−y, x, 0). Then f represents a rotation around the z-axis, and hence there is no flow
∫ ∫ ∫
out of P (X) so that we expect P f = 0. Indeed P f = X ⟨(−r sin x, r cos x, 0), (r cos x, r sin x, 0)⟩ =
∫
X 0 = 0. (ii) Let f : R → R be f (x, y, z) = (x, y, 0), then there is flow out of P (X); in fact,
3 3
∫ ∫ ∫ 2
P f = X ⟨(r cos x, r sin x, 0), (r cos x, r sin x, 0)⟩ = X r = r µ(X) = 2πr h.
2 2
Another notation for the surface integral of a vector field: Let U ⊂ R3 be open, f = (f1 , f2 , f3 ) :
U → R3 be continuous, and P = (P1 , P2 , P3 ) : X → U be a parametric surface. Write elements of
X as (x, y) and elements of U as (u1 , u2 , u3 ). Then f (P (x, y)) = f (u1 , u2 , u3 ), where u1 = P1 (x, y),
u2 = P2 (x, y), and u3 = P3 (x, y). For distinct j, k ∈ {1, 2, 3}, letting
∂Pj ∂Pj
∂P ∂P
duj ∧ duk = det ∂x ∂y
, we see ⟨f ◦ P, × ⟩ = f1 du2 ∧ du3 + f2 du3 ∧ du1 + f3 du1 ∧ du2 .
∂Pk ∂Pk ∂x ∂y
∂x ∂y
Also, let S = P (X). Then the surface integral may be written as
∫ ∫
... P f = S (f1 du2 ∧ du3 + f2 du3 ∧ du1 + f3 du1 ∧ du2 ) .
Often, duj ∧ duk is written simply as duj duk (but it should be noted that this does not mean a
simple double integral as in Fubini’s theorem). Since the calculation of the surface integral does
66 T.K.SUBRAHMONIAN MOOTHATHU
not involve the Chain rule, it is not essential to use disjoint sets of variables for f and P : we may
write the variables of f as x, y, z also. Then the notation for the surface integral becomes
∫ ∫
P f = S (f1 dydz + f2 dzdx + f3 dxdy)
.
Exercise-47: Let f : R3 → R3 be f (x, y, z) = (x, y, 0), and S ⊂ R3 be the upper half of the sphere
∫
with radius r > 0 centered at the origin. Compute the surface integral S f with respect to the
outward unit normal to S by considering the following parametrizations:
(i) P : [0, 2π] × [0, π/2] → R3 , P (x, y) = (r cos x sin y, r sin x sin y, r cos y).
√
(ii) P : B(0, 1) ⊂ R2 → R3 , P (x, y) = (x, y, r2 − x2 − y 2 ).
∂P ∂P
[Hint: (i) We know (x, y) × (x, y) = −r sin yP (x, y), which is an inward normal. Let
∂x ∂y
∫ ∫ ∂P ∂P
X = [0, 2π] × [0, π/2]. Then S f = − X ⟨f ◦ P, × ⟩
∫ ∂x ∂y
= − X ⟨(r cos x sin y, r sin x sin y, 0), −r sin y(r cos x sin y, r sin x sin y, r cos y)⟩
∫ π/2 ∫ 2π 3 3 ∫ ∫ ∫
3 π/2 sin3 ydy = 2πr 3 π/2 (1−cos2 y) sin ydy = 2πr 3 1 (1−λ2 )dλ =
= 0 0 r sin ydxdy = 2πr 0 0 0
√
4πr3 /3 by putting λ = cos y. (ii) Letting ϕ(x, y) = r2 − x2 − y 2 and using Exercise-44, we
∂P ∂P ∂ϕ ∂ϕ P
have × = (− , − , 1) = , which is the outward normal. Let X = B(0, 1) ⊂
∂x ∂y ∂x ∂y ϕ
∫ ∫ ∂P ∂P ∫ x y
R2 . Then S f = X ⟨f ◦ P, × ⟩ = X ⟨(x, y, 0), ( √ ,√ , 1)⟩ =
∂x ∂y r −x −y
2 2 2 r − x2 − y 2
2
∫ x2 + y 2
X
√ . Using the polar coordinates (x, y) = (t cos θ, t sin θ) and Change of variable the-
r2 − x2 − y 2
∫ r ∫ 2π t2 × t ∫ r 2πt3 ∫r
orem, this integral is equal to 0 0 √ dθdt = 0 √ dt = 0 2π(r2 − λ2 )dλ = 4πr3 /3
r 2 − t2 r 2 − t2
√ −tdt
by putting λ = r2 − t2 (then dλ = √ and t2 = r2 − λ2 ).]
r 2 − t2
[144] Let U ⊂ R3 be open, f : U → R3 be continuous, and P : X → U and Pe : X e → U be
e with P = Pe ◦ g (if necessary
parametric surfaces. Suppose there is a C 1 -diffeomorphism g : X → X
assume g is defined in a neighborhood of X).
∫ ∫
(i) If det(Jg (a)) > 0 for every a ∈ X, then P f = Pe f .
∫ ∫
(ii) If det(Jg (a)) < 0 for every a ∈ X, then P f = − Pe f .
e e
Proof. If a ∈ X and b = g(a) ∈ X, e then ∂P (a) × ∂P (a) = det(Jg (a))( ∂ P (b) × ∂ P (b)) by
∂x ∂y ∂x ∂y
e e
Exercise-43(ii). Let h : X e → R be h(b) = ⟨(f ◦ Pe)(b), ∂ P (b) × ∂ P (b)⟩. Then by Change of
∂x
∫ ∫ ∂y ∫
variable theorem and the initial observation, we see that Pe f = Xe h = X (h ◦ g) det(Jg (·)) =
∫ ∂P ∂P ∫
± X ⟨f ◦ P, (a) × (a)⟩ = ± P f , where we have plus sign if det(Jg (·)) > 0 and we have minus
∂x ∂y
sign if det(Jg (·)) < 0.
MULTIVARIABLE CALCULUS 67
We will introduce the notions of divergence and curl for a vector field: divergence measures
expansion/compression (positive divergence indicates expansion and negative divergence indicates
compression), and curl measures the circulation density (the direction of the curl vector indicates
the axis around which maximal rotation happens and the magnitude of the curl vector measures
the speed of rotation).
(ii) Let U ⊂ R3 be open and f = (f1 , f2 , f3 ) : U → R3 be a C 1 -function (or just assume that all the
first order partial derivatives exist). Then the curl of f is the vector-valued function curlf : U → R3
e1 e2 e3
∂ ∂
∂ ∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1
defined as curlf = ∇ × f = det =( − , − , − ). Note that
∂x ∂y ∂z ∂y ∂z ∂z ∂x ∂x ∂y
f1 f2 f3
if f is a C -function, then curlf is a C k−1 -function. Also observe that if a ∈ U and Jf (a) is
k
Warning: Let U ⊂ Rn be open, f : U → Rn be a C 1 -function. Then ⟨∇, f ⟩ ̸= ⟨f, ∇⟩: the right
∑ ∂
hand side is the partial differential operator ni=1 fi .
∂xi
[145] Let U ⊂ Rn be open. (i) Let f, g : U → Rn be C 1 -functions, and c1 , c2 ∈ R.
Then div(c1 f + c2 g) = c1 divf + c2 divg. If n = 3, then curl(c1 f + c2 g) = c1 curlf + c2 curlg.
(ii) Let f : U → Rn and ϕ : U → R be C 1 -functions. Then
div(ϕf ) = ⟨∇, ϕf ⟩ = ⟨∇ϕ, f ⟩ + ϕ⟨∇, f ⟩ = ⟨∇ϕ, f ⟩ + ϕ divf . If n = 3, then
curl (ϕf ) = ∇ × (ϕf ) = ∇ϕ × f + ϕ(∇ × f ) = ∇ϕ × f + ϕ curlf .
(iii) Assume n = 3, and let f, g : U → R3 be C 1 -functions. Then
div(f × g) = ⟨∇, f × g⟩ = ⟨g, ∇ × f ⟩ − ⟨f, ∇ × g⟩ = ⟨g, curlf ⟩ − ⟨f, curlg⟩, and
curl(f × g) = ∇ × (f × g) = ⟨g, ∇⟩f − ⟨f, ∇⟩g + ⟨∇, g⟩f − ⟨∇, f ⟩g
= ⟨g, ∇⟩f − ⟨f, ∇⟩g + (divg)f − (divf )g.
Proof. (i) Suppose f = ∇F , and consider a ∈ U . Then Jf (a) is equal to the transpose of the
Hessian matrix HF (a). But F is a C 2 -function (since f is C 1 ), and hence HF (a) is symmetric by
[110]. Thus Jf (a) is symmetric, which implies (curlf )(a) = 0 by the definition of curlf .
(ii) Use the equality of second order mixed partial derivatives of g given by [110].
Remark: (i) Compare [146](i) with [139](i) and [139](ii). Another way to understand [146](i) is: if
∫
U is connected and f = ∇F , then α f = 0 by [138] for every piecewise C 1 closed path α in U , and
therefore ‘circulation density’ of f is zero everywhere in U ; so curlf ≡ 0.
(ii) At a formal level, [146](ii) says ⟨∇, ∇ × g⟩ = 0, which is similar to the fact ⟨u, u × v⟩ = 0.
Further intuition about [146](ii) is given by Gauss’ divergence theorem [149] (see Exercise-51).
Example: (i) Let f : R3 → R3 be f (x, y, z) = (x, y, z). This represents a flow originating from
(0, 0, 0) and spreading outwards with increasing speed. There is no rotation involved. We see that
divf ≡ 3 and cirlf ≡ 0 ∈ R3 .
(ii) Let f : R3 → R3 be f (x, y, z) = (−y, x, 0), which is a rotation around the z-axis. There is
neither expansion nor compression. We see that divf ≡ 0 and (curlf )(x, y, z) = (0, 0, 2).
The converse parts of [146](i) and [146](ii) are true in certain special cases:
Proof. (i) The implication ⇒ is given by [146](i). And the reverse implication follows from [140]
because Jf (a) is symmetric for every a ∈ U if curlf ≡ 0.
(ii) The implication ⇒ is given by [146](ii). For the reverse implication, see Theorem 12.5 of
Apostol, Calculus-II (left as a reading assignment).
The result analogous to Green’s theorem in dimension 3 is called Stokes’ theorem. Roughly
speaking, it says that if P : X ⊂ R2 → R3 is a parametric surface, then the net amount of rotation
of a flow tangential to P (X) is equal to the net amount of anticlockwise flow along the boundary
of P (X). A technical point: if necessary assume P is defined in a neighborhood of X.
Proof. Write f = (f1 , f2 , f3 ). Then f = (f1 , 0, 0) + (0, f2 , 0) + (0, 0, f3 ). Since the curl and the
∫ ∫
integrals (the surface integral P and the line integral P ◦α ) are linear operators, it suffices to prove
the result for each of the three functions (f1 , 0, 0), (0, f2 , 0), (0, 0, f3 ) separately. We will prove the
result for the function (0, 0, f3 ); the proofs for the other two functions are similar. So assume
f = (0, 0, f3 ) for the rest of the proof. We will use Green’s theorem in the proof.
∂P ∂P ∂g2 ∂g1
⟨(curlf )(P (a)), (a) × (a)⟩ = ( − )(a).
∂x ∂y ∂x ∂y
∫ ∫ ∂P ∂P ∫ ∂g2 ∂g1 ∫
Hence (*) implies that P curlf = X ⟨(curlf ) ◦ P, × ⟩ = X( − ) = P ◦α f .
∂x ∂y ∂x ∂y
Remark: (i) Stokes’ theorem can be extended to ‘surfaces’ obtained by ‘pasting together’ finitely
many (images of) parametric surfaces provided on any common boundary of two distinct parametric
surfaces, the line integrals are in opposite directions and cancel each other. On the other hand,
Stokes’ theorem cannot be extended to ‘non-orientable’ surfaces such as the Mobius band (which
has only ‘one side’). For more details on this topic, see Section 12.8 of Apostol, Calculus-II.
(ii) Stokes’ theorem provides an insight into the identity curl(grad) ≡ 0 as follows. Suppose f = ∇F .
∫ ∫
Then by taking β = P ◦ α in [148], we see P curlf = β f = F (β(b)) − F (β(a)) = 0 by [137](i)
since β is a closed path. As P is an arbitrary parametric surface, we may deduce that curlf ≡ 0.
Exercise-50: (i) Let r > 0 and S ⊂ R3 be the sphere with radius r centered at the origin, with
upper half S1 and lower half S2 . Let X = {(x, y) ∈ R2 : x2 + y 2 ≤ r2 }, and P, Q : X → R3 be
√
P (x, y) = (x, y, ϕ(x, y)) and Q(x, y) = (x, −y, −ϕ(x, y)), where ϕ(x, y) = r2 − x2 − y 2 . Then P
parametrizes S1 , Q parametrizes S2 , and the common boundary ∂S1 ∩ ∂S2 (which is the equator
of S) is parametrized by P and Q in opposite directions (because of the minus sign for y in the
∂P ∂P ∂ϕ ∂ϕ ∂Q ∂Q ∂ϕ ∂ϕ
expression for Q). Moreover, × = (− , − , 1) and × = (− , , −1) give
∂x ∂y ∂x ∂y ∂x ∂y ∂x ∂y
outward normals to S1 and S2 respectively (in the first case, the z-coordinate is positive, which
indicates upward normal; and in the second case, the z-coordinate is negative, which indicates
downward normal). Hence if U ⊂ R3 is an open neighborhood of S and f : U → R3 is a C 1 -
∫ ∫ ∫
function, then S f = P f + Q f .
(ii) Let U ⊂ R3 be open, f : U → R3 be a C 1 -function, and suppose f = curlg for some g : U → R3 .
∫ ∫
Then S f = 0 for any sphere S ⊂ U when the integral S f is considered with respect to the
outward unit normal to S.
[Hint: (ii) For simplicity assume S = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = r2 }. Write S = S1 ∪ S2 , where
S1 is the upper half and S2 is the lower half of S. Let P, Q be as in part (i). Since the common
∂P ∂P
boundary ∂S1 ∩ ∂S2 is parametrized in opposite directions by P and Q, and since × and
∂x ∂y
∂Q ∂Q
× give outward normals to S1 and S2 respectively, we may extend Stokes’ theorem to S
∂x ∂y ∫ ∫ ∫ ∫ ∫ ∫ ∫
for the function g and conclude S f = S1 f + S2 f = P curlg + Q curlg = ∂S1 g + ∂S2 g = 0.
72 T.K.SUBRAHMONIAN MOOTHATHU
∫ ∫ ∂Q ∂Q
Similarly, Q f = X ⟨f ◦ Q, × ⟩
∂x ∂y
∫ √ x −y ∫ 1
= X ⟨(x, −y, − 1 − x2 − y 2 ), ( √ ,√ , −1)⟩ = X √ = 2π.
1 − x2 − y 2 1 − x2 − y 2 1 − x2 − y 2
∂P ∂P ∂Q ∂Q
Both × and × give outward normals to the unit sphere; see Exercise-50(i). Hence
∂x ∂y ∂x ∂y
∫ ∫ ∫ ∫ ∫
S f = S1 f + S2 f = P f + Q f = 4π ̸= 0. Thus we conclude by Exercise-50(ii) that f ̸= curlg
for any C 2 -function g : U → R3 . (Remark: In this example, the failure of the existence of any g
with curlg = f is essentially due to the fact that the open set U has a ‘hole’).
Remark: In the theorems of Green, Stokes, and Gauss, we have an equality between an integral
over an n-dimensional region (n = 2 or n = 3) and an integral over the boundary of the region,
MULTIVARIABLE CALCULUS 73
In this sense, all these three theorems can be thought of as generalizations of the Fundamental
∫b
theorem of calculus (which says a f (x)dx = F (b) − F (a) if F ′ = f and f is continuous).
Proof. Write f = (f1 , f2 , f3 ). Then f = (f1 , 0, 0) + (0, f2 , 0) + (0, 0, f3 ). Since div and the integrals
are linear operators, it is enough to prove the result for each of the three functions (f1 , 0, 0),
(0, f2 , 0), (0, 0, f3 ) separately. We will prove the result for the function (0, 0, f3 ); and the proofs for
the other two functions are similar. So assume f = (0, 0, f3 ) for the rest of the proof.
As per definition, the elementary solid V has three representations, out of which choose the
following representation: V = {(x, y, z) ∈ R3 : (x, y) ∈ X and ϕ(x, y) ≤ z ≤ ψ(x, y)}, where
X ⊂ R2 is a compact connected set bounded by piecewise C 1 simple closed path, and ϕ, ψ : X → R
are C 1 -functions with ϕ < ψ on int(X). Since f = (0, 0, f3 ), we get by Fubini’s theorem that
∫ ∫ ∂f3 ∫ ∫ ψ(x,y) ∂f3 ∫
V divf = V ∂z = (x,y)∈X ( ϕ(x,y) ∂z dz)d(x, y) = X (f3 (x, y, ψ(x, y)) − f3 (x, y, ϕ(x, y))) (*)
Let P, Q : X → U be P (x, y) = (x, y, ψ(x, y)) and Q(x, y) = (x, y, ϕ(x, y)). Then V is the region
between the images of the parametric surfaces P (upper part) and Q (lower part). We may write
∂V = P (X) ∪ S ∪ Q(X), where S is the part of ∂V between P (X) and Q(X). Note the following:
∂P ∂P ∂ψ ∂ψ
(i) × = (− , − , 1), which is an outward normal to the upper part P (X) of ∂V because
∂x ∂y ∂x ∂y
the z-coordinate is positive (upward).
∂Q ∂Q ∂ϕ ∂ϕ
(ii) × = (− , − , 1), which is an inward normal to the lower part Q(X) of ∂V because
∂x ∂y ∂x ∂y
the z-coordinate is positive (upward).
(iii) Any outward normal to the ‘middle part’ S of ∂V is parallel to the xy-plane and hence has
∫
the z-coordinate zero. Consequently, S f = 0 because f = (0, 0, f3 ) by assumption.
74 T.K.SUBRAHMONIAN MOOTHATHU
∫
By the above observations, the value of ∂V f with respect to the outward unit normal is:
∫ ∫ ∫ ∫ ∂P ∂P ∫ ∂Q ∂Q ∫ ∫
∂V f= P f− Qf = X ⟨f ◦P, × ⟩− X ⟨f ◦Q, × ⟩ = X ((f3 ◦P )×1)− X ((f3 ◦Q)×1)
∂x ∂y ∂x ∂y
∫ ∫
= X (f3 (x, y, ψ(x, y)) − f3 (x, y, ϕ(x, y))) = V divf by (*).
Remark: Gauss’ divergence theorem can be extended to more general 3-dimensional solid regions
which are formed by ‘pasting together’ finitely many elementary solids V1 , . . . , Vk provided on any
shared boundary of Vi and Vj (for i ̸= j), the outward unit normals are in opposite directions (so
that the respective parts of surface integrals over ∂Vi and ∂Vj cancel each other).
Exercise-51: Derive ‘div(curl) ≡ 0’ using Gauss’s divergence theorem. [Hint: Let U ⊂ R3 be open,
and f : U → R3 be a C 1 -function, and assume f = curlg for some g : U → R3 . If (divf )(u) ̸= 0
for some u ∈ U , assume (divf )(u) > 0. Choose δ > 0 and a small solid sphere V ⊂ U centered at
∫ ∫ ∫
u such that divf ≥ δ in V . Then V divf ≥ δµ(V ) > 0. Hence by [149], ∂V f = V f > 0. On the
∫
other hand, ∂V f = 0 by Exercise-50(ii), a contradiction.]
*****