You are on page 1of 44

MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Session 8
Applications of Differentiation of
several variables

Contents:
Introduction, p 88
8.1 Maximum and Minimum Values, p 89
8.2 Convex and Concave Functions, p 95
8.3 Classifying Quadratic Forms, p 101
8.4 Unconstrained Maximization and Minimization, p 107
8.5 Lagrange Multipliers, p 111
Solutions of Activities, p 125
Summary, p 129
Learning Outcomes, p 130

Introduction

This session fully consists of applications of theory of several variable


functions. In the real world, sometimes we need to find extreme value(s) (or,
in other words, optimal value(s)) of a function. We refer to that process as
optimization.

There are two types of optimizing problems. The first type is unconstrained
optimization. In this case, there are no constraints imposed on the domain of
the function. The second type is constrained optimization. Here, several
constraints are imposed on the domain of the function and the goal is then to
find extreme values of the function subject to these constraints. We use the
so called method of Lagrange multipliers to find extreme values in such
cases. In this session we have dealt with both types of optimizing problems.

88

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

8.1 Maximum and Minimum Values

Definition
Let 𝑓 be a real-valued function defined on some 𝐷 ⊆ ℝ𝑛 and let 𝒙𝟎 ∈ 𝐷.
i. The function 𝑓 is said to have a global maximum (or absolute
maximum) at 𝑥0 if 𝑓(𝑥0 ) ≥ 𝑓(𝑥) for each 𝑥 ∈ 𝐷. The number
𝑓(𝑥0 ) is called the maximum value of 𝑓 on 𝐷.

ii. The function 𝑓 is said to have a global minimum (or absolute


minimum) at 𝑥0 if 𝑓(𝑥0 ) ≤ 𝑓(𝑥) for each 𝑥 ∈ 𝐷. The number 𝑓(𝑥0 )
is called the minimum value of 𝑓 on 𝐷.

iii. The function 𝑓 is said to have a local maximum (or relative


maximum) at 𝒙𝟎 if there exists an open ball 𝐵(𝒙𝟎 , 𝛿), where 𝛿 > 0,
such that for each 𝒙 ∈ 𝐵(𝒙𝟎 , 𝛿) ∩ 𝐷, 𝑓(𝒙𝟎 ) ≥ 𝑓(𝒙). The number
𝑓(𝒙𝟎 ) is called the maximum value of 𝑓 on 𝐵(𝒙𝟎 , 𝛿).

iv. The function 𝑓 is said to have a local minimum (or relative


minimum) at 𝒙𝟎 if there exists an open ball 𝐵(𝑥0 , 𝛿), where 𝛿 > 0,
such that for each 𝑥 ∈ 𝐵(𝒙𝟎 , 𝛿) ∩ 𝐷, 𝑓(𝒙𝟎 ) ≤ 𝑓(𝒙). The number
𝑓(𝒙𝟎 ) is called the minimum value of 𝑓 on 𝐵(𝒙𝟎 , 𝛿).

v. In each of the above cases the function 𝑓 is said to have a strict


global /local maximum or strict global /local minimum at 𝒙𝟎 if the
inequality is strict for each 𝒙 ≠ 𝒙𝟎 .

Note

𝑓 is said to have a global extremum (or absolute extremum) at 𝒙𝟎 whenever


𝑓 has either a global maximum or a global minimum at 𝒙𝟎 . Similarly, 𝑓 is
said to have a local extremum (or relative extremum) at 𝒙𝟎 whenever 𝑓 has

89

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

either a local maximum or a local minimum at 𝒙𝟎 . Thus, extremum of 𝑓 can


be either a maximum (global or local) or a minimum (global or local) of 𝑓.

Taken together, we refer to global maxima (plural form of maximum) and


global minima (plural form of minimum) of a function 𝑓 as global extrema
(plural form of extremum) or absolute extrema of 𝑓. Thus, global extrema
of 𝑓 occur at those points in Domn(𝑓) at which 𝑓 has either a global
maximum or a global minimum.

Similarly, we refer to local maxima and local minima of 𝑓 as local extrema


or relative extrema of 𝑓. Thus, local extrema of 𝑓 occur at those points in
Domn(𝑓) at which 𝑓 has either a local maximum or a local minimum.

The following theorem gives a necessary condition for a function 𝑓 to have


a local extremum at an interior point 𝒙𝟎 of the domain of 𝑓.

Theorem 8.1

Let 𝑓(𝑥1 , 𝑥2 , … , 𝑥𝑛 ) be a real-valued function defined on some set


𝐷 ⊆ ℝ𝑛 and let 𝒙𝟎 ∈ 𝐷 be an interior point of 𝐷. If the function 𝑓
has a local extremum at 𝒙𝟎 and all the first-order partial
𝜕𝑓
derivatives of 𝑓 exist at 𝒙𝟎 , then (𝒙𝟎 ) = 0 for each 𝑖 =
𝜕𝑥𝑖

1, 2, … , 𝑛.

Remark

Sometimes it is possible that one (or perhaps more than one) of these partial
derivatives does not exist and yet 𝑓 has an extremum at 𝒙𝟎 (compare with
|𝑥| defined on the interval [−1, 1] ).

90

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Figure 8.1: |𝑥| is not differentiable at 0. But, |𝑥| has a strict global
minimum at 0.

We can use this theorem to locate points, if any, in the interior of the
domain of a function 𝑓 at which 𝑓 has a local extremum provided that all
the first-order partial derivatives of 𝑓 exist at those points (compare this
with the analogous result in one variable case).

The converse of the above theorem, however, is not true always. It is


possible for a function 𝑓 to have first-order partial derivatives each of which
is equal to 0 at some point 𝒙 yet, 𝑓 has neither a local maximum nor a local
minimum at 𝒙. The following example explains this situation.

Example 8.1
Let 𝑓(𝑥, 𝑦) = 𝑥 2 − 𝑦 2.

Notice that
𝜕𝑓 𝜕𝑓
(0, 0) = 0 and (0, 0) = 0.
𝜕𝑥 𝜕𝑦

Let us show that 𝑓 has neither a local maximum nor a local minimum at
(0, 0). To this end let 𝛿 > 0.

91

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

𝛿 𝛿
Put 𝒙 = (2 , 0) and 𝒚 = (0, 2). Then, 𝒙, 𝒚 ∈ 𝐵(𝟎, 𝛿 ).

𝛿 2 𝛿2
Notice that 𝑓(𝑥) = (2) − 0 = > 0 = 𝑓(0, 0) and hence 𝑓 does not
4

have a local maximum at (0, 0).


𝛿 2 −𝛿 2
Also, 𝑓(𝑦) = 0 − (2) = < 0 = 𝑓(0, 0) and hence 𝑓 does not have
4

a local minimum at (0, 0). See Figure 8.2.

Figure 8.2: 𝑧 = 𝑥 2 − 𝑦 2 has no extremum at (0,0,0).

Note

A point 𝒙 in the domain of 𝑓 is said to be a critical point or stationary point


𝜕𝑓
of 𝑓 if ∂𝑥 (𝒙) = 0 for each 𝑖 = 1, 2, … , 𝑛 or at least one of these partial
𝑖

derivatives does not exist at 𝒙. Thus, any point at which 𝑓 has a local
extremum is a critical point of 𝑓. However, as in the single variable
calculus, not all critical points give rise to extrema. At a critical point, a
function could have a local maximum or a local minimum or neither (cf.
Example 8.1).

92

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Definition
A critical point 𝒙𝟎 is called a saddle point if for each 𝜖 > 0, there exist
𝒙, 𝒚 ∈ 𝐵(𝒙𝟎 , 𝜖) such that 𝑓(𝒙) > 𝑓(𝒙𝟎 ) and 𝑓(𝒚) < 𝑓(𝒙𝟎 ).
In example 8.1, the point (0,0) is a saddle point.

Theorem 8.2

Second Derivative Test


Let 𝑓(𝑥, 𝑦) be a real-valued function defined on some open set 𝐷 ⊆
ℝ2 and let 𝒙𝟎 ∈ 𝐷. Suppose that 𝑓 has continuous second-order
partial derivatives on an open ball 𝐵(𝒙𝟎 , 𝛿) ⊆ 𝐷, where 𝛿 > 0 and
𝜕𝑓 𝜕𝑓
that (𝒙𝟎 ) = (𝒙𝟎 ) = 0.
𝜕𝑥 𝜕𝑦
2
Let 𝛥(𝒙𝟎 ) = 𝑓𝑥𝑥 (𝒙𝟎 )𝑓𝑦𝑦 (𝒙𝟎 ) − [𝑓𝑥𝑦 (𝒙𝟎 )] .
a. If 𝛥(𝒙𝟎 ) > 0 and 𝑓𝑥𝑥 (𝒙𝟎 ) > 0, then 𝑓 has a local minimum at
𝒙𝟎 .
b. If 𝛥(𝒙𝟎 ) > 0 and 𝑓𝑥𝑥 (𝒙𝟎 ) < 0, then 𝑓 has a local maximum at
𝒙𝟎 .
c. If 𝛥(𝒙𝟎 ) < 0, then 𝑓 does not have a local extremum at 𝒙𝟎 .
d. If 𝛥(𝒙𝟎 ) = 0, then the test is inconclusive.

Note

In case c. of above theorem the point 𝒙𝟎 is a saddle point of 𝑓. Can you


explain why the above test is inconclusive when 𝛥(𝒙𝟎 ) = 0? Try to give
examples!

93

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Figure 8.3

Figure 8.4

Example 8.2
Find critical points of following functions and determine the nature of
critical points.
a. 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 + 2𝑥 − 4𝑦 + 3
b. 𝑓(𝑥, 𝑦) = 𝑦 2 − 𝑥 2

Solution

94

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

a. Notice that (𝑎, 𝑏) is a critical point of 𝑓 if and only if


𝜕𝑓 𝜕𝑓
(𝑎, 𝑏) = (𝑎, 𝑏) = 0 if and only if
𝜕𝑥 𝜕𝑦
2𝑎 + 2 = 0 and 2𝑏 − 4 = 0 if and only if
𝑎 = −1 and 𝑏 = 2.
So, (−1, 2) is the only critical point of 𝑓.
Now,
2
∆(−1,2) = 𝑓𝑥𝑥 (−1,2)𝑓𝑦𝑦 (−1,2) − [𝑓𝑥𝑦 (−1,2)] = 2.2 − 0 = 4 > 0
and
𝑓𝑥𝑥 (−1, 2) = 2 > 0.
Thus 𝑓 has a local minimum at (−1, 2).

b. Notice that (𝑎, 𝑏) is a critical point of 𝑓 if and only if


𝜕𝑓 𝜕𝑓
(𝑎, 𝑏) = (𝑎, 𝑏) = 0 if and only if
𝜕𝑥 𝜕𝑦
−2𝑎 = 0 and 2𝑏 = 0 if and only if
𝑎 = 𝑏 = 0.

So, 𝑓 has only one critical point namely (0, 0).


Notice that
2
∆(0,0) = 𝑓𝑥𝑥 (0, 0) ⋅ 𝑓𝑦𝑦 (0, 0) − (𝑓𝑥𝑦 (0, 0)) = (−2)(2) − 0 = −4

<0
Thus (0, 0) is a saddle point of 𝑓.

8.2 Convex and Concave Functions


Definition

A set 𝑆 ⊆ ℝ𝑛 is said to be convex if for each 𝒙′ , 𝒙″ ∈ 𝑆 and for each 𝜆 ∈


[0, 1], 𝜆𝒙′ + (1 − 𝜆)𝒙″ ∈ 𝑆.

Definition
Let 𝑓(𝑥1 , … , 𝑥𝑛 ) be a function defined on a convex set 𝑆 ⊆ ℝ𝑛 .

95

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Then 𝑓 is said to be a convex function if for all 𝒙′ , 𝒙″ ∈ 𝑆 and for all 0 ≤


𝜆 ≤ 1, 𝑓(𝜆𝒙′ + (1 − 𝜆)𝒙″ ) ≤ 𝜆𝑓(𝒙′ ) + (1 − 𝜆)𝑓(𝒙″ ).

Example 8.2

Show that the function 𝑓(𝑥1 , 𝑥2 ) = 𝑥1 2 + 𝑥2 2 is a convex function.

Solution:

Let 𝑥 ′ ≡ (𝑥1 ′ , 𝑥2 ′ ) and 𝑥 ′′ ≡ (𝑥1 ′′ , 𝑥2 ′′ )


Want: 𝑓(𝜆𝑥 ′ + (1 − 𝜆)𝑥 ″ ) ≤ 𝜆𝑓(𝑥 ′ ) + (1 − 𝜆)𝑓(𝑥 ″ ).

L.H.S. = 𝑓(𝜆(𝑥1 ′ , 𝑥2 ′ ) + (1 − 𝜆)(𝑥1 ′′ , 𝑥2 ′′ ))


= 𝑓(𝜆𝑥1 ′ + (1 − 𝜆)𝑥1 ′′ , 𝜆𝑥2 ′ + (1 − 𝜆)𝑥2 ′′ )
=[𝜆𝑥1 ′ + (1 − 𝜆)𝑥1 ′′ ]2 + [ 𝜆𝑥2 ′ + (1 − 𝜆)𝑥2 ′′ ]2

Since 𝑓(𝑥) = 𝑥 2 is convex,


𝑓(𝜆𝑥1 ′ + (1 − 𝜆)𝑥1 ′′ ≤ 𝜆 𝑓(𝑥1 ′ ) + (1 − 𝜆)𝑓(𝑥1 ′′ )
[𝜆𝑥1 ′ + (1 − 𝜆)𝑥1 ′′ ]2 ≤ 𝜆𝑥1 ′ 2 + (1 − 𝜆)𝑥1 ′′ 2
Similarly
[𝜆𝑥2 ′ + (1 − 𝜆)𝑥2 ′′ ]2 ≤ 𝜆𝑥2 ′ 2 + (1 − 𝜆)𝑥2 ′′ 2

2 2 2 2
L.H.S. ≤ 𝜆𝑥1 ′ + (1 − 𝜆)𝑥1 ′′ + 𝜆𝑥2 ′ + (1 − 𝜆)𝑥2 ′′
2 2 2 2
= 𝜆(𝑥1 ′ + 𝑥1 ′′ ) + (1 − 𝜆)[𝑥1 ′′ + 𝑥2 ′′ ]
= 𝜆𝑓(𝑥 ′ ) + (1 − 𝜆)𝑓(𝑥 ″ ).
∴ 𝑓(𝜆𝑥 ′ + (1 − 𝜆)𝑥 ″ ) ≤ 𝜆𝑓(𝑥 ′ ) + (1 − 𝜆)𝑓(𝑥 ″ ).

Definition

96

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Let 𝑓(𝑥1 , … , 𝑥𝑛 ) be a function defined on a convex set 𝑆 ⊆ ℝ𝑛 . Then 𝑓 is


said to be a concave function if for all𝒙′ , 𝒙″ ∈ 𝑆, and for all 0 ≤ 𝜆 ≤ 1,
𝑓(𝜆𝒙′ + (1 − 𝜆)𝒙″ ) ≥ 𝜆𝑓(𝒙′ ) + (1 − 𝜆)𝑓(𝒙″ ).

Remark

Let 𝑓(𝑥1 , … , 𝑥𝑛 ) be a function defined on a convex set 𝑆 ⊆ ℝ𝑛 . Then,


i. 𝑓(𝑥1 , … , 𝑥𝑛 ) is a convex function if and only if −𝑓(𝑥1 , … , 𝑥𝑛 ) is a
concave function.
ii. 𝑓(𝑥1 , … , 𝑥𝑛 ) is a concave function if and only if −𝑓(𝑥1 , … , 𝑥𝑛 ) is a
convex function.

Example 8.3
 𝑓(𝑥) = 𝑥 2 and 𝑓(𝑥) = 𝑒 𝑥 are convex functions on ℝ.
 A linear function is both a convex and a concave function.
 The sum of two convex (concave) functions is convex (concave).
 If 𝑓(𝑥1 , … , 𝑥𝑛 ) is a convex function on a convex set 𝑆, then for 𝑐 ≥
0, 𝑐𝑓(𝑥1 , … , 𝑥𝑛 ) is a convex function on 𝑆. If 𝑐 ≤ 0, 𝑐𝑓(𝑥1 , … , 𝑥𝑛 ) is
a concave function on 𝑆.
 sin(𝑥 + 𝑦) , cos(𝑥 + 𝑦) are neither convex nor concave on ℝ2 .
 −𝑥 2 − 𝑦 2 is concave on ℝ2 .
 𝑓(𝑥, 𝑦) = 𝑥 2 𝑦 2 is neither convex nor concave on ℝ2 .
 𝑓(𝑥, 𝑦) = (𝑥 + 𝑦)2 is convex on ℝ2 .

97

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Figure 8.5 Figure 8.6

Figure 8.7

8.2.1 Test for Convex and Concave functions

Definition
An 𝑖 𝑡ℎ principal minor of an 𝑛 × 𝑛 matrix is the determinant of any 𝑖 × 𝑖
matrix obtained by deleting 𝑛 − 𝑖 rows and the corresponding 𝑛 − 𝑖
columns of the matrix.

98

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Example 8.4
1 1 2
𝑠𝑡 𝑛𝑑 𝑟𝑑
Find all the 1 , 2 and 3 principal minors of the matrix (1 1 3).
2 3 2
Solution
Notice that there are three first principal minors, three second principal
minors and one third principal minor. Let us denote the 𝑖 th row of the
matrix by 𝑅𝑖 and the 𝑖 th column of the matrix by 𝐶𝑖 . Then, the three first
principal minors of the matrix are obtained by deleting:
i. 𝑅2 , 𝑅3 , 𝐶2 , 𝐶3
ii. 𝑅1 , 𝑅3 , 𝐶1 , 𝐶3 and
iii. 𝑅1 , 𝑅2 , 𝐶1 , 𝐶2 .
The three second principal minors of the matrix are obtained by
deleting:
i. 𝑅1 , 𝐶1
ii. 𝑅2 , 𝐶2 and
iii. 𝑅3 , 𝐶3 .
Third principal minor of the matrix is just the determinant of the matrix.

Definition
The 𝑘 𝑡ℎ leading principal minor of an 𝑛 × 𝑛 matrix is the determinant of the
𝑘 × 𝑘 matrix obtained by deleting the last 𝑛 − 𝑘 rows and columns of the
matrix.

Definition
Let 𝑓(𝑥1 , … , 𝑥𝑛 ) be a function defined on an open subset 𝑈 in ℝ𝑛 and let
𝒂 ∈ 𝑈. Suppose 𝑓 has first order partial derivatives on 𝑈 with respect to
each variable 𝑥𝑖 and that each second-order partial derivative of 𝑓 exist at 𝒂.
Then the Hessian of 𝑓 at 𝒂, denoted by 𝐻𝑓(𝒂), is the 𝑛 × 𝑛 matrix
𝜕2
𝐻𝑓(𝒂) = (ℎ𝑖𝑗 (𝒂)) = ( 𝑓(𝒂))
𝜕𝑥𝑖 𝜕𝑥𝑗

99

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

𝜕 2𝑓 𝜕 2𝑓 𝜕 2𝑓
(𝒂) (𝒂) ⋯ (𝒂)
𝜕𝑥1 𝜕𝑥1 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥𝑛
𝜕 2𝑓 𝜕 2𝑓 𝜕 2𝑓
= 𝜕𝑥2 𝜕𝑥1 (𝒂) (𝒂) ⋯ (𝒂)
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2 𝜕𝑥𝑛
⋮ ⋮ ⋱ ⋮
𝜕 2𝑓 𝜕 2𝑓 𝜕 2𝑓
(𝒂) (𝒂) ⋯ (𝒂)
(𝜕𝑥𝑛 𝜕𝑥1 𝜕𝑥𝑛 𝜕𝑥2 𝜕𝑥𝑛 𝜕𝑥𝑛 )

𝐻𝑓(𝒙) is a function of 𝒙 = (𝑥1 , … , 𝑥𝑛 ) and plays an important role in the


theory of optimization because it can be used to classify the critical points of
a function. Notice that if 𝑓 has continuous second-order partial derivatives
on 𝑈 then, 𝐻𝑓(𝒂) is symmetric (Theorem 3.1).

If 𝑛 = 2 and if𝑓 has continuous second-order partial derivatives on 𝑈, the


2
determinant of 𝐻𝑓(𝒂) equals to Δ(𝒂) = 𝑓𝑥1 𝑥1 (𝒂)𝑓𝑥2 𝑥2 (𝒂) − [𝑓𝑥1 𝑥2 (𝒂)] ,
which is used in the second derivative test for functions of two variables.

Theorem 8.3

Let 𝑆 be an open set which is also convex. Suppose 𝑓(𝑥1 , … , 𝑥𝑛 )


has continuous second-order partial derivatives at each point 𝒙 =
(𝑥1 , … , 𝑥𝑛 ) ∈ 𝑆. Then 𝑓(𝑥1 , … , 𝑥𝑛 ) is a convex function on 𝑆 if and
only if for each 𝒙 ∈ 𝑆, all the principal minors of 𝐻𝑓(𝒙) are
nonnegative.

Theorem 8.4

Let 𝑆 be an open set which is also convex. Suppose 𝑓(𝑥1 , … , 𝑥𝑛 )


has continuous second-order partial derivatives at each point 𝒙 =
(𝑥1 , … , 𝑥𝑛 ) ∈ 𝑆. Then 𝑓(𝑥1 , … , 𝑥𝑛 ) is a concave function on 𝑆 if
and only if for each 𝒙 ∈ 𝑆 and 𝑘 = 1,2, … , 𝑛, all nonzero principal
minors of 𝐻𝑓(𝒙) of order 𝑘 have the same sign as (−1)𝑘 .

100

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Activity 8.1

1. Show that 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 + 2𝑥1 𝑥2 + 𝑥22 is a convex function on ℝ2 .

2. Show that 𝑓(𝑥1 , 𝑥2 ) = −𝑥12 − 𝑥1 𝑥2 − 2𝑥22 is a concave function on ℝ2 .

3. Show that 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 3𝑥1 𝑥2 + 2𝑥22 is neither convex nor concave on ℝ2 .

4. Show that 𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = 𝑥12 + 𝑥22 + 2𝑥32 − 𝑥1 𝑥2 − 𝑥2 𝑥3 − 𝑥1 𝑥3 is a convex


function.

5. What about the function 𝑓(𝑥1 , 𝑥2 ) defined on ℝ2 by 𝑓(𝑥1 , 𝑥2 ) = 𝑥13 + 2𝑥1 𝑥2 + 𝑥22 ?

8.3. Classifying Quadratic forms

At this point we would like you to introduce one of the most compelling
areas in linear algebra, known as quadratic forms, simply because of its
extensive uses in optimizing problems occurred in engineering. You can
find more details on this topic in any linear algebra text.

Unless stated otherwise, from now onwards we regard points in ℝ𝑛 as 1 × 𝑛


row-matrices.

Definition
A function 𝑄: ℝ𝑛 → ℝ is said to be a quadratic form on ℝ𝑛 if there exists
an 𝑛 × 𝑛 symmetric matrix 𝐴 such that 𝑄(𝒙) = 𝒙𝐴𝒙𝑇 for each 𝒙 ∈ ℝ𝑛 . The
matrix 𝐴 is called the matrix of the quadratic form.

Example 8.5
Let 𝑥 = [𝑥1 𝑥2 ]. Compute 𝒙𝐴𝒙𝑇 , for the matrices:
3 0 −4 1
𝐴1 = [ ] , 𝐴2 = [ ].
0 5 1 2

101

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Solution
3 0
When 𝐴1 = [ ],
0 5
𝒙𝐴1 𝒙𝑇 = [𝑥1 𝑥2 ] [3 0 [𝑥 𝑥 ]𝑇
] 1 2
0 5
= [3𝑥1 5𝑥2 ][𝑥1 𝑥2 ]𝑇
= 3𝑥12 + 5𝑥22 .

−4 1
When 𝐴 = [ ],
1 2
−4 1 [𝑥 𝑥 ]𝑇
𝒙𝐴2 𝒙𝑇 = [𝑥1 𝑥2 ] [ ] 1 2
1 2
= [−4𝑥1 + 𝑥2 𝑥1 + 2𝑥2 ][𝑥1 𝑥2 ]𝑇
= −4𝑥12 + 2𝑥1 𝑥2 + 2𝑥22 .
Thus, 𝐴1 and 𝐴2 are matrices of the quadratic form.

Example 8.6
For 𝒙 ∈ ℝ3 , define 𝑄(𝒙) = 7𝑥12 + 5𝑥22 + 3𝑥32 − 2𝑥1 𝑥2 + 6𝑥2 𝑥3 . Find a
symmetric matrix 𝐴 such that 𝑄(𝒙) = 𝒙𝐴𝒙𝑇 for each 𝒙 ∈ ℝ3 .

Solution
7 −1 0
Let 𝐴 = [−1 5 3]. Then for each 𝒙 = (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 ,
0 3 3
7 −1 0
[𝑥1 𝑥2 𝑥3 ] [−1 5 3] [𝑥1 𝑥2 𝑥3 ]𝑇
0 3 3
= [7𝑥1 − 𝑥2 −𝑥1 + 5𝑥2 + 3𝑥3 3𝑥2 + 3𝑥3 ][𝑥1 𝑥2 𝑥3 ]𝑇
= 7𝑥12 − 2𝑥1 𝑥2 + 5𝑥22 + 6𝑥2 𝑥3 + 3𝑥32 = 𝑄(𝒙).

Example 8.7

102

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

For 𝒙 ∈ ℝ2 , let 𝑄(𝒙) = 𝑥12 + 4𝑥1 𝑥2 + 2𝑥22 . Find a symmetric matrix 𝐴


such that 𝑄(𝒙) = 𝒙𝐴𝒙𝑇 for each 𝒙 ∈ ℝ2 .
Solution
1 2
Let 𝐴 = [ ].
2 2
Then for each 𝒙 = (𝑥1 , 𝑥2 ) ∈ ℝ2 ,

[𝑥1 𝑥2 ] [1 2] [𝑥1 𝑥2 ]𝑇 = [𝑥1 + 2𝑥2 2𝑥1 + 2𝑥2 ][𝑥1 𝑥2 ]𝑇


2 2
= 𝑥12 + 4𝑥1 𝑥2 + 2𝑥22 = 𝑄(𝒙).

8.3.1 Classification Quadratic Forms

Definition
A quadratic form 𝑄 on ℝ𝑛 is said to be
1. positive definite if 𝑄(𝒙) > 0 for all 𝒙 ≠ 𝟎,
2. positive semidefinite if 𝑄(𝒙) ≥ 0 for all 𝒙 ∈ ℝ𝑛 ,
3. negative definite if 𝑄(𝒙) < 0 for all 𝒙 ≠ 𝟎,
4. negative semidefinite if 𝑄(𝒙) ≤ 0 for all 𝒙 ∈ ℝ𝑛 ,
5. indefinite if there exist 𝒙, 𝒚 ∈ ℝ𝑛 such that 𝑄(𝒙) > 0 and 𝑄(𝒚) < 0.

Theorem 8.5

Let 𝐴 be an 𝑛 × 𝑛 symmetric matrix. Then a quadratic form 𝒙𝐴𝒙𝑇 is


1. Positive definite if and only if the eigenvalues of 𝐴 are all
positive (i.e. strictly greater than zero).
2. Positive semidefinite if and only if the eigenvalues of 𝐴 are all
non-negative.
3. Negative definite if and only if the eigenvalues of 𝐴 are all
negative (i.e. strictly less than zero).
4. Negative semidefinite if and only if the eigenvalues of 𝐴 are all
non-positive.

103

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

5. Indefinite if and only if 𝐴 has both positive and negative


eigenvalues.

Example 8.8
Classify the quadratic form 𝑄(𝒙) = 3𝑥12 − 2𝑥22 + 2𝑥32 − 2𝑥1 𝑥3 .
Solution
Notice that 𝑄(𝒙) = [𝑥1 𝑥2 𝑥3 ]𝐴 [𝑥1 𝑥2 𝑥3 ]𝑇 ,
3 0 −1
where 𝐴 = [ 0 −2 0 ].
−1 0 2

Now let us find Eigen values of 𝐴. Eigen values of 𝐴 are given by the
equation |𝐴 − 𝜆𝐼| = 0.

By solving this equation we get, as Eigen values of 𝐴,


5+√5 5−√5
𝜆 = −2, 𝜆 = and 𝜆 = .
2 2

Since Eigen values of 𝐴 are both positive and negative, the given
quadratic form of 𝑄 is indefinite.

Definition
An 𝑛 × 𝑛 symmetric matrix 𝐴 is said to be
1. Positive definite (PD) if the quadratic form 𝒙𝐴𝒙𝑇 is positive
definite.
2. Positive semidefinite (PSD) if the quadratic form 𝒙𝐴𝒙𝑇 is positive
semidefinite.
3. Negative definite (ND) if the quadratic form 𝒙𝐴𝒙𝑇 is negative
definite.
4. Negative semidefinite (NSD) if the quadratic form 𝒙𝐴𝒙𝑇 is negative
semidefinite.

104

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

1. Indefinite (ID) if the quadratic form 𝒙𝐴𝒙𝑇 is indefinite.

Remark

Let 𝐴 be a symmetric matrix. Then we have the following two results.


i. If 𝐴 is positive definite then 𝐴 is positive semidefinite.
ii. If 𝐴 is negative definite then 𝐴 is negative semidefinite.

Activity 8.2

1.What are the positive definite and positive semidefinite matrices in 𝑀1 − the set
of all 1 × 1 matrices?

1 1
1. Explain why ( ) is positive semidefinite but not positive definite.
1 1

1 4
2. Is 𝐴 = ( ) positive definite? (Observe that all the entries of 𝐴 are positive).
4 1

−1 1
3. Determine whether the matrix ( ) is PD, PSD,ND or NSD.
1 −4

𝑎 0 0
4. Consider the diagonal matrix 𝐴 = (0 𝑏 0). Determine the values of 𝑎, 𝑏, 𝑐
0 0 𝑐
which make 𝐴,

5. (a.)PD (b.)PSD (c.) ND (d.)NSD (e.) ID

Theorem 8.6

Let 𝐴 be an 𝑛 × 𝑛 real symmetric matrix. The following statements


are equivalent.
I. A is positive definite.
II. All the leading principal minors of 𝐴 are positive. (i.e. strictly
greater than zero).

105

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Example 8.9
2 −1 0
Show that 𝐴 = (−1 2 −1) is positive definite.
0 −1 2

Solution
Notice that, the leading principal minors of 𝐴 are 2, 3, and 4. Since they
are all positive, 𝐴 is positive definite.

Theorem 8.7

Let 𝐴 be an 𝑛 × 𝑛 real symmetric matrix. The following statements


are equivalent.
I. A is positive semidefinite.
II. All the principal minors of 𝐴 are non-negative.

Note

Observe that condition II applies to all the principal minors, not only for
leading principal minors of 𝐴. Otherwise, it is not possible to distinguish
between two matrices whose leading principal minors were all zero.

0 0 0 0
For example ( ) is positive semidefinite and ( ) is negative
0 1 0 −1
semidefinite. But, all the leading principal minors of both matrices are equal
to zero.

Example 8.10
2 −1 −1
Show that 𝐴 = (−1 2 −1) is positive semidefinite.
−1 −1 2

106

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Solution
Notice that all the first principal minors of 𝐴are equal to 2 and all the
second principal minors of 𝐴 are equal to 3. The determinant of 𝐴 is 0.
Since all the principal minors of 𝐴 are nonnegative, it follows that 𝐴 is
positive semidefinite.

Theorem 8.8

Let 𝐴 be an 𝑛 × 𝑛 real symmetric matrix. For each 1 ≤ 𝑘 ≤ 𝑛, 𝑘 ∈


ℕ, let 𝛥𝑘 , denote the 𝑘 𝑡ℎ leading principal minor.
I. The following statements are equivalent.
II. A is negative definite.
(−1)𝑘 𝛥𝑘 > 0 for all 1 ≤ 𝑘 ≤ 𝑛.

Theorem 8.9
𝑛
Let 𝐴 be an 𝑛 × 𝑛 real symmetric matrix. Let {|𝐴𝑘𝑖 |: 1 ≤ 𝑖 ≤ (𝑛−𝑘 ),

𝑖, 𝑘 ∈ ℕ}be the set of all 𝑘 𝑡ℎ -order principal minors of 𝐴. The


following statements are equivalent.
I. 𝐴 is negative semidefinite.
II. For each 1 ≤ 𝑘 ≤ 𝑛, if |𝐴𝑘𝑖 | ≠ 0 then, (−1)𝑘 |𝐴𝑘𝑖 | > 0 (i.e. all
the nonzero principal minors of order 𝑘,1 ≤ 𝑘 ≤ 𝑛, have the same
sign as (−1)𝑘 ).

8.4 Unconstrained Maximization and Minimization

In this section, all the functions are assumed to be defined on ℝ𝑛 . Also, let
us assume that they have continuous first and second order partial
derivatives on ℝ𝑛 .

107

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

The following three theorems give conditions (involving the Hessian of a


function 𝑓) under which a function 𝑓 has a strict local minimum (and hence
a local minimum), a strict local maximum (and hence a local maximum), or
𝑓 does not have a local extremum at a critical point.

Let𝑓 be a function defined on ℝ𝑛 and let 𝒙𝟎 be a critical point of 𝑓.

Theorem 8.10

If 𝐻𝑓(𝒙𝟎 ) is positive definite, then 𝑓 has a strict local minimum at


𝒙𝟎 .

Theorem 8.11

If 𝐻𝑓(𝒙𝟎 ) is negative definite, then 𝑓 has a strict local maximum at


𝒙𝟎 .

Theorem 8.12

If |𝐻𝑓(𝒙𝟎 )| ≠ 0 and 𝐻𝑓(𝒙𝟎 ) is neither positive definite nor


negative definite, then 𝒙𝟎 is a saddle point.

Example 8.11
Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 𝑥1 2 + 𝑥2 2 . Notice that (0, 0) is a
critical point of 𝑓 (indeed (0, 0) is the only critical point of 𝑓).

2 0
The Hessian of 𝑓 at 𝒙, 𝐻𝑓(𝒙) = ( ). In particular, the Hessian of 𝑓
0 2
2 0 2 0
at (0, 0) is also ( ). Clearly ( ) is positive definite. Thus, by
0 2 0 2
Theorem 8.10, 𝑓 has a strict local minimum at (0, 0)(indeed 𝑓 has a
strict global minimum at (0, 0)). See Figure 8.8.

108

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Figure 8.8

Example 8.12
For each (𝑥1 , 𝑥2 ) ∈ ℝ2 ,define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 4𝑥1 𝑥2 . Notice
0 4
that,𝐻𝑓(0, 0) = ( ) and |𝐻𝑓(0,0)| ≠ 0. Cleary 𝐻𝑓(0, 0) is neither
4 0
positive definite nor negative definite. Therefore (0, 0) is a saddle point.
We can obtain the same conclusion in the following way.
𝜖 𝜖 −𝜖 𝜖 𝜖 𝜖
Let 𝜖 > 0. Then (2 , 2) , ( 2 , 2) ∈ 𝐵(𝟎, 𝜖). Notice that 𝑓 (2 , 2) = 𝜖 2 > 0
−𝜖 𝜖
and 𝑓 ( 2 , 2) = −𝜖 2 < 0. Since 𝜖 > 0 is arbitrary it follows from

Definition 8.2 that (0,0) is a saddle point. See Figure 8.9.

109

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Figure 8.9

Remark

If |𝐻𝑓(𝒙𝟎 )| = 0, then 𝑓 may have a local extremum at 𝒙𝟎 or 𝒙𝟎 can be a


saddle point, and the preceding tests are inconclusive. Also, if the Hessian is
positive semidefinite or negative semidefinite at a critical point, then it
cannot be concluded that 𝑓necessarily has an extremum at that point.

Example 8.13
Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 𝑥14 − 𝑥24 for all (𝑥1 , 𝑥2 ) ∈ ℝ2 .
Then ∇𝑓(𝑥1 , 𝑥2 ) = (4𝑥13 , −4𝑥23 ), which yields the critical point (0, 0).
12𝑥12 0
Also, 𝐻𝑓(𝑥1 , 𝑥2 ) = ( ).
0 −12𝑥22
Now, 𝐻𝑓(0,0) is the zero matrix, which is positive semidefinite (also
negative semidefinite). Observe that, 𝑓 has neither a local maximum nor
a local minimum at (0, 0).

Example 8.14

110

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = −𝑥14 − 𝑥24 for all (𝑥1 , 𝑥2 ) ∈ ℝ2 .


Notice that (0, 0) is a critical point for 𝑓 and 𝐻𝑓(0, 0) is the zero matrix.
However, it is easy to see that 𝑓 has a strict global maximum at (0, 0).

Theorem 8.13

Let 𝑓 be a function defined on ℝ𝑛 and 𝒙𝟎 be a critical point.


I. If 𝐻𝑓(𝒙) is positive semidefinite for all 𝒙 ∈ ℝ𝑛 , then 𝑓 has a
global minimum at 𝒙𝟎 .
II. If 𝐻𝑓(𝒙) is negative semidefinite for all 𝒙 ∈ ℝ𝑛 , then 𝑓 has a
global maximum at 𝒙𝟎 .
III. If 𝐻𝑓(𝒙) is positive definite for all 𝒙 ∈ ℝ𝑛 , then 𝑓 has a strict
global minimum at 𝒙𝟎 .
IV. If 𝐻𝑓(𝒙) is negative definite for all 𝒙 ∈ ℝ𝑛 , then 𝑓 has a strict
global maximum at 𝒙𝟎 .

Activity 8.3

1. Find all points at which 𝑓 has local maxima, local minima, and saddle points
for,

i. 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 𝑥2 + 𝑥23 𝑥1 − 𝑥1 𝑥2 .

ii. 𝑓(𝑥1 , 𝑥2 ) = 𝑥13 − 3𝑥1 𝑥22 + 𝑥24 .

iii. 𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = 𝑥1 𝑥2 + 𝑥2 𝑥3 + 𝑥1 𝑥3 .

2. Determine the nature of the critical points (if any) of following functions.

i. 𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = −𝑥12 − 𝑥22 − 𝑥32 + 𝑥1 𝑥2 − 𝑥2 𝑥3 + 𝑥1 𝑥3 .

ii. 𝑓(𝑥1 , 𝑥2 ) = 𝑒 𝑥1−𝑥2 + 𝑒 𝑥2−𝑥1 .


2 2
iii. 𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = 𝑒 𝑥1 −𝑥2 + 𝑒 𝑥2 −𝑥1 + 𝑒 𝑥1 + 𝑒 𝑥3 .

iv. 𝑓(𝑥1 , 𝑥2 ) = 𝑒 𝑥1−𝑥2 + 𝑒 𝑥1+𝑥2 .

3. Show that the functions 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥23 and 𝑔(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥24 both have a
critical point at (𝑥1 , 𝑥2 ) = (0, 0) and that their associated Hessians are

111

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

PSD. Show that𝑔 has a local (global) minimum at(0, 0)and that 𝑓 has no
extremum at (0, 0).

Definition
Let ∅ ≠ 𝐴 ⊆ ℝ𝑛 and let 𝑓 be a function defined on 𝐴. 𝑓 is said to have a
maximum on 𝐴 if there exists 𝒂 ∈ 𝐴 such that 𝑓(𝒂) ≥ 𝑓(𝒙) for each 𝒙 ∈ 𝐴.
𝑓 is said to have a minimum on 𝐴 if there exists 𝒃 ∈ 𝐴 such that 𝑓(𝒃) ≤
𝑓(𝒙) for each 𝒙 ∈ 𝐴. The values 𝑓(𝒂) and 𝑓(𝒃) are called, respectively, as
the maximum value of 𝑓 on 𝐴 and the minimum value of 𝑓 on 𝐴.

Example 8.15
The function 𝑓: (0, 1) → ℝ defined by 𝑓(𝑥) = 𝑥 2 for each 𝑥 ∈ (0, 1) has
neither a maximum nor a minimum on (0, 1).

Example 8.16
Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥22 for each (𝑥1 , 𝑥2 ) ∈ ℝ2 . Then
𝑓 has no maximum on the open unit disc ‖𝒙‖ < 1. However, 𝑓 has a
minimum on the open unit disc and 𝑓 takes its minimum at (0, 0).

8.5 Lagrange Multipliers

Almost all the functions that we discussed in the previous section were
defined on ℝ𝑛 and are differentiable on ℝ𝑛 . So, the problem of finding
points at which a function has local extrema was easily solved by invoking
Theorem 8.1. Functions that we are going to consider in this section are also
differentiable. But, the domain will no longer be ℝ𝑛 . There are certain
situations in which we need to impose constraints on the domain of the
function. These constraints then restrict the domain of the function. Our goal
in this section is actually to optimize a given function 𝑓, i.e. to find
maximum or minimum values of 𝑓, on this kind of restricted domain. It is

112

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

called a constrained optimization problem. The method which we are going


to introduce shortly is known as method of Lagrange multipliers and is used
to tackle such problems. To begin with, let us first consider some simple
examples.

Examples 8.17
Find the minimum value of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 3𝑥𝑦 + 6𝑦 2 subject to the
constraint 𝑥 + 𝑦 = 56.

Solution
Obviously, the function 𝑓 is defined for all (𝑥, 𝑦) ∈ ℝ2 and
differentiable on ℝ2 . But, we need to investigate the function on the set
{(𝑥, 𝑦) ∈ ℝ2 : 𝑥 + 𝑦 = 56}. First let us find critical points of 𝑓.

Notice that,
∇𝑓 = 𝟎 if and only if
(8𝑥 + 3𝑦, 3𝑥 + 12𝑦) = 𝟎 if and only if
8𝑥 + 3𝑦 = 0 and 3𝑥 + 12𝑦 = 0 if and only if
𝑥 = 𝑦 = 0.

So, 𝑓 has only one critical point and it is (0, 0). Clearly (0,0) does not
satisfy 𝑥 + 𝑦 = 56. In other words, (0, 0) ∉ {(𝑥, 𝑦) ∈ ℝ2 : 𝑥 + 𝑦 = 56}
– the domain of interested.

2
Since Δ(0,0) = 𝑓𝑥𝑥 (0,0)𝑓𝑦𝑦 (0,0) − (𝑓𝑥𝑦 (0,0)) = 8 ⋅ 12 − 32 > 0 and

𝑓𝑥𝑥 (0,0) = 8 > 0, the second derivative test implies that 𝑓 has a local
minimum at (0,0). By writing 𝑓 in the equivalent form (𝑥, 𝑦) = 3𝑥 2 +
3𝑦 2 15𝑦 2
(𝑥 + ) + , we can see that 𝑓(𝑥, 𝑦) ≥ 0 = 𝑓(0,0) for all (𝑥, 𝑦) ∈
2 4

ℝ2 . Hence, not only 𝑓 has a local minimum at (0,0), but it also has its
unique global minimum at (0,0).

113

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

However, this is not we are looking for.

As we pointed out earlier, our points (𝑥, 𝑦) should satisfy the condition
𝑥 + 𝑦 = 56. So, how do we solve this problem? To this end let us write
𝑦 in terms of 𝑥 using the equation 𝑥 + 𝑦 = 56. This yields 𝑦 = 56 − 𝑥,
where 𝑥 ∈ ℝ. By substituting this for 𝑦 in 𝑓(𝑥, 𝑦) we get 𝑓(𝑥, 𝑦) =
𝑓(𝑥, 𝑦(𝑥)) = 4𝑥 2 + 3𝑥(56 − 𝑥) + 6(56 − 𝑥)2 , 𝑥 ∈ ℝ. Now we have
obtained a function of the single variable 𝑥, and you know how to find
extreme values of such functions. Let 𝑓(𝑥, 56 − 𝑥) = 𝐹(𝑥). Observe
that 𝐹(𝑥) is a quadratic function and the coefficient of 𝑥 2 term is
positive. This guarantees that 𝐹, and hence the constrained problem, has
𝑑𝐹
a minimum. As you know, to find the minimum we need to set = 0.
𝑑𝑥

Now,
𝑑𝐹
= 0 if and only if
𝑑𝑥

8𝑥 + 3𝑥(−1) + (56 − 𝑥)3 + 6.2(56 − 𝑥)(−1) = 0 if and only if


𝑥 = 36.
When 𝑥 = 36, 𝑦 = 20.

So, the minimum value of 𝑓, subject to the constraint 𝑥 + 𝑦 = 56 is


𝑓(36, 20) = 9744 and this occurs at (36, 20).

We now state a useful result in analysis which guarantees the existence


of extrema of a continuous function defined on a closed and bounded
subset of ℝ𝑛 .

Theorem 8.14

Let 𝑓 be a continuous function defined on a closed and bounded set


𝐷 ⊆ ℝ𝑛 . Then 𝑓 attains a maximum and a minimum value on 𝐷

114

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

. That is there exist 𝒙′ , 𝒙″ ∈ 𝐷 such that 𝑓(𝒙′ ) ≥ 𝑓(𝒙) for each


𝒙 ∈ 𝐷 and 𝑓(𝒙″ ) ≤ 𝑓(𝒙) for each 𝒙 ∈ 𝐷.

Example 8.18
Find the maximum and the minimum value of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 9𝑦 2 on
the closed disk determined by 𝑥 2 + 𝑦 2 ≤ 9.

Solution
In this problem we would like to find maximum and minimum values of
𝑓 on the closed disk given by 𝑥 2 + 𝑦 2 ≤ 9.

Observe that 𝑓 is continuous on ℝ2 and in particular on the disk


𝐷 = {(𝑥, 𝑦) ∈ ℝ2 : 𝑥 2 + 𝑦 2 ≤ 9}.
Since 𝐷 = {(𝑥, 𝑦) ∈ ℝ2 : 𝑥 2 + 𝑦 2 ≤ 9} is closed and bounded on ℝ2
Theorem 8.14 implies that 𝑓 attains its maximum and minimum values
on 𝐷. That is there exist 𝒙𝟎 , 𝒚𝟎 ∈ 𝐷 such that 𝑓(𝒙𝟎 ) ≥ 𝑓(𝒙) for all 𝒙 ∈
𝐷 and 𝑓(𝒚𝟎 ) ≤ 𝑓(𝒙) for all 𝒙 ∈ 𝐷. Therefore, the problem makes sense.
Since 𝑓 is differentiable on ℝ2 , ∇𝑓 exists at each point in ℝ2 . Hence,
critical points of 𝑓 are obtained by setting ∇𝑓 = 𝟎.

Now, ∇𝑓 = 𝟎 if and only if (8𝑥, 18𝑦) = (0, 0) if and only if 𝑥 =


0 and 𝑦 = 0. So, the only critical point of 𝑓 is (0, 0) and it is in 𝐷. It is
easy to see that 𝑓(0, 0) ≤ 𝑓(𝑥, 𝑦) for all (𝑥, 𝑦) ∈ 𝐷.

Therefore the minimum value of 𝑓 on 𝐷 is 𝑓(0, 0) = 0 and it occurs at


(0, 0).

Now what about maximum? Can 𝑓 attain its maximum value at an


interior point of 𝐷. The answer should be No. For suppose there exists

115

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

𝒙𝟎 ∈ int(𝐷) such that 𝑓(𝒙𝟎 ) ≥ 𝑓(𝒙)for all 𝒙 ∈ 𝐷. Then differentiability


of 𝑓 at 𝒙𝟎 implies that ∇𝑓(𝒙𝟎 ) = 𝟎 (Theorem 8.1).

But the only point in int(𝐷) which satisfy ∇𝑓(𝒙) = 𝟎 is (0, 0). Thus
there is no such point 𝒙𝟎 in int(𝐷) which gives a maximum. So,
maximum of 𝑓 should occur on the boundary of 𝐷. That is if 𝑓 takes its
maximum at (𝑎, 𝑏), then (𝑎, 𝑏) ∈ {(𝑥, 𝑦) ∈ ℝ2 : 𝑥 2 + 𝑦 2 = 9}. Now,
what happens to 𝑓 on this boundary?

Let us parameterize the circle𝑥 2 + 𝑦 2 = 9 by 𝑥(𝑡) = 3 cos 𝑡 and 𝑦(𝑡) =


3 sin 𝑡, where 0 ≤ 𝑡 ≤ 2𝜋.
Define 𝐹: [0,2𝜋] → ℝ by 𝐹(𝑡) = 𝑓(𝑥(𝑡), 𝑦(𝑡)). Then 𝐹(𝑡) =
4(3cos 𝑡)2 + 9(3 sin 𝑡)2 = 36 + 45sin2 𝑡.

Let’s find extreme values of 𝐹 on [0, 2𝜋].


Clearly 𝐹 is continuous on the closed and bounded interval [0, 2𝜋] and
hence 𝐹 must have a maximum and a minimum on [0, 2𝜋]. To find
extreme values of 𝐹 in (0, 2𝜋) set 𝐹 ′ (𝑡) = 0. Now, 𝐹 ′ (𝑡) = 0 if and
only if 45 sin 2𝑡 = 0 if and only if sin 2𝑡 = 0 if and only if 2𝑡 =
𝑛𝜋, 𝑛 ∈ ℤ. However, in our case 𝑡 ∈ (0, 2𝜋). Thus, we get
1𝜋 2𝜋 3𝜋
𝑡= , , .
2 2 2

We need to check the two end points 0, 2𝜋 separately.


Notice that,
𝐹(0) = 36
𝜋
𝐹 ( ) = 36 + 45 = 81
2
F(𝜋) = 36
3𝜋
F ( 2 ) = 81 and

𝐹(2𝜋) = 36.

116

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Now,
when 𝑡 = 0, 𝑥 = 𝑥(𝑡) = 3 and 𝑦 = 𝑦(𝑡) = 0
𝜋
when 𝑡 = 2 , 𝑥 = 0 and 𝑦 = 3

when 𝑡 = 𝜋, 𝑥 = −3 and 𝑦 = 0
3𝜋
when 𝑡 = , 𝑥 = 0 and 𝑦 = −3
2

when 𝑡 = 2𝜋, 𝑥 = 3 and 𝑦 = 0

Hence,
𝑓(3, 0) = 36,
𝑓(0, 3) = 81,
𝑓(−3, 0) = 36,
𝑓(0, −3) = 81.

Therefore 𝑓 takes its maximum value at the two points (0, 3), (0, −3)
and the maximum value is 81.

Remark

If there are points in the domain of the function at which at least one of the
first-order partial derivatives does not exist then, you need to check those
points separately to make sure that 𝑓 has extrema at such points or not.

Now let’s move on to the main theorem in this section.

Theorem 8.15

Let 𝑈 ⊆ ℝ𝑛 be open and let 𝑓: 𝑈 → ℝ, 𝑔𝑖 : 𝑈 → ℝ be continuously


differentiable functions on 𝑈, where 𝑖 = 1, 2, … , 𝑚 with 𝑚 < 𝑛.
Let 𝐶 = {𝒙: 𝒙 ∈ 𝑈 𝑎𝑛𝑑 𝑔𝑖 (𝒙) = 0 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑖 = 1, 2, … , 𝑚}. Let

117

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

𝒂 ∈ 𝐶 and assume that there exists 𝜖 > 0 such that 𝑓(𝒂) ≥


𝑓(𝒙) 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝒙 ∈ 𝐶⋂𝐵(𝒂, 𝜖) or 𝑓(𝒂) ≤ 𝑓(𝒙) for each 𝒙 ∈
𝐶⋂𝐵(𝒂, 𝜖)

118

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

. Assume also that 𝑟𝑎𝑛𝑘 𝐷 = 𝑚, where 𝐷=


𝜕𝑔1 𝜕𝑔1 𝜕𝑔1
(𝒂) (𝒂) ⋯ (𝒂)
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑛
𝛻𝑔1 (𝒂)
𝜕𝑔2 𝜕𝑔2 𝜕𝑔2
(𝒂) 𝜕𝑥 (𝒂) ⋯ 𝜕𝑥 (𝒂) = [ 𝛻𝑔2 (𝒂) ].
𝜕𝑥1 2 𝑛
⋮ ⋮ ⋱ ⋮ ⋮
𝜕𝑔𝑚 𝜕𝑔𝑚 𝜕𝑔𝑚 𝛻𝑔𝑚 (𝒂)
[ 𝜕𝑥1 (𝒂) (𝒂) ⋯ (𝒂) ]
𝜕𝑥2 𝜕𝑥𝑛

(Here we have considered 𝛻𝑔𝑖 (𝒂)as an 1 × 𝑛 row-vector) Then


there exist 𝑚 real numbers 𝜆1′ , … , 𝜆′𝑚 such that
𝛻𝑓(𝒂) = 𝜆1′ 𝛻𝑔1 (𝒂) + 𝜆′2 𝛻𝑔2 (𝒂) + ⋯ + 𝜆′𝑚 𝛻𝑔𝑚 (𝒂).

Several remarks have to be made out at this point.


Either condition there exists 𝜖 > 0 such that 𝑓(𝒂) ≥ 𝑓(𝒙) for each 𝒙 ∈
𝐶⋂𝐵(𝒂, 𝜖) or, there exists 𝜖 > 0 such that 𝑓(𝒂) ≤ 𝑓(𝒙) for each 𝒙 ∈
𝐶⋂𝐵(𝒂, 𝜖) implies that 𝑓 has respectively a local maximum or a local
minimum at 𝒂 in 𝐶.

If 𝑓(𝒂) ≥ 𝑓(𝒙) (respectively 𝑓(𝒂) ≤ 𝑓(𝒙)) for each 𝒙 ∈ 𝐶,then 𝑓 is said to


have a global maximum (respectively a global minimum) at 𝒂 in 𝐶.

We have used the phrase “in 𝐶” to emphasize that not only 𝑓 has a local
extremum/ global extremum at 𝒂 = (𝑎1 , … , 𝑎𝑛 ), but also that the point 𝒂
satisfies the 𝑚 equality constraints 𝑔𝑖 (𝒙) = 0, 𝑖 = 1, … , 𝑚 from which 𝐶 is
made up.

 Notice that altogether we have 𝑚 + 𝑛 equations namely, the 𝑛 equations


coming from the vector equation
𝑚

∇𝑓(𝒙) = ∑ 𝜆𝑖 ∇𝑔𝑖 (𝒙)


𝑖=1

and the 𝑚 equations (coming from the 𝑚 equality constraints)

𝑔𝑖 (𝒙) = 0 for each 𝑖 = 1, … , 𝑚.

118

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Let us define a function 𝐿(𝑥1 , … , 𝑥𝑛 , 𝜆1 , … , 𝜆𝑚 ): ℝ𝑛+𝑚 → ℝ


by 𝐿(𝑥1 , … , 𝑥𝑛 , 𝜆1 , … , 𝜆𝑚 ) = 𝑓(𝒙) − 𝜆1 𝑔1 (𝒙) − 𝜆2 𝑔2 (𝒙) − ⋯ − 𝜆𝑚 𝑔𝑚 (𝒙).

Then for (𝑎1 , … , 𝑎𝑛 , 𝜆1′ , … , 𝜆′𝑚 ) ∈ ℝ𝑛+𝑚 , where 𝒂 = (𝑎1 , … , 𝑎𝑛 ) and 𝜆′𝑖 ’s are
as in Theorem 8.15, ∇𝐿(𝑎1 , … , 𝑎𝑛 , 𝜆1′ , … , 𝜆′𝑚 ) = 𝟎. Here 𝟎 ∈ ℝ𝑛+𝑚 . In other
words,(𝑎1 , … , 𝑎𝑛 , 𝜆1′ , … , 𝜆′𝑚 ) is a critical point of 𝐿.

Thus, if 𝒂 ∈ ℝ𝑛 is a solution for the constrained problem, then 𝒂 together


with some 𝜆1′ , … , 𝜆′𝑚 ∈ ℝ should satisfy the 𝑛 + 𝑚 equations obtained by
setting
∇𝐿(𝑥1 , … , 𝑥𝑛 , 𝜆1 , … , 𝜆𝑛 ) = 𝟎.

 Therefore, in a problem, you need to solve these 𝑛 + 𝑚 equations for


𝑥𝑖 ’s and 𝜆𝑖 ’s (this could be a laborious task and sometimes we won’t be
able to come up with a solution at all). However, it is not necessary to
find 𝜆𝑖 ’s explicitly since we are interested on 𝑥𝑖 ’s only.

 The points (𝑥1 , … , 𝑥𝑛 ) obtained in this manner must then be checked to


determine whether they yield a maximum, a minimum, or neither. The
reason is Lagrange method does not provide a criterion for determining
the nature of extrema.

 The values 𝜆1′ , … , 𝜆′𝑚 are known as Lagrange’s multipliers.

 The condition rank 𝐷 = 𝑚 is called as “nondegenerate constraint


qualification (NDCQ)”.

Since we are dealing with functions of two or three variables in most cases,
we would like to give the following two versions of the above theorem.

119

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Theorem 8.16

Let 𝑈 ⊆ ℝ𝑛 (𝑛 = 2 𝑜𝑟 3) be an open set and let 𝑓, 𝑔 be two


continuously differentiable real-valued functions defined on 𝑈. Let
𝐶 = {𝒙: 𝒙 ∈ 𝑈 𝑎𝑛𝑑 𝑔(𝒙) = 0}. Suppose 𝑓 has a local extremum at
𝒂 ∈ 𝐶. Suppose further that 𝛻𝑔(𝒂) ≠ 0. Then there exists 𝜆 ∈ ℝ
such that 𝛻𝑓(𝒂) = 𝜆𝛻𝑔(𝒂).

Theorem 8.17

Let 𝑈 ⊆ ℝ3 be an open set and let 𝑓, 𝑔1 , 𝑔2 be three continuously


differentiable real-valued functions defined on 𝑈.
Let 𝐶 = {𝒙 ∈ 𝑈: 𝑔1 (𝒙) = 0 𝑎𝑛𝑑 𝑔2 (𝒙) = 0}. Assume that 𝑓 has a
local extremum at 𝒂 ∈ 𝐶.
Assume also that 𝛻𝑔1 (𝒂) 𝑎𝑛𝑑 𝛻𝑔2 (𝒂) are linearly independent (if
𝛻𝑔1 (𝒂) 𝑎𝑛𝑑 𝛻𝑔2 (𝒂) are viewed as vectors in ℝ3 , this means
that 𝛻𝑔1 (𝒂) 𝑎𝑛𝑑 𝛻𝑔2 (𝒂) are nonzero and are non-parallel).
Then there exists 𝜆1 , 𝜆2 ∈ ℝ such that 𝛻𝑓(𝒂) = 𝜆1 𝛻𝑔1 (𝒂) +
𝜆2 𝛻𝑔2 (𝒂).

Let us solve the two examples we discussed at the beginning of this section
using the method of Lagrange multipliers.

Example 8.19
Find the minimum value of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 3𝑥𝑦 + 6𝑦 2 subject to the
constraint curve 𝐶, 𝑥 + 𝑦 = 56.

Solution
Put 𝑔(𝑥, 𝑦) = 𝑥 + 𝑦 − 56. Then 𝑓 and 𝑔 are continuously differentiable
on ℝ2 .

120

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Notice that ∇𝑔(𝒙) = (1, 1) ≠ 𝟎 for each (𝑥, 𝑦) ∈ ℝ2 and hence for
each (𝑥, 𝑦) ∈ 𝐶.

Now, the possible candidates for points at which 𝑓 has local extrema in
𝐶 are obtained by setting
∇𝑓(𝒙) = λ∇𝑔(𝒙) where 𝜆 ∈ ℝ and 𝑥 + 𝑦 = 56

This yields the following equations:


8𝑥 + 3𝑦 = 𝜆
3𝑥 + 12𝑦 = 𝜆
𝑥 + 𝑦 = 56
Equating left hand side of first two equations gives
5𝑥 − 9𝑦 = 0.

This, along with the equation 𝑥 + 𝑦 = 56 gives


𝑥 = 36 and 𝑦 = 20.

In this special case we know that the point (36, 20) gives rise to a local
minimum in 𝐶. The minimum value of 𝑓 is 𝑓(36,20) = 9744.

Example 8.20
Find the maximum and the minimum values of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 9𝑦 2 on
the closed disk determined by 𝑥 2 + 𝑦 2 ≤ 9.

Solution
We know how to deal with the interior of the disk. Thus, let us consider
the boundary. Notice that, points on the boundary satisfy 𝑥 2 + 𝑦 2 = 9.
So, put 𝑔(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 − 9.

121

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Now 𝑓 and 𝑔 are continuously differentiable on ℝ2 . Notice that


∇𝑔(𝑥) = (2𝑥, 2𝑦) ≠ 𝟎 for each point on the boundary. The possible
candidates for points at which 𝑓 has local extrema on the boundary are
then obtained by setting
∇𝑓(𝒙) = λ∇𝑔(𝒙), where 𝜆 ∈ ℝ and 𝑥 2 + 𝑦 2 − 9 = 0.

This yields the following equations:


8𝑥 = 2𝜆𝑥 or equivalently 𝑥(4 − 𝜆) = 0
18𝑦 = 2𝜆𝑦 or equivalently 𝑦(9 − 𝜆) = 0
𝑥 2 + 𝑦 2 = 9.

The first equation implies 𝑥 = 0 or 𝜆 = 4 (notice that these two cases


cannot occur simultaneously). If 𝑥 = 0 then, 𝑦 = ±3 (from third
equation) and 𝜆 = 9. If 𝜆 = 4, then 𝑦 = 0 and it follows that 𝑥 = ±3.
Thus we get (0, 3), (0, −3), (3, 0) and (−3, 0) as possible candidates at
which 𝑓 is expected to have local extrema. Observe that a similar
investigation on the second equation gives rise to the same four points.

Clearly 𝑓 has only one critical point and it is (0, 0).


Now,
𝑓(0, 0) = 0
𝑓(0, 3) = 81
𝑓(0, −3) = 81
𝑓(3, 0) = 36
𝑓(−3, 0) = 36

So, 𝑓 attains its maximum at the two points (0, 3) and (0, −3) on the
disk and maximum value of 𝑓 is 81. The minimum value of 𝑓 is 0 and it
occurs at the origin.

Example 8.21

122

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Find the maximum and the minimum values of 𝑓(𝑥, 𝑦, 𝑧) = 𝑥𝑦 + 𝑧 2


subject to 𝑥 2 + 𝑦 2 + 𝑧 2 = 4 and 𝑦 − 𝑥 = 0.

Solution
Here we have two constraints.
Put 𝑔1 (𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2 − 4 and 𝑔2 (𝑥, 𝑦, 𝑧) = 𝑦 − 𝑥.
Clearly 𝑓, 𝑔1 , 𝑔2 are continuously differentiable on ℝ3 .

Observe that ∇𝑔1 (𝒙) = (2𝑥, 2𝑦, 2𝑧) , ∇𝑔2 (𝒙) = (−1, 1, 0) are
linearly independent for each (𝑥, 𝑦, 𝑧) ∈ ℝ3 satisfying the two equality
constraints 𝑔1 (𝑥, 𝑦, 𝑧) = 0 and 𝑔2 (𝑥, 𝑦, 𝑧) = 𝑦 − 𝑥.
To find possible candidates for points at which 𝑓 has local extrema, we
set
∇𝑓(𝒙) = 𝜆1 ∇𝑔1 (𝒙) + 𝜆2 ∇𝑔2 (𝒙) for 𝜆1 , 𝜆2 ∈ ℝ.

This gives following three equations.


𝑦 = 2𝑥𝜆1 − 𝜆2
𝑥 = 2𝑦𝜆1 + 𝜆2
2𝑧 = 2𝑧𝜆1 .

We also have the two equality constraints,


𝑥 2 + 𝑦 2 + 𝑧 2 = 4 and
𝑦 = 𝑥.

Notice that first two equations together with 𝑦 = 𝑥 implies 𝜆2 = 0.


Then, the above five equations reduce to following equations.
𝑥(1 − 2𝜆1 ) = 0
𝑧(1 − 𝜆1 ) = 0
2𝑥 2 + 𝑧 2 = 4.

Now, first equation implies 𝑥 = 0 or 𝜆1 = 1⁄2 (observe that these two


cases cannot occur simultaneously).

123

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Suppose 𝑥 = 0. Then 𝑧 = ±2 . This gives 𝜆1 = 1. Now suppose 𝜆1 =


1⁄ . Then 𝑧 = 0 and hence 𝑥 = ±√2 . Notice that the same set of
2
solution is obtained when a similar reasoning is made on the second
equation.
So, the possible local extrema of 𝑓 will be occurred at
(0, 0, 2), (0, 0, −2), (√2, √2, 0) and (−√2, −√2, 0).

At these points,
𝑓(0, 0, 2) = 4
𝑓(0, 0, −2) = 4
𝑓(√2, √2, 0) = 2

𝑓(−√2, −√2, 0) = 2

So, 𝑓 attains its maximum value 4 at the two points (0, 0, 2), (0, 0, −2)
and its minimum value 2 at the two points (√2, √2, 0), (−√2, −√2, 0).

A question is still remaining to answer! How do we know that 𝑓, subject


to two given constraints, has a maximum and a minimum. To answer
this question we need the knowledge of closed sets in ℝ𝑛 . We can
prove that the two sets 𝐶1 = {(𝑥, 𝑦, 𝑧) ∈ ℝ3 : 𝑥 2 + 𝑦 2 + 𝑧 2 = 4}and𝐶2 =
{(𝑥, 𝑦, 𝑧) ∈ ℝ3 : 𝑦 = 𝑥} are closed in ℝ3 and that their intersection
𝐶1 ⋂𝐶2 is closed in ℝ3 (notice that a point (𝑥, 𝑦, 𝑧) satisfies 𝑔1 (𝒙) =
0 and 𝑔2 (𝒙) = 0 if and only if (𝑥, 𝑦, 𝑧) ∈ 𝐶1 ⋂𝐶2 ).

Also, it is clear that 𝐶1 ⋂𝐶2 is a bounded set in ℝ3 as 𝐶1 is a bounded set


in ℝ3 .

124

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Now since 𝑓 is continuous on ℝ3 , and in particular on 𝐶1 ⋂𝐶2 , Theorem


8.14 implies that 𝑓 attains a maximum and a minimum on 𝐶1 ⋂𝐶2 .

Example 8.22
Let 𝑓, 𝑔: ℝ2 → ℝ be two functions defined by 𝑓(𝑥, 𝑦) = 2𝑥 + 4𝑦 for all
(𝑥, 𝑦) ∈ ℝ2 and 𝑔(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 for all (𝑥, 𝑦) ∈ ℝ2 . Find the
maximum of 𝑓 subject to 𝑔(𝑥, 𝑦) = 0.

Solution
Setting ∇𝑓(𝑥) = 𝜆∇𝑔(𝑥) gives the following equations:
2 = 2𝜆𝑥
4 = 2𝜆𝑦

These two equations suggest that 𝑥, 𝑦, 𝜆 ≠ 0. Thus we get 𝑦 = 2𝑥. But


then, the equation 𝑥 2 + 𝑦 2 = 0 implies that 𝑥 = 0 and 𝑦 = 0
So, what goes wrong here?

Observe that ∇𝑔(𝒙) = (2𝑥, 2𝑦) = 𝟎 if and only if 𝑥 = 0 and 𝑦 = 0.


Since {(𝑥, 𝑦) ∈ ℝ2 : 𝑔(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 = 0} = {(0, 0)}, ∇𝑔(𝒙) = 𝟎 on
the set on which 𝑓 is to be maximized. Therefore, we cannot apply the
method of Lagrange multipliers in this problem. However, the maximum
value which is taken by 𝑓 on this specified set is 0.

Example 8.23
Consider the problem of maximizing 𝑓(𝑥, 𝑦) = 𝑥 + 𝑦 subject to 𝑥𝑦 =
16.
In this problem 𝑔(𝑥, 𝑦) = 𝑥𝑦 − 16. Clearly 𝑓 and 𝑔 are continuously
differentiable on ℝ2 . Also, ∇𝑔(𝑥) = (𝑦, 𝑥) ≠ 0 for all points (𝑥, 𝑦)
satisfying 𝑥𝑦 = 16.
By setting ∇𝑓(𝑥) = 𝜆∇𝑔(𝑥) we get the following equations:

125

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

1 = 𝜆𝑦
1 = 𝜆𝑥

It is easy to see that 𝑥, 𝑦, 𝜆 ≠ 0. Dividing both sides of each of the


equations by 𝜆 gives 𝑥 = 𝑦. Now from the equation 𝑥𝑦 = 16, we
identify the two points (4, 4) and (−4, −4) as possible candidates at
which 𝑓 is expected to take on extreme values. Since 𝑓(−4, − 4) = −8
and 𝑓(4, 4) = 8 and 8 > −8, one might be tempted to conclude at this
point that 𝑓 has a maximum at (4, 4) and that maximum value of 𝑓 is 8.
This conclusion is however wrong.

16
Let 𝑀 > 0 be arbitrary. Put 𝑥 = 2𝑀 and 𝑦 = . Then 𝑥 ⋅ 𝑦 = 16.
2𝑀
16 16
Notice that, 𝑓 (2𝑀, 2𝑀) = 2𝑀 + > 𝑀 Since 𝑀 > 0 is arbitrarily
2𝑀
chosen it follows that 𝑓 can be maximized as much as you please
without violating the condition 𝑥𝑦 = 16.
So, what are these two points then? Think!

Remark

It should be noticed that the condition ∇𝑓(𝒙) = 𝜆∇𝑔(𝒙) is just a necessary


condition for a function 𝑓 to have an extremum at 𝒙 subject to 𝑔(𝒙) = 0. It
does not guarantee the existence of an optimal solution to the problem!

Solutions of Activities

Activity 8.1

1. Clearly ℝ2 is open and convex. It is easy to see that 𝑓 has continuous second-
order partial derivatives at each point 𝒙 ∈ ℝ2 . Now let 𝒙 ∈ ℝ2 .

126

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

Notice that
2 2
𝐻𝑓(𝒙) = [ ].
2 2

There are two 1𝑠𝑡 principal minors of 𝐻𝑓(𝒙) and both of them are equal to 2. There
is only one 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙) and it is equal to 0. Since all the principal
minors are nonnegative and 𝒙 ∈ ℝ2 is arbitrarily chosen, 𝑓 is convex on ℝ2 .

2. Observe that 𝑓 has continuous second-order partial derivatives at each point 𝒙 ∈


ℝ2 .
Let 𝒙 ∈ ℝ2 . Then
−2 −1
𝐻𝑓(𝒙) = [ ].
−1 −4

The two 1𝑠𝑡 principal minors are −4 and −2. The 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙) is 7.
Since both 1𝑠𝑡 principal minors are negative and 2𝑛𝑑 principal minor is positive 𝑓 is
concave on ℝ2 .

3. Let 𝑥 ∈ ℝ2 . Notice that


2 −3
𝐻𝑓(𝑥) = [ ].
−3 4
The two 1𝑠𝑡 principal minors of 𝐻𝑓(𝒙) are 2 and 4. The 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙)
is −1. Since 1𝑠𝑡 principal minors of 𝐻𝑓(𝑥) are positive 𝑓 is not concave. Since
2𝑛𝑑 principal minor of 𝐻𝑓(𝒙) is negative 𝑓 is not convex.

4. Let 𝒙 ∈ ℝ3 . Notice that


2 −1 −1
𝐻𝑓(𝒙) = [−1 2 −1].
−1 −1 4

The three 1𝑠𝑡 principal minors of 𝐻𝑓(𝒙) are 2, 2, 4. The three 2𝑛𝑑 principal minors of
𝐻𝑓(𝒙) are 7, 7, and 3. The only 3𝑟𝑑 principal minor of 𝐻𝑓(𝒙) is 6. Since all the
principal minors of 𝐻𝑓(𝒙) are nonnegative it follows from Theorem 8.3 that 𝑓 is
convex on ℝ3 .

5. Let 𝒙 ∈ ℝ2 . Notice that


6𝑥1 2
𝐻𝑓(𝒙) = [ ].
2 2

127

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

In this problem 𝑥1 has appeared in the Hessian. Notice that the two 1𝑠𝑡 principal
minors of 𝐻𝑓(𝒙) are 6𝑥1 and 2. On the other hand the 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙)
is 12𝑥1 − 4. Since 2 > 0, 𝑓 is not concave. Now 6𝑥1 ≥ 0 if and only if 𝑥1 ≥ 0 and
1
12𝑥1 − 4 ≥ 0 if and only if 𝑥1 ≥ . Therefore, all the principal minors of 𝐻𝑓(𝒙) are
3
1
nonnegative if and only if 𝑥1 ≥ .
3

1
It follows that 𝑓 is convex on the open set {(𝑥1 , 𝑥2 ) ∈ ℝ2 : 𝑥1 > }.
3

Activity 8.2

1. Let 𝑎 ∈ ℝ. Then the matrix (𝑎) is positive definite if and only if𝑥 ⋅ 𝑎 ⋅ 𝑥 = 𝑎𝑥 2 >
0 for each 𝑥 ∈ ℝ ∖ {0} if and only if 𝑎 > 0.

Let 𝑎 ∈ ℝ. Then the matrix (𝑎) is positive semidefinite if and only if

𝑥 ⋅ 𝑎 ⋅ 𝑥 = 𝑎𝑥 2 ≥ 0 for each 𝑥 ∈ ℝ if and only if 𝑎 ≥ 0.

1 1
2. Let 𝐴 = ( ) and let (𝑥, 𝑦) ∈ ℝ2 .
1 1

Then (𝑥 𝑦 ) 𝐴 (𝑥 𝑦)𝑇 = 𝑥 2 + 2𝑥𝑦 + 𝑦 2 = (𝑥 + 𝑦)2 .

Clearly for each 𝑥, 𝑦 ∈ ℝ, (𝑥 + 𝑦)2 ≥ 0. Therefore 𝐴 is positive semidefinite.


2
However (1, −1) ≠ (0, 0) and (1 + (−1)) = 0. Therefore 𝐴 is not positive
definite.

On the other hand Eigen values of 𝐴 are given by the equation (1 − 𝜆)2 − 1 =
0, or equivalently by the equation −𝜆(2 − 𝜆) = 0. So, Eigen values of 𝐴 are 2
and 0. Since all the Eigen values are non-negative 𝐴 is positive semidefinite.
But since 0 is an Eigen value of 𝐴, 𝐴 is not positive definite.

3. Notice that Eigen values of 𝐴 are −3 and 5. Since −3 < 0, 𝐴 is not positive
definite.

128

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

−1 1
4. Let 𝐴 = ( ). Then Eigen values of 𝐴 are given by the equation
1 −4
(−1 − 𝜆)(−4 − 𝜆) − 1 = 0.
−5+√13 −5−√13
So, Eigen values of 𝐴 are and . Since both Eigen values of 𝐴 are
2 2

negative, it follows that 𝐴 is negative definite (ND).

5. Notice that Eigen values of 𝐴 are 𝑎, 𝑏, and 𝑐.

Therefore,
𝐴 is PD if and only if 𝑎, 𝑏, 𝑐 ∈ (0, ∞)

𝐴 is PSD if and only if 𝑎, 𝑏, 𝑐 ∈ [0, ∞)

𝐴 is ND if and only if 𝑎, 𝑏, 𝑐 ∈ (−∞, 0)𝐴 is NSD if and only if 𝑎, 𝑏, 𝑐 ∈ (−∞, 0]

𝐴 is ID if and only if 𝑎 = 0 and 𝑏𝑐 < 0, 𝑜𝑟 𝑏 = 0 and 𝑎𝑐 < 0, 𝑜𝑟 𝑐 = 0 and 𝑎𝑏


< 0, 𝑜𝑟 𝑎𝑏𝑐 < 0 and{𝑎, 𝑏, 𝑐} ⊈ (−∞, 0), 𝑜𝑟 𝑎𝑏𝑐 > 0 and{𝑎, 𝑏, 𝑐}
⊈ (0, ∞).

Activity 8.3

1.
i) (0, 0) − 𝑓 has no extremum at (0, 0).

(0, 1) − 𝑓 has no extremum at (0, 1).


(0, −1) − 𝑓 has no extremum at (0, −1).
(1, 0) − 𝑓 has no extremum at (1, 0).
2 1 2 1
( , ) − 𝑓 has a local minimum at ( , ).
5 √5 5 √5
2 1 2 1
( ,− ) − 𝑓 has a local maximum at ( , − ).
5 √5 5 √5

ii) (0, 0) − 𝑓 has no extremum at (0, 0).


3 3 3 3
( , ) − 𝑓 has a local minimum at ( , ).
2 2 2 2
3 −3 3 −3
( , ) − 𝑓 has a local minimum at ( , ).
2 2 2 2

iii) (0, 0, 0) − (0, 0, 0) is a saddle point.

129

Copyright © 2018, The Open University of Sri Lanka


MHZ4553: Unit III Session 8: Applications of Differentiation of several variables

2.
i) (0, 0, 0) − 𝑓 has a strict local maximum at (0, 0, 0).
ii) Critical points of 𝑓 are of the form (𝑎, 𝑎), where 𝑎 ∈ ℝ. At each critical point (𝑎, 𝑎),
𝑎 ∈ ℝ, 𝑓 has a global minimum.
iii) (0, 0, 0) − 𝑓 has a strict local minimum at (0, 0, 0).
iv) 𝑓 has no critical points in ℝ2 .

3.
It is easy to see that (0,0) is a critical point of both 𝑓 and 𝑔.
Since 𝑔(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥24 ≥ 0 for each (𝑥1 , 𝑥2 ) ∈ ℝ2 and 𝑔(0,0) = 0 it follows that 𝑔 has
a local (actually, global) minimum at (0,0).
𝛿 𝛿 𝛿2
Now let 𝛿 > 0. Notice that ( , 0) ∈ 𝐵(𝟎, 𝛿) and 𝑓 ( , 0) = > 0 = 𝑓(0,0).
2 2 4

Hence 𝑓 has no maximum at (0,0).

𝛿 𝛿 𝛿3
Also, (0, − ) ∈ 𝐵(𝟎, 𝛿) and 𝑓 (0, − ) = − < 0 = 𝑓(0,0).
2 2 8

Hence 𝑓 has no minimum at (0,0). Therefore, 𝑓 has no extremum at (0,0).

Summary

 If a function 𝑓 possess all the first order partial derivatives at an interior


point 𝒙𝟎 in the domain of 𝑓, then a necessary condition for 𝑓 to have a
local extremum at 𝒙𝟎 is that each of the first order partial derivatives of
𝑓 at 𝒙𝟎 to be equal to 0.

 A point 𝒙 in the domain of a function 𝑓 is defined as a critical point of 𝑓


if, either each of the first order partial derivatives of 𝑓 evaluated at 𝒙 is
equal to zero or at least one of the partial derivatives does not exist at 𝒙.

130

Copyright © 2018, The Open University of Sri Lanka

You might also like