Professional Documents
Culture Documents
Session 8
Applications of Differentiation of
several variables
Contents:
Introduction, p 88
8.1 Maximum and Minimum Values, p 89
8.2 Convex and Concave Functions, p 95
8.3 Classifying Quadratic Forms, p 101
8.4 Unconstrained Maximization and Minimization, p 107
8.5 Lagrange Multipliers, p 111
Solutions of Activities, p 125
Summary, p 129
Learning Outcomes, p 130
Introduction
There are two types of optimizing problems. The first type is unconstrained
optimization. In this case, there are no constraints imposed on the domain of
the function. The second type is constrained optimization. Here, several
constraints are imposed on the domain of the function and the goal is then to
find extreme values of the function subject to these constraints. We use the
so called method of Lagrange multipliers to find extreme values in such
cases. In this session we have dealt with both types of optimizing problems.
88
Definition
Let 𝑓 be a real-valued function defined on some 𝐷 ⊆ ℝ𝑛 and let 𝒙𝟎 ∈ 𝐷.
i. The function 𝑓 is said to have a global maximum (or absolute
maximum) at 𝑥0 if 𝑓(𝑥0 ) ≥ 𝑓(𝑥) for each 𝑥 ∈ 𝐷. The number
𝑓(𝑥0 ) is called the maximum value of 𝑓 on 𝐷.
Note
89
Theorem 8.1
1, 2, … , 𝑛.
Remark
Sometimes it is possible that one (or perhaps more than one) of these partial
derivatives does not exist and yet 𝑓 has an extremum at 𝒙𝟎 (compare with
|𝑥| defined on the interval [−1, 1] ).
90
Figure 8.1: |𝑥| is not differentiable at 0. But, |𝑥| has a strict global
minimum at 0.
We can use this theorem to locate points, if any, in the interior of the
domain of a function 𝑓 at which 𝑓 has a local extremum provided that all
the first-order partial derivatives of 𝑓 exist at those points (compare this
with the analogous result in one variable case).
Example 8.1
Let 𝑓(𝑥, 𝑦) = 𝑥 2 − 𝑦 2.
Notice that
𝜕𝑓 𝜕𝑓
(0, 0) = 0 and (0, 0) = 0.
𝜕𝑥 𝜕𝑦
Let us show that 𝑓 has neither a local maximum nor a local minimum at
(0, 0). To this end let 𝛿 > 0.
91
𝛿 𝛿
Put 𝒙 = (2 , 0) and 𝒚 = (0, 2). Then, 𝒙, 𝒚 ∈ 𝐵(𝟎, 𝛿 ).
𝛿 2 𝛿2
Notice that 𝑓(𝑥) = (2) − 0 = > 0 = 𝑓(0, 0) and hence 𝑓 does not
4
Note
derivatives does not exist at 𝒙. Thus, any point at which 𝑓 has a local
extremum is a critical point of 𝑓. However, as in the single variable
calculus, not all critical points give rise to extrema. At a critical point, a
function could have a local maximum or a local minimum or neither (cf.
Example 8.1).
92
Definition
A critical point 𝒙𝟎 is called a saddle point if for each 𝜖 > 0, there exist
𝒙, 𝒚 ∈ 𝐵(𝒙𝟎 , 𝜖) such that 𝑓(𝒙) > 𝑓(𝒙𝟎 ) and 𝑓(𝒚) < 𝑓(𝒙𝟎 ).
In example 8.1, the point (0,0) is a saddle point.
Theorem 8.2
Note
93
Figure 8.3
Figure 8.4
Example 8.2
Find critical points of following functions and determine the nature of
critical points.
a. 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 + 2𝑥 − 4𝑦 + 3
b. 𝑓(𝑥, 𝑦) = 𝑦 2 − 𝑥 2
Solution
94
<0
Thus (0, 0) is a saddle point of 𝑓.
Definition
Let 𝑓(𝑥1 , … , 𝑥𝑛 ) be a function defined on a convex set 𝑆 ⊆ ℝ𝑛 .
95
Example 8.2
Solution:
2 2 2 2
L.H.S. ≤ 𝜆𝑥1 ′ + (1 − 𝜆)𝑥1 ′′ + 𝜆𝑥2 ′ + (1 − 𝜆)𝑥2 ′′
2 2 2 2
= 𝜆(𝑥1 ′ + 𝑥1 ′′ ) + (1 − 𝜆)[𝑥1 ′′ + 𝑥2 ′′ ]
= 𝜆𝑓(𝑥 ′ ) + (1 − 𝜆)𝑓(𝑥 ″ ).
∴ 𝑓(𝜆𝑥 ′ + (1 − 𝜆)𝑥 ″ ) ≤ 𝜆𝑓(𝑥 ′ ) + (1 − 𝜆)𝑓(𝑥 ″ ).
Definition
96
Remark
Example 8.3
𝑓(𝑥) = 𝑥 2 and 𝑓(𝑥) = 𝑒 𝑥 are convex functions on ℝ.
A linear function is both a convex and a concave function.
The sum of two convex (concave) functions is convex (concave).
If 𝑓(𝑥1 , … , 𝑥𝑛 ) is a convex function on a convex set 𝑆, then for 𝑐 ≥
0, 𝑐𝑓(𝑥1 , … , 𝑥𝑛 ) is a convex function on 𝑆. If 𝑐 ≤ 0, 𝑐𝑓(𝑥1 , … , 𝑥𝑛 ) is
a concave function on 𝑆.
sin(𝑥 + 𝑦) , cos(𝑥 + 𝑦) are neither convex nor concave on ℝ2 .
−𝑥 2 − 𝑦 2 is concave on ℝ2 .
𝑓(𝑥, 𝑦) = 𝑥 2 𝑦 2 is neither convex nor concave on ℝ2 .
𝑓(𝑥, 𝑦) = (𝑥 + 𝑦)2 is convex on ℝ2 .
97
Figure 8.7
Definition
An 𝑖 𝑡ℎ principal minor of an 𝑛 × 𝑛 matrix is the determinant of any 𝑖 × 𝑖
matrix obtained by deleting 𝑛 − 𝑖 rows and the corresponding 𝑛 − 𝑖
columns of the matrix.
98
Example 8.4
1 1 2
𝑠𝑡 𝑛𝑑 𝑟𝑑
Find all the 1 , 2 and 3 principal minors of the matrix (1 1 3).
2 3 2
Solution
Notice that there are three first principal minors, three second principal
minors and one third principal minor. Let us denote the 𝑖 th row of the
matrix by 𝑅𝑖 and the 𝑖 th column of the matrix by 𝐶𝑖 . Then, the three first
principal minors of the matrix are obtained by deleting:
i. 𝑅2 , 𝑅3 , 𝐶2 , 𝐶3
ii. 𝑅1 , 𝑅3 , 𝐶1 , 𝐶3 and
iii. 𝑅1 , 𝑅2 , 𝐶1 , 𝐶2 .
The three second principal minors of the matrix are obtained by
deleting:
i. 𝑅1 , 𝐶1
ii. 𝑅2 , 𝐶2 and
iii. 𝑅3 , 𝐶3 .
Third principal minor of the matrix is just the determinant of the matrix.
Definition
The 𝑘 𝑡ℎ leading principal minor of an 𝑛 × 𝑛 matrix is the determinant of the
𝑘 × 𝑘 matrix obtained by deleting the last 𝑛 − 𝑘 rows and columns of the
matrix.
Definition
Let 𝑓(𝑥1 , … , 𝑥𝑛 ) be a function defined on an open subset 𝑈 in ℝ𝑛 and let
𝒂 ∈ 𝑈. Suppose 𝑓 has first order partial derivatives on 𝑈 with respect to
each variable 𝑥𝑖 and that each second-order partial derivative of 𝑓 exist at 𝒂.
Then the Hessian of 𝑓 at 𝒂, denoted by 𝐻𝑓(𝒂), is the 𝑛 × 𝑛 matrix
𝜕2
𝐻𝑓(𝒂) = (ℎ𝑖𝑗 (𝒂)) = ( 𝑓(𝒂))
𝜕𝑥𝑖 𝜕𝑥𝑗
99
𝜕 2𝑓 𝜕 2𝑓 𝜕 2𝑓
(𝒂) (𝒂) ⋯ (𝒂)
𝜕𝑥1 𝜕𝑥1 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥𝑛
𝜕 2𝑓 𝜕 2𝑓 𝜕 2𝑓
= 𝜕𝑥2 𝜕𝑥1 (𝒂) (𝒂) ⋯ (𝒂)
𝜕𝑥2 𝜕𝑥2 𝜕𝑥2 𝜕𝑥𝑛
⋮ ⋮ ⋱ ⋮
𝜕 2𝑓 𝜕 2𝑓 𝜕 2𝑓
(𝒂) (𝒂) ⋯ (𝒂)
(𝜕𝑥𝑛 𝜕𝑥1 𝜕𝑥𝑛 𝜕𝑥2 𝜕𝑥𝑛 𝜕𝑥𝑛 )
Theorem 8.3
Theorem 8.4
100
Activity 8.1
3. Show that 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 − 3𝑥1 𝑥2 + 2𝑥22 is neither convex nor concave on ℝ2 .
5. What about the function 𝑓(𝑥1 , 𝑥2 ) defined on ℝ2 by 𝑓(𝑥1 , 𝑥2 ) = 𝑥13 + 2𝑥1 𝑥2 + 𝑥22 ?
At this point we would like you to introduce one of the most compelling
areas in linear algebra, known as quadratic forms, simply because of its
extensive uses in optimizing problems occurred in engineering. You can
find more details on this topic in any linear algebra text.
Definition
A function 𝑄: ℝ𝑛 → ℝ is said to be a quadratic form on ℝ𝑛 if there exists
an 𝑛 × 𝑛 symmetric matrix 𝐴 such that 𝑄(𝒙) = 𝒙𝐴𝒙𝑇 for each 𝒙 ∈ ℝ𝑛 . The
matrix 𝐴 is called the matrix of the quadratic form.
Example 8.5
Let 𝑥 = [𝑥1 𝑥2 ]. Compute 𝒙𝐴𝒙𝑇 , for the matrices:
3 0 −4 1
𝐴1 = [ ] , 𝐴2 = [ ].
0 5 1 2
101
Solution
3 0
When 𝐴1 = [ ],
0 5
𝒙𝐴1 𝒙𝑇 = [𝑥1 𝑥2 ] [3 0 [𝑥 𝑥 ]𝑇
] 1 2
0 5
= [3𝑥1 5𝑥2 ][𝑥1 𝑥2 ]𝑇
= 3𝑥12 + 5𝑥22 .
−4 1
When 𝐴 = [ ],
1 2
−4 1 [𝑥 𝑥 ]𝑇
𝒙𝐴2 𝒙𝑇 = [𝑥1 𝑥2 ] [ ] 1 2
1 2
= [−4𝑥1 + 𝑥2 𝑥1 + 2𝑥2 ][𝑥1 𝑥2 ]𝑇
= −4𝑥12 + 2𝑥1 𝑥2 + 2𝑥22 .
Thus, 𝐴1 and 𝐴2 are matrices of the quadratic form.
Example 8.6
For 𝒙 ∈ ℝ3 , define 𝑄(𝒙) = 7𝑥12 + 5𝑥22 + 3𝑥32 − 2𝑥1 𝑥2 + 6𝑥2 𝑥3 . Find a
symmetric matrix 𝐴 such that 𝑄(𝒙) = 𝒙𝐴𝒙𝑇 for each 𝒙 ∈ ℝ3 .
Solution
7 −1 0
Let 𝐴 = [−1 5 3]. Then for each 𝒙 = (𝑥1 , 𝑥2 , 𝑥3 ) ∈ ℝ3 ,
0 3 3
7 −1 0
[𝑥1 𝑥2 𝑥3 ] [−1 5 3] [𝑥1 𝑥2 𝑥3 ]𝑇
0 3 3
= [7𝑥1 − 𝑥2 −𝑥1 + 5𝑥2 + 3𝑥3 3𝑥2 + 3𝑥3 ][𝑥1 𝑥2 𝑥3 ]𝑇
= 7𝑥12 − 2𝑥1 𝑥2 + 5𝑥22 + 6𝑥2 𝑥3 + 3𝑥32 = 𝑄(𝒙).
Example 8.7
102
Definition
A quadratic form 𝑄 on ℝ𝑛 is said to be
1. positive definite if 𝑄(𝒙) > 0 for all 𝒙 ≠ 𝟎,
2. positive semidefinite if 𝑄(𝒙) ≥ 0 for all 𝒙 ∈ ℝ𝑛 ,
3. negative definite if 𝑄(𝒙) < 0 for all 𝒙 ≠ 𝟎,
4. negative semidefinite if 𝑄(𝒙) ≤ 0 for all 𝒙 ∈ ℝ𝑛 ,
5. indefinite if there exist 𝒙, 𝒚 ∈ ℝ𝑛 such that 𝑄(𝒙) > 0 and 𝑄(𝒚) < 0.
Theorem 8.5
103
Example 8.8
Classify the quadratic form 𝑄(𝒙) = 3𝑥12 − 2𝑥22 + 2𝑥32 − 2𝑥1 𝑥3 .
Solution
Notice that 𝑄(𝒙) = [𝑥1 𝑥2 𝑥3 ]𝐴 [𝑥1 𝑥2 𝑥3 ]𝑇 ,
3 0 −1
where 𝐴 = [ 0 −2 0 ].
−1 0 2
Now let us find Eigen values of 𝐴. Eigen values of 𝐴 are given by the
equation |𝐴 − 𝜆𝐼| = 0.
Since Eigen values of 𝐴 are both positive and negative, the given
quadratic form of 𝑄 is indefinite.
Definition
An 𝑛 × 𝑛 symmetric matrix 𝐴 is said to be
1. Positive definite (PD) if the quadratic form 𝒙𝐴𝒙𝑇 is positive
definite.
2. Positive semidefinite (PSD) if the quadratic form 𝒙𝐴𝒙𝑇 is positive
semidefinite.
3. Negative definite (ND) if the quadratic form 𝒙𝐴𝒙𝑇 is negative
definite.
4. Negative semidefinite (NSD) if the quadratic form 𝒙𝐴𝒙𝑇 is negative
semidefinite.
104
Remark
Activity 8.2
1.What are the positive definite and positive semidefinite matrices in 𝑀1 − the set
of all 1 × 1 matrices?
1 1
1. Explain why ( ) is positive semidefinite but not positive definite.
1 1
1 4
2. Is 𝐴 = ( ) positive definite? (Observe that all the entries of 𝐴 are positive).
4 1
−1 1
3. Determine whether the matrix ( ) is PD, PSD,ND or NSD.
1 −4
𝑎 0 0
4. Consider the diagonal matrix 𝐴 = (0 𝑏 0). Determine the values of 𝑎, 𝑏, 𝑐
0 0 𝑐
which make 𝐴,
Theorem 8.6
105
Example 8.9
2 −1 0
Show that 𝐴 = (−1 2 −1) is positive definite.
0 −1 2
Solution
Notice that, the leading principal minors of 𝐴 are 2, 3, and 4. Since they
are all positive, 𝐴 is positive definite.
Theorem 8.7
Note
Observe that condition II applies to all the principal minors, not only for
leading principal minors of 𝐴. Otherwise, it is not possible to distinguish
between two matrices whose leading principal minors were all zero.
0 0 0 0
For example ( ) is positive semidefinite and ( ) is negative
0 1 0 −1
semidefinite. But, all the leading principal minors of both matrices are equal
to zero.
Example 8.10
2 −1 −1
Show that 𝐴 = (−1 2 −1) is positive semidefinite.
−1 −1 2
106
Solution
Notice that all the first principal minors of 𝐴are equal to 2 and all the
second principal minors of 𝐴 are equal to 3. The determinant of 𝐴 is 0.
Since all the principal minors of 𝐴 are nonnegative, it follows that 𝐴 is
positive semidefinite.
Theorem 8.8
Theorem 8.9
𝑛
Let 𝐴 be an 𝑛 × 𝑛 real symmetric matrix. Let {|𝐴𝑘𝑖 |: 1 ≤ 𝑖 ≤ (𝑛−𝑘 ),
In this section, all the functions are assumed to be defined on ℝ𝑛 . Also, let
us assume that they have continuous first and second order partial
derivatives on ℝ𝑛 .
107
Theorem 8.10
Theorem 8.11
Theorem 8.12
Example 8.11
Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 𝑥1 2 + 𝑥2 2 . Notice that (0, 0) is a
critical point of 𝑓 (indeed (0, 0) is the only critical point of 𝑓).
2 0
The Hessian of 𝑓 at 𝒙, 𝐻𝑓(𝒙) = ( ). In particular, the Hessian of 𝑓
0 2
2 0 2 0
at (0, 0) is also ( ). Clearly ( ) is positive definite. Thus, by
0 2 0 2
Theorem 8.10, 𝑓 has a strict local minimum at (0, 0)(indeed 𝑓 has a
strict global minimum at (0, 0)). See Figure 8.8.
108
Figure 8.8
Example 8.12
For each (𝑥1 , 𝑥2 ) ∈ ℝ2 ,define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 4𝑥1 𝑥2 . Notice
0 4
that,𝐻𝑓(0, 0) = ( ) and |𝐻𝑓(0,0)| ≠ 0. Cleary 𝐻𝑓(0, 0) is neither
4 0
positive definite nor negative definite. Therefore (0, 0) is a saddle point.
We can obtain the same conclusion in the following way.
𝜖 𝜖 −𝜖 𝜖 𝜖 𝜖
Let 𝜖 > 0. Then (2 , 2) , ( 2 , 2) ∈ 𝐵(𝟎, 𝜖). Notice that 𝑓 (2 , 2) = 𝜖 2 > 0
−𝜖 𝜖
and 𝑓 ( 2 , 2) = −𝜖 2 < 0. Since 𝜖 > 0 is arbitrary it follows from
109
Figure 8.9
Remark
Example 8.13
Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 𝑥14 − 𝑥24 for all (𝑥1 , 𝑥2 ) ∈ ℝ2 .
Then ∇𝑓(𝑥1 , 𝑥2 ) = (4𝑥13 , −4𝑥23 ), which yields the critical point (0, 0).
12𝑥12 0
Also, 𝐻𝑓(𝑥1 , 𝑥2 ) = ( ).
0 −12𝑥22
Now, 𝐻𝑓(0,0) is the zero matrix, which is positive semidefinite (also
negative semidefinite). Observe that, 𝑓 has neither a local maximum nor
a local minimum at (0, 0).
Example 8.14
110
Theorem 8.13
Activity 8.3
1. Find all points at which 𝑓 has local maxima, local minima, and saddle points
for,
iii. 𝑓(𝑥1 , 𝑥2 , 𝑥3 ) = 𝑥1 𝑥2 + 𝑥2 𝑥3 + 𝑥1 𝑥3 .
2. Determine the nature of the critical points (if any) of following functions.
3. Show that the functions 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥23 and 𝑔(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥24 both have a
critical point at (𝑥1 , 𝑥2 ) = (0, 0) and that their associated Hessians are
111
PSD. Show that𝑔 has a local (global) minimum at(0, 0)and that 𝑓 has no
extremum at (0, 0).
Definition
Let ∅ ≠ 𝐴 ⊆ ℝ𝑛 and let 𝑓 be a function defined on 𝐴. 𝑓 is said to have a
maximum on 𝐴 if there exists 𝒂 ∈ 𝐴 such that 𝑓(𝒂) ≥ 𝑓(𝒙) for each 𝒙 ∈ 𝐴.
𝑓 is said to have a minimum on 𝐴 if there exists 𝒃 ∈ 𝐴 such that 𝑓(𝒃) ≤
𝑓(𝒙) for each 𝒙 ∈ 𝐴. The values 𝑓(𝒂) and 𝑓(𝒃) are called, respectively, as
the maximum value of 𝑓 on 𝐴 and the minimum value of 𝑓 on 𝐴.
Example 8.15
The function 𝑓: (0, 1) → ℝ defined by 𝑓(𝑥) = 𝑥 2 for each 𝑥 ∈ (0, 1) has
neither a maximum nor a minimum on (0, 1).
Example 8.16
Define 𝑓: ℝ2 → ℝ by 𝑓(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥22 for each (𝑥1 , 𝑥2 ) ∈ ℝ2 . Then
𝑓 has no maximum on the open unit disc ‖𝒙‖ < 1. However, 𝑓 has a
minimum on the open unit disc and 𝑓 takes its minimum at (0, 0).
Almost all the functions that we discussed in the previous section were
defined on ℝ𝑛 and are differentiable on ℝ𝑛 . So, the problem of finding
points at which a function has local extrema was easily solved by invoking
Theorem 8.1. Functions that we are going to consider in this section are also
differentiable. But, the domain will no longer be ℝ𝑛 . There are certain
situations in which we need to impose constraints on the domain of the
function. These constraints then restrict the domain of the function. Our goal
in this section is actually to optimize a given function 𝑓, i.e. to find
maximum or minimum values of 𝑓, on this kind of restricted domain. It is
112
Examples 8.17
Find the minimum value of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 3𝑥𝑦 + 6𝑦 2 subject to the
constraint 𝑥 + 𝑦 = 56.
Solution
Obviously, the function 𝑓 is defined for all (𝑥, 𝑦) ∈ ℝ2 and
differentiable on ℝ2 . But, we need to investigate the function on the set
{(𝑥, 𝑦) ∈ ℝ2 : 𝑥 + 𝑦 = 56}. First let us find critical points of 𝑓.
Notice that,
∇𝑓 = 𝟎 if and only if
(8𝑥 + 3𝑦, 3𝑥 + 12𝑦) = 𝟎 if and only if
8𝑥 + 3𝑦 = 0 and 3𝑥 + 12𝑦 = 0 if and only if
𝑥 = 𝑦 = 0.
So, 𝑓 has only one critical point and it is (0, 0). Clearly (0,0) does not
satisfy 𝑥 + 𝑦 = 56. In other words, (0, 0) ∉ {(𝑥, 𝑦) ∈ ℝ2 : 𝑥 + 𝑦 = 56}
– the domain of interested.
2
Since Δ(0,0) = 𝑓𝑥𝑥 (0,0)𝑓𝑦𝑦 (0,0) − (𝑓𝑥𝑦 (0,0)) = 8 ⋅ 12 − 32 > 0 and
𝑓𝑥𝑥 (0,0) = 8 > 0, the second derivative test implies that 𝑓 has a local
minimum at (0,0). By writing 𝑓 in the equivalent form (𝑥, 𝑦) = 3𝑥 2 +
3𝑦 2 15𝑦 2
(𝑥 + ) + , we can see that 𝑓(𝑥, 𝑦) ≥ 0 = 𝑓(0,0) for all (𝑥, 𝑦) ∈
2 4
ℝ2 . Hence, not only 𝑓 has a local minimum at (0,0), but it also has its
unique global minimum at (0,0).
113
As we pointed out earlier, our points (𝑥, 𝑦) should satisfy the condition
𝑥 + 𝑦 = 56. So, how do we solve this problem? To this end let us write
𝑦 in terms of 𝑥 using the equation 𝑥 + 𝑦 = 56. This yields 𝑦 = 56 − 𝑥,
where 𝑥 ∈ ℝ. By substituting this for 𝑦 in 𝑓(𝑥, 𝑦) we get 𝑓(𝑥, 𝑦) =
𝑓(𝑥, 𝑦(𝑥)) = 4𝑥 2 + 3𝑥(56 − 𝑥) + 6(56 − 𝑥)2 , 𝑥 ∈ ℝ. Now we have
obtained a function of the single variable 𝑥, and you know how to find
extreme values of such functions. Let 𝑓(𝑥, 56 − 𝑥) = 𝐹(𝑥). Observe
that 𝐹(𝑥) is a quadratic function and the coefficient of 𝑥 2 term is
positive. This guarantees that 𝐹, and hence the constrained problem, has
𝑑𝐹
a minimum. As you know, to find the minimum we need to set = 0.
𝑑𝑥
Now,
𝑑𝐹
= 0 if and only if
𝑑𝑥
Theorem 8.14
114
Example 8.18
Find the maximum and the minimum value of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 9𝑦 2 on
the closed disk determined by 𝑥 2 + 𝑦 2 ≤ 9.
Solution
In this problem we would like to find maximum and minimum values of
𝑓 on the closed disk given by 𝑥 2 + 𝑦 2 ≤ 9.
115
But the only point in int(𝐷) which satisfy ∇𝑓(𝒙) = 𝟎 is (0, 0). Thus
there is no such point 𝒙𝟎 in int(𝐷) which gives a maximum. So,
maximum of 𝑓 should occur on the boundary of 𝐷. That is if 𝑓 takes its
maximum at (𝑎, 𝑏), then (𝑎, 𝑏) ∈ {(𝑥, 𝑦) ∈ ℝ2 : 𝑥 2 + 𝑦 2 = 9}. Now,
what happens to 𝑓 on this boundary?
𝐹(2𝜋) = 36.
116
Now,
when 𝑡 = 0, 𝑥 = 𝑥(𝑡) = 3 and 𝑦 = 𝑦(𝑡) = 0
𝜋
when 𝑡 = 2 , 𝑥 = 0 and 𝑦 = 3
when 𝑡 = 𝜋, 𝑥 = −3 and 𝑦 = 0
3𝜋
when 𝑡 = , 𝑥 = 0 and 𝑦 = −3
2
Hence,
𝑓(3, 0) = 36,
𝑓(0, 3) = 81,
𝑓(−3, 0) = 36,
𝑓(0, −3) = 81.
Therefore 𝑓 takes its maximum value at the two points (0, 3), (0, −3)
and the maximum value is 81.
Remark
If there are points in the domain of the function at which at least one of the
first-order partial derivatives does not exist then, you need to check those
points separately to make sure that 𝑓 has extrema at such points or not.
Theorem 8.15
117
118
We have used the phrase “in 𝐶” to emphasize that not only 𝑓 has a local
extremum/ global extremum at 𝒂 = (𝑎1 , … , 𝑎𝑛 ), but also that the point 𝒂
satisfies the 𝑚 equality constraints 𝑔𝑖 (𝒙) = 0, 𝑖 = 1, … , 𝑚 from which 𝐶 is
made up.
118
Then for (𝑎1 , … , 𝑎𝑛 , 𝜆1′ , … , 𝜆′𝑚 ) ∈ ℝ𝑛+𝑚 , where 𝒂 = (𝑎1 , … , 𝑎𝑛 ) and 𝜆′𝑖 ’s are
as in Theorem 8.15, ∇𝐿(𝑎1 , … , 𝑎𝑛 , 𝜆1′ , … , 𝜆′𝑚 ) = 𝟎. Here 𝟎 ∈ ℝ𝑛+𝑚 . In other
words,(𝑎1 , … , 𝑎𝑛 , 𝜆1′ , … , 𝜆′𝑚 ) is a critical point of 𝐿.
Since we are dealing with functions of two or three variables in most cases,
we would like to give the following two versions of the above theorem.
119
Theorem 8.16
Theorem 8.17
Let us solve the two examples we discussed at the beginning of this section
using the method of Lagrange multipliers.
Example 8.19
Find the minimum value of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 3𝑥𝑦 + 6𝑦 2 subject to the
constraint curve 𝐶, 𝑥 + 𝑦 = 56.
Solution
Put 𝑔(𝑥, 𝑦) = 𝑥 + 𝑦 − 56. Then 𝑓 and 𝑔 are continuously differentiable
on ℝ2 .
120
Notice that ∇𝑔(𝒙) = (1, 1) ≠ 𝟎 for each (𝑥, 𝑦) ∈ ℝ2 and hence for
each (𝑥, 𝑦) ∈ 𝐶.
Now, the possible candidates for points at which 𝑓 has local extrema in
𝐶 are obtained by setting
∇𝑓(𝒙) = λ∇𝑔(𝒙) where 𝜆 ∈ ℝ and 𝑥 + 𝑦 = 56
In this special case we know that the point (36, 20) gives rise to a local
minimum in 𝐶. The minimum value of 𝑓 is 𝑓(36,20) = 9744.
Example 8.20
Find the maximum and the minimum values of 𝑓(𝑥, 𝑦) = 4𝑥 2 + 9𝑦 2 on
the closed disk determined by 𝑥 2 + 𝑦 2 ≤ 9.
Solution
We know how to deal with the interior of the disk. Thus, let us consider
the boundary. Notice that, points on the boundary satisfy 𝑥 2 + 𝑦 2 = 9.
So, put 𝑔(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 − 9.
121
So, 𝑓 attains its maximum at the two points (0, 3) and (0, −3) on the
disk and maximum value of 𝑓 is 81. The minimum value of 𝑓 is 0 and it
occurs at the origin.
Example 8.21
122
Solution
Here we have two constraints.
Put 𝑔1 (𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2 − 4 and 𝑔2 (𝑥, 𝑦, 𝑧) = 𝑦 − 𝑥.
Clearly 𝑓, 𝑔1 , 𝑔2 are continuously differentiable on ℝ3 .
Observe that ∇𝑔1 (𝒙) = (2𝑥, 2𝑦, 2𝑧) , ∇𝑔2 (𝒙) = (−1, 1, 0) are
linearly independent for each (𝑥, 𝑦, 𝑧) ∈ ℝ3 satisfying the two equality
constraints 𝑔1 (𝑥, 𝑦, 𝑧) = 0 and 𝑔2 (𝑥, 𝑦, 𝑧) = 𝑦 − 𝑥.
To find possible candidates for points at which 𝑓 has local extrema, we
set
∇𝑓(𝒙) = 𝜆1 ∇𝑔1 (𝒙) + 𝜆2 ∇𝑔2 (𝒙) for 𝜆1 , 𝜆2 ∈ ℝ.
123
At these points,
𝑓(0, 0, 2) = 4
𝑓(0, 0, −2) = 4
𝑓(√2, √2, 0) = 2
𝑓(−√2, −√2, 0) = 2
So, 𝑓 attains its maximum value 4 at the two points (0, 0, 2), (0, 0, −2)
and its minimum value 2 at the two points (√2, √2, 0), (−√2, −√2, 0).
124
Example 8.22
Let 𝑓, 𝑔: ℝ2 → ℝ be two functions defined by 𝑓(𝑥, 𝑦) = 2𝑥 + 4𝑦 for all
(𝑥, 𝑦) ∈ ℝ2 and 𝑔(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 for all (𝑥, 𝑦) ∈ ℝ2 . Find the
maximum of 𝑓 subject to 𝑔(𝑥, 𝑦) = 0.
Solution
Setting ∇𝑓(𝑥) = 𝜆∇𝑔(𝑥) gives the following equations:
2 = 2𝜆𝑥
4 = 2𝜆𝑦
Example 8.23
Consider the problem of maximizing 𝑓(𝑥, 𝑦) = 𝑥 + 𝑦 subject to 𝑥𝑦 =
16.
In this problem 𝑔(𝑥, 𝑦) = 𝑥𝑦 − 16. Clearly 𝑓 and 𝑔 are continuously
differentiable on ℝ2 . Also, ∇𝑔(𝑥) = (𝑦, 𝑥) ≠ 0 for all points (𝑥, 𝑦)
satisfying 𝑥𝑦 = 16.
By setting ∇𝑓(𝑥) = 𝜆∇𝑔(𝑥) we get the following equations:
125
1 = 𝜆𝑦
1 = 𝜆𝑥
16
Let 𝑀 > 0 be arbitrary. Put 𝑥 = 2𝑀 and 𝑦 = . Then 𝑥 ⋅ 𝑦 = 16.
2𝑀
16 16
Notice that, 𝑓 (2𝑀, 2𝑀) = 2𝑀 + > 𝑀 Since 𝑀 > 0 is arbitrarily
2𝑀
chosen it follows that 𝑓 can be maximized as much as you please
without violating the condition 𝑥𝑦 = 16.
So, what are these two points then? Think!
Remark
Solutions of Activities
Activity 8.1
1. Clearly ℝ2 is open and convex. It is easy to see that 𝑓 has continuous second-
order partial derivatives at each point 𝒙 ∈ ℝ2 . Now let 𝒙 ∈ ℝ2 .
126
Notice that
2 2
𝐻𝑓(𝒙) = [ ].
2 2
There are two 1𝑠𝑡 principal minors of 𝐻𝑓(𝒙) and both of them are equal to 2. There
is only one 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙) and it is equal to 0. Since all the principal
minors are nonnegative and 𝒙 ∈ ℝ2 is arbitrarily chosen, 𝑓 is convex on ℝ2 .
The two 1𝑠𝑡 principal minors are −4 and −2. The 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙) is 7.
Since both 1𝑠𝑡 principal minors are negative and 2𝑛𝑑 principal minor is positive 𝑓 is
concave on ℝ2 .
The three 1𝑠𝑡 principal minors of 𝐻𝑓(𝒙) are 2, 2, 4. The three 2𝑛𝑑 principal minors of
𝐻𝑓(𝒙) are 7, 7, and 3. The only 3𝑟𝑑 principal minor of 𝐻𝑓(𝒙) is 6. Since all the
principal minors of 𝐻𝑓(𝒙) are nonnegative it follows from Theorem 8.3 that 𝑓 is
convex on ℝ3 .
127
In this problem 𝑥1 has appeared in the Hessian. Notice that the two 1𝑠𝑡 principal
minors of 𝐻𝑓(𝒙) are 6𝑥1 and 2. On the other hand the 2𝑛𝑑 principal minor of 𝐻𝑓(𝒙)
is 12𝑥1 − 4. Since 2 > 0, 𝑓 is not concave. Now 6𝑥1 ≥ 0 if and only if 𝑥1 ≥ 0 and
1
12𝑥1 − 4 ≥ 0 if and only if 𝑥1 ≥ . Therefore, all the principal minors of 𝐻𝑓(𝒙) are
3
1
nonnegative if and only if 𝑥1 ≥ .
3
1
It follows that 𝑓 is convex on the open set {(𝑥1 , 𝑥2 ) ∈ ℝ2 : 𝑥1 > }.
3
Activity 8.2
1. Let 𝑎 ∈ ℝ. Then the matrix (𝑎) is positive definite if and only if𝑥 ⋅ 𝑎 ⋅ 𝑥 = 𝑎𝑥 2 >
0 for each 𝑥 ∈ ℝ ∖ {0} if and only if 𝑎 > 0.
1 1
2. Let 𝐴 = ( ) and let (𝑥, 𝑦) ∈ ℝ2 .
1 1
On the other hand Eigen values of 𝐴 are given by the equation (1 − 𝜆)2 − 1 =
0, or equivalently by the equation −𝜆(2 − 𝜆) = 0. So, Eigen values of 𝐴 are 2
and 0. Since all the Eigen values are non-negative 𝐴 is positive semidefinite.
But since 0 is an Eigen value of 𝐴, 𝐴 is not positive definite.
3. Notice that Eigen values of 𝐴 are −3 and 5. Since −3 < 0, 𝐴 is not positive
definite.
128
−1 1
4. Let 𝐴 = ( ). Then Eigen values of 𝐴 are given by the equation
1 −4
(−1 − 𝜆)(−4 − 𝜆) − 1 = 0.
−5+√13 −5−√13
So, Eigen values of 𝐴 are and . Since both Eigen values of 𝐴 are
2 2
Therefore,
𝐴 is PD if and only if 𝑎, 𝑏, 𝑐 ∈ (0, ∞)
Activity 8.3
1.
i) (0, 0) − 𝑓 has no extremum at (0, 0).
129
2.
i) (0, 0, 0) − 𝑓 has a strict local maximum at (0, 0, 0).
ii) Critical points of 𝑓 are of the form (𝑎, 𝑎), where 𝑎 ∈ ℝ. At each critical point (𝑎, 𝑎),
𝑎 ∈ ℝ, 𝑓 has a global minimum.
iii) (0, 0, 0) − 𝑓 has a strict local minimum at (0, 0, 0).
iv) 𝑓 has no critical points in ℝ2 .
3.
It is easy to see that (0,0) is a critical point of both 𝑓 and 𝑔.
Since 𝑔(𝑥1 , 𝑥2 ) = 𝑥12 + 𝑥24 ≥ 0 for each (𝑥1 , 𝑥2 ) ∈ ℝ2 and 𝑔(0,0) = 0 it follows that 𝑔 has
a local (actually, global) minimum at (0,0).
𝛿 𝛿 𝛿2
Now let 𝛿 > 0. Notice that ( , 0) ∈ 𝐵(𝟎, 𝛿) and 𝑓 ( , 0) = > 0 = 𝑓(0,0).
2 2 4
𝛿 𝛿 𝛿3
Also, (0, − ) ∈ 𝐵(𝟎, 𝛿) and 𝑓 (0, − ) = − < 0 = 𝑓(0,0).
2 2 8
Summary
130