MT1186 Chapter 6 Multivariate Optimisation

Chapter 6
Multivariate optimisation
Essential reading
(For full publication details, see Chapter 1.)
Binmore and Davies (2002) Sections 4.6, 4.7, 6.3–6.8.

Anthony and Biggs (1996) Chapter 13, parts of Chapters 14 and 21.
Further reading
Simon and Blume (1994) parts of Chapter 17, 18 and 19. 6

Adams and Essex (2010) parts of Sections 13.1–13.3.
Aims and objectives
The objectives of this chapter are as follows.
To use partial derivatives to solve problems where a function needs to be optimised.

To solve problems where a function needs to be optimised subject to a constraint.
Specific learning outcomes can be found near the end of this chapter.
6.1 Introduction
Having seen how to find partial derivatives and gained some insight into what they tell
us about a function of two variables in the last chapter, we now see how they can be
used to optimise such a function. In particular, we will see how the first-order partial
derivatives allow us to find the stationary points of a function and its second-order
partial derivatives allow us to see whether such a point is a maximum or a minimum. We
will also see how to optimise a function of two variables in cases where the variables are
constrained, i.e. they are required to satisfy some extra condition known as a constraint.
6.2 Unconstrained optimisation

We start by considering unconstrained optimisation, i.e. we are looking for the places
where a function of two variables, f (x, y), attains its maximum or minimum values
when x and y are independent and free to take any values in R2 .
213
6. Multivariate optimisation
6.2.1 Stationary points

We define a stationary point of a function, f (x, y), to be any point that satisfies the
equations
fx (x, y) = 0 and fy (x, y) = 0,
simultaneously.
Let’s look at some examples to see how this works.
Example 6.1 Find the stationary points of the function

f (x, y) = x4 + 2x2 y + 2y 2 + y.
The first-order partial derivatives of this function are
fx (x, y) = 4x3 + 4xy and fy (x, y) = 2x2 + 4y + 1.
At a stationary point, both of the first-order partial derivatives are zero, i.e. we must
have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to
solve the simultaneous equations
6
4x3 + 4xy = 0 and 2x2 + 4y + 1 = 0.
If we start by looking at the first equation, this gives us
4x3 + 4xy = 0 =⇒ 4x(x2 + y) = 0 =⇒ x = 0 or y = −x2 .
And so, to satisfy the second equation with:
x = 0 we must have
1
2(0)2 + 4y + 1 = 0 =⇒ y=− ,
4
i.e. (0, −1/4) is a stationary point.
y = −x2 we must have

1 1
2x2 + 4(−x2 ) + 1 = 0 =⇒ 2x2 = 1 =⇒ x2 = =⇒ x = ±√ ,
2 2
which in turn gives us 2
1 1
y = − ±√ =− ,
2 2
√ √
i.e. (1/ 2, −1/2) and (−1/ 2, −1/2) are stationary points.
Consequently, the points

1 1 1 1 1
0, − , √ ,− and −√ , − ,
4 2 2 2 2
are stationary points of this function.
214
6.2. Unconstrained optimisation
Example 6.2 Find the stationary points of the function

f (x, y) = 4x3 − 60xy + 5y 2 + 400y − 35.
fx (x, y) = 12x2 − 60y and fy (x, y) = −60x + 10y + 400.
have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to
12x2 − 60y = 0 and − 60x + 10y + 400 = 0.
We start by simplifying these equations to get,
x2 − 5y = 0 and − 6x + y + 40 = 0,
and then notice that the first equation gives us y = x2 /5. Substituting this into the
second equation then allows us to see that
6
x2
−6x+ +40 = 0 =⇒ x2 −30x+200 = 0 =⇒ (x−20)(x−10) = 0 =⇒ x = 10 or x = 20,
5
and, since y = x2 /5, we have
102 202
y= = 20 or y= = 80,
5 5
respectively. Thus, this function has two stationary points, namely the points
(10, 20) and (20, 80).
Activity 6.1 Find the stationary points of the function

f (x, y) = x2 − 4x + y 2 + 4y + 8.
Activity 6.2 Find the stationary points of the function
f (x, y) = 3x3 + 9x2 − 72x + 2y 3 − 12y 2 − 126y + 19.
We have now seen how to find the stationary points of a function, f (x, y), but what do
they look like? Generally speaking, we will find that there are three kinds of stationary
point — namely local minima, saddle points and local maxima — and these are
illustrated in Figure 6.1(a), (b) and (c) respectively. We now consider what criteria we
can use to determine exactly what kind of stationary point we have found.
215
x y
x y
x y
(a) local minimum (b) saddle point (c) local maximum
Figure 6.1: Each of these surfaces has the indicated kind of stationary point at (0, 0, 0).
6.2.2 Classifying stationary points

Let’s say that we have found that (a, b) is a stationary point of the function, f (x, y), so
that
fx (a, b) = 0 and fy (a, b) = 0.
6 In order to classify this stationary point, we use the Hessian which is the function
H(x, y) = fxx (x, y)fyy (x, y) − [fxy (x, y)]2 ,
and then note that
If H(a, b) > 0 and fxx (a, b) > 0, then this stationary point is a local minimum.
If H(a, b) > 0 and fxx (a, b) < 0, then this stationary point is a local maximum.
If H(a, b) < 0, then this stationary point is a saddle point.
In particular, if H(a, b) = 0, we can draw no conclusions about the nature of the
stationary point by using this method.
Let’s look at some examples of how this works in practice.
Example 6.3 Classify the stationary points we found in Example 6.1.
Using the first-order partial derivatives we found in Example 6.1, we find that the
second-order partial derivatives are
fxx (x, y) = 12x2 + 4y, fxy (x, y) = 4x = fyx (x, y) and fyy (x, y) = 4,
and, as such, the Hessian is given by
H(x, y) = (12x2 + 4y)(4) − (4x)2 = 48x2 + 16y − 16x2 = 16(2x2 + y).
Evaluating this at each of the stationary points we then find that:
At (0, −1/4), the Hessian is
H(0, −1/4) = 16(−1/4) < 0,
and so this is a saddle point.
216
√
At (1/ 2, −1/2), the Hessian is
√ √
H(1/ 2, −1/2) = 16(1/2) > 0 and fxx (1/ 2, −1/2) = 6 − 2 > 0,
so this is a local minimum.

√
At (−1/ 2, −1/2), the Hessian is
√ √
H(−1/ 2, −1/2) = 16(1/2) > 0 and fxx (−1/ 2, −1/2) = 6 − 2 > 0,

Thus, the stationary points we found in Example 6.1, i.e.

1 1 1 1 1
0, − , √ ,− and −√ , − ,
4 2 2 2 2
are a saddle point and two local minima respectively.
Example 6.4 Classify the stationary points we found in Example 6.2. 6

Using the first-order partial derivatives we found in Example 6.2, we find that the
fxx (x, y) = 24x, fxy (x, y) = −60 = fyx (x, y) and fyy (x, y) = 10,
H(x, y) = (24x)(10) − (−60)2 = 240x − 3600 = 240(x − 15).
At (10, 20), the Hessian is
H(10, 20) = 240(−5) < 0,
H(20, 80) = 240(5) > 0 and fxx (20, 80) = 24(20) > 0,

Thus, the stationary points (10, 20) and (20, 80) are a saddle point and a local
minimum respectively.
Activity 6.3 Classify the stationary points we found in Activity 6.1.
Activity 6.4 Classify the stationary points we found in Activity 6.2.
217
Lastly, we have remarked above that in cases where the Hessian is zero at a stationary
point, the method that we have used so far fails. Indeed, in such cases, the stationary
point could be a local minimum, a local maximum or a saddle point and, to determine
which, we would have to think more carefully about what is happening. Let’s consider
an example of a function where this kind of problem occurs.
Example 6.5 Find the stationary point of the function f (x, y) = x3 − y 3 and show
that we can’t determine its nature using the method above. What kind of stationary
point do we have here?
fx (x, y) = 3x2 and fy (x, y) = −3y 2 .
So, clearly, the only stationary point is at (0, 0). The second-order partial derivatives
of this function are given by
fxx (x, y) = 6x, fxy (x, y) = 0 = fyx (x, y) and fyy (x, y) = −6y,
6
H(x, y) = (6x)(−6y) − 02 = −36xy.
Indeed, evaluating this at the stationary point gives H(0, 0) = 0 and so the method
we used above fails.
However, if we consider the surface z = f (x, y), notice that the y = 0 section of our
function gives z = f (x, 0) = x3 . As such, if we look at this section around the
stationary point (0, 0) where z = f (0, 0) = 0, we can see that
if x > 0, we have f (x, 0) > f (0, 0) and so this stationary point can’t be a local
maximum, whereas
if x < 0, we have f (x, 0) < f (0, 0) and so this stationary point can’t be a local
minimum.
Indeed, if we look at the x = 0 section of our function, i.e. z = f (0, y) = −y 3 , this
leads us to a similar conclusion. In fact, looking at the sections, we can see that this
is a kind of saddle point, albeit one which ‘looks different’ to the one that we saw
before in Figure 6.1(b), and it is illustrated in Figure 6.2.
Activity 6.5 Find the stationary point of the function
f (x, y) = (x − 1)4 + (y − 1)4 ,
and show that we can’t determine its nature using the method above. What kind of
stationary point do we have here?
218
100 100
50 50 200
100
0 0
-4 -2 0 2 4 -4 -2 0 2 4 0
x y
-50 -50 -100
4
-200 2
0 x
-100 -100 -2
4 2 -4
0 -2 -4
y
(a) (b) (c)

Figure 6.2: Some useful pictures for Example 6.5. (a) The y = 0 section, z = f (x, 0) = x3 .
(b) The x = 0 section, z = f (0, y) = −y 3 . (c) The surface z = f (x, y) = x3 −y 3 displaying
a ‘different kind’ of saddle point at (0, 0, 0).
6.2.3 Applications
Optimisation problems are very common in economics and we now introduce two ways
in which they can arise in that subject. The first is their use in cost minimisation and
the second will be another instance of profit maximisation. 6
Cost minimisation
Suppose a firm is using quantities x and y of two commodities and this incurs a cost
given by the cost function, C(x, y). One might reasonably ask: What quantities should
they be using if they want to minimise their costs?
Example 6.6 A data processing company employs both senior and junior
programmers. A particularly large project will cost
C(x, y) = 2000 + 2x3 − 12xy + y 2 ,
pounds, where x and y represent the number of junior and senior programmers used
respectively. How many employees of each kind should be assigned to the project in
order to minimise its cost? What is this minimum cost?
To minimise the cost, we need to find the stationary points of C(x, y) and determine
which of them gives us a minimum. So, as before, we start by finding the first-order
partial derivatives of C(x, y), i.e.
Cx (x, y) = 6x2 − 12y and Cy (x, y) = −12x + 2y.
At a stationary point, both of these first-order partial derivatives are zero, i.e. we
must have Cx (x, y) = 0 and Cy (x, y) = 0. Thus, to find the stationary points, we
have to solve the simultaneous equations
6x2 − 12y = 0 and − 12x + 2y = 0.
We start by simplifying these equations to get
x2 − 2y = 0 and − 6x + y = 0,
219
and then notice that the second equation gives us y = 6x. Substituting this into the
first equation then allows us to see that
x2 − 2(6x) = 0 =⇒ x2 − 12x = 0 =⇒ x(x − 12) = 0 =⇒ x = 0 or x = 12,
and, since y = 6x, we have
y = 6(0) = 0 or y = 6(12) = 72,
respectively. Thus, the cost function, C(x, y), has two stationary points, namely the
points (0, 0) and (12, 72).
To classify these stationary points, we look at the second-order partial derivatives of

C(x, y), which are
Cxx (x, y) = 12x, Cxy (x, y) = −12 = Cyx (x, y) and Cyy (x, y) = 2,
H(x, y) = (12x)(2) − (−12)2 = 24x − 144 = 24(x − 6).

6

H(0, 0) = 24(−6) < 0,
H(12, 72) = 24(+6) > 0 and Cxx (12, 72) = 12(12) > 0,

Consequently, to minimise the cost we want to use 12 junior and 72 senior
programmers. If we do this we find that the minimum cost is given by
C(12, 72) = 2000 + 3456 − 10368 + 5184 = 272,
i.e. the minimum cost is £272.1
Profit maximisation
We now describe the problem of maximising the profit of a firm which makes two
products, X and Y. Generally, if pX and pY are the selling prices of one unit of X and
one unit of Y respectively, then the total revenue, TR(x, y), obtained from producing
amounts x of product X and y of product Y is
TR(x, y) = xpX + ypY .

1
Which, thinking about it, is far less than the value of C(x, y) at the other stationary point since
C(0, 0) = 2000.
220
Of course, there are a number of ways in which the prices pX and pY may be related to
the quantities x and y. For instance:
If the goods were related, pX and pY could both depend on x and y (e.g. if we were
considering a music company producing an album on both CD and cassette).
If the goods were unrelated, pX and pY could depend only on x and y respectively
(e.g. a pharmaceuticals company producing paracetamol and insulin).
The firm will also have a joint total cost function, TC(x, y), which tells us how much it
costs to produce x units of X and y units of Y. Clearly, given TR(x, y) and TC(x, y), we
can consider the profit function of the firm, π(x, y), which is given by
π(x, y) = TR(x, y) − TC(x, y) = xpX + ypY − TC(x, y),
and we can maximise this function of x and y using the techniques described above.
Let’s look at an example.
Example 6.7 Suppose that a firm is the sole supplier of X and Y (in other words,
it has a monopoly on these goods) and that the demands for X and Y, in tonnes, are
given by
x = 2 − 2pX + pY and y = 13 + pX − 2pY , 6
respectively when each unit of X and Y sells at a price, in pounds, of pX and pY ,
respectively. If the joint total cost function of the firm is TC(x, y) = 5 + x2 − xy + y 2 ,
find the quantities of X and Y the firm should produce in order to maximise its
profit. What are the corresponding prices? What is the maximum profit?
We start by rearranging the equations to find expressions for pX and pY .2 The first
equation tells us that pY = x − 2 + 2pX and so substituting this into the second
equation yields
y = 13 + pX − 2(x − 2 + 2pX ) =⇒ y = 13 + pX − 2x + 4 − 4pX =⇒ 3pX = 17 − 2x − y.
As such, we have
17 − 2x − y
pX = ,
3
and so substituting this into pY = x − 2 + 2pX , we find that
17 − 2x − y 3x − 6 + 34 − 4x − 2y 28 − x − 2y
pY = x−2+2 =⇒ pY = =⇒ pY = .
3 3 3
Consequently, the profit function in this case is given by
π(x, y) = xpX + ypY − TC(x, y)

17 − 2x − y 28 − x − 2y
=x +y − (5 + x2 − xy + y 2 )
3 3

1 2 2 2 2
= (17x − 2x − xy) + (28y − xy − 2y ) − (15 + 3x − 3xy + 3y )
3

1 2 2
∴ π(x, y) = − 15 + 17x + 28y − 5x − 5y + xy ,
3
and we can now maximise this profit function using the method above.
221
Activity 6.6 Finish the problem started in Example 6.9. That is, find the values of
x and y that maximise the profit function π(x, y) found in the example, the
corresponding prices pX and pY , and the maximum profit.
6.3 Constrained optimisation

We now turn our attention to the problem of constrained optimisation, i.e. the problem
of optimising a function, f (x, y), in the case where the values of x and y we are
considering are constrained by the requirement that they must lie in some region, R, of
R2 . In particular, we will see that the optimal point we seek will
either be a point inside the region, in which case it will be a stationary point of
f (x, y) that happens to be in the region,
or it will be a point on the boundary of the region, in which case it need not be a
stationary point of f (x, y) even though it optimises this function over points in the
6 region.
Of course, in the former case, we can find and classify the stationary point in the region
using the method in the previous section and then, checking that this point is more
optimal than any point on the boundary of the region, we will have our answer. Let’s
look at a quick example.
Example 6.8 Minimise the function f (x, y) = (x − 1)2 + (y − 1)2 given that (x, y)
must lie in the region defined by the inequalities x ≥ 0, y ≥ 0 and x + y ≤ 3.
fx (x, y) = 2(x − 1) and fy (x, y) = 2(y − 1),
and so, setting these equal to zero, we see that (1, 1) is the only stationary point of
this function. The second-order partial derivatives of this function are
fxx (x, y) = 2, fxy (x, y) = 0 = fyx (x, y) and fyy (x, y) = 2,
which means that the Hessian is given by
H(x, y) = (2)(2) − 02 = 4,
and so we see that H(1, 1) = 4 > 0 and fxx (1, 1) = 2 > 0 which means that this
point is a local minimum. Indeed, as this point satisfies the inequalities given above,3
this point is in the specified region and so f (1, 1) = 0 is a candidate for the
minimum value of f (x, y) for (x, y) that lie in the region. However, we must check
that nothing ‘odd’ is happening due to the points on the boundary of the region and
to do this we note that:
2
Note that if the price of X was fixed and the price of Y was increased, then the demand for X would
rise and the demand for Y would fall. This is the behaviour one might expect if X and Y were two related
commodities, e.g. if they were two different types of chocolate bar.
222
6.3. Constrained optimisation
If we are on the x = 0 boundary of the region (so, technically, 0 ≤ y ≤ 3) we

have f (0, y) = 1 + (y − 1)2 ≥ 1 > 0.
If we are on the y = 0 boundary of the region (so, technically, 0 ≤ x ≤ 3) we

have f (x, 0) = (x − 1)2 + 1 ≥ 1 > 0.
If we are on the x + y = 3 boundary of the region we have x = 3 − y (and,

technically, 0 ≤ y ≤ 3) which means that
2
2 2 2 3 1
f (3 − y, y) = (2 − y) + (y − 1) = 2y − 6y + 5 = 2 y − + ,
2 2
1
if we complete the square, but this means that f (3 − y, y) ≥ 2
> 0.
Thus, we can’t find values of f (x, y) as small as f (1, 1) = 0 on any of the boundaries
of the region and so the minimum value of f (x, y) for points in this region is zero
and this occurs at the point (1, 1).
Activity 6.7 Explain why the answer we found in the previous example is obvious! 6
However, in what follows we will be more interested in solving constrained optimisation
problems where the optimal point occurs on the boundary of the region since the
methods we have developed so far will not help us in that case.
6.3.1 Finding optimal points on the boundary of a region

Generally speaking, when the optimal point occurs on the boundary of a region, we will
be able to find it by considering the contours of the function we are optimising in
relation to the region we are optimising the function over. Indeed, when doing this, we
will find that we are in one of the two cases below.
The optimal point is at a ‘corner’ of the boundary
The following example should clarify what we should do in this case.
Example 6.9 Maximise the function f (x, y) = x2 + y 2 given that (x, y) must lie in
the region defined by the inequalities x ≥ 0, y ≥ 0 and x + 2y ≤ 4.
We start by sketching the region which is the shaded triangle in Figure 6.3(a) and
some typical contours of the surface z = f (x, y). Indeed, notice that here, the
contour z = c has equation
x2 + y 2 = c,
√
and so it will be a circle of radius c centred on the origin. In the figure, we have
sketched the z = 4 and z = 16 contours and, in particular, we notice that as the
3
That is, the point (1, 1) clearly satisfies the inequalities x ≥ 0 and y ≥ 0 as well as the inequality
x + y ≤ 3 since 1 + 1 = 2 < 3.
223
contours move away from the origin, the value of z increases as indicated by the
arrow.
Now, to find the maximum value of f (x, y) in this region we need a point which both
lies in the region, and
gives us the largest value of z.

That is, in this case, we want the point (4, 0) which is a ‘corner’ of the boundary. In
particular, notice that with this point on the z = 16 contour:
we get a higher value of z than we do from any point on a contour with z < 16
(like, say, the z = 2 contour), and
we can’t have any point on a contour with z > 16 as none of these contours will
give us a point in the region.
That is, the point (4, 0) which gives us z = 16 must indeed maximise the function
f (x, y) given that (x, y) must lie in the specified region.
6
y y
g f
g f
si n o
si n o
z
z
ea on
4
z = 16 ea on
cr ti
cr ti
i n i rec
i n i rec
d
2 z=4 2
(x∗ , y ∗ )
z=2
z=1
O O
2 4 x 4 x
(a) (b)
Figure 6.3: (a) The region for Example 6.9 is the shaded triangle and the z = 4 and z = 16
contours are indicated. (b) The region for Example 6.10 is the same shaded triangle and
the z = 1 and z = 2 contours are indicated. Note, in both cases, the direction in which z
increases.
The optimal point is on the boundary but it isn’t a ‘corner’
This is the case that is going to concern us the most and so, for the moment, we just
look at an example to see what is happening before we come to the recommended
method for solving such problems.
Example 6.10 Maximise the function f (x, y) = xy given that (x, y) must lie in the
region defined by the inequalities x ≥ 0, y ≥ 0 and x + 2y ≤ 4.
We start by sketching the region which is the shaded triangle in Figure 6.3(b) and
some typical contours of the surface z = f (x, y). Indeed, notice that here, the
contour z = c has equation
xy = c,
224
and so it will be a rectangular hyperbola with the x and y-axes as its asymptotes. In
the figure, we have sketched the z = 1 and z = 2 contours and, in particular, we
notice that as the contours move away from the origin, the value of z increases as
indicated by the arrow.
Now, to find the maximum value of f (x, y) in this region we need a point which both
lies in the region, and
gives us the largest value of z.

That is, in this case, we want the point (x∗ , y ∗ ) which is not a ‘corner’ of the
boundary. In particular, notice that with this point on the z = 2 contour:
we get a higher value of z than we do from any point on a contour with z < 2
(like, say, the z = 1 contour), and
we can’t have any point on a contour with z > 2 as none of these contours will
give us a point in the region.
That is, the point (x∗ , y ∗ ) which gives us z = 2 must indeed maximise the function 6
f (x, y) given that (x, y) must lie in the specified region. But, how do we find this
point?
One way to find this point is to see that it is a point where, for some constant c, we
have a contour f (x, y) = c which is both
tangential to the line x + 2y = 4, and
touching the line x + 2y = 4.

Indeed, as the gradient of f (x, y) = c is given by
dy ∂f /∂x y
=− =− ,
dx ∂f /∂y x
as we saw in Section 5.3.3 and the gradient of the line x + 2y = 4 is given by

x dy 1
y =2− =⇒ =− ,
2 dx 2
the first condition means that we must have a point which satisfies the equation
y 1 x
− =− =⇒ y= ,
x 2 2
whereas the second condition means that we must have a point which satisfies the
equation x + 2y = 4. Solving these equations simultaneously, we find that this gives
us the point (x∗ , y ∗ ) = (2, 1).4
Now, in such cases, we could always proceed in this way but, as we shall see in a
4
And, at this point, z = f (2, 1) = 2 as expected from above. But, in general, we would not know the
optimal value of z = f (x, y) beforehand. We have just used it here to help illustrate what is going on.
225
moment, there is a way of turning this idea into a much more general method. And, it is
this new method that we will generally use in such cases.
6.3.2 The method of Lagrange multipliers

Suppose that we have been asked to optimise the function, f (x, y), given that (x, y)
must lie in some region and, by looking at the contours as above, we have determined
that the optimal point occurs on the boundary given by some equation g(x, y) = 0. In
particular, we are concerned with the case where the optimal point is not a ‘corner’ of
the boundary, i.e. we want a point where, for some constant c, the contour f (x, y) = c is
both
tangential to the boundary given by g(x, y) = 0, and

touching the boundary given by g(x, y) = 0.
Now, for tangency, we require that the gradient of the contour f (x, y) = c, i.e.
dy fx (x, y)
=− ,
6 dx fy (x, y)
is equal to the gradient of the boundary given by g(x, y) = 0, i.e.
dy gx (x, y)
=− ,
dx gy (x, y)
where we have used what we saw in Section 5.3.3 twice. But, if these are equal, we have
fx (x, y) gx (x, y) fx (x, y) fy (x, y)
− =− =⇒ = ,
fy (x, y) gy (x, y) gx (x, y) gy (x, y)
and we denote this common value by λ, i.e. we have
fx (x, y) fy (x, y)
λ= = .
gx (x, y) gy (x, y)
Rearranging this we then get two equations, namely
fx (x, y) − λgx (x, y) = 0 and fy (x, y) − λgy (x, y) = 0,
or, more simply,

∂ ∂
f (x, y) − λg(x, y) = 0 and f (x, y) − λg(x, y) = 0.
∂x ∂y
So, any point which satisfies these two equations is a point where the contour
f (x, y) = c is tangential to the boundary g(x, y) = 0. We also note that the equation

∂
f (x, y) − λg(x, y) = 0 =⇒ g(x, y) = 0,
∂λ
and so, any point which satisfies this equation lies on the boundary. Consequently, we
define the Lagrangean to be the function
L(x, y, λ) = f (x, y) − λg(x, y),
226
and we call λ the Lagrange multiplier. In particular, the point we seek will be amongst
the stationary points of the Lagrangean since it must satisfy the equations
∂L ∂L ∂L
= 0, = 0 and = 0,
∂x ∂y ∂λ
which we have derived above. In such cases, we call the function we are optimising,
f (x, y), the objective function and we call the equation of the boundary, which must be
written in the form g(x, y) = 0, the constraint. Let’s see how we can use this method to
solve the constrained optimisation problem we saw in Example 6.10.
Example 6.11 Solve the constrained optimisation problem in Example 6.10 using
the method of Lagrange multipliers.
We have already seen that the optimal point we seek occurs when the function
f (x, y) = xy is tangential to the boundary given by the line x + 2y = 4. Writing the
equation of the line in the form g(x, y) = x + 2y − 4 = 0 we see that the Lagrangean
is
L(x, y, λ) = xy − λ(x + 2y − 4),
where λ is the Lagrange multiplier. We now find the stationary points of the
Lagrangean by finding its first-order partial derivatives, i.e. 6
Lx (x, y, λ) = y − λ, Ly (x, y, λ) = x − 2λ and Lλ (x, y, λ) = −(x + 2y − 4),
and setting them equal to zero to get the equations
y − λ = 0, x − 2λ = 0 and x + 2y − 4 = 0.
We now eliminate λ from the first two equations to get

x x
λ=y= =⇒ y= ,
2 2
and this, as you should expect is our tangency condition from Example 6.10. On the
other hand, the third equation is just
x + 2y = 4,
which, as you should expect, is our constraint. Solving these two equations
simultaneously, we then get the point (2, 1) as the only solution and so this must be
the optimal point we seek in agreement with what we found in Example 6.12.
Obviously, at this point, we find that f (1, 2) = 2 is the maximum value of f subject
to the constraint.
Sometimes we will see questions where we are just asked to use this method to solve a
constrained optimisation problem. In such cases, we will be given the objective function,
f (x, y), and the constraint, g(x, y) = 0, which we should be using. In particular, unless
we are explicitly asked to look at contours, we will just apply the method and assume
that the answer we find is the appropriate kind of optimal point.5 Let’s look at an
example of such a problem.
5
Although, sometimes, the Lagrangean may have several stationary points and, if that happens, it
should be fairly straightforward to see which of these is the one we want.
227
Example 6.12 Given the function
f (x, y) = 160x − 3x2 − 2xy − 2y 2 + 120y − 18,
find the maximum value of f (x, y) subject to the constraint x + y = 34.
We write the constraint x + y = 34 as x + y − 34 = 0 so that it is in the form

g(x, y) = 0 with g(x, y) = x + y − 34. This allows us to write the Lagrangean as
L(x, y, λ) = 160x − 3x2 − 2xy − 2y 2 + 120y − 18 − λ(x + y − 34),
where λ is the Lagrange multiplier. To find the stationary points of the Lagrangean
we find its first-order partial derivatives, i.e.
Lx (x, y, λ) = 160 − 6x − 2y − λ,
Ly (x, y, λ) = −2x − 4y + 120 − λ and
Lλ (x, y, λ) = −(x + y − 34),
and set them equal to zero to get the equations

6
160 − 6x − 2y − λ = 0, −2x − 4y + 120 − λ = 0 and x + y − 34 = 0.
The first two equations give us
λ = 160 − 6x − 2y and λ = −2x − 4y + 120,
and so we can eliminate λ to get
160 − 6x − 2y = −2x − 4y + 120 =⇒ 2y = 4x − 40 =⇒ y = 2x − 20,
whereas the third equation gives us x + y = 34 which is, of course, just our
constraint. So, as this gives y = 34 − x, we can use it and the y = 2x − 20 that we
have just found to eliminate y and get
34 − x = 2x − 20 =⇒ 3x = 54 =⇒ x = 18.
And, if x = 18, then the constraint y = 34 − x gives us y = 34 − 18 = 16. Thus, the

point (18, 16) is the only stationary point of the Lagrangean and so it must be the
optimal point we seek. Thus, the maximum of f (x, y) subject to the constraint
g(x, y) = 0 is f (18, 16) = 2, 722.
Note that, although we have only used this method to find maxima in the examples
above, it will find minima as well and we will see an example of this when we consider
cost minimisation problems in Section 6.3.4.
6.3.3 The meaning of the Lagrange multiplier
In addition to allowing us to solve certain constrained optimisation problems, the

method of Lagrange multipliers has another use which will be important when we come
228
to consider its applications in Section 6.3.4. To see this, consider that, when we are
asked to optimise f (x, y) subject to the constraint g(x, y) = c where c is a constant we
would proceed as follows.
Writing the constraint in the form g(x, y) − c = 0, we have the Lagrangean
L(x, y, λ) = f (x, y) − λ(g(x, y) − c),
where λ is the Lagrange multiplier. Its first-order partial derivatives are given by
Lx (x, y, λ) = fx (x, y) − λgx (x, y),

Ly (x, y, λ) = fy (x, y) − λgy (x, y) and
Lλ (x, y, λ) = −(g(x, y) − c)
and we find that the stationary points occur when we set these equal to zero to get the
equations
fx (x, y) − λgx (x, y) = 0, fy (x, y) − λgy (x, y) = 0 and g(x, y) − c = 0.
Now, the first two equations tell us that

6
fx (x, y) = λgx (x, y) and fy (x, y) = λgy (x, y),
and, clearly, neither of these depend on c. However, when we solve these equations in
the standard way and use the constraint, g(x, y) = c, we find the point (x∗ , y ∗ ) which
optimises f (x, y) subject to the constraint. Of course, since we have used the constraint
to find the point (x∗ , y ∗ ), the values of x and y we found will depend on c, i.e. we have
the functions x∗ = x(c) and y ∗ = y(c) of c. In particular, this means that the optimal
value of f (x, y) subject to the constraint that we have found also depends on c, let’s call
this F (c), i.e. we have
F (c) = f (x∗ , y ∗ ) = f (x(c), y(c)).
Now, if we differentiate this with respect to c using the chain rule (see Section 5.3.3), we
have
dF ∂f dx ∂f dy
= + ,
dc ∂x dc ∂y dc
so that, using our expressions for fx (x, y) and fy (x, y) above, we get

dF ∂g dx ∂g dy ∂g dx ∂g dy
=λ +λ =λ + .
dc ∂x dc ∂y dc ∂x dc ∂y dc
However, given the constraint g(x, y) = c, we see that differentiating both sides with
respect to c we get
∂g dx ∂g dy
+ = 1,
∂x dc ∂y dc
where we have used the chain rule again on the left-hand-side. Putting these last two
equations together, we find that
dF
= λ,
dc
i.e. the Lagrange multiplier is the rate of change of the optimal value of f (x, y) subject
to the constraint g(x, y) = c with respect to c. In particular, if we allowed our constraint
229
to change from g(x, y) = c to g(x, y) = c + ∆c we would find that the change in the
optimal value of f (x, y) subject to this constraint, i.e. F (c), is given by
∆F
'λ =⇒ ∆F ' λ∆c,
∆c
provided that ∆c is suitably small. Let’s see how this works in the context of
Example 6.12.
Example 6.13 Using what we found in Example 6.12, find λ and hence find the
approximate change in the maximum value of f (x, y) subject to the constraint
x + y = 34 if the constraint is changed to x + y = 35.
We have found that the maximum value of f (x, y) subject to the constraint
x + y = 34 is f (18, 16) = 2, 722. As this occurs at the point (18, 16) we can use either
of the first two equations we found in Example 6.12 to find λ so, using the first, we
have
160 − 6x − 2y − λ = 0 =⇒ λ = 160 − 6(18) − 2(16) = 20.
Consequently, using the theory above, we have a change in the constraint from
6 x + y = 34 to x + y = 35 which gives ∆c = 1 and so the change in the maximum
value of f (x, y) subject to this constraint is approximately 20.
We now turn to some applications of constrained optimisation in economics.
6.3.4 Applications
Constrained optimisation problems are very common in economics and we now
introduce two ways in which they can arise in that subject. The first is their use when a
consumer wants to maximise their utility subject to a constraint imposed by their
budget and the second is when a firm wants to minimise its costs subject to a constraint
on its level of production.
Utility maximisation subject to a budget constraint
Suppose that a consumer is interested in buying some combination of two goods. Let’s
say the price of the first good is p1 per unit, the price of the second good is p2 per unit
and the consumer has an amount M to spend on them. Indeed, if he wants to purchase
the bundle, (x1 , x2 ), which contains quantities x1 and x2 of the first and second good
respectively, it will cost him
p 1 x1 + p 2 x2 ,
and he can afford this bundle if he satisfies the budget constraint given by
p1 x1 + p2 x2 ≤ M,
where x1 , x2 ≥ 0 as they represent quantities. This gives us a budget set, i.e. the set of
all bundles that the consumer can afford given the prices of the goods and his budget.
Indeed, geometrically, the bundles he can afford are contained in the triangular region
illustrated in Figure 6.4(a).
230
x2 x2
of
ty
n
ili
io
M/p2 M/p2
ut
ct
g
re
p1
sin
di
x1
ea
+
cr
in
p2
x2
=
M
O x1 O x1
M/p1 M/p1
(a) (b)
Figure 6.4: (a) The budget set for our consumer. (b) Adding three contours, u(x1 , x2 ) =
c, where the direction in which u(x1 , x2 ) is increasing is as indicated. Clearly, we are
interested in the point which is indicated in the figure.
Now, if his utility function is u(x1 , x2 ), the consumer wants to maximise this subject to
the constraint that he must be able to afford the bundle. That is, he must maximise
u(x1 , x2 ) subject to the constraint that the bundle he chooses is in the budget set. Let’s 6
assume that, in this case, the utility function has contours u(x1 , x2 ) = c, where c is a
constant,6 that look like the ones illustrated in Figure 6.4(b) and that the direction of
increasing utility is as indicated. Indeed, we observe in this case that the maximum
value of u(x1 , x2 ) subject to the constraint imposed by the budget set occurs at the
point indicated, i.e. a point where we have a contour of u(x1 , x2 ) which is both
tangential to the line p1 x1 + p2 x2 = M , and
touching the line p1 x1 + p2 x2 = M .

As such, we could use the method of Lagrange multipliers to solve this problem, i.e. we
would write the constraint as p1 x1 + p2 x2 − M = 0 and use the Lagrangean
L(k, l, λ) = u(x1 , x2 ) − λ(p1 x1 + p2 x2 − M ),
to find the point (x∗1 , x∗2 ) which maximises the consumer’s utility subject to the
constraint. Indeed, having done this, we can define the function
U (M ) = u(x∗1 , x∗2 ),
which tells us the maximum utility of the consumer given his budget, M . In particular,
using the theory in Section 6.3.3, we see that the value of the Lagrange multiplier we
get from solving the equations will satisfy
dU
= λ,
dM
i.e. it gives us the consumer’s marginal utility of [budgetary] money if he is purchasing
in a way that maximises his utility subject to his budget set. Let’s look at an example.
6
These contours are called indifference curves as each point on such a contour gives our consumer the
same utility, i.e. he will be indifferent between the bundles represented by points on the same contour.
231
Example 6.14 Suppose cats cost £2 each and dogs cost £1 each. If a consumer
has a utility function given by
u(x1 , x2 ) = x21 x22 ,
when he buys x1 cats and x2 dogs, how many cats and dogs should he buy if he
wants to maximise his utility given that he has £M to spend? Find, U (M ), the
maximum utility he can attain if he has a budget of M and verify that U 0 (M ) = λ
where λ is the Lagrange multiplier.
In this case, the budget set will be the region defined by the inequalities
2x1 + x2 ≤ M,
and x1 , x2 ≥ 0 which looks like the one in Figure 6.4(a) whereas the contours
u(x1 , x2 ) = c where u(x1 , x2 ) = x21 x22 look like the ones sketched in Figure 6.4(b). As
such, we are in the situation described above and so we need to maximise u(x1 , x2 )
subject to the constraint that
2x1 + x2 = M =⇒ 2x1 + x2 − M = 0,
6 if we want the constraint in the right form. Thus, we have the Lagrangean
L(x1 , x2 , λ) = x21 x22 − λ(2x1 + x2 − M ),
and we seek the points which simultaneously satisfy the equations Lx1 (x1 , x2 , λ) = 0,
Lx2 (x1 , x2 , λ) = 0 and Lλ (x1 , x2 , λ) = 0. The first-order partial derivatives of
L(x1 , x2 , λ) are
Lx1 (x1 , x2 , λ) = 2x1 x22 − 2λ,
Lx2 (x1 , x2 , λ) = 2x21 x2 − λ and
Lλ (x1 , x2 , λ) = − (2x1 + x2 − M ) ,
and we set these equal to zero to yield the equations
2x1 x22 − 2λ = 0, 2x21 x2 − λ = 0 and 2x1 + x2 − M = 0.
We now solve these by eliminating λ from the first two equations, i.e. we get
λ = x1 x22 = 2x21 x2 =⇒ x1 x2 (x2 − 2x1 ) = 0 =⇒ x2 = 2x1 ,
where we reject the solutions where x1 = 0 and x2 = 0 as these give a utility of zero
which, clearly, won’t give us the maximum we seek. We then use this new
relationship between x1 and x2 in the third equation, which is just the constraint
2x1 + x2 = M , to get
M
2x1 + 2x1 = M =⇒ 4x1 = M =⇒ x1 = ,
4
and then, using this in the equation x2 = 2x1 , we get x2 = M/2. Thus, these values
of x1 and x2 maximise our consumer’s utility if he has a budget of M and his
maximum utility is then given by
2 2
M M M M M4
U (M ) = u , = = ,
4 2 4 2 64
232
which means that

4M 3 M3
U 0 (M ) = = .
64 16
Of course, we can also find the value of λ using, say, the equation
2
2 M M M3
λ = x1 x2 =⇒ λ = = ,
4 2 16
which verifies that U 0 (M ) = λ.
Activity 6.8 Another consumer has a budget of £4 to buy cats and dogs at the
prices in Example 6.16 and her utility function is u(x1 , x2 ) = 3x1 + x2 when she buys
x1 cats and x2 dogs. Sketch the budget set and some contours u(x1 , x2 ) = c where c
is a constant for this consumer. How many cats and dogs should she buy if she wants
to maximise her utility given her budget?
Cost minimisation subject to a production constraint
Suppose that capital costs £v per unit and labour costs £w per unit. This means that a
6
firm which uses an amount k of capital and l of labour will incur costs given by the cost
function
C(k, l) = vl + wk.
Also suppose that these inputs allow the firm to produce an amount given by the
production function, q(k, l). We want to ask: How much capital and labour should the
firm use if it needs to produce an amount Q of its product? That is, we want to solve
the constrained optimisation problem
minimise C(k, l) subject to the constraint q(k, l) = Q,
where k, l ≥ 0 as they are quantities. Let’s assume that, in this case, the constraint
q(k, l) = Q looks like the curve in Figure 6.5(a) for k, l ≥ 0. If we also sketch some
contours of the cost function,7 we can identify the direction in which costs are
decreasing as indicated in Figure 6.5(b). Indeed, we observe in this case that the
minimum value of C(k, l) subject to the constraint q(k, l) = Q occurs at the point
indicated, i.e. a point where we have a contour of C(k, l) which is both
tangential to the constraint q(k, l) = Q, and

touching the constraint q(k, l) = Q.
As such, we could use the method of Lagrange multipliers to solve this problem, i.e. we
would write the constraint as q(k, l) − Q = 0 and use the Lagrangean
L(k, l, λ) = C(k, l) − λ(q(k, l) − Q),
to find the point (k ∗ , l∗ ) which minimises the costs subject to the constraint. Indeed,
having done this, we can define the function
Ĉ(Q) = C(k ∗ , l∗ ),
7
These contours are called isocosts as each point on such a contour costs the firm the same amount
of money.
233
l l
f
as on o
st
co
i
ct
g
re
in
di
c re
de
O k O k
(a) (b)
Figure 6.5: (a) The constraint q(k, l) = Q. (b) Adding three contours, C(k, l) = c, where
the direction in which C(k, l) is decreasing is as indicated. Clearly, we are interested in
the point which is indicated in the figure.
which tells us the minimum cost of producing an amount, Q. In particular, using the
theory in Section 6.3.3, we see that the value of the Lagrange multiplier we get from
6 solving the equations will satisfy
dĈ
= λ,
dQ
i.e. it gives us the marginal cost of the firm if it is producing in a way that minimises its
costs subject to the constraint that it is producing an amount, Q. Let’s look at an
example.
Example 6.15 Suppose capital, k, costs £16 per unit and labour, l, costs £1 per
unit. If a firm can produce an amount given by the production function
q(k, l) = 10k 1/4 l1/4 ,
what values of k and l will minimise the cost of producing Q units? Find, Ĉ(Q), the
minimum cost of producing Q and verify that Ĉ 0 (Q) = λ where λ is the Lagrange
multiplier.
In this case, the constraint q(k, l) = Q will look like the curve in Figure 6.5(a) for
k, l ≥ 0 and so we are in the situation described above. Indeed, here the cost
function is
C(k, l) = 16k + l,
and, writing the constraint in the form q(k, l) − Q = 0, we get the Lagrangean
L(k, l, λ) = 16k + l − λ(q(k, l) − Q).
We seek the points which simultaneously satisfy the equations Lk (k, l, λ) = 0,

Ll (k, l, λ) = 0 and Lλ (k, l, λ) = 0 so we find the first-order partial derivatives of
234
L(k, l, λ), i.e.

10 − 3 1
Lk (k, l, λ) = 16 − λ k l ,
4 4
4

10 1 − 3
Ll (k, l, λ) = 1 − λ k 4 l 4 and
4
1 1

Lλ (k, l, λ) = − 10k 4 l 4 − Q ,
and set these equal to zero to yield the equations

5 3 1 5 1 3 1 1
16 − λk − 4 l 4 = 0, 1 − λk 4 l− 4 = 0 and 10k 4 l 4 − Q = 0.
2 2
1 3
5 l4 2 k4
16 − λ 3 = 0 =⇒ λ = 16 ,
2 k4 5 l 14
from the first equation, and
3
5 k4
1− λ 3 =0
1
=⇒ λ=
2 l4
,
6
2 l4 5 k 14
from the second equation. As such, we can equate these expressions for λ to get
3 3
2 k4 2 l4
16 1 = =⇒ 16k = l.
5 l4 5 k 14
We then use this new relationship between k and l in the third equation, which is
just the constraint 10k 1/4 l1/4 = Q, to get
1 1 1 1 Q Q2
Q = 10k 4 (16k) 4 =⇒ Q = 20k 2 =⇒ k2 = =⇒ k= ,
20 400
and then, using this in the equation k = 16l, we get
2
Q Q2
l = 16 = .
400 25
Thus, these values of k and l minimise the cost of producing Q units. The minimum
cost is then given by
2 2 2
Q Q2 Q Q 2Q2
Ĉ(Q) = C , = 16 + = ,
400 25 400 25 25
and so, we have
4Q
Ĉ 0 (Q) =
.
25
Of course, we can also find the value of λ using, say, the equation
3 3
2 l4 2 (Q2 /25) 4 4Q
λ= 1 =⇒ λ = 1 = ,
5 k4 5 (Q2 /400) 4 25
which verifies that Ĉ 0 (Q) = λ.
235
Learning outcomes
At the end of this chapter and having completed the relevant reading and activities, you
should be able to:
find and classify the stationary points of a function of two variables;
solve problems from economics-based subjects that involve unconstrained

optimisation;
optimise a function in the presence of constraints;
solve problems from economics-based subjects that involve constrained

optimisation.
Solutions to activities
Solution to activity 6.1
6 The first-order partial derivatives of the function are
fx (x, y) = 2x − 4 and fy (x, y) = 2y + 4.
have fx (x, y) = 0 and fy (x, y) = 0. Thus, to find the stationary points we have to solve
the simultaneous equations
2x − 4 = 0 and 2y + 4 = 0.
But, clearly, the first of these equations gives x = 2 and the second gives y = −2. Thus,
(2, −2) is the only stationary point of f (x, y).

The first-order partial derivatives of the function are
fx (x, y) = 9x2 + 18x − 72 and fy (x, y) = 6y 2 − 24y − 126.
9x2 + 18x − 72 = 0 and 6y 2 − 24y − 126 = 0.
Now, notice that the first equation contains no ‘y’s and the second equation contains no
‘x’s. As such, the first equation tells us everything there is to know about x, i.e.
9x2 + 18x − 72 = 0 =⇒ x2 + 2x − 8 = 0 =⇒ (x + 4)(x − 2) = 0 =⇒ x = −4 or x = 2,
whereas the second equation tells us everything we need to know about y, i.e.
6y 2 − 24y − 126 = 0 =⇒ y 2 − 4y − 21 = 0 =⇒ (y + 3)(y − 7) = 0 =⇒ y = −3 or y = 7.
236
6.3. Solutions to activities
As such, since we can take any of the x values with any of the y values we can see that
this function has four stationary points, namely (−4, −3), (−4, 7), (2, −3) and (2, 7).

Using the first-order partial derivatives we found in Activity 6.1, we find that the
fxx (x, y) = 2, fxy (x, y) = 0 = fyx (x, y) and fyy (x, y) = 2.
As these are constants, they take these values at the stationary point (and, indeed, at
all other points). Thus, we can see that the Hessian at the stationary point is given by
H(2, −2) = (2)(2) − (0)2 = 4 > 0 and fxx (2, −2) = 2 > 0,

Using the first-order partial derivatives we found in Activity 6.2, we find that the
second-order partial derivatives are 6
fxx (x, y) = 18x + 18, fxy (x, y) = 0 = fyx (x, y) and fyy (x, y) = 12y − 24,
H(x, y) = (18x + 18)(12y − 24) − 02 = 216(x + 1)(y − 2).
Evaluating this at each of the stationary points we find that:
At (−4, −3), the Hessian is
H(−4, −3) = 216(−3)(−5) > 0 and fxx (−4, −3) = 18(−4) + 18 < 0,
so this is a local maximum.
At (−4, 7), the Hessian is
H(−4, 7) = 216(−3)(+5) < 0,
At (2, −3), the Hessian is
H(2, −3) = 216(+3)(−5) < 0,
H(2, 7) = 216(+3)(+5) > 0 and fxx (2, 7) = 18(2) + 18 > 0,
237
Thus, the stationary point (−4, −3) is a local maximum, (−4, 7) and (2, −3) are saddle
points and (2, 7) is a local minimum.

fx (x, y) = 4(x − 1)3 and fy (x, y) = 4(y − 1)3 .
So, clearly, the only stationary point is at (1, 1) as this is the only point that makes
fx (x, y) = 0 and fy (x, y) = 0. The second-order partial derivatives of this function are
given by
fxx (x, y) = 12(x − 1)2 , fxy (x, y) = 0 = fyx (x, y) and fyy (x, y) = 12(y − 1)2 ,
H(x, y) = [12(x − 1)2 ][12(y − 1)2 ] − 02 = 144(x − 1)2 (y − 1)2 .
6 Indeed, evaluating this as the stationary point gives H(1, 1) = 0 and so the method we
used above fails.
However, if we consider the surface z = f (x, y), notice that we have z = f (1, 1) = 0 at
the stationary point and for all other x, y ∈ R, we have
z = f (x, y) = (x − 1)4 + (y − 1)4 > 0,
i.e. f (x, y) ≥ f (1, 1) for all x, y ∈ R. Consequently, it should be clear that this function
has a local minimum at (1, 1) and this minimum value is zero.8

We have found that the profit function is given by

1 2 2
π(x, y) = − 15 + 17x + 28y − 5x − 5y + xy ,
3
and, to maximise this, we need to find its stationary points and determine which of
them gives us a maximum. So, we start by finding the first-order partial derivatives of
π(x, y), i.e.

1 1
πx (x, y) = 17 − 10x + y and πy (x, y) = 28 − 10y + x .
3 3
At a stationary point, both of these first-order partial derivatives are zero, i.e. we must
have πx (x, y) = 0 and πy (x, y) = 0. Thus, to find the stationary points, we have to solve
10x − y = 17 and x − 10y = −28.

8
Actually, this is not only a local minimum, it is a global minimum as this is truly the smallest value
the function can take for x, y ∈ R.
238
6.3. Solutions to activities
We start by noticing that the first equation gives us y = 10x − 17 and so, substituting
this into the second equation, we get

x − 10 10x − 17 = −28 =⇒ −99x = −198 =⇒ x = 2,
and then, using y = 10x − 17 again, we get y = 3. Thus, the profit function, π(x, y), has
(2, 3) as its only stationary point.
To classify this stationary point, we look at the second-order partial derivatives of

π(x, y), which are
10 1 10
πxx (x, y) = − , πxy (x, y) = = πyx (x, y) and πyy (x, y) = − ,
3 3 3
2
10 10 1 100 1
H(x, y) = − − − = − = 11.
3 3 3 9 9
Clearly, at (2, 3), we have H(2, 3) > 0 and fxx (2, 3) < 0, which means that the
stationary point we have found is indeed a local maximum. Consequently, to maximise 6
its profit, the firm should produce 2 units of X and 3 units of Y so that it can sell them
at prices, in pounds, of
17 − 2(2) − 3 10 28 − 2 − 2(3) 20
pX = = ' 3.33 and pY = = ' 6.67,
3 3 3 3
respectively and, in doing so, the firm will make a maximum profit of

1 2 2 44
π(2, 3) = − 15 + 17(2) + 28(3) − 5(2) − 5(3) + (2)(3) = ' 14.67,
3 3
pounds.

Of course, this should have been obvious as
f (x, y) = (x − 1)2 + (y − 1)2 ≥ 0,
for all points (x, y) ∈ R2 with a minimum of zero at (1, 1). Thus, we see that we have
found the minimum of f (x, y) for all (x, y) ∈ R2 and so it must be the minimum in the
given region too since it is in that region.

Given the prices in Example 6.16 and the consumer’s budget of £4, we see that the
budget set is given by
2x1 + x2 ≤ 4,
where x1 , x2 ≥ 0 as they are quantities. This is sketched in Figure 6.6(a).
We are now asked to sketch some contours u(x1 , x2 ) = c where c is a constant and
u(x1 , x2 ) = 3x1 + x2 ,
239
for this consumer. Indeed, looking at the budget set, it makes sense to choose the
contours where c = 4 and c = 6 and these are illustrated in Figure 6.6(b). This allows us
to see the direction of increasing utility, which is indicated in the figure, and allows us
to see that the point (2, 0) is the one where we get the highest utility if we are
constrained to stay within the budget set. Consequently, this consumer should buy two
cats and no dogs if she wants to maximise her utility subject to her budget constraint.
x2 x2
of
6
t i on tility
ec u
d i r si n g
4 4 re a
in c
2x
c=
1
+
c=
x2
6
=
4
4
O x1 O 4 x1
2 3 2
(a) (b)
Figure 6.6: The sketches for Activity 6.9. (a) The budget set for our consumer. (b) Adding
6 two contours, u(x1 , x2 ) = c, where c = 4 and c = 6. The direction in which u(x1 , x2 ) is
increasing is as indicated and we are interested in the point which is indicated in the
figure.
Exercises
Exercise 6.1
Find and classify the stationary points of the function f (x, y) = x3 − y 3 − 3xy.
Exercise 6.2
The function
f (x, y) = x2 ln y − y ln y,
is defined for y > 0 and all x ∈ R. Find its stationary points and classify them.
Exercise 6.3
Suppose that a firm can sell its product in a domestic and a foreign market and that
the inverse demand functions for these two markets are
p1 = 30 − 4q1 and p2 = 50 − 5q2 ,
where p1 and p2 are the prices (in pounds) if they sell quantities q1 and q2 (in tonnes) in
the domestic and foreign markets respectively. Given that the total cost function of the
firm (in pounds) is
TC(q) = 10 + 10q,
where q is the quantity produced (in tonnes) and that the firm has a monopoly in both
markets, find the quantities it should sell in these markets if they want to maximise
their profit. What are the corresponding prices? What is the maximum profit?
240
6.3. Solutions to exercises
Exercise 6.4
Use the method of Lagrange multipliers to optimise the function
f (x, y) = x3/8 y 2/3 ,
subject to the constraint x2 + y 2 = 25 where x, y > 0.
By sketching the constraint and some contours of f , justify your use of the method of
Lagrange multipliers and determine whether the point you have found maximises or
minimises f subject to the constraint.
Exercise 6.5
Given an amount of capital, k, and labour, l, a firm produces a quantity of goods,
q(k, l), where
q(k, l) = ln k + ln l,
for k, l > 0. Suppose that each unit of capital costs £2 and each unit of labour costs £3.
Use the method of Lagrange multipliers to find the values of k and l that maximise the
firm’s production given that their total budget for capital and labour is £M .
6
Hence show that the maximum production the firm can achieve given a budget of £M
is given by
M
Q(M ) = 2 ln √ ,
2 6
and verify that Q0 (M ) = λ where λ is the Lagrange multiplier.
Solutions to exercises
Solution to exercise 6.1
Given that
f (x, y) = x3 − y 3 − 3xy,
we see that the first-order partial derivatives of this function are
fx (x, y) = 3x2 − 3y and fy (x, y) = −3y 2 − 3x.
3x2 − 3y = 0 and − 3y 2 − 3x = 0.
If we start by looking at the first equation, this gives us y = x2 and, substituting this
into the second equation, we get
−3(x2 )2 − 3x = 0 =⇒ x4 + x = 0 =⇒ x(x3 + 1) = 0 =⇒ x = 0 or x = −1.
Thus, as y = x2 , we see that the stationary points of this function are (0, 0) and (−1, 1).
241
To classify these stationary points, we note that the second-order partial derivatives are
fxx (x, y) = 6x, fxy (x, y) = −3 = fyx (x, y) and fyy (x, y) = −6y,
H(x, y) = (6x)(−6y) − (−3)2 = −9(4xy + 1).

H(0, 0) = −9(1) < 0,
H(−1, 1) = −9(−4 + 1) > 0 and fxx (−1, −1) = 6(−1) < 0,
6 and so this is a local maximum.

Thus, the stationary points (0, 0) and (−1, 1) are a saddle point and a local maximum
respectively.

Given that
f (x, y) = x2 ln y − y ln y,
for y > 0 and all x ∈ R, we see that the first-order partial derivatives of this function are
x2
fx (x, y) = 2x ln y and fy (x, y) = − (ln y + 1),
y
where we have used the product rule when finding fy (x, y). At a stationary point, both
of the first-order partial derivatives are zero, i.e. we must have fx (x, y) = 0 and
fy (x, y) = 0. Thus, to find the stationary points we have to solve the simultaneous
equations
x2
2x ln y = 0 and − ln y − 1 = 0.
y
If we start by looking at the first equation, this gives us
x ln y = 0 =⇒ x = 0 or ln y = 0 =⇒ x = 0 or y = 1.
And so, to satisfy the second equation with:
x = 0 we must have
0 − ln y − 1 = 0 =⇒ ln y = −1 =⇒ y = e−1 ,
i.e. (0, e−1 ) is a stationary point.
242
y = 1 we must have
x2
− ln 1 − 1 = 0 =⇒ x2 = 1 =⇒ x = ±1,
1
i.e. (1, 1) and (−1, 1) are stationary points.
Consequently, the points (0, e−1 ), (1, 1) and (−1, 1) are stationary points of this
function.
To classify these stationary points, we note that the second-order partial derivatives are
2x x2 1
fxx (x, y) = 2 ln y, fxy (x, y) = = fyx (x, y) and fyy (x, y) = − 2 − ,
y y y
2 2
x 1 2x 2(x2 + y) ln y + 4x2
H(x, y) = (2 ln y) − 2 − − =− .
y y y y2
At (0, e−1 ), the Hessian is

6
−1 2 e−1 ln(e−1 )
H(0, e ) = − −2
= 2 e > 0 and fxx (0, e−1 ) = 2 ln(e−1 ) = −2 < 0,
e
as ln(e−1 ) = −1 and so this is a local maximum.

4
H(1, 1) = − < 0,
1
as ln 1 = 0 and so this is a saddle point.

4
H(−1, 1) = − < 0,
1
as ln 1 = 0 and so this is a saddle point.
Thus, the stationary points (0, e−1 ), (1, 1) and (−1, 1) are a local maximum and two
saddle points respectively.

Here the firm is a monopoly and so, as it is the sole supplier of its product in both
markets, when it supplies quantities q1 and q2 to the domestic and foreign markets
respectively, the prices will be given by the inverse demand functions
p1 = 30 − 4q1 and p2 = 50 − 5q2 ,
respectively.9 This means that their total revenue is given by
TR(q1 , q2 ) = p1 q1 + p2 q2 = (30 − 4q1 )q1 + (50 − 5q2 )q2 ,

9
Note that the situation described here, where a producer charges different prices in different markets,
is sometimes known as price discrimination.
243
and their total costs are given by
TC(q) = 10 + 10q =⇒ TC(q1 , q2 ) = 10 + 10(q1 + q2 ),
as q = q1 + q2 is the quantity being produced. As such, their profit function is
π(q1 , q2 ) = TR(q1 , q2 ) − TC(q1 , q2 ) = 20q1 + 40q2 − 4q12 − 5q22 − 10,
and we need to find the values of q1 and q2 that maximise this.
To do this, we see that the first-order partial derivatives of π(q1 , q2 ) are
πq1 (q1 , q2 ) = 20 − 8q1 and πq2 (q1 , q2 ) = 40 − 10q2 ,
and so, as a stationary point occurs when πq1 (q1 , q2 ) = 0 and πq2 (q1 , q2 ) = 0, we need to
20 − 8q1 = 0 and 40 − 10q2 = 0.
6 But, of course, the first equation gives q1 = 5/2 and the second equation gives q2 = 4
which means that (5/2, 4) is the only stationary point of π(q1 , q2 ).
To check that this is a maximum, we look at the second-order partial derivatives of

π(q1 , q2 ), which are
πq1 q1 (q1 , q2 ) = −8, πq1 q2 (q1 , q2 ) = 0 = πq2 q1 (q1 , q2 ) and πq2 q2 (q1 , q2 ) = −10,
and, as such the Hessian is given by
H(x, y) = (−8)(−10) − 02 = 80.
Clearly, at (5/2, 4), we have H(5/2, 4) > 0 and πq1 q1 (5/2, 4) < 0 which means that the
stationary point we have found is indeed a local maximum. Consequently, to maximise
its profit, the firm should supply 5/2 tonnes of its product to the domestic market and 4
tonnes of its product to the foreign market so that it can sell them at prices, in pounds,
of
5
p1 = 30 − 4 = 20 and p2 = 50 − 5(4) = 30,
2
respectively and, in doing so, the firm will make a maximum profit of
2
5 5
π(5/2, 4) = 20 + 40(4) − 4 − 5(4)2 − 10 = 95,
2 2
pounds.

Writing the constraint in the form x2 + y 2 − 25 = 0, we get the Lagrangean
L(x, y, λ) = x3/8 y 2/3 − λ(x2 + y 2 − 25),
244
and we seek the points which simultaneously satisfy the equations Lx (x, y, λ) = 0,
Ly (x, y, λ) = 0 and Lλ (x, y, λ) = 0. So we find the first-order partial derivatives of
L(x, y, λ), i.e.
3
Lx (x, y, λ) = x−5/8 y 2/3 − 2xλ,
8
2 3/8 −1/3
Ly (x, y, λ) = x y − 2yλ and
3
Lλ (x, y, λ) = −(x2 + y 2 − 25),
and set these equal to zero to yield the equations

3 −5/8 2/3 2 3/8 −1/3
x y − 2xλ = 0, x y − 2yλ = 0 and x2 + y 2 − 25 = 0.
8 3
2/3
3 −5/8 2/3 3 y
x y − 2xλ = 0 =⇒ λ= ,
8 16 x13/8
from the first equation, and 6

2 3/8 −1/3 1 x3/8
x y − 2yλ = 0 =⇒ λ= ,
3 3 y 4/3
from the second equation. As such, we can equate these expressions for λ to get
2/3
3 y 1 x3/8 16
13/8
= 4/3
=⇒ y 2 = x2 .
16 x 3 y 9
We then use this new relationship between x and y in the third equation, which is just
the constraint x2 + y 2 = 25, to get
16 2 25 2
x2 + x = 25 =⇒ x = 25 =⇒ x2 = 9 =⇒ x = 3,
9 9
as x > 0. Then, using this in the equation y 2 = 16x2 /9, we get
16 2
y2 = (3 ) = 16 =⇒ y = 4,
9
as y > 0. Thus, x = 3 and y = 4 will optimise f (x, y) subject to the constraint.
The constraint is x2 + y 2 = 15 and this is a circle of radius five centred on the origin
which, for x, y > 0, is illustrated in Figure 6.7(a). The objective function,
f (x, y) = x3/8 y 2/3 has contours f (x, y) = c, where c is a constant, that look a bit like
rectangular hyperbolae as illustrated in Figure 6.7(b). The direction in which f (x, y) is
increasing is indicated in this figure along with the point we found above using the
Lagrange multiplier method — i.e. a point where we have a contour of f (x, y) which is
both tangential to the constraint and touching the constraint. Having seen this, it
should be clear that this point will maximise f subject to the constraint.
245
y y
y)
g of
x,
si n n
f(
e a c ti o
i n d i re
5
cr
5 4
O x O x
5 3 5
(a) (b)
Figure 6.7: The sketches for Exercise 6.4. (a) The constraint x2 + y 2 = 25 for x, y > 0. (b)
Adding three contours, f (x, y) = c, where the direction in which f (x, y) is increasing is
as indicated. Clearly, we are interested in the point (3, 4) which is indicated in the figure.
The firm has £M to spend on capital and labour where each unit of capital costs £2
and each unit of labour costs £3. As such, the cost of using k units of capital and l
6 units of labour is 2k + 3l and this gives us the constraint 2k + 3l = M .10 So, to
maximise the quantity
q(k, l) = ln k + ln l,
that the firm can produce subject to the constraint 2k + 3l = M where k, l > 0 we use
the Lagrangean
L(k, l, λ) = ln(k) + ln(l) − λ(2k + 3l − M ).
We seek the points which simultaneously satisfy the equations Lk (k, l, λ) = 0,
Ll (k, l, λ) = 0 and Lλ (k, l, λ) = 0. The first-order derivatives of L(k, l, λ) are
1 1
Lk (k, l, λ) = − 2λ, Ll (k, l, λ) = − 3λ and Lλ (k, l, λ) = −(2k + 3l − M ),
k l
and we set these equal to zero to yield the equations
1 1
− 2λ = 0, − 3λ = 0 and 2k + 3l − M = 0.
k l
1 1 3
λ= = =⇒ 3l = 2k =⇒ k = l.
2k 3l 2
We then use this new relationship between k and l in the third equation, which is just
the constraint 2k + 3l = M , to get

3 M
2 l + 3l = M =⇒ 6l = M =⇒ l= ,
2 6
10
Strictly, the constraint is 2k + 3l ≤ M where k, l > 0, but we can see that if we chose a point where
2k + 3l < M , we could not maximise the quantity produced since, spending more on capital and labour
to get a point where 2k + 3l = M , we would get a larger quantity. This should make sense if you consider
the discussion of budget constraints in Section 6.3.4.
246
and then, using this in the equation k = 3l/2, we get
3 M M
k= × = .
2 6 4
Thus the values of k and l that maximise q(k, l) subject to the constraint are k = M/4
and l = M/6.
In this case, the maximum production achievable, given a budget of £M , is

2
M M M M M M
Q(M ) = q , = ln + ln = ln = 2 ln √ ,
4 6 4 6 24 2 6
as required. Further, we can find the value of λ using, say, the equation

1 1 4 2
λ= =⇒ λ= = ,
2k 2 M M
and we can see that

M
Q(M ) = 2 ln √ ,
2 6 6
can be written as
√ 2
Q(M ) = 2 ln M − 2 ln 2 6 =⇒ Q0 (M ) = ,
M
which verifies that Q0 (M ) = λ.
Note: Although this question is similar to what we saw in Example 6.15, notice that
here we are maximising production subject to a budget constraint whereas in
Example 6.15 we were minimising costs subject to a production constraint. In
particular, this means that you should always read the question carefully to ensure that
you are using the correct objective function and constraint! Further, we were not asked
to justify the assertion that the optimal point we found was a maximum here and so we
haven’t, but sometimes, as in Exercise 6.4, we will be asked to provide such a
justification.
247
248

MT1186 Chapter 6 Multivariate Optimisation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MT1186 Chapter 6 Multivariate Optimisation

Uploaded by

Copyright:

Available Formats

Chapter 6

(For full publication details, see Chapter 1.)

Binmore and Davies (2002) Sections 4.6, 4.7, 6.3–6.8.

Simon and Blume (1994) parts of Chapter 17, 18 and 19. 6

Aims and objectives

The objectives of this chapter are as follows.

To use partial derivatives to solve problems where a function needs to be optimised.

6.2 Unconstrained optimisation

6.2.1 Stationary points

Example 6.1 Find the stationary points of the function

The first-order partial derivatives of this function are

fx (x, y) = 4x3 + 4xy and fy (x, y) = 2x2 + 4y + 1.

If we start by looking at the first equation, this gives us

4x3 + 4xy = 0 =⇒ 4x(x2 + y) = 0 =⇒ x = 0 or y = −x2 .

And so, to satisfy the second equation with:

y = −x2 we must have

Example 6.2 Find the stationary points of the function

The first-order partial derivatives of this function are

fx (x, y) = 12x2 − 60y and fy (x, y) = −60x + 10y + 400.

12x2 − 60y = 0 and − 60x + 10y + 400 = 0.

We start by simplifying these equations to get,

Activity 6.1 Find the stationary points of the function

Activity 6.2 Find the stationary points of the function

f (x, y) = 3x3 + 9x2 − 72x + 2y 3 − 12y 2 − 126y + 19.

(a) local minimum (b) saddle point (c) local maximum

6.2.2 Classifying stationary points

Example 6.3 Classify the stationary points we found in Example 6.1.

and, as such, the Hessian is given by

H(x, y) = (12x2 + 4y)(4) − (4x)2 = 48x2 + 16y − 16x2 = 16(2x2 + y).

Evaluating this at each of the stationary points we then find that:

At (0, −1/4), the Hessian is

H(0, −1/4) = 16(−1/4) < 0,

and so this is a saddle point.

so this is a local minimum.

so this is a local minimum.

Example 6.4 Classify the stationary points we found in Example 6.2. 6

and, as such, the Hessian is given by

H(x, y) = (24x)(10) − (−60)2 = 240x − 3600 = 240(x − 15).

Evaluating this at each of the stationary points we then find that:

At (10, 20), the Hessian is

H(10, 20) = 240(−5) < 0,

and so this is a saddle point.

At (20, 80), the Hessian is

so this is a local minimum.

Activity 6.3 Classify the stationary points we found in Activity 6.1.

Activity 6.4 Classify the stationary points we found in Activity 6.2.

The first-order partial derivatives of this function are

fx (x, y) = 3x2 and fy (x, y) = −3y 2 .

H(x, y) = (6x)(−6y) − 02 = −36xy.

Activity 6.5 Find the stationary point of the function

f (x, y) = (x − 1)4 + (y − 1)4 ,

-50 -50 -100

(a) (b) (c)

C(x, y) = 2000 + 2x3 − 12xy + y 2 ,

Cx (x, y) = 6x2 − 12y and Cy (x, y) = −12x + 2y.

6x2 − 12y = 0 and − 12x + 2y = 0.

We start by simplifying these equations to get

x2 − 2(6x) = 0 =⇒ x2 − 12x = 0 =⇒ x(x − 12) = 0 =⇒ x = 0 or x = 12,

and, since y = 6x, we have

y = 6(0) = 0 or y = 6(12) = 72,

To classify these stationary points, we look at the second-order partial derivatives of

and, as such, the Hessian is given by

H(x, y) = (12x)(2) − (−12)2 = 24x − 144 = 24(x − 6).

At (0, 0), the Hessian is