You are on page 1of 57

EC3314: Mathematics for Economists

Lecture Notes 6: Optimization 1

Jingfeng Lu (ecsljf@nus.edu.sg), AS2, 05-27.

Economics Department
National University of Singapore

March 16, 2023

1 / 57
Readings: Chapters 8, 13, 14 of EMEA; Chapter 3 of FMEA;
Chapters 9, 11, 12, 13 of FMME

2 / 57
Optimization: single-variable case

3 / 57
Optimization

I In economics, we frequently want to find the choice that gives us


the most (or least) of something.
I The process of finding such choices are known as optimization
problems.
I We usually go about this by formulating mathematical models.
I Most of the optimization done in economics uses functions of
multiple variables, but the insights from those problems come
from single-variable optimization problems.
I We will use calculus to find the extreme points of functions.

4 / 57
Extrema
I The values where functions reach their highest and lowest values
are the maximum and minimum points, or the points where the
maxima and minima occur.
I Jointly, these are called extreme points or the points where
extrema (singular: extremum) occur.
I More precisely, if f (x) has domain D,

c ∈ D is a maximum point for f iff f (x) ≤ f (c) ∀x ∈ D

d ∈ D is a minimum point for f iff f (x) ≥ f (d) ∀x ∈ D

I Also, f (c) is the maximum value and f (d) is the minimum


value.
I In the above definitions, c and d are strict maximum or minimum
points if we replace the inequalities (≥ or ≤) with strict
inequalities (> or <).

5 / 57
Local Extrema

I Before looking for overall maxima and minima for functions, we


look for the maxima and minima in small regions of the function.
I f has a local maximum (or minimum) at c if there exists an
interval around c such that f (x) ≤ f (c) (or f (x) ≥ f (c)) for all x in
the interval that are also in the domain of f .

6 / 57
Finding Local Extrema

I There are various ways extrema could occur.


I One prevalent way is in the interior of the domain of a function
(away from boundaries) at a point where the derivative of the
function exists.
I In this case, if a local extremum exists, then the derivative of the
function must be 0 at the extremum. This is called a stationary
point.
I This is only a necessary condition: Local extremum of f (x) exists
at x = c ⇒ f 0 (c) = 0.
I There are cases where f 0 (c) = 0 and there is no local extremum.

7 / 57
Local Extrema
f(x)
5

0 x
-1 0 1 2 3 4 5

-1

8 / 57
Local Extrema
f(x)
3

0 x
-1 0 1 2 3

-1

9 / 57
First Derivative Test

I We want to tell the difference between local maxima and local


minima. We can use the first derivative test:
I Assume that we are at an interior point of the domain of the
function, and the function is differentiable in some interval (a, b)
around the point x = c where f 0 (c) = 0.
I If f 0 (x) ≥ 0 on (a, c) and f 0 (x) ≤ 0 on (c, b), then x = c is a local
maximum point for f .
I If f 0 (x) ≤ 0 on (a, c) and f 0 (x) ≥ 0 on (c, b), then x = c is a local
minimum point for f .
I If f 0 (x) > 0 on (a, c) and on (c, b), then x = c is NOT a local
extremum for f .
I If f 0 (x) < 0 on (a, c) and on (c, b), then x = c is NOT a local
extremum for f .

10 / 57
f (x) = x2
y
2
f'(x) f(x)

0 x
-2 -1 0 1 2

-1

-2

11 / 57
f (x) = −x2
y
2

0 x
-2 -1 0 1 2

-1

f(x)
f'(x)
-2

12 / 57
f (x) = x3
y
2
f'(x) f(x)

0 x
-2 -1 0 1 2

-1

-2

13 / 57
Second Derivative Test

I It may be the case that you do not know the derivative of a


function at points other than the possible extreme point itself.
Then you cannot use the first derivative test.
I If this is the case, and the function is twice differentiable at the
possible extreme point (i.e. f 00 (c) exists), then we can use the
second derivative test:
I Assume that we are at an interior point of the domain of the
function, and the function is twice differentiable at x = c where
f 0 (c) = 0.
I If f 00 (c) < 0, then x = c is a strict local maximum point.
I If f 00 (c) > 0, then x = c is a strict local minimum point.
I If f 00 (c) = 0, the test says nothing about the possible extreme point.

14 / 57
f (x) = −x2
y
1

0 x
-2 -1 0 1 2

f(x)
-1

f'(x)

-2
f''(x)

-3

15 / 57
f (x) = −x4
y
1
f'(x)

0 x
-2 -1 0 1 2

f(x)
-1

-2
f''(x)

-3

16 / 57
Inflection Points

I If f 00 (c) = 0, there could be a maximum, a minimum, or an


inflection point.
I An inflection point occurs when the concavity of the function
changes (either convex to concave or concave to convex).
I More precisely, x = c is an inflection point of f if there exists an
interval (a, b) around c such that
I f 00 (x) ≥ 0 on (a, c) and f 00 (x) ≤ 0 on (c, b) OR
I f 00 (x) ≤ 0 on (a, c) and f 00 (x) ≥ 0 on (c, b)

17 / 57
f (x) = x3
y
2
f'(x) f''(x) f(x)

0 x
-2 -1 0 1 2

-1

-2

18 / 57
Example

x4 2x3 x2
Find the local maximum and minimum points of 4 − 3 + 2 −1
using both the first and second derivative tests.

19 / 57
x4 2x3 x2
f (x) = 4 − 3 + 2 −1
y
1
f''(x) f'(x)

0 x
-1 0 1 2

f(x)

-1

-2

20 / 57
Other Local Extrema

I As we stated, the methods we’ve used only apply to the interiors


of differentiable functions.
I This implies we should check two other sets of points:
I Boundary points: points at the edges of the function’s domain
I Points where the derivative does not exist: corners, discontinuities,
asymptotes, etc. These points, along with points where f 0 (x) = 0
are collectively known as critical points.
I We can check that the values of the function around these points
(or to one side of boundary points) are all less (or greater) than
at the point itself to identify a maximum (or minimum).

21 / 57

f (x) = x
f(x)
2

0 x
-1 0 1 2

-1

22 / 57
f (x) = |x|
f(x)
1

0 x
-1 0 1

-1

23 / 57
Global Extrema

I A global maximum (or minimum) is simply the highest (or lowest)


value achieved by the function in its entire domain.
I Finding global minima and maxima is similar to finding local
minima and maxima.
I Given the definition of local and global extrema, it must be the
case that all global extrema are also local extrema (if a point
gives the highest overall value of a function, then it must also
give the highest value of the function around the point).
I So, we simply need to find all the local extrema and find the ones
that give the highest and lowest values.

24 / 57
Extreme Value Theorem

I First case: f is a continuous function on a closed and bounded


interval [a, b].
I The extreme value theorem: in this case, there exists a point d
and a point c such that

f (d) ≤ f (x) ≤ f (c) for all x in [a, b]

I In other words, there is always a maximum and a minimum.


I For example, even a constant function f (x) = 5 has a maximum
and a minimum. They just both happen to be 5.

25 / 57
Global Extrema on [a, b]

I In the case where f is differentiable on [a, b], we need to find all


stationary points (x = c where f 0 (c) = 0) and all boundary points
(x = a and x = b).
I We then evaluate the function at each of these points.
I The point that gives the highest and lowest f (x) values are the
minimum and maximum points.

26 / 57
Global Extrema without Boundaries

I Sometimes, functions will increase or decrease without bound.


I In these cases, the value of the function at the extremes must be
found.
I This can be done taking the limits of the function to −∞ and ∞.

27 / 57
General Procedure for Global Extrema

I Find all stationary points of the function


I Find all points where the derivative of the function does not exist
I Find all boundary points of the function (there might be no
boundaries)
I Compare the values of the function at each of these points. If
there are no boundaries or only boundaries on one side of the
domain, take the limit as x goes to the appropriate ∞
I The highest value gives the maximum, and the lowest value
gives the minimum. If these values are −∞ or ∞, then those
extrema do not exist.

28 / 57
Example

4
2x3 x2
Find the global maximum and minimum points of x4 − 3 + 2 − 1.
Also find the minimum value attained by the function.

29 / 57
Example
y
1
f''(x) f'(x)

0 x
-1 0 1 2

f(x)

-1

-2

30 / 57
Optimization: multi-variable case

31 / 57
Multi-variable

I Definition of maxima and minima: If f (x) ≡ f (x1 , x2 , . . . , xn ) is a


function of n variables with domain D, then x∗ = (x∗1 , . . . , x∗n ) is a
global maximum point for f ∈ D if

f (x) ≤ f (x∗ ) ∀ x ∈ D

If the inequality is strict for any x 6= x∗ , then x∗ is strict maximum


point.
The opposite relation holds for global (strict) minimum points.
I A stationary point for a function of n variables is a point where all
first-order partial derivatives are equal to 0.

32 / 57
Multi-variable: first-order condition

I Definition: a point x is called an interior point in D if there exists


 > 0 such that any vector within at most  distance from x lies in
D, i.e, B (x) ⊂ D, where B (x) = {y ∈ Rn : ||y − x|| < }
I First-order condition: If x∗ is an interior point of the domain of f ,
a function of n variables, and is either a maximum or a minimum
point for f , then it must be the case that x∗ is a stationary point
for f . That is, x∗ satisfies the n equations

fi0 (x) = 0 for i = 1, . . . , n.

I Equivalently,∇f (x) = 0

33 / 57
Intuition

I Assume f attains its maximum(or minimum) value at an interior


point x∗ of S.
I Consider the funtion defined by fixing x2 , x3 , · · · , xn , but varying
x1 ,
h(x1 ) = f (x1 , x∗2 , · · · , x∗n )

I h(x1 ) must have a maximum(or minimum) point at x1 = x∗1 (why?)


I From our arguments for functions of one variable, we know that
h0 (x∗1 ) = 0, or f10 (x∗ ) = 0.
I Similarly for fk0 (x∗ ) = 0, k = 2, · · · , n.

34 / 57
First-Order (Necessary) Conditions: two-variable
example

35 / 57
Formal proof: maximum case

I For any vector a ∈ Rn , define a single-variable function

g(t) := f (x∗ + ta)

which is defined in an open interval around zero. Since g is


maximized at 0 (why?), hence g 0 (0) = 0, or ∇f (x∗ ) · a = 0. Since
this is true for any a, it must be the case that ∇f (x∗ ) = 0(why?
take a = ∇f (x∗ )).
I Question: why interior point?

36 / 57
Example

Find a stationary point for


f (x, y) = −2x2 − 2xy − 2y 2 + 36x + 36y − 158.

37 / 57
Multi-variable: second-order condition

I Suppose f (x) is defined in a convex set S in Rn and let c be an


interior point of S.
[1] If f is concave in S, then c is a (global) maximum point for f
in S if and only if c is a stationary point of f
[2] If f is convex in S, then c is a (global) minimum point for f in
S if and only if c is a stationary point of f
• Intuition: when f is concave, the tangent plane at c always lies
above the graph of f .

f (x) ≤ f (c) + ∇f (c) · (x − c), ∀x ∈ S

When c is a stationary point, ∇f (c) = 0, so f (x) ≤ f (c)

38 / 57
Example

Find a maximum point for


f (x, y) = −2x2 − 2xy − 2y 2 + 36x + 36y − 158.

Recall that the Hessian matrix


 
2 4 2
D f (x, y) = −
2 4

is negative definite at every (x, y), so f is (strictly) concave.

39 / 57
Local Exteme Points

I As in the case with 1 variable, we also want to examine local


extreme points, not just global extreme points.
I x∗ is a local maximum point (or local minimum point) of f if
f (x) ≤ f (x∗ ) (or f (x) ≥ f (x∗ )) for all x that are “close” to x∗ .
I If the inequalities above are strict for x 6= x∗ , then x∗ is a strict
local maximum point or a strict local minimum point.
I As before, a global extreme point must be a local extreme point,
but not necessarily vice versa.

40 / 57
FOCs and Saddle Points

I The first order conditions for local extrema are similar to those of
global extrema:
I At a local extreme point in the interior of the domain of a
differentiable function, all first-order partial derivatives are 0.
I As usual, a stationary point does not have to be a local extreme
point.
I You can also have saddle points. This occurs at x∗ when there
are simultaneously points arbitrarily close to x∗ that give a larger
value of the function and other points that give a smaller value of
the function.

41 / 57
Saddle Points: two-variable case
z(x, y) = x2 − y 2 at (x, y) = (0, 0)

42 / 57
Second-Derivative Test: sufficient conditions

I Suppose f (x) is defined in a set S in Rn and let x∗ be an interior


stationary point. Then
(a) If D2 f (x∗ ) is positive definite, =⇒ x∗ is a local minimum point
(b) If D2 f (x∗ ) is negative definite, =⇒ x∗ is a local maximum
point
(c) If D2 f (x∗ ) is indefinite, =⇒ x∗ is a saddle point

NOTE: for positive(negative) semidefinite case, need further


conditions.

43 / 57
Second-Derivative Test: necessary conditions

I Suppose f (x) is defined in a set S in Rn and let c be an interior


point. Then
(a) x∗ is a local minimum point =⇒ D2 f (x∗ ) is positive
semidefinite,
(b) x∗ is a local maximum point =⇒ D2 f (x∗ ) is negative
semidefinite,

44 / 57
Intuitive argument

I For any nonzero vector a ∈ Rn , define a single-variable function

g(t) := f (x∗ + ta)

I Then we have

g 0 (t) = ∇f (x∗ + ta) · a, and g 00 (t) = a0 D2 f (x∗ + ta)a

I If x∗ is a local minimum point, then 0 is a local minimum point of


g, so g 00 (0) ≥ 0, or a0 D2 f (x∗ )a ≥ 0, ∀a 6= 0. =⇒ D2 f (x∗ ) is
positive semidefinite
I On the other hand, if x∗ is a stationary point and D2 f (x∗ ) is
positive definite, then g 0 (0) = 0 and g 00 (0) = a0 D2 f (x∗ )a > 0, so 0
is a local minimum of g. Since this holds for every nonzero vector
a, x∗ is a local minimum point of f .

45 / 57
Second-Derivative Test: two variable case
Suppose f (x, y) has continuous second-order partial derivatives in a
domain S, and let (x0 , y0 ) be an interior stationary point of f . Also, let
00 00 00
A = f11 (x0 , y0 ), B = f12 (x0 , y0 ), and C = f22 (x0 , y0 )

Then
1. If A < 0 and AC − B 2 > 0, then (x0 , y0 ) is a (strict) local
maximum point.
2. If A > 0 and AC − B 2 > 0, then (x0 , y0 ) is a (strict) local
minimum point.
3. If AC − B 2 < 0, then (x0 , y0 ) is a saddle point.
4. If AC − B 2 = 0, then (x0 , y0 ) could be a local maximum, a local
minimum, or a saddle point.
Recall the Hessian matrix at (x0 , y0 ):
 
A B
D2 f (x0 , y0 ) =
B C

46 / 57
Example

For each of the following functions, (x, y) = (0, 0) is a stationary point.


Determine whether it is a local max, local min, or a saddle point
[1] f (x, y) = x2 + y 2
[2] g(x, y) = x2 − y 2
[3] h(x, y) = −(x2 + y 2 )
[4] p(x, y) = x2 + y 4
[5] q(x, y) = x2 − y 4
[6] s(x, y) = x2 y 2
[7] t(x, y) = −x2 y 2
[8] r(x, y) = x3 + y 4

47 / 57
Increasing Transformations

Optimizing a function is equivalent to optimizing a strictly increasing


transformation of that function.
I Assume that we are trying to find the maximum points of f (x, y)
over a set S. This is equivalent to finding the maximum points of
the following:
I af (x, y) + b (a > 0)
I ef (x,y)
I ln f (x, y) assuming f (x, y) > 0

I The maximum points will be the same for each of these cases,
but the maximum values will be very different.

48 / 57
Increasing Transformations

More generally, take f (x1 , x2 , . . . , xn ) which is defined over a set S,


and F (x) which is defined over the range of f . Also take
c= (c1 , . . . , cn ) which is a point in S. Define g over S by

g(x1 , . . . , xn ) = F (f (x1 , . . . , xn ))

Then:
1. If F is increasing and c maximizes (minimizes) f over S, then c
also maximizes (minimizes) g over S.
(F increasing and c optimizes f ⇒ c optimizes g)
2. If F is strictly increasing, then c maximizes (minimizes) f over S
if and only if c maximizes (minimizes) g over S.
(F strictly increasing ⇒ (c optimizes f ⇔ c optimizes g))

49 / 57
Example

The following function has a unique maximum point. What is it?

f (x, y, z) = exp (2x − x2 + 10y − y 2 + 3 − z 2 )

50 / 57
Economic Applications

(Discriminating Monopolist) Consider a firm that costlessly sells a


product in two isolated geographical areas. It can charge different
prices in the two areas. The inverse demand curves for the two
markets are given by

P1 = a1 − b1 Q1 , P2 = a2 − b2 Q2

I What is the profit maximizing quantity and price in each market?


I What is the firm’s maximum profit?

51 / 57
Economic Applications

(Collusion) Consider two firms that sell an identical product to the


same market. The inverse demand curve for the product depends on
the total amount of the product produced by the two firms:

P = 80 − Q1 − Q2

It costs firm 1 C1 (Q1 ) = Q21 to produce Q1 of the product, and it costs


firm 2 C2 (Q2 ) = Q22 /2 to produce Q2 product.
I Instead of competing with each other, the firms decide to join
forces to maximize their total profits. How much of the product
should each firm produce?
I What is the maximum total profit of the two firms?

52 / 57
Comparative Statics

I Assume that you found an optimal point for a function, but the
value of the function is also dependent on the value of a
parameter that is set exogenously (i.e. not determined by the
model).
I For example, in the discriminating monopolist example, the profit
function has 4 parameters, a1 , a2 , b1 , and b2 , and the equation for
the maximum profit depended on these 4 parameters:

a21 a2
π ∗ (a1 , a2 , b1 , b2 ) = + 2
4b1 4b2

I In this example, π ∗ is called the value function, and is the value


of the objective function when the variables of interest have been
optimized (in this case Q1 , Q2 , P1 and P2 ).

53 / 57
Comparative Statics

I More generally, if we have f (x, r) where x is a variable and r is a


parameter, f ∗ (r) = f (x∗ (r), r) is the value function, where x∗ (r)
is the value of x that maximizes f . Generally, x∗ is a function of r.
I What happens to the value function when one of the parameters
changes?
I We can use the envelope theorem:

df ∗ (r) ∂f (x∗ (r), r)


=
dr ∂r

I This is the envelope theorem in one dimension. The general


version will be mentioned later.

54 / 57
Envelope Theorem
I Using the chain rule on f ∗ (r),

df ∗ (r) dx∗ (r)


= f10 (x∗ (r), r) + f20 (x∗ (r), r)
dr dr

I f ∗ (r) changes when r changes because (a) the maximizing


value of the variable changes, and (b) the function depends
directly on r
I Since x∗ (r) maximizes f (x, r), we know that f10 (x∗ (r), r) = 0
(from the 1st-order condition) as long as we’re at an interior point.
I This gives us the envelope theorem:

df ∗ (r) ∂f (x∗ (r), r)


= f20 (x∗ (r), r) =
dr ∂r

I This means that we can ignore the indirect effect of the


parameter on the value function (through changing the optimizing
point), and focus only on the direct change to the function.
55 / 57
Example

∂π ∗
Take the example of the discriminating monopolist. Find by
∂a1
(a) using the envelope theorem, and
(b) directly computing the derivative from π ∗ (a1 , a2 , b1 , b2 ).

56 / 57
Envelope Theorem

57 / 57

You might also like