Optimization Note PDF

Optimization: Existence
Reading: Sundaram
Ch 3, Ch 4.1-4.5, (4.6 optional)
Ch 5.1-5.5 (5.2.3, 5.6 optional), Ch. 6.1-6.4 (6.1.2, 6.1.3 optional)
Weierstrass’ Theorem
Let S ⊆ RN be a compact set and f : RN → R is a continuous function. There is a
maximum and minimum of f in S. That is there exists m1 , m2 ∈ S such that
f (m2 ) ≤ f (x) ≤ f (m1 ) for all x ∈ S. These are also called global maximum and
global minimum.
Proof: First we shall show that f (S) is a compact set. Then we shall show that
compact sets in R has maximum and minimum.
Step 1: Take any sequence {yn }∞ n=1 in f (S). yn ∈ f (S) implies there is xn ∈ S such
that yn = f (xn ). Take the sequence {xn }∞ ∞
n=1 in S. Since S is compact, {xn }n=1 has a
∞
subsequence {zn }n=1 which converges to a point z in S. Since f is continuous
{f (zn )}∞ ∞ ∞
n=1 converges to f (z). Of course {f (zn )}n=1 is a subsequence of {yn }n=1 and
f (z) ∈ f (S) (because z ∈ S). Thus every sequence in f (S) has convergent
subsequence implying f (S) is compact.
Step 2: f (S) is compact in R implies f (S) is closed and bounded. f (S) is bounded
implies f (S) has a supremum (say M1 ) and an infimum (say M2 ). Since M1 (similarly
M2 ) satisfies -test, it is a limit point of f (S). Since f (S) is closed, M1 must be in
f (S). Thus there exists m1 ∈ S such that f (m1 ) = M1 ≥ f (x) for all x ∈ S.
Optimization: Existence
Examples
1. f : R → R. f (x) = x. There is no maximum. Here f is continuous but domain is
not compact because it is not bounded.
2. f : (0, 1) → R. f (x) = x. There is no maximum. f is continuous but once again
domain is not compact because it is not closed.
3. f : [0, 1] → R. f (x) = x for all x 6= 1 and f (1) = 0. There is no maximum. Here
domain is compact but f is not continuous.
Weierstrass’ Theorem gives ‘sufficient condition’ for existence of maximum and
minimum. It is not necessary to have these properties for existence of optimum.
4. Take f : [0, 1] → R. f (x) = x for all x 6= 1/2 and f ( 1/2 ) = 0. This has maximum
and minimum.
Finding Interior Optima
Interior Optima
Let f : S → R, where S ⊆ RN . We want to maximize (or minimize) f over S. We
shall focus on maximum. Note that minimum of f is the same as maximum of −f .
Instead of finding global maximum, we start from finding interior local maxima and
use them to find global maximum.
x ? is called an interior local maximum if
(i) x ? is an interior point of S
(ii) There exists > 0 such that f (x ? ) ≥ f (x) for all x ∈ B (x ? ) ⊆ S.
Note: At this moment we are excluding points that are not interior point of S.
Note: It is strict local maximum if the above inequality is strict.
First order necessary condition:
x ? is interior local maximum (or minimum) and f is differentiable at x ? ⇒ Df (x ? ) = 0
Proof: Take N = 1. Take two sequence {yn }∞ ∞
n=1 and {zn }k=1 such that
? ? ?
lim yn = lim zn = x ; yn < x and zn > x for all n.
f (yn ) − f (x ? ) f (zn ) − f (x ? )
≥0≥
yn − x ? zn − x ?
Taking limit, we get f 0 (x ? ) ≥ 0 ≥ f 0 (x ? ). Thus f 0 (x ? ) = 0.

∂f
When N > 1, follow the same proof for every j. We get ∂x = 0 for all j = 1, 2, . . . , N.
j
Observation: This is not a sufficient condition. Example: S = [−1, 1], f (x) = x 3 .

There is a critical point x = 0 but it is not an optimum.
Digression: Useful Results
Rolle’s Theorem: [a, b] ⊂ R. f : [a, b] → R is a continuous function and has
derivative at each point of (a, b). If f (a) = f (b) then there is c ∈ (a, b) such that
f 0 (c) = 0
Proof: [a, b] is a compact set and f is a continuous function. Therefore f has a global
maximum and minimum in [a, b]. If the maximum or minimum is an interior point,
then we have obtained a c ∈ (a, b) such that f 0 (c) = 0. Otherwise both maximum
and minimum are obtained at a and b, which implies that f is a constant function in
[a, b]. Hence for all c ∈ (a, b), f 0 (c) = 0.
Intermediate Value Theorem: [a, b] ⊂ R. f : [a, b] → R is a continuous function and
has derivative at each point of (a, b). Then there is c ∈ (a, b) such that
f (b)−f (a)
b−a
= f 0 (c)
Proof: Define h(x) = f (x)(b − a) − x[f (b) − f (a)]. Note that h is continuous on
[a, b] and has derivative at each point of (a, b). Moreover h(a) = h(b). Using Rolle’s
Theorem, there is a c ∈ (a, b) such that h0 (c) = f 0 (c)(b − a) − [f (b) − f (a)] = 0
Result: [a, b] ⊂ R. f : [a, b] → R is a continuous function and has derivative at each
point of (a, b).
(i) If f 0 (x) > 0 for all x ∈ (a, b) then f is strictly increasing on [a, b].
(ii) If f 0 (x) < 0 for all x ∈ (a, b) then f is strictly decreasing on [a, b].
(iii) If f 0 (x) = 0 for all x ∈ (a, b) then f is constant on [a, b].
Proof: We just proof part (i). The rest are similar. Take any two points x1 < x2 in
the interval [a, b]. By Intermediate Value Theorem, there exists c ∈ (x1 , x2 ) such that
f (x2 ) − f (x1 ) = (x2 − x1 )f 0 (c). Right hand side is strictly positive because x2 > x1
and f 0 (c) > 0. Hence f (x2 ) > f (x1 )
Second Order Conditions
Second order conditions: f ∈ C 2
1. (necessary) x ? is interior local maximum (resp. minimum) ⇒ z T D 2 f z ≤ 0 (resp.

T
2 N
z D f z ≥ 0) for all z ∈ R and z 6= 0,
2. (sufficient) x ? is a critical point and z T D 2 f z < 0 (resp. z T D 2 f z > 0) for all

z ∈ RN such that z 6= 0 ⇒ x ? is a strict local maximum (resp. strict local minimum)

Proof: (N = 1): First we show part 2. Take an interior point x ? such that f 0 (x ? ) = 0
and f 00 (x ? ) < 0 (since it is a scalar). Since f ∈ C 2 , f 00 is a continuous function.
Therefore we can find a neighbourhood around x ? such that for all x ∈ Br (x ? ),
f 00 (x) < 0. Hence f 0 is strictly decreasing in Br (x ? ). Since f 0 (x ? ) = 0, we must have
(i) f 0 (z) > 0 for z < x ? and z ∈ Br (x ? ), (ii) f 0 (z) < 0 for z > x ? and z ∈ Br (x ? ).
(i) ⇒ f is strictly increasing when z < x ? and z ∈ Br (x ? ) ⇒ f (x ? ) > f (z) for all
z < x ? and z ∈ Br (x ? )
(ii) ⇒ f is strictly decreasing when z > x ? and z ∈ Br (x ? ) ⇒ f (x ? ) > f (z) for all
z > x ? and z ∈ Br (x ? )
Combining two cases, we obtain, f (x ? ) > f (z) for all z ∈ Br (x ? ). Hence x ? is a strict
local maximum. Part 1 follows immediately from part 2.
(N > 1): We shall only prove part 1. Pick any z ∈ RN , z 6= 0. Define
g (t) = f (x ? + tz). Note that g is composition of two functions f (which is from
RN → R) and φ, defined as , φ(t) = x ? + tz. Note further that g (0) = f (x ? ) > g (t)
for all t such that (x ? + tz) belongs to the neighbourhood that has x ? as the interior
local maximum. Since g is a function from R→ R, we conclude that 00
g (0) ≤ 0. By
applying chain rule twice, we get g 00 (0) = z T D 2 f (x ? ) z. Thus z T D 2 f (x ? ) z ≤ 0.
Pathway
Caution
Example: Caution 1: Be aware of necessary conditions.

S = R2 , f (x1 , x2 ) = x13 − 3x1 x22 . Only critical point is x ? = (0, 0) and f (0, 0) = 0

0 0
D 2 f (x ? ) =
0 0
which is both PSD and NSD. However x ? is neither local min nor local max. For
> 0, f (, 0) > 0 and f (−, 0) < 0.
Example: Caution 2: Local maximum does not imply that it is global maximum.
Global maximum and minimum may not exist.
S = R, f (x) = 2x 3 − 3x 2 . Critical points are x ? = 0, 1. Second order conditions
imply 0 is a strict local maximum and 1 is strict local minimum.
Our algorithm will say that global maximum is obtained at 0 and global minimum is
obtained at 1.
However there is no global maximum or minimum. f (x) strictly increases for x > 1
and x < 0. Of course a local maximum is obtained at 0 and a local minimum is
obtained at 1.
Optimization with Equality Constraints
Suppose that the objective function is f : RN → R and there are K (< N) constraint
functions, g 1 : RN → R, g 2 : RN → R, . . . , g K : RN → R. We want to maximize
(or minimize) f (x) over all x ∈ RN such that g 1 (x) = 0, g 2 (x) = 0, . . . , g K (x) = 0.
Once again, we start from local maxima x ? that satisfy K constraints. That is
(i) g 1 (x ? ) = 0, g 2 (x ? ) = 0, . . . , g K (x ? ) = 0
(ii) There exists an open set U such that f (x ? ) ≥ f (x) for all x ∈ U and
g 1 (x) = 0, g 2 (x) = 0, . . . , g K (x) = 0.
Note: K < N makes the domain non trivial.
First order necessary condition (Theorem of Lagrange):
Suppose that f , g 1 , . . . , g K are C 1 functions.
x ? is local maximum (or minimum) of f ⇒ There exists λ?1 , λ?2 , . . . , λ?K ∈ RK such
that
K
X
Df (x ? ) + λ?j Dg j (x ? ) = 0
j=1
However this hold only when rank(Dg (x ? )) = K . This condition is called constraint
qualification.  1 
∂g ∂g 1 ∂g 1
...
 ∂x1 ∂x2 ∂xN 
Dg = . . . . . . . . . . . . . . . . . . . . . . .

∂g K ∂g K ∂g K
∂x1 ∂x2
... ∂xN
Second Order Condition
Second order conditions: f ∈ C 2

1 − a) (necessary) x ? is interior local maximum ⇒
z T D 2 f (x ? ) + K ? 2 j ? N
P
j=1 λj D g (x ) z ≤ 0 for all z ∈ R such that z 6= 0 and
?
Dg (x )z = 0.
1 − b) (necessary) x ? is interior local minimum ⇒
z T D 2 f (x ? ) + K ? 2 j ? N
P
j=1 λj D g (x ) z ≥ 0 for all z ∈ R such that z 6= 0 and
?
Dg (x )z = 0.
2 − a) (sufficient) x ? is a critical point and z T D 2 f (x ? ) + K ? 2 j ?
P
j=1 λj D g (x ) z < 0
for all z ∈ RN , such that z 6= 0 and Dg (x ? )z = 0 ⇒ x ? is a strict local maximum.
2 − b) (sufficient) x ? is a critical point and z T D 2 f (x ? ) + K ? 2 j ?
P
j=1 λj D g (x ) z > 0
N ? ?
for all z ∈ R , such that z 6= 0 and Dg (x )z = 0 ⇒ x is a strict local minimum.
Pathway and Caution
Finding global maximum:
Step 1: Find all local maxima using First order condition and the constraints.
Step 2: Compare f (x ? ) at all points obtained in Step 1.
Note 1: There is no need for second order condition.
Note 2: Usually constraint qualification hold at all points in the domain. If it does not
there is no way to check whether CQ holds.
Caution 1: Be aware of necessary conditions.
Suppose f (x1 , x2 ) = x13 + x23 and g 1 (x1 , x2 ) = x1 − x2 . Note that CQ is met at all
points.
First order condition:
3x12 + λ1 = 0 (1)
3x22 − λ1 = 0 (2)
x1 − x2 = 0 (3)
Only solution is x1 = x2 = λ = 0. Even second order conditions can not rule out
x1 = x2 = λ = 0 as local maximum and minimum. However (0, 0) is neither local
maximum nor local minimum. For instance, take f (, ) = 23 > 0 = f (0, 0), when
> 0.
This also suggests that there is no global maximum and global minimum, which is
indeed true. f (, ) = 23 goes to ∞ as → ∞
Pathway and Caution
Caution 2: Algorithm may fail because CQ is not satisfied.

Suppose f (x1 , x2 ) = −x2 and g 1 (x1 , x2 ) = x23 − x12 . This has unique global maximum
at (0, 0). Note that CQ fails at (0, 0).
First order condition does not have any solution.
−2λ1 x1 = 0 (4)
−1 + 3λ1 x22 = 0 (5)
x12 − x23 = 0 (6)
Caution 3: Global maximum and minimum may not exist.

Suppose f (x1 , x2 ) = 31 x13 − 32 x22 + 2x1 and g 1 (x1 , x2 ) = x1 − x2 . First order condition
x12 + 2 + λ1 = 0 (7)
−3x2 − λ1 = 0 (8)
x1 − x2 = 0 (9)
This system of equations have two solutions: x1 = x2 = 1, λ1 = −3 and

x1 = x2 = 2, λ1 = −6. f (1, 1) = 56 and f (2, 2) = 23 . However, global maximum and
minimum does not exist. f (, ) is an increasing function when > 2 and < 1. Of
course (1, 1) is a local maximum and (2, 2) is a local minimum.
Proof of FOC
Sketch of a proof: We prove a special case N = 2 and K = 1. That is we are
maximizing f (x1 , x2 ) subject to g (x1 , x2 ) = 0. We know that f and g are C 1 functions.
∂g ∂g ∂g
CQ implies ( ∂x , ∂x ) 6= (0, 0) at optimum (x1? , x2? ). Assume that ∂x (x1? , x2? ) 6= 0.
1 2 2
By Implicit function theorem, ∃h ∈ C 1 such that x2 = h(x1 ) in an open set containing
(x1? , x2? ). Thus
∂g
∂g ∂g 0 ∂x1
(x1? ,x2? )
(A) g (x1 , h(x1 )) = 0. By chain rule, ∂x1
+ ∂x2
h (x1 ) = 0 ⇒ h0 (x1? ) = − ∂g
∂x2
(x1? ,x2? )
(B) f (x1 , h(x1 )) is maximized at x1? . By chain rule and unconstrained FOC,
∂f ∂f
(x ? , x2? ) + ∂x
∂x1 1
(x1? , x2? )h0 (x1? ) = 0
2
To show that, we can find λ? ∈ R such that
∂f ∂g ? ?
(x ? , x ? ) + λ? (x , x ) = 0 (10)
∂x1 1 2 ∂x1 1 2
∂f ∂g ? ?
(x ? , x ? ) + λ? (x , x ) = 0 (11)
∂x2 1 2 ∂x2 1 2
∂f
∂x2
(x1? ,x2? )
Equation (11) gives λ? = − ∂g . Remains to show Equation (10) holds for λ? .
∂x2
(x1? ,x2? )
∂f ∂g ∂f ∂f
(x ? , x2? )
∂x1 1
+ λ? ∂x (x1? , x2? ) = ∂x (x1? , x2? ) + (x ? , x2? )h0 (x1? )
∂x2 1
(by A)
1 1
=0 (by B)
Implicit Function Theorem
Take a C 1 function φ : D → R where D ⊆ RN is an open set.

∂φ
Suppose that φ(x1? , x2? , · · · , xN? ) = 0 and (x ? , x2? , · · ·
∂x1 1
, xN? ) 6= 0.
Then we can find a neighbourhood Vδ (x2? , · · · , xN? ) ⊆ R(N−1) and an open set
W ⊂ R1 containing x1? such that
(i) there exists a C 1 function ψ : Vδ (x2? , · · · , xN? ) → W and x1? = ψ(x2? , · · · , xN? )
∂φ
(x1? ,x2? ,··· ,xN
?
)
∂ψ ∂xk
(ii) moreover, ∂xk
(x2? , · · · , xN? ) = − ∂φ for all k = 2, · · · , N
∂x1
(x1? ,x2? ,··· ,xN
?)
Optimization with Inequality Constraints
Suppose that the objective function is f : RN → R and there are K constraint
functions, h1 : RN → R, h2 : RN → R, . . . , hK : RN → R. We want to maximize
(or minimize) f (x) over all x ∈ RN such that h1 (x) ≥ 0, h2 (x) ≥ 0, . . . , hK (x) ≥ 0.
Once again, we start from local maxima x ? that satisfy K constraints. That is
(i) h1 (x ? ) ≥ 0, h2 (x ? ) ≥ 0, . . . , hK (x ? ) ≥ 0
(ii) There exists an open set U such that f (x ? ) ≥ f (x) for all x ∈ U and
h1 (x) ≥ 0, h2 (x) ≥ 0, . . . , hK (x) ≥ 0
First order necessary condition (Theorem of Kuhn and Tucker):
Suppose that f , h1 , . . . , hK are C 1 functions.
x ? is local maximum of f ⇒ There exists λ?1 , λ?2 , . . . , λ?K ∈ RK such that
(KT1) λ?j ≥ 0 and λ?j hj (x ? ) = 0 for all j = 1, 2, . . . , K
(KT2) Df (x ? ) + K ? j ?
P
j=1 λj Dh (x ) = 0
under appropriate CQ condition (which we shall ignore).
In case of minimum replace f by −f to obtain
(KT1) λ?j ≥ 0 and λ?j hj (x ? ) = 0 for all j = 1, 2, . . . , K
(KT2) Df (x ? ) − K ? j ?
P
j=1 λj Dh (x ) = 0
Finding global maximum:

Step 1: Find all local maxima using First order condition.
Step 2: Compare f (x ? ) at all points obtained in Step 1.
Note: The usual cautions apply.
Envelope Theorem
Envelope Theorem
Objective function f : R(N+1) → R is a function of variables (x1 , x2 , . . . xN ) ∈ Rn and
also a parameter α ∈ R. The parameter is held constant when f is maximized over
RN subject to h(x1 , x2 , . . . xN , α) ≥ 0.
Suppose that for every α, there is a unique interior maximizer, so we can say that
x ? (α) is the function that represents this relationship between parameter and
maximizer. Suppose f ∈ C 1 and x ? (α) is differentiable. V (α) = f (x ? (α), α) is the
maximum value of f given α and is called a value function.
Change of V (α) with respect to α is given by
V 0 (α) = ∂L
∂α
(x ? (α), λ? (α), α), where L = f (x, α) + λh(x, α)
Proof (with a bit of hand-waving): V (α) = f (x ?
(α), α).
dxj?
∂f
By chain rule, V 0 (α) = ∂α (x ? (α), α) + N ∂f ? (x ? (α), α)
P
j=1 ∂x (x (α), α) ·
j dα
∂f ∂h
By (KT2) (x ? (α), α) = −λ? ∂x
∂xj
(x ? (α), α) for all j = 1, 2 . . . n.
j
?
∂f
Hence V 0 (α) = ∂α (x ? (α), α) − λ? N ∂h ? (α), α) · dxj (x ? (α), α)
P
j=1 ∂x (x dα
j
By (KT1) either λ? = 0 or h(x ? (α), α) = 0

∂f
Case 1: If λ? = 0 then V 0 (α) = ∂α (x ? (α), α) = ∂L
∂α
(x ? (α), λ? (α), α)
?
Case 2: Define h(x (α), α) = H(α). Thus H(α) = 0. Differentiating w.r.t α, we get,
?

∂h
H 0 (α) = ∂α (x ? (α), α) + N ∂h ? (α), α) · dxj (x ? (α), α) = 0
P
j=1 ∂x (x dα
j
∂f ∂h ∂L
Therefore V 0 (α) = ∂α
(x ? (α), α) + λ? ∂α (x ? (α), α) = ∂α
(x ? (α), λ? (α), α)

Optimization Note PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimization Note PDF

Uploaded by

Copyright:

Available Formats

Optimization: Existence

Taking limit, we get f 0 (x ? ) ≥ 0 ≥ f 0 (x ? ). Thus f 0 (x ? ) = 0.

Observation: This is not a sufficient condition. Example: S = [−1, 1], f (x) = x 3 .

z ∈ RN such that z 6= 0 ⇒ x ? is a strict local maximum (resp. strict local minimum)

Example: Caution 1: Be aware of necessary conditions.

Second order conditions: f ∈ C 2

Caution 2: Algorithm may fail because CQ is not satisfied.

Caution 3: Global maximum and minimum may not exist.

This system of equations have two solutions: x1 = x2 = 1, λ1 = −3 and

Take a C 1 function φ : D → R where D ⊆ RN is an open set.

Finding global maximum:

By (KT1) either λ? = 0 or h(x ? (α), α) = 0

You might also like