You are on page 1of 42

Constrained and Unconstrained Optimization

Carlos Hurtado

Department of Economics
University of Illinois at Urbana-Champaign
hrtdmrt2@illinois.edu

Oct 10th, 2017

C. Hurtado (UIUC - Economics) Numerical Methods


On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Numerical Optimization

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Numerical Optimization

Numerical Optimization

I In some economic problems, we would like to find the value that


maximizes or minimizes a function.

I We are going to focus on the minimization problems:

min f (x )
x

or
min f (x ) s.t. x ∈ B
x

I Notice that minimization and maximization are equivalent because we


can maximize f (x ) by minimizing −f (x ).

C. Hurtado (UIUC - Economics) Numerical Methods 1 / 27


Numerical Optimization

Numerical Optimization

I We want to solve this problem in a reasonable time

I Most often, the CPU time is dominated by the cost of evaluating


f (x ).

I We will like to keep the number of evaluations of f (x ) as small as


possible.
I There are two types of objectives:
- Finding global minimum: The lowest possible value of the function
over the range.
- Finding a local minimum: Smallest value within a bounded
neighborhood.

C. Hurtado (UIUC - Economics) Numerical Methods 2 / 27


Minimization of Scalar Function

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Minimization of Scalar Function

Bracketing Method

I We would like to find the minimum of a scalar funciton f (x ), such


that f : R → R.
I The Bracketing method is a direct method that does not use
curvature or local approximation
I We start with a bracket:
(a, b, c) s.t. a < b < c and f (a) > f (b) and f (c) > f (b)
I We will search for the minimum by selecting a trial point in one of the
intervals.
I If c − b > b − a, take d = b+c2 .
I Else, if c − b ≤ b − a, take d = a+b2
I If f (d) > f (b), there is a new bracket (d, b, c) or (a, b, d).
I If f (d) < f (b), there is a new bracket (a, d, c).
I Continue until the distance between the extremes of the bracket is
small.
C. Hurtado (UIUC - Economics) Numerical Methods 3 / 27
Minimization of Scalar Function

Bracketing Method
I We selected the new point using the mid point between the extremes,
but what is the best location for the new point d?

a b d c
I One possibility is to minimize the size of the next search interval.
I The next search interval will be either from a to d or from b to c
I The proportion of the left interval is
b−a
w=
c −a

I The proportion of the new interval is


d −b
z=
c −a
C. Hurtado (UIUC - Economics) Numerical Methods 4 / 27
Golden Search

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Golden Search

Golden Search
I The proportion of the new segment will be
c −b
1−w =
c −a
or
d −a
w +z =
c −a

I Moreover, if d is the new candidate to minimize the function,


d−b
z c−a d −b
= c−b
=
1−w c−a
c −b

I Ideally we will have


z = 1 − 2w
and
z
=w
1−w
C. Hurtado (UIUC - Economics) Numerical Methods 5 / 27
Golden Search

Golden Search

I The previous equations imply w 2 − 3w + 1 = 0, or



3− 5
w= ' 0.38197
2


1+ 5
I In mathematics, the golden ration is φ = 2
I This goes back to Pythagoras

1 3− 5
I Notice that 1 − φ = 2
I The Golden Search algorithm uses the golden ratio to set the new
point (using a weighed average)
I This reduces the bracketing by about 40%.
I The performance is independent of the function that is being
minimized.
C. Hurtado (UIUC - Economics) Numerical Methods 6 / 27
Golden Search

Golden Search

I Sometimes the performance can be improve substantially when a local


approximation is used.
I When we use a combination of local approximation and golden search
we get a method called Brent.
I Let us suppose that we want to minimize y = x (x − 2)(x + 2)2

C. Hurtado (UIUC - Economics) Numerical Methods 7 / 27


Golden Search

Golden Search

I Sometimes the performance can be improve substantially when a local


approximation is used.
I When we use a combination of local approximation and golden search
we get a method called Brent.
I Let us suppose that we want to minimize y = x (x − 2)(x + 2)2

30
y = x(x - 2)(x + 2) 2
25

20

15

10

10
2 1 0 1 2
x

C. Hurtado (UIUC - Economics) Numerical Methods 7 / 27


Golden Search

Golden Search

I We can use the minimize scalar function from the scipy .optimize
module.
1 >>> def f ( x ) :
2 >>> .... return ( x - 2) * x * ( x + 2) **2
3 >>> from scipy . optimize import minimize_scalar
4 >>> opt_res = minimize_scalar ( f )
5 >>> print opt_res . x
6 1.28077640403
7 >>> opt_res = minimize_scalar (f , method = ’ golden ’)
8 >>> print opt_res . x
9 1.28077640147
10 >>> opt_res = minimize_scalar (f , bounds =( -3 , -1) , method = ’
bounded ’)
11 >>> print opt_res . x
12 -2.0000002026

C. Hurtado (UIUC - Economics) Numerical Methods 8 / 27


Newton’s Method

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Newton’s Method

Newton’s Method
I Let us assume that the function f (x ) : R → R is infinitely
differentiable
I We would like to find x ∗ such that f (x ∗ ) ≤ f (x ) for all x ∈ R.
I Idea: Use a Taylor approximation of the function f (x ).
I The polynomial approximation of order two around a is:
1
p(x ) = f (a) + f 0 (a)(x − a) + f 00 (a)(x − a)2
2

I To find an optimal value for p(x ) we use the FOC:


p 0 (x ) = f 0 (a) + (x − a)f 00 (a) = 0

I Hence,
f 0 (a)
x =a−
f 00 (a)
C. Hurtado (UIUC - Economics) Numerical Methods 9 / 27
Newton’s Method

Newton’s Method
I The Newton’s method starts with a given x1 .
I To compute the next candidate to minimize the function we use
f 0 (xn )
xn+1 = xn −
f 00 (xn )

I Do this until
|xn+1 − xn | < ε
and
|f 0 (xn+1 )| < 

I Newton’s method is very fast (quadratic convergence).


I Theorem:
|f 000 (x ∗ )|
|xn+1 − xn | < |xn − x ∗ |2
2|f 00 (x ∗ )|
C. Hurtado (UIUC - Economics) Numerical Methods 10 / 27
Newton’s Method

Newton’s Method
I A Quick Detour: Root Finding
I Consider the problem of finding zeros for p(x )
I Assume that you know a point a where p(a) is positive and a point b
where p(b) is negative.
I If p(x ) is continuous between a and b, we could approximate as:
p(x ) ' p(a) + (x − a)p 0 (a)

I The approximate zero is then:


p(a)
x =a−
p 0 (a)

I The idea is the same as before. Newton’s method also works for
finding roots.
C. Hurtado (UIUC - Economics) Numerical Methods 11 / 27
Newton’s Method

Newton’s Method

I There are several issues with the Newton’s method:

- Iteration point is stationary

- Starting point enter a cycle

- Derivative does not exist

- Discontinuous derivative

I Newton’s method finds a local optimum, but not a global optimum.

C. Hurtado (UIUC - Economics) Numerical Methods 12 / 27


Polytope Method

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Polytope Method

Polytope Method

I The Polytope (a.k.a. Nelder-Meade) Method is a direct method to


find the solution of
min f (x )
x

where f : Rn → R.
I We start with the points x1 , x2 and x3 , such that
f (x1 ) ≥ f (x2 ) ≥ f (x3 )
I Using the midpoint between x2 and x3 , we reflect x1 to the point y1
I Check if f (y1 ) < f (x1 ).
I If true, you have a new polytope.
I If not, try x2 . If not, try x3
I If nothing works, shrink the polytope toward x3 .
I Stop when the size of the polytope is smaller then ε
C. Hurtado (UIUC - Economics) Numerical Methods 13 / 27
Polytope Method

Polytope Method
I Let us consider the following function:
f (x0 , x1 ) = (1 − x0 )2 + 100(x1 − x02 )2

I The function looks like:

C. Hurtado (UIUC - Economics) Numerical Methods 14 / 27


Polytope Method

Polytope Method
I Let us consider the following function:
f (x0 , x1 ) = (1 − x0 )2 + 100(x1 − x02 )2

I The function looks like:

3000 3000

2500 2500

2000 2000

1500 y 1500 y

1000 1000

500 500

0 0
2.01.5 2.01.5
1.0 0.5 1 1.0 0.5 1
0 0
0.0 1 0.0 1
x0 0.5 1.0 2
x1 x0 0.5 1.0 2
x1
1.5 3 1.5 3
2.0 4 2.0 4

C. Hurtado (UIUC - Economics) Numerical Methods 14 / 27


Polytope Method

Polytope Method

I Let us consider the following function:

f (x0 , x1 ) = (1 − x0 )2 + 100(x1 − x02 )2

I The function looks like:

1.5

1.0

0.5

0.0

0.5
1.0 0.5 0.0 0.5 1.0 1.5

C. Hurtado (UIUC - Economics) Numerical Methods 14 / 27


Polytope Method

Polytope Method

I In python we can do:


1 >>> def f2 ( x ) :
2 .... return (1 - x [0]) **2 + 100*( x [1] - x [0]**2) **2
3 >>> from scipy . optimize import fmin
4 >>> opt = fmin ( func = f2 , x0 =[0 ,0])
5 >>> print ( opt )
6 [ 1.00000439 1.00001064]

C. Hurtado (UIUC - Economics) Numerical Methods 15 / 27


Newton’s Method Reloaded

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Newton’s Method Reloaded

Newton’s Method

I What can we do if we want to use Newton’s Method for a function


f : Rn → R?

I We can use a quadratic approximation at a0 = (a1 , · · · , an ):

1
p(x ) = f (a) + ∇f (a)(x − a) + (x − a)0 H(a)(x − a)
2
where x 0 = (x1 , · · · , xn ).

I The gradient ∇f (x ) is a multi-variablegeneralization of the


derivative: ∇f (x )0 = ∂f∂x(x1 ) , · · · , ∂f∂x(xn )

C. Hurtado (UIUC - Economics) Numerical Methods 16 / 27


Newton’s Method Reloaded

Newton’s Method
I The hessian matrix H(x ) is a square matrix of second-order partial
derivatives that describes the local curvature of a function of many
variables.
 ∂ 2 f (x ) ∂ 2 f (x ) ∂ 2 f (x )

2 ∂x1 ∂x2 ··· ∂x1 ∂xn
 2∂x1
 ∂ f (x ) ∂ 2 f (x ) ∂ 2 f (x )

···

∂x22
 ∂x2 ∂x1 ∂x2 ∂xn 
H(x ) =  .. .. ..

 .. 

 . . . . 

∂ 2 f (x ) ∂ 2 f (x ) ∂ 2 f (x )
∂xn ∂x1 ∂xn ∂x2 ··· ∂xn2

I The FOC is:


∇p = ∇f (a) + H(a)(x − a) = 0

I We can solve this to get:


x = a − H(a)−1 ∇f (a)
C. Hurtado (UIUC - Economics) Numerical Methods 17 / 27
Newton’s Method Reloaded

Newton’s Method

I Following the same logic as in the one dimensional case:

x k+1 = x k − H(x k )−1 ∇f (x k )

I How do we compute H(x k )−1 ∇f (x k )?


I We can solve:

H(x k )−1 ∇f (x k ) = s
∇f (x k ) = H(x k )s

I The search direction, s, is the solution of a system of equations (and


we know how to solve that!)

C. Hurtado (UIUC - Economics) Numerical Methods 18 / 27


Quasi-Newton Methods

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Quasi-Newton Methods

Quasi-Newton Methods

I For Newton’s method we need the Hessian of the function.


I If the Hessian is unavailable, the ”full” Newton’s method cannot be
used
I Any method that replaces the Hessian with an approximation is a
quasi-Newton method.
I One advantage of quasi-Newton methods is that the Hessian matrix
does not need to be inverted.
I Newton’s method, require the Hessian to be inverted, which is
typically implemented by solving a system of equations
I Quasi-Newton methods usually generate an estimate of the inverse
directly.

C. Hurtado (UIUC - Economics) Numerical Methods 19 / 27


Quasi-Newton Methods

Quasi-Newton Methods

I The Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm, the


Hessian matrix is approximated using updates specified by gradient
evaluations (or approximate gradient evaluations).
I In python:
1 >>> import numpy as np
2 >>> from scipy . optimize import fmin_bfgs
3 >>> def f ( x ) :
4 ... return (1 - x [0]) **2 + 100*( x [1] - x [0]**2) **2
5 >>> opt = fmin_bfgs (f , x0 =[0.5 ,0.5])

I Using the gradient we can improve the approximation


1 >>> d e f g r a d i e n t ( x ) :
2 ... r e t u r n np . a r r a y (( −2∗(1 − x [ 0 ] ) − 100∗4∗ x [ 0 ] ∗ ( x [ 1 ] − x [ 0 ] ∗ ∗ 2 ) , 2 0 0 ∗ ( x [ 1 ] − x
[0]∗∗2) ) )
3 >>> o p t 2 = f m i n b f g s ( f , x0 = [ 1 0 , 1 0 ] , f p r i m e=g r a d i e n t )

C. Hurtado (UIUC - Economics) Numerical Methods 20 / 27


Quasi-Newton Methods

Quasi-Newton Methods

I One of the methods that requires the fewest function calls (therefore
very fast) is the Newton-Conjugate-Gradient (NCG).

I The method uses a conjugate gradient algorithm to (approximately)


invert the local Hessian.

I If the Hessian is positive definite then the local minimum of this


function can be found by setting the gradient of the quadratic form to
zero

I In python
1 >>> from scipy . optimize import fmin_ncg
2 >>> opt3 = fmin_ncg (f , x0 =[10 ,10] , fprime = gradient )

C. Hurtado (UIUC - Economics) Numerical Methods 21 / 27


Non-linear Least-Square

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Non-linear Least-Square

Non-linear Least-Square

I suppose it is desired to fit a set of data {xi , yi } to a model,


y = f (x ; p) where p is a vector of parameters for the model that
need to be found.
I A common method for determining which parameter vector gives the
best fit to the data is to minimize the sum of squares errors. (why?)
I The error is usually defined for each observed data-point as:

ei (yi , xi ; p) = kyi − f (xi ; p)k

I The sum of the square of the errors is:


N
ei2 (yi , xi ; p)
X
S (p; x , y ) =
i=1

C. Hurtado (UIUC - Economics) Numerical Methods 22 / 27


Non-linear Least-Square

Non-linear Least-Square

I Suppose that we model some populaton data at several times.

yi = f (ti ; (A, b)) = Ae bt

I The parameters A and b are unknown to the economist.


I We would like to minimize the square of the error to approximate the
data

C. Hurtado (UIUC - Economics) Numerical Methods 23 / 27


Non-linear Least-Square

Non-linear Least-Square

I Suppose that we model some populaton data at several times.

yi = f (ti ; (A, b)) = Ae bt

I The parameters A and b are unknown to the economist.

I We would like to minimize the square of the error to approximate the


data

C. Hurtado (UIUC - Economics) Numerical Methods 23 / 27


Constrained Optimization

On the Agenda

1 Numerical Optimization
2 Minimization of Scalar Function
3 Golden Search
4 Newton’s Method
5 Polytope Method
6 Newton’s Method Reloaded
7 Quasi-Newton Methods
8 Non-linear Least-Square
9 Constrained Optimization

C. Hurtado (UIUC - Economics) Numerical Methods


Constrained Optimization

Constrained Optimization

I Let us find the minimum of a scalar function subject to constrains.

min f (x ) s.t. g(x ) = a and h(x ) ≥ b


x ∈Rn

I Here we have g : Rn → Rm and h : Rn → Rk .


I Notice that we can re-write the problem as an unconstrained version:
m k
" #
1 X 2
X
min f (x ) + p (gi (x ) − ai ) + max {0, hj (x ) − bj }
x ∈Rn 2 i=1 j=1

I For a ”very large” value of p, the constrain needs to be satisfied


(penalty method).

C. Hurtado (UIUC - Economics) Numerical Methods 24 / 27


Constrained Optimization

Constrained Optimization

I If the objective function is quadratic, the optimization problem looks


like
1
min q(x ) = x 0 Gx + x 0 c s.t. g(x ) = a and h(x ) ≥ b
x ∈Rn 2

I The structure of this type of problems can be efficiently exploited.

I This form the basis for Augmented Lagrangian and Sequential


Quadratic Programming problems

C. Hurtado (UIUC - Economics) Numerical Methods 25 / 27


Constrained Optimization

Constrained Optimization

I The Augmented Lagrangian Methods use a mix of the Lagrangian


with penalty method.

I The Sequential Quadratic Programming Algorithms (SQPA) solve the


problem by using Quadratic approximations of the Lagrangean
function.

I The SQPA is the analogous of Newton’s method for the case of


constraints.

I How does the algorithm solve the problem? It is possible with


extensions of simplex method, which we will not cover.

I The previous extensions can be solved with the BFGS algorithm

C. Hurtado (UIUC - Economics) Numerical Methods 26 / 27


Constrained Optimization

Constrained Optimization

I Let us consider the Utility Maximization problem of an agent with


constant elasticity of substitution (CES) utility function:
1
U(x , y ) = (αx ρ + (1 − α) y ρ ) ρ

I Denote by px and py the prices of goods x and y respectively.

I the constraint optimization problem for the consumer is:

max U(x , y ; ρ, α) subject to x ≥ 0, y ≥ 0 and px x + py y = M


x ,y

C. Hurtado (UIUC - Economics) Numerical Methods 27 / 27

You might also like