Professional Documents
Culture Documents
Matlab - Optimization and Integration: Paul Schrimpf
Matlab - Optimization and Integration: Paul Schrimpf
Paul Schrimpf
Paul Schrimpf ()
1 / 43
Integration
I
Paul Schrimpf ()
2 / 43
Optimization
Paul Schrimpf ()
3 / 43
Section Search
Form bracket x1 < x2 < x3 such that f (x2 ) < f (x1 ) and f (x3 )
Try new x (x1 , x3 ), update bracket
Paul Schrimpf ()
4 / 43
Quadratic Interpolation
Paul Schrimpf ()
5 / 43
f 0 (xn )
f 00 (xn )
0
(xn1 )
Quasi-Newton: approximate f 00 (xn ) with f (xxnn)f
xn1
P
00 (x) with 1
BHHH:
approximate
fMatlab
fi 0 (x)fi 0 (x)0 January 14, 2009
Paul Schrimpf
()
Optimization
Integration
N and i=1
6 / 43
Rates of Convergence
Newtons method converges quadraticly, i.e. in a neighborhood of the
solution,
||xn+1 x ||
=C
n ||xn x ||2
lim
7 / 43
Trust Regions
Paul Schrimpf ()
8 / 43
Matlab Implementation
fminbnd()
2
3
4
5
6
7
8
9 / 43
Paul Schrimpf ()
10 / 43
Choose a direction
I
F
2
Paul Schrimpf ()
11 / 43
2
3
Paul Schrimpf ()
12 / 43
Matlab Implementation
fminunc() offers two algorithms
1 optimset('LargeScale','off') quasi-Newton with
approximate line-search
Needs gradient, will compute finite difference approximation if gradient
not supplied
F Hessian approximation underdetermined by f 0 (xn ) and f 0 (xn1 ) in
dimension > 1
F Build Hessian approximation with recurrence relation:
optimset('HessUpdate','bfgs') (usually better) or
optimset('HessUpdate','dfp')
optimset('LargeScale','on') hybrid method: 2-d trust region
F
optimset('GradObj','on','DerivativeCheck','on')
optimset('Hessian','on')
Paul Schrimpf ()
13 / 43
Paul Schrimpf ()
14 / 43
uses it
N + 1 points in N dimensions form a polyhedron, move the
polyhedron by
fminsearch()
1
2
Animation
Paul Schrimpf ()
15 / 43
Pattern Search
Evaluate f (xi + dk )
If f (xi + dk ) < f (xi ), set xi+1 = xi + dk , increase
If mink f (xi + dk ) > f (xi ), set xi+1 = xi , decrease
Paul Schrimpf ()
16 / 43
Genetic Algorithm
Can find global optimum (but I do not know whether this has been
formally proven)
Begin with random population of points, {xn0 }, then
1
2
3
Paul Schrimpf ()
17 / 43
1
1+e (f (x)f (xi ))/
and
Paul Schrimpf ()
18 / 43
Constrained Optimization
min f (x)
(1)
s.t.
(2)
g (x) = 0
(3)
fmincon()
I
I
19 / 43
Paul Schrimpf ()
20 / 43
Derivatives
The best minimization algorithms require derivatives
Can use finite difference approximation
I
I
I
Paul Schrimpf ()
21 / 43
1
2
3
4
5
6
7
8
9
10
11
12
13
Paul Schrimpf ()
22 / 43
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Paul Schrimpf ()
23 / 43
simulateBinChoice()
1
2
3
4
5
function data=simulateBinChoice(dataIn,parm)
% ... comments and error checking omitted ...
data.x = randn(data.N,data.nX);
data.y = (data.x*parm.beta + epsilon > 0);
end
Paul Schrimpf ()
24 / 43
1
2
3
4
5
6
7
8
9
10
11
12
13
Paul Schrimpf ()
25 / 43
1
2
3
4
5
% gradient of l i
g i = (parm.dist.pdf(xb)*ones(1,length(b))).*data.x;
g i(data.y,:) = g i(data.y,:);
% change to gradient of loglike
grad = sum(g i./(l i *ones(1,length(b))),1)';
Paul Schrimpf ()
26 / 43
1
2
3
4
5
6
7
8
9
10
11
% calculate hessian
h i = zeros(length(xb),length(b),length(b));
for i=1:length(xb)
h i(i,:,:) = (parm.dist.dpdf(xb(i))* ...
data.x(i,:)'*data.x(i,:)) ...
... % make hessian of loglike
/ l i(i);
end
h i(data.y,:,:) = h i(data.y,:,:);
% make hessiane of loglike
hess = (sum(h i,1) g i'*(g i./(l i.2*ones(1,length(b)))))
Paul Schrimpf ()
27 / 43
binChoicePlot.m
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Paul Schrimpf ()
28 / 43
Numerical Integration
Want:
Z
F (x) =
f (x)(x)dx
a
Approximate:
F (x)
n
X
f (xi )wi
29 / 43
Quadrature
Suppose xi , n are given, need to choose wi
Let {ej (x)} be a basis for the space of functions such that
Rb
a f (x)(x)dx <
I
Rb
a
{wi }ni=1
solve
n
X
Z
wi ej (xi ) =
(4)
i=1
Example: Newton-Cotes
I
I
I
basis = polynomials
(x) = 1
Resulting rule is exact for all polynomials of degree less than or equal
to n
Paul Schrimpf ()
30 / 43
Gaussian Quadrature
Now suppose n is given, but we can choose both wi and xi
Same idea, {wi , xi }ni=1 solve
n
X
i=1
wi ej (xi ) =
(5)
Paul Schrimpf ()
31 / 43
Interval
[1, 1]
(1, 1)
(1, 1)
(x)
1
(1 x) (1 + x)
[1, 1]
[0, )
(, )
Paul Schrimpf ()
1
1x 2
1 x2
e x
2
e x
Name
Legendre
Jacobi
Chebyshev (first kind)
Chebyshev (second kind)
Laguerre
Hermite
32 / 43
Quadrature in Matlab
Not built in
Many implementations available at the
Pn
Pn
... im =1 f (xi1 , ..xim )wim ..wim works, but needs mn function
evaluations
More sophisticated methods exist called cubature, sparse grid, or
complete polynomials see e.g. Encyclopedia of Cubature Formulas
i1
Paul Schrimpf ()
33 / 43
Adaptive Integration
f (x)(x)dx
F (x) =
a
Paul Schrimpf ()
34 / 43
Randomly
R draw xi from distribution p(x) (x), set
1
wi = n (x)dx
Many methods for sampling from p(x): inverse cdf, acceptance
sampling, importance sampling, Gibbs sampling, Metropolis-Hastings
Pros: simple to understand, easy to implement, scales well, requires
little a priori knowledge of f (x)
Cons: inefficient for a fixed n, the right quadrature rule will do
much better
I
P
P
ButPwhen computing something like i g ( s f (yi , xi,s )ws ), errors in
g ( s f (yi , xi,s )ws ) for different i can offset one another
Paul Schrimpf ()
35 / 43
1
2
3
4
5
x
y
%
t
%
Paul Schrimpf ()
36 / 43
Integration Example
1
2
clear;
% some integration experiments
3
4
5
6
7
8
9
Paul Schrimpf ()
37 / 43
Integration Example
1
2
3
4
5
6
7
8
9
10
11
Paul Schrimpf ()
38 / 43
Integration Example
1
2
3
4
5
Paul Schrimpf ()
39 / 43
MCMC
Paul Schrimpf ()
40 / 43
Metropolis-Hastings
Draw y q(xt , )
p(y )q(y ,xt )
Compute (xt , y ) = p(x
t )q(xt ,y )
Draw u U[0, 1]
If u < (xt , y ) set xt+1 = y , otherwise set xt+1 = xt
Example: metropolisHastings.m
Paul Schrimpf ()
41 / 43
Exercises
If you have written any code that involves optimization or integration, try modifying it to use a different method.
Modify the dynamic programming code from lecture 1 to allow for a continuous distribution for income. If you are
clever, will be able to evaluate E v (x 0 , y 0 ) exactly, even if Ev (x 0 , y 0 ) does not have an analytic solution.
We know that value function iteration converges linearly. Gauss-Newton and other quadratic approximation based
method converge at faster rates. Change the dynamic programming code from lecture 1 to use one of these methods
instead of value function iteration.
Change the binary choice model into a multinomial choice model. Allow for correlation between the shocks. Try to
preserve the generality of the binary model, but feel free to limit the choice of distribution if it helps.
Paul Schrimpf ()
42 / 43
More Exercises
1
(hard) I dont know much about cubature, but Id like to learn more. Read about a few methods. Find or write some
Matlab code for one of them. Explore the accuracy of the method. To maintain a desired level of accuracy, how does
the number of points grow with the number of dimensions? Compare it monte carlo integration.
(hard) Matlab lacks an implementation of an optimization algorithm that uses interpolation in multiple dimensions.
Remedy this situation. Find or develop an algorithm and implement it.
(hard, but not as hard) Write a function for computing arbitrary Gaussian quadrature rules with polynomials as a basis.
Given integration limits, a weight function, (x), and the number of points, n, your function should return the
integration points and weights. You might want to use the following facts taken from Numerical Recipes . Let
R
hf |g i = ab f (x)g (x)(x)dx denote the inner product. Then the following recurrence relation will construct a set of
orthogonal polynomials:
p1 (x) 0
p0 (x) 1
pj+1 (x) =
hxpj |pj i
hpj |pj i
!
x
hpj |pj i
hpj1 |pj1 i
pj1 (x)
Recall that the roots of the n degree polynomial will be the integration points. If you have the roots, {xj }nj=1 . Then
the weights are given by
wj =
Paul Schrimpf ()
hpn1 |pn1 i
pn1 (xj )pn0 (xj )
43 / 43