3 views

Uploaded by cavanzas

klre

- The Simplex Method1
- Non-Negative Matrix Factorization
- 05 Strategic Mine Planning 1
- Staff scheduling and rostering A review of applications methods and models
- ISBI 2007 View Part2
- synchronousgenerator
- Mathematical Programming
- Conditional Modeling
- lec19
- Kxu Chap02 Solution
- Crown in Shield, Brand - 1981 - A Physiologically Based Criterion of Muscle Force Prediction in Locomotion-Annotated
- A Mixed-Integer Mathematical Modeling Approach to Exam Timetabling
- OTEMA-37
- Transportation Problem
- reactors
- Chisti 2d Gabor Filter
- Revisión de Modelos de Optimización de Tiempos Discretos Para Planificación de Producción Táctica
- Capitulo 26.pdf
- VOLTAGE PROFILE IMPROVEMENT AND LINE LOSSES REDUCTION USING DG USING GSA AND PSO OPTIMIZATION TECHNIQUES
- Combination Loadcase

You are on page 1of 18

http://ocw.mit.edu

Spring 2008

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

16.323 Lecture 1

Nonlinear Optimization

Unconstrained nonlinear optimization

Line search methods

Spr 2008

Basics Unconstrained

16.323 11

parameters x.

Assume that F (x) is scalar x = arg minx F (x)

Dene two types of minima:

Strong: objective function increases locally in all directions

A point x is a strong minimum of a function F (x) if a scalar > 0

exists such that F (x) < F (x + x) for all x such that 0 <

x

Weak: objective function remains same in some directions, and

increases locally in other directions

Point x is a weak minimum of a function F (x) if is not a strong

minimum and a scalar > 0 exists such that F (x) F (x + x)

for all x such that 0 < x

Note that a minimum is a unique global minimum if the denitions

hold for = . Otherwise these are local minima.

6

F(x)

0

2

1.5

0.5

0

x

0.5

1.5

Spr 2008

16.323 12

in the neighborhood of an arbitrary point using Taylor series:

1

F (x + x) F (x) + xT g(x) + xT G(x)x + . . .

2

where g gradient of F and G second derivative of F

2

2F

x1xn

T

x1

x21

x

1

.

.

...

..

..

x =

...

, g =

=

...

, G =

2F

2F

F

xn

xn

xn x1

x2

n

Given ambiguity of sign of the term xT g(x), can only avoid

cost decrease F (x + x) < F (x) if g(x) = 0

Obtain further information from higher derivatives

g(x) = 0 is a necessary and sucient condition for a point to be

a stationary point a necessary, but not sucient condition to

be a minima.

Stationary point could also be a maximum or a saddle point.

Spr 2008

16.323 13

set g(x) = 0, in which case:

F (x + x) F (x) + xT G(x)x + . . .

2

For a strong minimum, need xT G(x)x > 0 for all x, which

is sucient to ensure that F (x + x) > F (x).

To be true for arbitrary x =

0, sucient condition is that

G(x) > 0 (PD). 1

Second order necessary condition for a strong minimum is that

G(x) 0 (PSD), because in this case the higher order terms in

the expansion can play an important role, i.e.

xT G(x)x = 0

but the third term in the Taylor series expansion is positive.

or G(x) 0 (necessary)

1 Positive

Denite Matrix

Spr 2008

Solution Methods

16.323 14

k and

Given: An initial estimate of the optimizing value of x x

a search direction pk

k+1 = x

k + k pk , for some scalar k =

Find: x

0

Sounds good, but there are some questions:

How nd pk ?

How nd k ? line search

How nd initial condition x0, and how sensitive is the answer to

the choice?

Search direction:

k

Taylor series expansion of F (x) about current estimate x

F

k+1 x

k)

(x

x

= Fk + gkT (k pk )

k + pk ) F (x

k) +

Fk+1 F (x

(i.e. Fk+1 < Fk ), set

gkT pk < 0

Steepest descent given by pk = gk

Summary: gradient search methods (rst-order methods) using es

timate updates of the form:

k+1 = x

k k gk

x

Spr 2008

Line Search

16.323 15

Line Search - given a search direction, must decide how far to step

Expression xk+1 = xk + k pk gives a new solution for all possible

values of - what is the right value to pick?

Note that pk denes a slice through solution space is a very spe

cic combination of how the elements of x will change together.

Would like to pick k to minimize F (xk + k pk )

Can do this line search in gory detail, but that would be very time

consuming

Often want this process to be fast, accurate, and easy

Especially if you are not that condent in the choice of pk

1

0

1

x0 =

p0 =

x1 = x0 + p0 =

1

2

1 + 2

which gives that F = 1 + (1 + 2) + (1 + 2)2 so that

F

= 2 + 2(1 + 2)(2) = 0

Spr 2008

16.323 16

Figure 1.2: F (x) = x21 + x1 x2 + x22 doing a line search in arbitrary direction

Spr 2008

16.323 17

Line Search II

First step: search along the line until you think you have bracketed a

local minimum

F(x)

a1 b1

a2

8

x

b2

a3

b3

a4

a5

b4

b5

Figure by MIT OpenCourseWare.

Once you think you have a bracket of the local min what is the

smallest number of function evaluations that can be made to reduce

the size of the bracket?

Many ways to do this:

Golden Section Search

Bisection

Polynomial approximations

First 2 have linear convergence, last one has superlinear

Polynomial approximation approach

Approximate function as quadratic/cubic in the interval and use

the minimum of that polynomial as the estimate of the local min.

Use with care since it can go very wrong but it is a good termi

nation approach.

Spr 2008

16.323 18

(x) = px3 + qx2 + rx + s

F

g(x) = 3px2 + 2qx + r ( = 0 at min)

Then x is the point (pick one) x = (q (q 2 3pr)1/2)/(3p) for

(x) = 6px + 2q > 0

which G

Great, but how do we nd x in terms of what we know (F (x) and

g(x) at the end of the bracket [a, b])?

g

+

v

w

b

x = a + (b a) 1

gb ga + 2v

where

v=

w2 gagb and w =

3

(Fa Fb) + ga + gb

ba

Content from: Scales, L. E. Introduction to Non-Linear Optimization. New York, NY: Springer, 1985, pp. 40.

Removed due to copyright restrictions.

Spr 2008

16.323 19

Observations:

Tends to work well near a function local minimum (good con

vergence behavior)

But can be very poor far away use a hybrid approach of

bisection followed by cubic.

Rule of thumb: do not bother making the linear search too accu

rate, especially at the beginning

A waste of time and eort

Check the min tolerance and reduce it as it you think you are

approaching the overall solution.

Spr 2008

16.323 110

k+1

Assume F is quadratic, and expand gradient gk+1 at x

k + pk ) = gk + Gk (x

k+1 x

k)

gk+1 g(x

= gk + Gk pk

where there are no other terms because of the assumption that F

is quadratic and

T

x1

x.. 1

.

.

xk

=

=

.

,

gk =

x

F

xn

xn

k

x

2

F

2F

x1 xn

x. 21

.

...

..

..

Gk

=

2

F

2F

xn x1

x2

n

k

x

So for x

pk = G1

k gk

k+1 is

Problem is that F (x) typically not quadratic, so the solution x

not at the minimum need to iterate

Note that for a complicated F (x), we may not have explicit gradients

(should always compute them if you can)

But can always approximate them using nite dierence tech

niques but pretty expensive to nd G that way

Use Quasi-Newton approximation methods instead, such as BFGS

(Broyden-Fletcher-Goldfarb-Shanno)

Spr 2008

16.323 111

FMINUNC Example

Does quasi-Newton and gradient search

No gradients need to be formed

Mixture of cubic and quadratic line searches

Performance shown on a complex function by Rosenbrock

F (x1, x2) = 100(x21 x2)2 + (1 x1)2

Start at x = [1.9 2]. Known global min it is at x = [1 1]

Rosenbrock with BFGS

3

3

0

x1

Rosenbrock with GS

x2

x2

3

3

0

x1

x2

0

1

1000

500

2

0

3

3

0

x1

x2

3

3

iterations (35 ftn calls), but gradient search (steepest descent) fails

(very close though), even after 2000 function calls (550 iterations).

June 18, 2008

Spr 2008

16.323 112

x2

3

3

0

x

Rosenbrock with GS

x2

3

3

0

x1

Spr 2008

16.323 113

x2

3

3

0

x

1000

500

2

0

3

2

x1

3

3

x2

Spr 2008

16.323 114

Observations:

1. Typically not a good idea to start the optimization with QN, and

I often nd that it is better to do GS for 100 iterations, and then

switch over to QN for the termination phase.

0 tends to be very important standard process is to try many

2. x

dierent cases to see if you can nd consistency in the answers.

6

F(x)

0

2

1.5

0.5

0

x

0.5

1.5

Figure 1.7: Shows how the point of convergence changes as a function of the initial

condition.

4. Are there any guarantees on getting a good nal answer in a

reasonable amount of time? Typically yes, but not always.

Spr 2008

16.323 115

1

2

function [F,G]=rosen(x)

%global xpath

3

4

%F=100*(x(1)^2-x(2))^2+(1-x(1))^2;

5

6

7

8

9

F=100*(x(:,2)-x(:,1).^2).^2+(1-x(:,1)).^2;

G=[100*(4*x(1)^3-4*x(1)*x(2))+2*x(1)-2; 100*(2*x(2)-2*x(1)^2)];

10

11

return

12

13

14

15

16

17

global xpath

18

19

20

21

22

23

24

25

clear FF

for ii=1:N,

for jj=1:N,

FF(ii,jj)=rosen([x1(ii) x2(jj)]);

end,

end

26

27

28

29

30

31

32

33

% quasi-newton

xpath=[];t0=clock;

opt=optimset(fminunc);

opt=optimset(opt,Hessupdate,bfgs,gradobj,on,Display,Iter,...

LargeScale,off,InitialHessType,identity,...

MaxFunEvals,150,OutputFcn, @outftn);

34

35

x0=[-1.9 2];

36

37

38

xout1=fminunc(rosen,x0,opt) % quasi-newton

xbfgs=xpath;

39

40

41

42

43

44

45

46

47

% gradient search

xpath=[];

opt=optimset(fminunc);

opt=optimset(opt,Hessupdate,steepdesc,gradobj,on,Display,Iter,...

LargeScale,off,InitialHessType,identity,MaxFunEvals,2000,MaxIter,1000,OutputFcn, @outftn);

xout=fminunc(rosen,x0,opt)

xgs=xpath;

48

49

50

51

52

53

54

55

56

57

58

59

60

xpath=[];

opt=optimset(fminunc);

opt=optimset(opt,Hessupdate,steepdesc,gradobj,on,Display,Iter,...

LargeScale,off,InitialHessType,identity,MaxFunEvals,5,OutputFcn, @outftn);

xout=fminunc(rosen,x0,opt)

opt=optimset(fminunc);

opt=optimset(opt,Hessupdate,bfgs,gradobj,on,Display,Iter,...

LargeScale,off,InitialHessType,identity,MaxFunEvals,150,OutputFcn, @outftn);

xout=fminunc(rosen,xout,opt)

61

62

xhyb=xpath;

63

64

65

66

67

figure(1);clf

contour(x1,x2,FF,[0:2:10 15:50:1000])

hold on

plot(x0(1),x0(2),ro,Markersize,12)

Spr 2008

68

69

70

71

72

73

74

plot(1,1,rs,Markersize,12)

plot(xbfgs(:,1),xbfgs(:,2),bd,Markersize,12)

hold off

xlabel(x_1)

ylabel(x_2)

75

76

77

78

79

80

81

82

83

84

85

86

figure(2);clf

contour(x1,x2,FF,[0:2:10 15:50:1000])

hold on

xlabel(x_1)

ylabel(x_2)

plot(x0(1),x0(2),ro,Markersize,12)

plot(1,1,rs,Markersize,12)

plot(xgs(:,1),xgs(:,2),m+,Markersize,12)

hold off

87

88

89

90

91

92

93

94

95

96

97

98

figure(3);clf

contour(x1,x2,FF,[0:2:10 15:50:1000])

hold on

xlabel(x_1)

ylabel(x_2)

plot(x0(1),x0(2),ro,Markersize,12)

plot(1,1,rs,Markersize,12)

plot(xhyb(:,1),xhyb(:,2),m+,Markersize,12)

hold off

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

figure(4);clf

mesh(x1,x2,FF)

hold on

plot3(x0(1),x0(2),rosen(x0)+5,ro,Markersize,12,MarkerFaceColor,r)

plot3(1,1,rosen([1 1]),ms,Markersize,12,MarkerFaceColor,m)

plot3(xbfgs(:,1),xbfgs(:,2),rosen(xbfgs)+5,gd,MarkerFaceColor,g)

%plot3(xgs(:,1),xgs(:,2),rosen(xgs)+5,m+)

hold off

axis([-3 3 -3 3 0 1000])

hh=get(gcf,children);

xlabel(x_1)

ylabel(x_2)

set(hh,View,[-177 89.861],CameraPosition,[-0.585976 11.1811 5116.63]);%

print -depsc rosen2.eps;jpdf(rosen2)

114

2

3

4

5

global xpath

stop=0;

xpath=[xpath;x];

6

7

return

16.323 116

- The Simplex Method1Uploaded byshahidsark
- Non-Negative Matrix FactorizationUploaded byErheng Zhong
- 05 Strategic Mine Planning 1Uploaded byminerito2211
- Staff scheduling and rostering A review of applications methods and modelsUploaded bydarkennimbus
- ISBI 2007 View Part2Uploaded byReza Khalili
- synchronousgeneratorUploaded bySaddam Hussain
- Mathematical ProgrammingUploaded byKodana Adam Coulibaly
- Conditional ModelingUploaded byzaherjj
- lec19Uploaded byDeepak Sakkari
- Kxu Chap02 SolutionUploaded byAkmal Gondal
- Crown in Shield, Brand - 1981 - A Physiologically Based Criterion of Muscle Force Prediction in Locomotion-AnnotatedUploaded bynicososol
- A Mixed-Integer Mathematical Modeling Approach to Exam TimetablingUploaded byPhilippe Baert
- OTEMA-37Uploaded bySima Viorel
- Transportation ProblemUploaded byRahul Sharma
- reactorsUploaded byMostafa Alhamod
- Chisti 2d Gabor FilterUploaded byChistiMonu
- Revisión de Modelos de Optimización de Tiempos Discretos Para Planificación de Producción TácticaUploaded byYunuem Reyes Zotelo
- Capitulo 26.pdfUploaded byAlejandro Mendoza
- VOLTAGE PROFILE IMPROVEMENT AND LINE LOSSES REDUCTION USING DG USING GSA AND PSO OPTIMIZATION TECHNIQUESUploaded byJournal 4 Research
- Combination LoadcaseUploaded bySharun Suresh
- EE16.pdfUploaded byrobertovm2002
- TolerancesUploaded byIvan Leščić
- Add maths assignmentUploaded byBrian Fernandez
- chap8Uploaded byaybik
- HW Assignment2 Due May13 2019Uploaded byThảo Trần
- math 1210 projectUploaded byapi-259451065
- LPsolverUploaded byrushikesh28
- interior point method PaperUploaded bymanorathprasad
- Linear ProgramwinUploaded byRamon Villareal
- Ineq. constrained minimizationUploaded byKostas Konstantinidis

- Introduction to Laplace TransformsUploaded bycavanzas
- finalAnalysisReport-dataUji1Uploaded bycavanzas
- PCM-4390_DSUploaded bycavanzas
- ECE382_f08_hw4soln.pdfUploaded bycavanzas
- MATLAB - C# InterfacingUploaded bycavanzas
- GuideUploaded bycavanzas
- Differential and Difference EquationsUploaded bycavanzas
- VspaceUploaded byfaalme
- Constructive AnatomyUploaded bygazorninplotz

- History m.algebraUploaded bySyahira Yusof
- LAPLACE.pdfUploaded byArial96
- NRK ParaSolUploaded byLuis Arreguin
- Code-JamUploaded bykaran26121989
- NUMBER SYSTEMUploaded byMehar Chand
- buffet contest + solutions - x - canada 2009Uploaded byRicha Srivastava
- Pa 3 AaaaaaaUploaded bySrivenu Chinna
- iit jee maths pncUploaded byThe Rock
- Upward continuation of scattered potential field dataUploaded byAlutsyahLuthfian
- Gpx MathematicsUploaded byTech Master
- 2. Quadratic EquationsUploaded byDonestop Tudung Centre
- ieee_eeg_final (1).pdfUploaded byShafayet Uddin
- Kinematic Synthesis of MechanismsUploaded byHarshan Arumugam
- Ratio and Proportion, RaviUploaded byShrishailamalikarjun
- hw4-2015Uploaded byMario Xiao
- chapter 2 isoparametric elementUploaded byVijayakumarGs
- BinomialUploaded byChristopher Clayton
- Module--1Uploaded bytonydisoja
- EigenLab-102Uploaded byMradul Yadav
- 10 Reservoir Problem SolutionUploaded byVageesha Shantha Veerabhadra Swamy
- solution5[1]Uploaded byHarjinder Pal Singh
- icfp01Uploaded byAnonymous RrGVQj
- c2p4Uploaded bysundarmeenakshi
- stem pre-calculus cgUploaded byapi-323289375
- Inverse Laplace Transform Lecture-3Uploaded bySingappuli
- Number SystemUploaded byEppa Srinivas
- MCR_3U_June05Uploaded bytktech
- Peano ArithmeticUploaded byglaufan
- A Fuzzy Rule Based ClusteringUploaded byIJASRET
- 2006 Calculus BC Practice ExamUploaded byccschmitt