You are on page 1of 157

Optimization

MATLAB Exercises
Assoc. Prof. Dr. Pelin GÜNDEŞ
Introduction to MATLAB
Introduction to MATLAB
Introduction to MATLAB
• % All text after the % sign is a comment. MATLAB
ignores anything to the right of the % sign.

• ; A semicolon at the end of a line prevents MATLAB


from echoing the information you enter on the screen

• … A succession of three periods at the end of the line


informs MATLAB that code will continue to the next line.
You can not split a variable name across two lines. You
can not continue a comment on another line.
Introduction to MATLAB
• ^c You can stop MATLAB execution and get
back the command prompt by typing ^c (Ctrl-C) – by
holding down „Ctrl‟ and „c‟ together.

• help command_name

• helpwin Opens a help text window that provides


more information on all of the MATLAB resources
installed on your system.

• helpdesk Provides help using a browser window.


Introduction to MATLAB
• = The assignment operator. The variable on the
left-hand side of the sign is assigned the value of the
right-hand side.

• == Within a if construct

• MATLAB is case sensitive. An a is different than A. All


built-in MATLAB commands are in lower case.

• MATLAB does not need a type definition or dimension


statement to introduce variables.
Introduction to MATLAB
• Variable names start with a letter and contain up to 31
characters (only letters, digits and underscore).

• MATLAB uses some built in variable names. Avoid using


built in variable names.

• Scientific notation is expressed with the letter e, for


example, 2.0e-03, 1.07e23, -1.732e+03.

• Imaginary numbers use either i or j as a suffix, for


example 1i, -3.14j,3e5i
Introduction to MATLAB
Arithmetic Operators

• + Addition

• - Subtraction

• * Multiplication

• / Division

• ^ Power

• ‘ Complex conjugate transpose (also array transpose)


Introduction to MATLAB
• In the case of arrays, each of these operators can be
used with a period prefixed to the operator, for example,
(.*) or (.^) or (./). This implies element-by-element
operation in MATLAB.

• , A comma will cause the information to echo


Exercise
>> a=2;b=3;c=4,d=5;e=6,

c=
4

e=
6
% why did only c and e echo on the screen?
Exercise
>> who % lists all the variables on the screen
Your variables are:
a b c d e

>> a % gives the value stored in a


a=
2
Exercise
>> A=1.5 % Variable A

A=
1.5000

>> a, A % Case matters

a=

A=
1.5000
Exercise
>> one=a;two=b;three=c;
>> % assigning values to new variables

>> four=d;five=e;six=pi; % value of pi available


>> f=7;
>> A1=[a b c;d e f]; % A1 is a 2 by 3 matrix
% space seperates columns
% semi-colon seperates rows
>> A1(2,2) % accesses the matrix element on the second raw and
second column
ans =
6
Exercise
>> size(A1) % gives you the size of the matrix (row, columns)

ans =

2 3

>> AA1=size(A1) % What should happen here? From previous


% statement the size of A1 contains two numbers arranged as a row
% matrix. This is assigned to AA1

AA1 =

2 3
Exercise
>> size(AA1) % AA1 is a one by two matrix

ans =

1 2

>> A1‟ % this transposes the matrix A1

ans =

2 5
3 6
4 7
Exercise
>> B1=A1‟ % the transpose of matrix A1is assigned to B1. B1 is a
% three by two matrix

B1 =

2 5
3 6
4 7

>> C1=A1*B1 % Matrix multiplication

C1 =
29 56
56 110
Exercise
>> C2=B1*A1

C2 =

29 36 43
36 45 54
43 54 65

>> C1*C2 % Read the error matrix


??? Error using ==> *
Inner matrix dimensions must agree.
Exercise
>> D1=[1 2]' % D1 is a column vector
D1 =
1
2

>> C1,C3=[C1 D1] % C1 is augmented by an extra column

C1 =
29 56
56 110

C3 =
29 56 1
56 110 2
Exercise
>> C2

C2 =
29 36 43
36 45 54
43 54 65

>> C3=[C3;C2(3,:)] % The column represents all the columns

C3 =
29 56 1
56 110 2
43 54 65
Exercise
>> C4=C2*C3

C4 =
4706 7906 2896
5886 9882 3636
7066 11858 4376

>> C5=C2.*C3 % The .* represents the product of each element of


% C2 with the corresponding element of C3

C5 =
841 2016 43
2016 4950 108
1849 2916 4225
Exercise
>> C6=inverse(C2)
??? Undefined function or variable 'inverse'.

% Apparently, inverse is not a command in MATLAB, if command


% name is known, it is easy to obtain help

>> lookfor inverse % this command will find all files where it comes
% across the word “inverse” in the initial comment lines. The
% command we need appears to be INV which says inverse of a
% matrix. The actual command is in lower case. To find out how to use
% it:
>> help inv
inv(C2) % inverse of C2
Exercise
>> for i=1:20
f(i)=i^2;
end % The for loop is terminated with “end”

>> plot(sin(0.01*f)',cos(0.03*f))
>> xlabel('sin(0.01f)')
>> ylabel('cos(0.03*f)')
>> legend('Example')
>> title('A Plot Example')
>> grid
>> exit % finished with MATLAB
Graphical optimization
Minimize f(x1,x2)= (x1-3)2 + (x2-2)2

subject to: h1 (x1,x2): 2x1 + x2 =8

h2(x1,x2): (x1-1)2 + (x2-4)2 =4

g1(x1,x2): x1 + x2 ≤ 7

g2(x1,x2): x1 – 0.25 x22 ≤ 0

0 ≤ x1≤ 10; 0 ≤ x2≤ 10;


Example 1
% Example 1 (modified graphics)(Sec 2.1- 2.2)
% Section:2.3.4 Tweaking the display
%
% graphical solution using matlab (two design variables)
% the following script should allow the graphical solution
% to example [ problem 3-90 from text]
%
% Minimize f(x1,x2) = (x1-3)**2 + (x2-2)**2
%
% h1(x1,x2) = 2x1 + x2 = 8
% h2(x1,x2) = (x1-1)^2 + (x2-4)^2 = 4
% g1(x1,x2) : x1 + x2 <= 7
% g1(x1,x2) : x1 - 0.25x2^2 <= 0.0
%
% 0 <= x1 <= 10 ; 0 <= x2 <= 10
%
%
%
Example 1
%
%
% WARNING : The hash marks for the inequality constraints must
% be determined and drawn outside of the plot
% generated by matlab
%
%----------------------------------------------------------------
x1=0:0.1:10; % the semi-colon at the end prevents the echo
x2=0:0.1:10; % these are also the side constraints
% x1 and x2 are vectors filled with numbers starting
% at 0 and ending at 10.0 with values at intervals of 0.1

[X1 X2] = meshgrid(x1,x2);


% generates matrices X1 and X2 correspondin
% vectors x1 and x2
Example 1 cont‟d
f1 = obj_ex1(X1,X2);% the objecive function is evaluated over the entire mesh
ineq1 = inecon1(X1,X2);% the inequality g1 is evaluated over the mesh
ineq2 = inecon2(X1,X2);% the inequality g2 is evaluated over the mesh

eq1 = eqcon1(X1,X2);% the equality 1 is evaluated over the mesh


eq2 = eqcon2(X1,X2);% the equality 2 is evaluated over the mesh

[C1,h1] = contour(x1,x2,ineq1,[7,7],'r-');
clabel(C1,h1);
set(h1,'LineWidth',2)
% ineq1 is plotted [at the contour value of 8]
hold on % allows multiple plots
k1 = gtext('g1');
set(k1,'FontName','Times','FontWeight','bold','FontSize',14,'Color','red')
% will place the string 'g1' on the lot where mouse is clicked
Example 1 cont‟d
[C2,h2] = contour(x1,x2,ineq2,[0,0],'r--');
clabel(C2,h2);
set(h2,'LineWidth',2)
k2 = gtext('g2');
set(k2,'FontName','Times','FontWeight','bold','FontSize',14,'Color','red')
[C3,h3] = contour(x1,x2,eq1,[8,8],'b-');
clabel(C3,h3);
set(h3,'LineWidth',2)
k3 = gtext('h1');
set(k3,'FontName','Times','FontWeight','bold','FontSize',14,'Color','blue')
% will place the string 'g1' on the lot where mouse is clicked
[C4,h4] = contour(x1,x2,eq2,[4,4],'b--');
clabel(C4,h4);
set(h4,'LineWidth',2)
k4 = gtext('h2');
set(k4,'FontName','Times','FontWeight','bold','FontSize',14,'Color','blue')
Example 1 cont‟d
[C,h] = contour(x1,x2,f1,'g');
clabel(C,h);
set(h,'LineWidth',1)
% the equality and inequality constraints are not written with 0 on the right hand
side. If you do write
% them that way you would have to include [0,0] in the contour commands
xlabel(' x_1 values','FontName','times','FontSize',12,'FontWeight','bold');
% label for x-axes
ylabel(' x_2 values','FontName','times','FontSize',12,'FontWeight','bold');
set(gca,'xtick',[0 2 4 6 8 10])
set(gca,'ytick',[0 2.5 5.0 7.5 10])
k5 = gtext({'Chapter 2: Example 1','pretty graphical display'})
set(k5,'FontName','Times','FontSize',12,'FontWeight','bold')
clear C C1 C2 C3 C4 h h1 h2 h3 h4 k1 k2 k3 k4 k5
grid
hold off
Example 1 cont‟d
Objective function

function retval = obj_ex1(X1,X2)


retval = (X1 - 3).*(X1 - 3) +(X2 - 2).*(X2 - 2);

The first inequality

function retval = inecon1(X1, X2)


retval = X1 + X2;

The second inequality

function retval = inecon2(X1,X2)


retval = X1 - 0.25*X2.^2;
Example 1 cont‟d
The first equality

function retval = eqcon1(X1,X2)


retval = 2.0*X1 + X2;

The second equality

function retval = eqcon2(X1,X2)


retval = (X1 - 1).*(X1 - 1) + (X2 - 4).*(X2 - 4);
Example 2- Graphical solution
f ( x1 , x2 )  ax12  bx22  c cos(px1 )  d cos(qx2 )  c  d
with
a  1, b  2, c  0.3, d  0.4, p  3 , q  4
Example 3- Solution of LP poblems
using MATLAB
Maximize

f ( X) : 990x1  900x2  5250

Subject to:
g1 ( X) : 0.4 x1  0.6 x2  8.5
g 2 ( X) : 3 x1  x2  25
g 3 ( X) : 3 x1  6 x2  70
x1  0; x2  0

The problem was also transformed to the standard format as follows:


Example 3- Solution of LP poblems
using MATLAB
Minimize

f ( X) : 990x1  900x2  5250

Subject to:
g1 ( X) : 0.4 x1  0.6 x2  x3  8.5
g 2 ( X) : 3 x1  x2  x4  25
g 3 ( X) : 3 x1  6 x2  x5  70
x1  0; x2  0; x3  0; x4  0; x5  0

where x3, x4 and x5 are the slack variables.


Example 3- Solution of LP poblems
using MATLAB
% The code presented here is captured
% using MATLAB‟s diary command.
% DIARY OFF suspends it.
% DIARY ON turns it back on.
% Use the functional form of DIARY, such as DIARY('file')% write „help
% format‟ to see different formats in MATLAB.
format compact
format rational
A=[0.4 0.6 1 0 0 8.5;3 -1 0 1 0 25;3 6 0 0 1 70;-990 -900 0 0 0 5280]
A=
2/5 3/5 1 0 0 17/2
3 -1 0 1 0 25
3 6 0 0 1 70
-990 -900 0 0 0 5280
Example 3- Solution of LP poblems
using MATLAB
'This is Table 1, note the canonical form';
>> 'EBV is x1-first column';
>> 'To find LBV divide last column by coeff. in EBV
column';
>> 'Take the minimum of the positive values';
>> 'The row identifies the pivot row to create the next table';
>> A(:,6)/A(:,1)
ans =
0 0 0 -17/1980
0 0 0 -5/198
0 0 0 -7/99
0 0 0 -16/3
Example 3- Solution of LP poblems
using MATLAB
'Something is not right.The above division should
have been an element by element one';
>> A(:,6)./A(:,1)
ans =
85/4
25/3
70/3
-16/3
Example 3- Solution of LP poblems
using MATLAB
>> format short
>> A(:,6)./A(:,1)
ans =
21.2500
8.3333
23.3333
-5.3333
'Second row is the pivot row and x4 is LBV';
'Constructing Table 2';
'Note the scaling factor for the matrix';
Example 3- Solution of LP poblems
using MATLAB
A
A=
1.0e+003 *
0.0004 0.0006 0.0010 0 0 0.0085
0.0030 -0.0010 0 0.0010 0 0.0250
0.0030 0.0060 0 0 0.0010 0.0700
-0.9900 -0.9000 0 0 0 5.2800
'The element at A(2,1) must be 1';
Example 3- Solution of LP poblems
using MATLAB
>> A(2,:)=A(2,:)/A(2,1)
A=
1.0e+003 *
0.0004 0.0006 0.0010 0 0 0.0085
0.0010 -0.0003 0 0.0003 0 0.0083
0.0030 0.0060 0 0 0.0010 0.0700
-0.9900 -0.9000 0 0 0 5.2800

'The element at A(1,1) must be a zero';


>> A(1,:)=A(1,:)-0.4*A(2,:);
>> A
A=
1.0e+003 *
0 0.0007 0.0010 -0.0001 0 0.0052
0.0010 -0.0003 0 0.0003 0 0.0083
0.0030 0.0060 0 0 0.0010 0.0700
-0.9900 -0.9000 0 0 0 5.2800
Example 3- Solution of LP poblems
using MATLAB
'Element at A(3,1) must be a zero';
>> A(3,:)=A(3,:)-3*A(2,:)
A=
1.0e+003 *
0 0.0007 0.0010 -0.0001 0 0.0052
0.0010 -0.0003 0 0.0003 0 0.0083
0 0.0070 0 -0.0010 0.0010 0.0450
-0.9900 -0.9000 0 0 0 5.2800

'Element at A(4,1) must be a 0';


>> A(4,:)=A(4,:)-A(4,1)*A(2,:)
A=
1.0e+004 *
0 0.0001 0.0001 -0.0000 0 0.0005
0.0001 -0.0000 0 0.0000 0 0.0008
0 0.0007 0 -0.0001 0.0001 0.0045
0 -0.1230 0 0.0330 0 1.3530
Example 3- Solution of LP poblems
using MATLAB
'Table 2 complete- and canonical form is present';
>> 'Solution has not converged because of A(4,2)';
>> 'EBV is x2';
>> 'Calculation of the LBV';
>> A(:,6)./A(:,2)
ans =
7.0455
-25.0000
6.4286
-11.0000
'Pivot row is third row and LBV is x5';
>> 'Calculation of the LBV';
>> 'Construction of Table 3-no echo of calculations';
>> 'A(3,2) must have value 1'
ans =
A(3,2) must have value 1
>> A(3,:)=A(3,:)/A(3,2);
Example 3- Solution of LP poblems
using MATLAB
'A(1,2) must have a value of 0';
>> A(1,:)=A(1,:)-A(1,2)*A(3,:);
>> 'A(2,2) must have a value of 0';
>> A(2,:)=A(2,:)-A(2,2)*A(3,:);
>> 'A(4,2) must have a value of 0';
>> A(4,:)=A(4,:)-A(4,2)*A(3,:);
>> A
A=
1.0e+004 *
0 0 0.0001 -0.0000 -0.0000 0.0000
0.0001 0 0 0.0000 0.0000 0.0010
0 0.0001 0 -0.0000 0.0000 0.0006
0 0 0 0.0154 0.0176 2.1437
Example 3- Solution of LP poblems
using MATLAB
>> format rational
>> A
A=
0 0 1 -1/35 -11/105 19/42
1 0 0 2/7 1/21 220/21
0 1 0 -1/7 1/7 45/7
0 0 0 1080/7 1230/7 150060/7
>> 'No further iterations are necessary';
>> diary off
Example 4- Solution using
MATLAB‟s Optimization Toolbox
f=[-990;-900];
A=[0.4 0.6;3 -1;3 6];
b=[8.5 25 70]';
[x,fval]=linprog(f,A,b)
Optimization terminated successfully.
x=
10.47619047619146
6.42857142857378
fval =
-1.615714285714595e+004
'To this solution we must add the constant -5280 which was omitted in problem
definition for MATLAB'
ans =
To this solution we must add the constant -5280 which was omitted in problem
definition for MATLAB
diary off
Nonlinear programming
The functions in the exercises are:

f ( x)  12  ( x  1) 2 ( x  2)( x  3)
g1 ( x, y ) : 20 x  15 y  30
g 2 ( x, y )  x 2 / 4  y 2  1
Nonlinear programming
x=sym('x') %defining x as a single symbolic object
x=

syms y f g1 g2 g %definition of multiple objects


whos %types of variables in the workspace
Name Size Bytes Class

f 1x1 126 sym object


g 1x1 126 sym object
g1 1x1 128 sym object
g2 1x1 128 sym object
x 1x1 126 sym object
y 1x1 126 sym object

Grand total is 14 elements using 760 bytes


Nonlinear programming
f=12+(x-1)*(x-1)*(x-2)*(x-3) % constructing f

f=

12+(x-1)^2*(x-2)*(x-3)

diff(f) % first derivative

ans =

2*(x-1)*(x-2)*(x-3)+(x-1)^2*(x-3)+(x-1)^2*(x-2)

% note the chain rule for the derivatives


% note that the independent variable is assumed to be x
Nonlinear programming
diff(f,x,2) % the second derivative wrt x

ans =

2*(x-2)*(x-3)+4*(x-1)*(x-3)+4*(x-1)*(x-2)+2*(x-1)^2

diff(f,x,3) % the third derivative wrt x

ans =

24*x-42

g1=20*x+15*y-30 % define g1

g1 =

20*x+15*y-30
Nonlinear programming
g2=0.25*x+y-1; % define g2
% g1,g2 can only have partial derivatives
% independent variables have to be identified
diff(g1,x) % partial derivative

ans =

20

diff(g1,y) % partial derivative

ans =

15
Nonlinear programming
g=[g1;g2] % g column vector based on g1, g2
g=
[ 20*x+15*y-30]
[ 1/4*x+y-1]

% g can be the constraint vector in optimization problems


% the partial derivatives of g wrt design variables is called the
Jacobian matrix
% the properties of this matrix is important for numerical
techniques
xy=[x y]; % row vector of variables
J=jacobian(g,xy) % calculating the jacobian
J=

[ 20, 15]
[ 1/4, 1]
Nonlinear programming
ezplot(f) % a plot of f for -2pi x 2pi (default)
ezplot(f,[0,4]) % a plot between 0<=x <=4
df=diff(f);
hold on
ezplot(df,[0,4]) %plotting function and derivative
%combine with MATLAB graphics- draw a line
line([0 4],[0 0],'Color','r')
g

g=

[ 20*x+15*y-30]
[ 1/4*x+y-1]

Nonlinear programming

% to evaluate g at x=1,y=2.5
subs(g,{x,y},{1,2.5})

ans =

27.5000
1.7500

diary off
Nonlinear programming
tangent.m
% Illustration of the derivative
% Optimization Using MATLAB
% Dr. P.Venkataraman
%
% section 4.2.2
% This example illustrates the limiting process
% in the definition of the derivative
% In the figure animation
% 1. note the scales
% 2. as the displacement gets smaller the function
% and the straight line coincide suggesting the
% the line is tangent t the curve at the point
%

Nonlinear programming
tangent.m

syms x f deriv % symboli variables
f=12+(x-1)*(x-1)*(x-2)*(x-3); % definition of f(x)
deriv=diff(f); % computing the derivative

xp = 3.0; % point at which the


% derivative will be computed
fp =subs(f,xp); % function value at xp
dfp=subs(deriv,xp); % actual value of derivative at xp
ezplot(f,[0,4]) % symbolic plot of the original function
% between 0 and 4
Nonlinear programming
tangent.m

% draw a line at value of 12 for reference


line([0 4],[12 12],'Color','g','LineWidth',1)
line([2.5 3.5],[10,14],'Color','k', ...
'LineStyle','--','LineWidth',2)
axis([0 4 8 24])
title('tangent - slope - derivative at x=3')
text(.6,21,'f=12+(x-1)*(x-1)*(x-2)*(x-3)')
ylabel('f(x)')
text(3.2,12.5,'\theta')
text(2.7,10.5,'tangent')
grid
Derivative
% Illustration of the derivative
% % Dr. P.Venkataraman
% This example illustrates the limiting process in the definition of the
derivativeIn the figure animation: 1. note the scales
% 2. as the displacement gets smaller the function and the straight
line coincide
% suggesting the line is tangent t the curve at the point

syms x f deriv % symboli variables


f=12+(x-1)*(x-1)*(x-2)*(x-3); % definition of f(x)
deriv=diff(f); % computing the derivative

xp = 3.0; % point at which the


% derivative will be computed
delx=[1 .1 .01 .001]; % decreasing displacements - vector
xvar =xp + delx; % neighboring points - vector
Derivative
fvar =subs(f,xvar);% function values at neighboring points
fp =subs(f,xp); % function value at xp
dfp=subs(deriv,xp); % actual value of derivative at xp

delf = fvar-fp; % change in the function values


derv= delf./delx ; % derivative using definition
% limiting process is being invoked
% as displacement is getting smaller
ezplot(f,[0,4]) % symbolic plot of the original function
% between 0 and 4

% draw a line at value of 12 for reference


line([0 4],[12 12],'Color','g','LineWidth',1)

figure % use a new figure for animation, figures are drawn as


if zooming in
Derivative
for i = 1:length(delx)
clf % clear reference figure
ezplot(f,[xp,xvar(i)])
% plot function within the displacement value only
line([xp xvar(i)],[fp subs(f,xvar(i))],'Color','r')
pause(2) % pause for 2 seconds - animation effect
end

xpstack=[xp xp xp xp]; % dummy vector for display


dfpstack=[dfp dfp dfp dfp]; % same

[xpstack' delx' xvar' delf' derv' dfpstack']


Gradient of the function
• In the function with one variable, the derivative was
associated with the slope.

• In two or more variables, the slope is equivalent to the


gradient.

• The gradient is a vector and at any point represents the


direction in which the function will increase most rapidly.

• The gradient is composed of the partial derivatives


organized as a vector.
Gradient of the function
• The gradient is defined as:

 f 
 x   f f 
T

f     
 f   x y 
 y 
Gradient and tangent line at a point
Gradient of the function
x=0:.05:3;
y=0:0.05:3;
[X Y]=meshgrid(x,y); % X,Y are matrices

fval=0.25*X.*X + Y.*Y -1; % could have used an m-file

subplot(2,1,1) % divides figure window into 2 rows


% and addresses top row

[c1,h1]= contour3(x,y,fval,[0 1 2 3]);


% draws the specified contour in 3D
% and establishes handles to set properties of plot
Gradient of the function
set(h1,'LineWidth',2);
clabel(c1); % labels contours
xlabel('x');
ylabel('y');
title('3D contour for f(x,y) = x^2/4 + y^2 -1');
% swithches to second row (lower plot)
subplot(2,1,2)
[c2,h2]=contour(x,y,fval,[0 1 2 3]);
set(h2,'LineWidth',2);
clabel(c2)
xlabel('x');
ylabel('y');
grid
Gradient of the function
% processing using the symbolic toolbox
% identify a point on contour f = 0
xf0 = 1.0; % x-value of point
syms f xx yy;
f=0.25*xx*xx + yy*yy -1; % the contour f=0
fy0=subs(f,xx,xf0); % substitute for x
yf0=solve(fy0,yy); % solve for y value of point
yf0d=double(yf0(1)); % express it as a decimal
% identify a point on contour f = 2
xf2 = 2.0; % x-value of point
sym fxy2;
fxy2=0.25*xx*xx + yy*yy -1-2; % contour f = 2
fy2=subs(fxy2,xx,xf2); % substitute for x
yf2=solve(fy2,yy);
yf2d=double(yf2(1)); % decimal value for y value
Gradient of the function
% draw blue line connecting the two points in both plots
% line PQ in Figure 4.3
subplot(2,1,1)
line([xf0 xf2],[yf0d yf2d],[0 2],'Color','b', ...
'LineWidth',2);
subplot(2,1,2)
line([xf0 xf2],[yf0d yf2d],'Color','b','LineWidth',2);
% the value of f at the point R in the Figure 4.3
fxy02=subs(f,{xx,yy},{xf2,yf0d});
Gradient of the function
% Line PR in Figure 4.3

line([xf0 xf2],[yf0d yf0d],'Color','g', ...


'LineWidth',2,'LineStyle','--')

% Line RQ in Figure 4.3

line([xf2 xf2],[yf0d yf2d],'Color','g', ...


'LineWidth',2,'LineStyle','--')

axis square % this is useful for identfing tangent


% and gradient
Jacobian
• In Jacobian, the gradients of the function appear in the same row.
 f f f 
 x y z 
[J ]   
 g g g 
 x y z 

• The Hessian matrix [H] is the same as the matrix of second


derivatives of a function of several variables. For f(x,y)
2 f 2 f 
 2 
 x x y
[H ]   2 
  f 2 f 
 2 
 x y y 
Unconstrained problem
• Minimize

f ( x1 , x2 ) : ( x1  1) 2  ( x2  1) 2  x1 x2
0  x1  3; 0  x2  3
Unconstrained problem
x1=0:.05:4;
x2=0:0.05:4;
[X1 X2]=meshgrid(x1,x2); % X,Y are matrices

fval=(X1-1).^2 +(X2-1).^2 -X1.*X2; % could have used an m-file

colormap(gray) % sets the default colors


meshc(X1,X2,fval) % draws a mesh with contours underneath
rotate3d % allows you to interactively rotate the figure
% for better view
xlabel('x_1');
ylabel('x_2');
zlabel('f(x_1,x_2)')
% adds a tangent (plane) surface at the minimum
patch([1 3 3 1], [1 1 3 3],[-2 -2 -2 -2],'y')
grid
Method of Lagrange

Minimize f ( x1 , x2 ) :  x1 x2
Subject to h1 ( x1 , x2 ) : x12 / 4  x22  1
0  x1  3; 0  x2  3
Example-Lagrange method
Minimize

F ( x1 , x2 , 1 )   x1 x2  1 ( x12 / 4  x22  1)
Subject to:
h1 ( x1 , x2 ) : x12 / 4  x22  1
0  x1  3; 0  x2  3
The necessary conditions are obtained as:
F f h
  1 1  0
x1 x1 x1
F f h
  1 1  0
x2 x2 x2
F
 h1  0
1
Example-Lagrange method
Applying the necessary conditions to the problem, we obtain:

F x
  x1  1 1  0
x1 2
F
  x2  21 x2  0
x2
F x12
 h1   x22  1  0
1 4
In MATLAB, there are two ways to solve the above equations. The first
is by using symbolic support functions or using the numerical support
functions. The symbolic function is solve and the numerical function is
fsolve. The numerical technique is an iterative one and requires you to
choose an initial guess to start the procedure.
Lagrange method-Example
% Necessary/Sufficient coonditions for
% Equality constrained problem
% Minimize f(x1,x2) = -x1x2
%
%-------------------------
% symbolic procedure
%------------------------
% define symbolic variables
format compact
syms x1 x2 lam1 h1 F

% define F
F = -x1*x2 + lam1*(x1*x1/4 + x2*x2 -1);
h1 = x1*x1/4 +x2*x2 -1;
Lagrange method-Example
%the gradient of F
sym grad;
grad1 = diff(F,x1);
grad2 = diff(F,x2);
% optimal values satisfaction of necessary conditions
[lams1 xs1 xs2] = solve(grad1,grad2,h1,'x1,x2,lam1');
% the solution is returned as a vector of the three unknowns in case of
multiple
% solutions lams1 is the solution vector for lam1 etc.
% IMPORTANT: the results are sorted alphabetically fprint is used to
print a
% string in the command window disp is used to print values of matrix
f = -xs1.*xs2;
fprintf('The solution (x1*,x2*,lam1*, f*):\n'), ...
disp(double([xs1 xs2 lams1 f]))
Lagrange method-Example
%------------------------------
% Numerical procedure
%----------------------------
% solution to non-linear system using fsolve see help fsolve
% the unknowns have to be defined as a vector
% the functions have to be set up in a m-file

% define intial values


xinit=[1 1 0.5]'; % inital guess for x1, x2, lam1

% the equations to be solved are available in


% eqns4_4_2.m

xfinal = fsolve('eqns4_4_2',xinit, „Display‟,‟final‟);

fprintf('The numerical solution (x1*,x2*,lam1*): \n'), ...


disp(xfinal);
Lagrange method-Example
• The symbolic computation generates four solutions. Only the first
one is valid for this problem. This is decided by the side constraints
expressed by the equation:

0  x1  3; 0  x2  3

• On the other hand, the numerical techniques provide only one


solution to the problem. This is a function of the initial guess.
Generally, the numerical techniques will deliver solutions closest to
the point they started from. The solution is:

x1*  1.4141 ; x2*  0.7071 ; 1*  1.0


Lagrange method-Example
function ret = eqns4_4_2(x)

% x is a vector
% x(1) = x1, x(2) = x2, x(3) = lam1
ret=[(-x(2) + 0.5*x(1)*x(3)), ...
(-x(1) + 2*x(2)*x(3)), ...
(0.25*x(1)*x(1) + x(2)*x(2) -1)];
Scaling-Example
Minimize

f ( x1 , x2 )  6.0559 *10^5 * ( x12  x22 )


subject to
g1 ( x1 , x2 ) : 7.4969 *105 x12  40000x1  9.7418 *106 ( x14  x24 )  0
g 2 ( x1 , x2 ) : (5000  1.4994 *105 x1 )( x12  x1 x2  x22 )  1.7083*107 ( x14  x24 )  0
g 3 ( x1 , x2 ) : 1.9091*103 x1  6.1116 *10 4  0.05 * ( x14  x24 )  0
g 4 ( x1 , x2 ) : x2  x1  0.001  0
0.02  x1  1.0; 0.02  x2  1.0
Scaling-Example
• Consider the order of the magnitude in the equations:

g1 ( x1 , x2 ) : 7.4969 *10 5 x12  40000 x1  9.7418 *10 6 ( x14  x24 )  0


g 3 ( x1 , x2 ) : 1.9091 *10 3 x1  6.1116 *10  4  0.05 * ( x14  x24 )  0

• Numerical calculations are driven by larger magnitudes. The second inequality will
be ignored in relation to the other functions even though the graphical solution
indicates that g3 is active.

• This is a frequent occurrence in all kinds of numerical techniques. The standard


approach to minimize the impact of large variations in magnitudes among different
equations is to normalize the relations.

• In practice, this is also extended to the variables. This is referred to as scaling the
variables and scaling the functions. Many current software will scale the problem
without user intervention.
Scaling-Example
Scaling variables: The presence of side constraints in problem
formulation allows a natural definition of scaled variables. The user
defined upper and lower bounds are used to scale each variable between 0
and 1. Therefore,

~ x  x l
~
xi  ui il ; xi  scaled ith variable
xi  xi
xi  ~
xi ( xiu  xil )  xil

In the original problem, the above equations is used to substitute for the
original variables after which the problem can be expressed in terms of
scaled variable.
Scaling-Example
• An alternate formulation is to use only the upper value of the side constraint to
scale the design variable:
xˆi  xi / xiu
x  xu ~
i i x i

• While the above option limits the higher scaled value to 1, it does not set the
lower scaled value to zero. For the example of this section, there is no necessity
for scaling the design variables since their order of magnitude is one, which is
exactly what scaling attempts to achieve.

• Scaling of the functions in the problem is usually critical for a successful


solution. Numerical techniques used in optimization are iterative. In each
iteration, usually the gradient of the functions at the current value of the design
variables is involved in the calculations. This gradient expressed as a matrix, is
called the Jacobian matrix or simply the Jacobian.
Scaling-Example
• Sophisticated scaling techniques employ the diagonal entries of the
Jacobian matrix as metrics to scale the respective functions. These
entries are evaluated at the starting values of the design variables. The
function can also be scaled in the same manner as the equations.
~ xi  xil ~
xi  u ; xi  scaled ith variable
xi  xi l

x ~
i x ( xu  xl )  xl
i i i i

xˆi  xi / xiu
x  xu ~
i i x i

• In this example, the constraints will be scaled using relations similar to


the relations expressed by the second scaling option.
Scaling-Example
• The scaling factor for each constraint will be determined using the starting
value or the initial guess for the variables. A starting value of 0.6 for both
design variables is selected to compute the values necessary for scaling the
functions. The scaling constants for the equations are calculated as:

>> syms g1 x1 x2
>> g1=7.4969*10^5*x1*x1+40000*x1-9.7418*10^6*(x1^4-x2^4)
g1 =

749690*x1^2+40000*x1-9741800*x1^4+9741800*x2^4

>> subs(g1,{x1,x2},{0.6,0.6})

ans =

2.9389e+005
Scaling-Example
>> syms g2 x1 x2
>> g2=(5000+1.4994*10^5*x1)*(x1*x1+x1*x2+x2*x2)-
1.7083*10^7*(x1^4-x2^4)
g2 =
(5000+149940*x1)*(x1^2+x1*x2+x2^2)-
17083000*x1^4+17083000*x2^4
>> subs(g2,{x1,x2},{0.6,0.6})
ans =
1.0256e+005
Scaling-Example
>> syms g3 x1 x2
>> g3=1.9091*10^(-3)*x1+6.1116*10^(-4)-0.05*(x1^4-x2^4)
g3 =
8804169777779727/4611686018427387904*x1+56369560
54044165/9223372036854775808-1/20*x1^4+1/20*x2^4
>> subs(g3,{x1,x2},{0.6,0.6})
ans =
0.0018
Scaling-Example
• The scaling constants for the equations are calculated as:
g~10  293888.4; g~20  102561.12;
g~30  1.7570E  03; g~20  1

• The first three constraints are divided through by their scaling constants. The last
equation is unchanged. The objective function has a coefficient of one. The scaled
problem is:
~
f  x12  x22
subject to
~
g1 : 2.5509x12  0.1361x1  33.148( x14  x24 )  0
g~ : (0.0488  1.4619x )( x 2  x x  x 2 )  166.5641( x 4  x 4 )  0
2 1 1 1 2 2 1 2

g~3 : 1.0868x1  0.3482  28.4641( x14  x24 )  0


g~ ( x , x ) : x  x  0.001  0
4 1 2 2 1

0.02  x1  1.0; 0.02  x2  1.0


Scaling-Example
syms x1s x2s g1s g2s g3s g4s fs b1s b2s b3s b4s Fs
g1s = 2.5509*x1s*x1s +.1361*x1s-33.148*(x1s^4-x2s^4);
g2s = (0.0488+ 1.4619*x1s)*(x1s*x1s+x1s*x2s+x2s*x2s) ...
-166.5641*(x1s^4-x2s^4);
g3s = 1.0868*x1s + 0.3482 -28.4641*(x1s^4-x2s^4);
g4s = x2s-x1s + 0.001;

fs = x1s*x1s -x2s*x2s;
Fs = fs + b1s*g1s+ b2s*g2s + b3s*g3s +b4s*g4s;
Scaling-Example
% the gradient of F
syms grad1s grad2s
grad1s = diff(Fs,x1s);
grad2s = diff(Fs,x2s);
% solution
[xs1 xs2] = solve(g1s,g3s,'x1s,x2s');
fss = xs1.^2 - xs2.^2;
gs1 = 2.5509*xs1.*xs1 +0.1361*xs1-33.148*(xs1.^4-xs2.^4);
gs2 = (.0488+ 1.4619*xs1).*(xs1.*xs1+xs1.*xs2+xs2.*xs2) ...
-166.5641*(xs1.^4-xs2.^4);
gs3 = 1.0868*xs1+0.3482 -28.4641*(xs1.^4-xs2.^4);
gs4 = xs2-xs1 +0.001;
Scaling-Example
fprintf('\n\nThe solution *** Case a ***(x1*,x2*, f*, g1, g2 g3
g4):\n'), ...
disp(double([xs1 xs2 fss gs1 gs2 gs3 gs4]))
%unlike the previous case all the solutions are displayed
%
x1s=double(xs1(1));
fprintf('\n x1 = '),disp(x1s)
x2s=double(xs2(1));
fprintf('\n x2 = '),disp(x2s)
Scaling-Example
fprintf('\nConstraint:')
fprintf('\ng1: '),disp(subs(g1s))
fprintf('\ng2: '),disp(subs(g2s))
fprintf('\ng3: '),disp(subs(g3s))
fprintf('\ng4: '),disp(subs(g4s))
b2s=0.0; b4s = 0.0;

[b1s b3s]=solve(subs(grad1s),subs(grad2s),'b1s,b3s');

fprintf('Multipliers b1 and b3 : ')


fprintf('\nb1: '),disp(double(b1s))
fprintf('\nb3: '),disp(double(b3s))
Scaling-Example
A note on Kuhn-Tucker Conditions:
The FOC associated with the general optimization problem in the following
equations istermed the Kuhn-Tucker conditions. The general optimization problem
is:
Minimize f ( x1 , x2 ,  , xn )
Subject to hk ( x1 , x2 ,  , xn )  0, k  1,2,  , l
g j ( x1 , x2 ,  , xn )  0, j  1,2,  , m
xil  xi  xiu i  1,2,  , n
The Lagrangian
F ( x1 ,  , xn , 1 ,  , l , 1 ,  ,  m )  f ( x1 , x2 ,  , xn )  1h1    l hl  1 g1     m g m

There are n+l+m unknowns. The same number of equations are required to solve the
problem. These are provided by the FOC or the Kuhn-Tucker conditions.
Scaling-Example
• n equations are obtained as:
F f h h g g
  1 1    l l  1 1     m m  0; i  1,2,, n
xi xi xi xi xi xi

• l equations are obtained directly through the equality constraints

hk ( x1 , x2 ,, xn )  0; k  1,2,, l

• m equations are applied through the 2m cases. This implies that there are 2m
possible solutions. Each case sets the multiplier j or the corresponding inequality
constraint gj to zero. If the multiplier is set to zero, then the corresponding
constraint must be feasible for an acceptable solution. If the constraint is set to
zero (active constraint), then the corresponding multiplier must be positive for a
minimum.
Scaling-Example
• With this in mind, the m equations can be expressed as:
 j g j  0  if  j  0 then g j  0
if g j  0 then  j  0
• If the above equations are not met, the design is not acceptable.

• In our example, n=2 and m=4 so there are 24 = 16 cases that must be investigated as
part of the Kuhn-Tucker conditions. It can also be identified that g1 and g3 are
active constraints. If g1 and g3 are active constraints, then the multipliers 1 and 3
must be positive. By the same reasoning, the multipliers associated with the inactive
constraints g2 and g4, that is 2 and 4 must be set to zero. This information on the
active constraints can be used to solve for x1* and x2* as this is a system of two
equations in two unknowns.
Newton-Raphson Method
Minimize
f ( )  (  1) 2 (  2)(  3)

Subject to
g ( ) : 0.75 * 2  1.5 *   1  0
0   4

Solution:
syms al f g phi phial
f = (al-1)^2*(al-2)*(al-3);
g = -1 - 1.5*al + 0.75*al*al; f ( i )
 i 1   i 
phi = diff(f); f ( i )
phial = diff(phi);
ezplot(f,[0 4])
l1 =line([0 4],[0 0]);
Newton-Raphson Method
set(l1,'Color','k','LineWidth',1,'LineStyle','-')
hold on
ezplot(phi,[0 4])
grid
hold off
xlabel('\alpha')
ylabel('f(\alpha), \phi(\alpha)')
title('Example 5.1')
axis([0 4 -2 10])

alpha(1) = 0.5;
fprintf('iterations alpha phi(i) d(alpha) phi(i+1) f\n')
Newton-Raphson Method
for i = 1:20;
index(i) = i;
al = alpha(i);
phicur(i)=subs(phi);
delalpha(i) = -subs(phi)/subs(phial);
al = alpha(i)+delalpha(i);
phinext(i)=subs(phi);
fun(i)=subs(f);
if (i > 1)
l1=line([alpha(i-1) alpha(i)], [phicur(i-1),phicur(i)]);
set(l1,'Color','r','LineWidth',2)
pause(2)
end
if (abs(phinext(i)) <= 1.0e-08) % the convergence or the stopping criterion
disp([index' alpha' phicur' delalpha' phinext' fun'])
return
else
alpha(i+1)=al;
end
end
i
Newton-Raphson Method

In summary, most iterative numerical


techniques are designed to converge to
solutions that are close to where they
start from…
Bisection Technique (Interval
Halving Procedure)
• Unlike the Newton-Raphson mehod, this procedure does not require the
evaluation of the gradient of the function whose root is being sought.

syms al f phi phial


f = (al-1)^2*(al-2)*(al-3);
phi = diff(f);
% plot of overall function
ezplot(f,[0 4])
l1 =line([0 4],[0 0]);
set(l1,'Color','k','LineWidth',1,'LineStyle','-')
hold on
ezplot(phi,[0 4])
grid
hold off
Bisection Technique (Interval
Halving Procedure)
xlabel('\alpha')
ylabel('f(\alpha), \phi(\alpha)')
title('Example 5.1')
axis([0 4 -2 10])
%number of iterations = 20
aone(1) = 0.0; al = aone(1); phi1(1) = subs(phi);
atwo(1) = 4.0; al = atwo(1); phi2(2) = subs(phi);
% display the trapping of the functions for the first seven iterations
figure
% organizing columns
% values are stored in a vector for later printing
fprintf('iterations alpha-a alpha-b alpha phi(alpha) f\n')
Bisection Technique (Interval
Halving Procedure)
for i = 1:20;
index(i) = i;
al = aone(i) + 0.5*(atwo(i) - aone(i)); % step 2
% step 3
alf(i) = al;
phi_al(i) =subs(phi);
fun(i) = subs(f);
if ((atwo(i) - aone(i)) <= 1.0e-04)
break;
end
if (abs(phi_al) <= 1.0e-08)
break;
end
Bisection Technique (Interval
Halving Procedure)
if ((phi1(1)*phi_al(i)) > 0.0)
aone(i+1)=al;atwo(i+1) = atwo(i);
phi2(i+1)=phi2(i); phi1(i+1) = phi_al(i);
else atwo(i+1)=al; phi2(i+1) = phi_al(i);
aone(i+1) = aone(i); phi1(i+1)=phi1(i);
end
% plots of th technique/trapping minimum
if (i <=7)
ezplot(phi,[aone(i),atwo(i)])
xlabel('\alpha')
ylabel('\phi(\alpha)')
Bisection Technique (Interval
Halving Procedure)
title('Example 5.1')
l1=line([aone(i) aone(i)], [0,phi1(i)]);set(l1,'Color','r','LineWidth',1)
l2=line([atwo(i) atwo(i)], [0,phi2(i)]);
set(l2,'Color','g','LineWidth',1)
l3=line([aone(i) atwo(i)],[0 0]);
set(l3,'Color','k','LineWidth',1,'LineStyle','-')
axis([0.8*aone(i) 1.2*atwo(i) phi1(i) phi2(i)]);pause(2)
end
end
% print out the values
disp([index' aone' atwo' alf' phi_al' fun'])
Polynomial Approximation
• This method is simple in concept. Instead of minimizing a difficult function of
one variable, minimize a polynomial that approximates the function.

• The optimal value of the variable that minimizes the polynomial is then
considered to approximate the optimal value of the variable for the original
function.

• It is rare for the degree of the approximating polynomial to exceed three. A


quadratic approximation is standard unless the third degree is warranted.

• It is clear that serious errors in approximation are expected if the polynomial


is to simulate the behaviour of the original function over a large range of
values of the variable.

• Mathematical theorems exist that justify a quadratic representation of the


function, with a prescribed degree of error, within a small neighborhood of
the minimum. What this ensures is that the polynomial approximation gets
better as the minimum is being approached.
Example
Minimize
f ( )  (  1) 2 (  2)(  3)

Subject to 0 ≤  ≤ 4

A quadratic polynomial P() is used for the approximation. This polynomial is


expressed as
P ( )  b0  b1  b2 2

Solution: Two elements need to be understood prior to the following discussion. The first
concerns the evaluation of the polynomial, and the second concerns the inclusion of
the expression

Subject to 0 ≤  ≤ 4
Example
• The polynomial is completely defined if the coefficients b 0, b1, b2 are known. To
determine them, three data points [(1,f1), (2,f2),(3,f3)] are generated from equation

f ( )  (  1) 2 (  2)(  3)

• This sets up a linear system of three equations in three unknowns by requiring that the
values of the function and the polynomial must be the same at the three points. The
solution of this system of equations is the values of the coefficients. The consideration of
the expression
Subject to 0 ≤  ≤ 4

depends on the type of one-dimensional problem being solved. If the one-dimensional


problem is a genuine single-variable design problem, then the above expression needs to
be present. If the one-dimensional problem is a subproblem from the multidimensional
optimization problem, then the above expression is not available. In that case, a scanning
procedure is used to define 1, 2 and 3.
Scanning procedure
• This process is started from the lower limit for . A value of zero for this value
can be justified since it refers to values at the current iteration. A constant interval
for ,   is also identified. For a well scaled problem, this value is usually 1.
Starting at the lower limit, the interval is doubled until three points are determined
such that the minimum is bracketed between them. With respect to this particular
example, the scanning procedure generates the following values:

  0; f (0)  6; 1  0; f1  6
  1; f (1)  0;  2  1; f2  0
  2; f (2)  0; this can not be  2 as the minimum is not yet trapped.
  4; f (4)  18;  3  4; f 3  18
Scanning procedure
• A process such as that illustrated is essential as it is both
indifferent to the problem being solved and can be
programmed easily. The important requirement of any such
process is to ensure that the minimum lies between the limits
established by the procedure. This procedure is developed as
MATLAB m-file below.

% Numerical Techniques - 1 D optimization


% Generic Scanning Procedure - Single Variable
% copyright Dr. P.Venkataraman
% An m-file to bracket the minimum of a function of a
single Lower bound is known only upper bound is found
% This procedure will be used along with Polynomial
Approximation or with the Golden Section Method
% the following information are passed to the function
Scanning procedure
% the name of the function 'functname'
% the function should be available as a function m-file and
should return the value of the function
% the inputs:

% the initial value a0


% the incremental value da
% the number of scanning steps ns
%% sample callng statement
% UpperBound_1Var('Example5_1',0,1,10)
% this should give you a value of [4 18] as in the text
Scanning procedure
function ReturnValue
=UpperBound_1Var(functname,a0,da,ns)

format compact
% ntrials are used to bisect/double values of da
if (ns ~= 0) ntrials = ns;
else ntrials = 20; % default
end

if (da ~= 0) das = da;


else das = 1; %default
end
Scanning procedure
for i = 1:ntrials;
j = 0; dela = j*das; a00 = a0 + dela;
f0 = feval(functname,a00);
j = 1; dela = j*das; a01 = a0 + dela;
f1 = feval(functname,a01);
f1s =f1;
if f1 < f0
for j = 2:ntrials
aa01 = a0 + j*das;
af1 = feval(functname,aa01);
f1s=min(f1s,af1);
if af1 > f1s
ReturnValue = [aa01 af1];
return;
end
end
Scanning procedure
% after ntrials the value is still less than start return last value

fprintf('\n cannot increase function value\n')

ReturnValue = [aa01 af1];

return;

else

das = 0.5*das;

end

end
Scanning procedure
fprintf('\ncannot decrease function value \n')

fprintf('\ninitial slope may be greater than


zero')
fprintf('\nlower bound needs adjusting \n')

% return start value


ReturnValue =[a0 f0];
UpperBound_1Var.m
• This code segment implements the determination of the upper bound of a function
of ne variable.

• The input to the function is the name of the function m-file, for example
‘Example5_1’, the start value for the scan (a0), the scanning interval (da), and the
number of scanning steps (ns). The function outputs a vector of two values.The first
is the value of the variable and the secodn is the corresponding value of the
function.

• The function referenced by the code must be a MATLAB m file in the same
directory (Example5_1.m)

function retval = Example5_1(a)


retval = (a - 1)*(a - 1)*(a -2)*(a -3);
UpperBound_1Var.m
>> UpperBound_1Var('Example5_1',0,1,10)
ans =
4 18
PolyApprox_1Var.m
% Numerical Techniques - 1 D optimization
% Generic Polynomial Approximation Method - Single Variable
% copyright Dr. P.Venkataraman
%% An m-file to apply the Polynomial Approximation Method
%************************************
% requires: UpperBound_1Var.m
%***************************************
% This procedure will be used along with Polynomial Approximation or with the
Golden Section Method
%the following information are passed to the function:
% the name of the function 'functname'
% the function should be available as a function m.file and should return the
value of the function
% the inputs:
% the initial value a0
% the incremental value da
% the number of scanning steps ns
% sample calling statement: UpperBound_1Var('Example5_1',0,1,10)
% this should give you a value of [4 18] as in the text
PolyApprox_1Var.m

function ReturnValue = ...

PolyApproximation_1Var(functname,order,lowbound,intvl
step,ntrials)

format compact
lowval = 0.0
up =
UpperBound_1Var(functname,lowbound,intvlstep,ntrials)
upval=up(1)
PolyApprox_1Var.m
if (order == 2)
val1 = lowval + (upval -lowval)*.5
f1 = feval(functname,lowval)
f2 = feval(functname,val1)
f3 = feval(functname,upval)

A = [1 lowval lowval^2 ; 1 val1 val1*val1 ; 1 upval upval^2];


coeff =inv(A)*[f1 f2 f3]';
polyopt = -coeff(2)/(2*coeff(3))
fpoly = coeff(1) + coeff(2)*polyopt +coeff(3)*polyopt*polyopt
ReturnValue = [polyopt fpoly];
end
PolyApprox_1Var.m
• This code segment implements the polynomial approximation method for a
function of one variable. This function uses UpperBound_1Var.m to
determine the range of the variable.

• The input to the function is the name of the function; the order (2 or 3) of the
approximation;lowbound-the start value of the scan passed to
UpperBound_1Var.m; intvlstep-the scanning interval passed to
UpperBound_1Var.m; intrials: the number of scanning steps passed to
UpperBound_1Var.m.

• The output of the program is a vector of two values. The first element of the
vector is the location of the minimum of the approximating polynomial, and
the second is the function value at this location.

• The function referenced by the code must be a MATLAB m-file., in the same
directory (Example5_1.m). The input for example Example5_1 is the value
at which the function needs to be computed, and its output is the value of
the function.

• Usage: Value=PolyApprox_1Var(„Example5_1‟,2,0,1,10)
Golden Section Method
Golden section method
• If f2 > f1
Golden section method
Golden section method
GoldSection_1Var.m
• The code translatesthe algorithm for the golden section method into
MATLAB code.

• The input to the function is the name of the function (functname)


whose minimum is being sought; the tolerance (tol) of the
approximation; the start value (lowbound) of the scan passed to
UpperBound_1Var.m; the scanning interval (intvl) passed to
UpperBound_1Var.m; the number of scanning steps (ntrials) passed to
UpperBound_1Var.m.

• The output of the program is a vector of four pairs of variable and


function values after the final iteration.

• Usage: Value=GoldSection_1Var(„Example5_1‟,0.001,0,1,10)
GoldSection_1Var.m
% Numerical Techniques - 1 D optimization
% Generic Golden Section Method - Single Variable
% copyright (code) Dr. P.Venkataraman
% An m-file to apply the Golden Section Method
%************************************
% requires: UpperBound_1Var.m
%***************************************
%the following information are passed to the function
% the name of the function
'functname'
% this function should be available as a function m-file
and shoukd return the value of the function
%% the tolerance 0.001
GoldSection_1Var.m
% following needed for UpperBound_1Var

% the initial value lowbound

% the incremental value intvl

% the number of scanning steps ntrials

% the function returns a row vector of four sets of


variable and function values for the last iteration
% sample calling statement
% GoldSection_1Var('Example5_1',0.001,0,1,10)
GoldSection_1Var.m
function ReturnValue =
GoldSection_1Var(functname,tol,lowbound,intvl,ntrials)
format compact;
% gets upperbound
upval =
UpperBound_m1Var(functname,lowbound,intvl,ntrials);
au=upval(1); fau = upval(2);
if (tol == 0) tol = 0.0001; %default
end
eps1 = tol/(au - lowbound); %default
tau = 0.38197; % golden ratio
nmax = round(-2.078*log(eps1)); % number of iterations
GoldSection_1Var.m
aL = lowbound; faL =feval(functname,aL);;
a1 = (1-tau)*aL + tau*au; fa1 = feval(functname,a1);
a2 = tau*aL + (1 - tau)*au; fa2 = feval(functname,a2);
% storing all the four values for printing
fprintf('start \n')
fprintf('alphal(low) alpha(1) alpha(2) alpha{up) \n')
avec = [aL a1 a2 au;faL fa1 fa2 fau];
disp([avec])
GoldSection_1Var.m
for i = 1:nmax

if fa1 >= fa2

aL = a1; faL = fa1;

a1 = a2; fa1 = fa2;

a2 = tau*aL + (1 - tau)*au; fa2 = feval(functname,a2);


au = au; fau = fau; % not necessary -just for clarity
fprintf('\niteration '),disp(i)
fprintf('alphal(low) alpha(1) alpha(2) alpha{up) \n')
avec = [aL a1 a2 au;faL fa1 fa2 fau];
disp([avec])
GoldSection_1Var.m
else
au = a2; fau = fa2;
a2 = a1; fa2 = fa1;
a1 = (1-tau)*aL + tau*au; fa1 = feval(functname,a1);
aL = aL; faL = faL; % not necessary
fprintf('\niteration '),disp(i)
fprintf('alphal(low) alpha(1) alpha(2) alpha{up) \n')
avec = [aL a1 a2 au;faL fa1 fa2 fau];
disp([avec])
end
end
% returns the value at the last iteration
ReturnValue =[aL faL a1 fa1 a2 fa2 au fau];
Golden Section Method
• The golden section method is an iterative technique.

• The number of iterations depends on the tolerance expected in the final result and is
known prior to the start of the iterations.

• This is a significant improvement relative to the Newton Raphson method where the
number of iterations can not be predicted a priori.

Example 2: Minimize

f ( x1 , x2 , x3 )  ( x1  x2 ) 2  2( x2  x3 ) 2  3( x3  x1 ) 2
Golden Section Method
Extension for the multivariable case

• For the single variable:

Usage: UpperBound_1Var(„Example5_1‟,0,1,10)

• For many variables:

Usage: UpperBound_nVar(„Example5_2‟,x,s,0,1,10)

x: current position vector or design vector


s: prescribed search direction
UpperBound_nVar
% Ch 5: Numerical Techniques - 1 D optimization
% Generic Scanning Procedure - n Variables
% copyright Dr. P.Venkataraman
%% An m-file to bracket the minimum of a function of a single
% Lower bound is known only upper bound is found
% This procedure will be used along with Polynomial Approximation or with the
Golden Section Method
% the following information are passed to the function
% the name of the function: 'functname„, the function should be available as a
function m.file and should return the value of the function for a design vector

% the current position vector : x


% the current search direction vector s
% the initial step a0
% the incremental step da
% the number of bracketing steps ns
UpperBound_nVar
function ReturnValue = UpperBound_nVar(functname,x,s,a0,da,ns)
format compact
% ntrials are used to bisect/double values of da
if (ns ~= 0) ntrials = ns;
else ntrials = 10; % default
end
if (da ~= 0) das = da;
else das = 1; %default
end
% finds a value of function greater than or equal
% to the previous lower value
for i = 1:ntrials;
j = 0; dela = j*das; a00 = a0 + dela;
dx0 = a00*s; x0 = x + dx0; f0 = feval(functname,x0);
j = j+1; dela = j*das; a01 = a0 + dela;
dx1 = a01*s; x1 = x + dx1; f1 = feval(functname,x1);
f1s = f1;
if f1 < f0
UpperBound_nVar
for j = 2:ntrials
a01 = a0 + j*das; dx1 = a01*s;
x1 = x + dx1; f1 = feval(functname,x1);
f1s = min(f1s,f1);
if f1 > f1s
ReturnValue = [a01 f1 x1];
return;
end
end
fprintf('\nCannot increase function in ntrials')
ReturnValue = [a01 f1 x1];
return,
UpperBound_nVar

else f1 >= f0;


das = 0.5*das;
end
end
fprintf('\n returned after ntrials - check problem')
ReturnValue =[a0 f0 x0];
GoldSection_nVar.m
Usage: For one variable
Value=GoldSection_1Var(‘Example5_1’,0.001,0,1,10)

Usage: For many variables


Value=GoldSection_nVar(‘Example5_2’,0.001,x,s,0,1,10)

The code is run from the command window using the following listing:
x=[0 0 0];
s=[0 0 6];
Value=GoldSection_nVar(‘Example5_2’,0.001,x,s,0,1,10)
The result is:
Value =
0.1000 1.2000 0 0 0.5973
1=0.1; f()=1.2; x1=0; x2=0; x3=0.5973
GoldSection_nVar.m
% Ch 5: Numerical Techniques - 1 D optimization
% Golden Section Method - many variables
% copyright (code) Dr. P.Venkataraman
% An m-file to apply the Golden Section Method
%************************************
% requires: UpperBound_nVar.m
%***************************************
%the following information are passed to the function
% the name of the function 'functname'
% this function should be available as a function m-file
% and should return the value of the function
% corresponding to a design vector given a vector
% the tolerance: 0.001
% following needed for UpperBound_nVar
% the current position vector x
% the current search direction s
% the initial value lowbound
% the incremental value intvl
% the number of scanning steps ntrials
GoldSection_nVar.m
% the function returns a row vector of the following
% alpha(1),f(alpha1), design variables at alpha(1)
% for the last iteration
% sample callng statement
% GoldSection_nVar('Example5_2',0.001,[0 0 0 ],[0 0 6],0,0.1,10)
%function ReturnValue = ...
GoldSection_nVar(functname,tol,x,s,lowbound,intvl,ntrials)
format compact;
% find upper bound
upval = UpperBound_nVar(functname,x,s,lowbound,intvl,ntrials);
au=upval(1); fau = upval(2);
if (tol == 0) tol = 0.0001; %default
end
eps1 = tol/(au - lowbound);
tau = 0.38197;
GoldSection_nVar.m
nmax = round(-2.078*log(eps1)); % no. of iterations
aL = lowbound; xL = x + aL*s; faL =feval(functname,xL);;
a1 = (1-tau)*aL + tau*au; x1 = x + a1*s; fa1 = feval(functname,x1);
a2 = tau*aL + (1 - tau)*au; x2 = x + a2*s; fa2 = feval(functname,x2);
% storing all the four values for printing
% remember to suppress printing after debugging
fprintf('start \n')
fprintf('alphal(low) alpha(1) alpha(2) alpha{up) \n')
avec = [aL a1 a2 au;faL fa1 fa2 fau];
disp([avec])
for i = 1:nmax
if fa1 >= fa2
aL = a1; faL = fa1;
a1 = a2; fa1 = fa2;
a2 = tau*aL + (1 - tau)*au; x2 = x + a2*s;
fa2 = feval(functname,x2);
au = au; fau = fau; % not necessary -just for clarity
GoldSection_nVar.m
fprintf('\niteration '),disp(i)
fprintf('alphal(low) alpha(1) alpha(2) alpha{up) \n')
avec = [aL a1 a2 au;faL fa1 fa2 fau];
disp([avec])
else
au = a2; fau = fa2;
a2 = a1; fa2 = fa1;
a1 = (1-tau)*aL + tau*au; x1 = x + a1*s;
fa1 = feval(functname,x1);
aL = aL; faL = faL; % not necessary
fprintf('\niteration '),disp(i)
fprintf('alphal(low) alpha(1) alpha(2) alpha{up) \n')
avec = [aL a1 a2 au;faL fa1 fa2 fau];
disp([avec])
end
end
% returns the value at the last iteration
ReturnValue =[a1 fa1 x1];
Pattern Search
• The pattern search method is a minor modification to the Univariate method
with a major impact. In the univariate method, each design variable
(considered a cordinate) provides a search direction.

• This is also referred to as a coordinate direction, and is easily expressed


through the unit vector for that coordinate.

• It can be shown by the application that for problems with considerable


nonlinearity, the univariate method tends to get locked into a zigzag pattern
of smaller and smaller moves as it approaches the solution.

• The pattern search procedure attempts to disrupt this zigzag behaviour by


executing one additional iteration for each cycle. In each cycle, at the end of
n Univariate directions, the n+1 search direction is assembled as a linear
combinatio of the previous n search directions and the optimum value of the
stepsize for that direction. A one-dimensional optimal step size is then
computed and the next cycle of iteration begins.
Powell‟s Method
• If there were only one zero-order method that must be programmed, the
overwhelming choice would be Powell’s method. The principle reason for
the decision would be that it has the property of quadratic convergence,
namely, for a quadratic problem with n variables, convergence will be
achieved in less than or equal to n Powell cycles.

• A quadratic problem is an unconstrained minimization of a function that is


expressed as a quadratic polynomial – a polynomial with no term having a
degree greater than two. The following example is an example of a
quadratic polynomial in two variables.

• retval = 3 + (x(1) - 1.5*x(2))^2 + (x(2) - 2)^2;


Powell‟s Method
• Engineering design optimization problems are rarely described by a
quadratic polynomial. This does not imply that you can not use Powell’s
method. What this means is that the solution should not be expected to
converge quadratically.

• For nonquadratic problems, as the solution is approached iteratively, he


objective can be approximated very well by a quadratic function. It is at
this stage that the quadratic convergence property is realized in the
computations.
Gradient Based Methods
• These methods are also known as first order methods. The search
directions will be constructed using the gradient of the objective
function. Since gradients are being computed, the Kuhn-Tucker
conditions (FOC) for unconstrained problems, f=0, can be used to
check for convergence.

• The SOC are hardly ever applied. One of the reasons is that it would
involve the computation of an n x n second derivative matrix which is
considered computationally expensive, particularly if the evaluation
of the objective function requires a call to a finite element method for
generating required information.

• Another reason for not calculating the Hessian is that the existence
of the second derivative in a real design problem is not certain even
though it is computationally possible or feasible. For problems that
can be described by symbolic calculations, MATLAB should be able
to handle computation of second derivative at the possible solution
and its eigenvalues.
Gradient Based Methods
• Without SOC, these methods require user‟s vigilence to ensure that the
solution obtained is a minimum rather than a maximum or a saddle point. A
simple way to verify this is to perturb the objective function through
perturbation in the design variables at the solution and verify it is a local
minimum.

• This brings up an important property of these methods- they only find local
minimums. Usually, this will be close to the design where the iterations are
begun. Before concluding the design exploration, it is necessary to execute
the method from several starting points to discover if the minimums exist
and select the best one by head to head comparison. The bulk of the
existing unconstrained and constrained optimization methods belong to this
category.

• Four methods are presented. The first is the Steepest Descent Method.
While this method is not used in practice, it provides an excellent example
for understanding the algorithmic principles for the gradient-based
techniques. The second is the conjugate gradient technique which is a
classical workhorse particularly in industry usage. The third and the fourth
belong to category of Variable Metric Methods, or Quasi-Newton methods
as they are also called.
Steepest Descent Method
• The gradient of a function at a point is the
direction of the most rapid increase in the value
of the function at that point. The descent
direction can be obtained reversing the gradient
(or multiplying it by -1). The next step would be
to regard the descent vector as a search
direction., after all we are attempting to decrease
the function through successive iterations.
• SteepestDescent.m:
SteepestDescent.m
• For two variables it will draw the contour plot.
• For two variables, the design vector changes can be seen
graphically in slow motion with steps in different colour.
• The design variables, the function value, and the square of the
length of the gradient vector (called the KT value) at each iteration
are displayed in the Command window at completion of the number
of iterations.
• The gradient of the function is numerically computed using first
forward finite difference. The gradient computation is therefore
automatic.

Usage: SteepestDescent(„Example6_1‟,[0.5 0.5],20,0.0001,0,1,20)


Steepest.Descent.m
Steepest.Descent.m
• A close observation of the figure illustrates the
zigzag motion-which was referred to earlier with
respect to the univariate method. Good methods
are expected to overcome this pattern. In the
case of the univariate method this was achieved
with the pattern search method in the zero-order
family.

• An iteration breaking out of the zigzag pattern


(or preventing getting locked into one) is
necessary to improve the method.
SteepestDescent.m
• The steepest descent method is woefully inadequate compared to Powell‟s
method even if the latter is a zero-order method.

% Ch 6: Numerical Techniques for Unconstrained Optimization


% Optimzation with MATLAB, Section 6.3.1
% Steepest Descent Method
% copyright (code) Dr. P.Venkataraman
%
% An m-file for the Steepest Descent Method
%************************************
% requires: UpperBound_nVar.m
% GoldSection_nVar.m
% and the problem m-file: Example6_1.m
% the following information are passed to the function
% the name of the function 'functname'
% functname.m : returns scalar for vector input
% the gradient calculation is in gradfunction.m
% gradfunction.m: returns vector for vector input
SteepestDescent.m
% initial design vector dvar0
% number of iterations niter

%------for golden section


% the tolerance (for golden section) tol
%
%-------for upper bound calculation
% the initial value of stepsize lowbound
% the incremental value intvl
% the number of scanning steps ntrials
%
% the function returns the final design and the objective function

% sample callng statement

% SteepDescent'Example6_1',[0.5 0.5],20, 0.0001, 0,1 ,20)


%
SteepestDescent.m
function ReturnValue = SteepestDescent(functname, ...
dvar0,niter,tol,lowbound,intvl,ntrials)
clf % clear figure
e3 = 1.0e-08; nvar = length(dvar0); % length of design vector or number of
variables obtained from start vector
if (nvar == 2)
%*******************
% plotting contours
%*******************
% for 2 var problems the contour plot can be drawn
x1 = 0:0.1:5;x2 = 0:0.1:5;
x1len = length(x1);x2len = length(x2);
for i = 1:x1len;
for j = 1:x2len;
x1x2 =[x1(i) x2(j)];
fun(j,i) = feval(functname,x1x2);
end
end
SteepestDescent.m
c1 = contour(x1,x2,fun, ...
[3.1 3.25 3.5 4 6 10 15 20 25],'k');
%clabel(c1); % remove labelling to mark iteration
grid
xlabel('x_1')
ylabel('x_2')
% replacing _ by - in the function nane
funname = strrep(functname,'_','-');
% adding file name to the title
title(strcat('Steepest Descent Using :',funname));
%*************************
% finished plotting contour
%*************************
% note that contour values are problem dependent
% the range is problem dependent
end
SteepestDescent.m
%*********************
% Numerical Procedure
%*********************
% design vector, alpha , and function value is stored
xs(1,:) = dvar0;
x = dvar0;
Lc = 'r';
fs(1) = feval(functname,x); % value of function at start
as(1)=0;
s = -(gradfunction(functname,x)); % steepest descent
convg(1)=s*s';
for i = 1:niter-1
% determine search direction
output = GoldSection_nVar(functname,tol,x, s,lowbound,intvl,ntrials);
as(i+1) = output(1);
fs(i+1) = output(2);
for k = 1:nvar
xs(i+1,k)=output(2+k);
x(k)=output(2+k);
end
s = -(gradfunction(functname,x)); % steepest descent
convg(i+1)=s*s';
%***********
% draw lines
%************
if (nvar == 2)
line([xs(i,1) xs(i+1,1)],[xs(i,2) xs(i+1,2)],'LineWidth',2, 'Color',Lc)
itr = int2str(i);
x1loc = 0.5*(xs(i,1)+xs(i+1,1));
x2loc = 0.5*(xs(i,2)+xs(i+1,2));%text(x1loc,x2loc,itr); % writes iteration number on
the line
if strcmp(Lc,'r')
Lc = 'k';
else
Lc = 'r';
end
pause(1)
%***********************
% finished drawing lines
%***********************
end
if(convg(i+1)<= e3) break; end; % convergence criteria

%***************************************
% complete the other stopping criteria
%****************************************
end
len=length(as);
%for kk = 1:nvar
designvar=xs(length(as),:);

fprintf('The problem: '),disp(functname)


fprintf('\n - The design vector,function value and KT value \nduring the
iterations\n')
disp([xs fs' convg'])
ReturnValue = [designvar fs(len)];
gradfunction.m
function Return = gradfunction(functname,x)
% numerical computation of gradient
% this allows automatic gradient computation
% first forward finite difference
% hstep = 0.001; - programmed in
hstep = 0.001;
n = length(x);
f = feval(functname,x);
for i = 1:n
xs = x;
xs(i) = xs(i) + hstep;
gradx(i)= (feval(functname,xs) -f)/hstep;
end
Return = gradx;