Least Square Methods

MA5171 Metoda Optimasi Lanjut Nicholas Malvin / 20119020
MA5171 Metoda Optimasi Lanjut

Task 2 : Nonlinear Least Square Problem
By : Nicholas Malvin / 20119020
Abstract
In this task we will use optimization methods, such as line searchs and Levenberg-Marquardt
method for nonlinear least square regression. We will use three methods in line search to
compute the descent step direction, which are Steepest descent, Hybrid (Steepest-Newton), and
Gauss-Newton, while the step size must satisfy strong Wolfe condition. We also use Dogleg,
Levenberg-Marquardt, and Hybrid II (LM-QN) as other methods. Finally, we will compare
these methods based on its computational speed (number of iterations) and accuracy.
Keywords : Nonlinear least square, Line search methods, Levenberg-Marquardt.
I. Introduction
1.1 The Objective Function (SSE)
Before we talk about the objective function, we begin by introducing the data. Here, we
generate the data using MATLAB (See appendix B) by creating an array of independent
variables 𝑡𝑡 , and the dependent variables 𝑡𝑡 , where
𝑡𝑡 = 10𝑡𝑡𝑡 − 5𝑡2𝑡𝑡 + 𝑡𝑡 , 𝑡𝑡 ~𝑡(0,0.03), 𝑡 = 1,2, … ,30
Figure 1. The data (𝑡𝑡 , 𝑡𝑡 ), 𝑡 = 1, … ,30 generated by MATLAB.

The error factor 𝑡𝑡 ’s are randomly generated with normal distribution with parameters
𝑡 = 0 and 𝑡2 = 0.03, so that the data (𝑡𝑡 , 𝑡𝑡 ) are scattered along the curve 𝑡 = 10𝑡𝑡 −
5𝑡2𝑡 with errors 𝑡𝑡 ’s. Furthermore, the array 𝑡𝑡 ’s are generated such that
1
𝑡1 = −2, 𝑡2 = −2 + Δ𝑡, … , 𝑡𝑡 = −2 + (𝑡 − 1)Δ𝑡, … , 𝑡𝑡 = 0.5 − Δ𝑡

We choose ∆𝑡 = 2.5/30, so that the number of data is 𝑚 = 30. The generated data can
be seen in Figure 1, and the data array 𝑡1 , … , 𝑡𝑡 is saved as data_v2.mat, so we will
have the exact same data (𝑡𝑡 is not randomized anymore) to be used on all methods later.
Now, consider the following functions :
𝑡
1
𝑡(𝑡) = ∑ 𝑡𝑡 (𝑥)2 , 𝑡𝑡 (𝑡) = 𝑡𝑡 − 𝑡(𝑡, 𝑡𝑡 ), 𝑡 = 1, … , 𝑡, 𝑡 ∈ 𝑡4
2
𝑡=1
Where 𝑡 is called the model function, which chosen as 𝑡(𝑡, 𝑡) = 𝑡3 𝑡𝑡1𝑡 + 𝑡4 𝑡𝑡2𝑡 in this
case. Of course, this is a nonlinear model, since we have the exponential terms for the
parameters 𝑡1 , 𝑡2 . Moreover, 𝐹 is the objective function, which is half of the sum of
squared error (SSE), with 𝑡𝑡 being the errors. Note that the term “error” for 𝑡𝑡 ’s are
completely different from 𝑡𝑡 ’s. We seek the best parameters 𝑡1 , … , 𝑡4 which gives a
minimum value for 𝑡, and therefore we call this a nonlinear least square problem.
1.2 Derivatives of 𝑭
Before we discuss each numerical methods used to find the minimizer, we will first
compute the gradient ∇𝑡, Jacobian 𝑡 and the Hessian ∇2 𝑡 for later usage in some
numerical methods. Denoting 𝑥 = (𝑥1 , … , 𝑥4 )𝑇 , we have
𝑚 𝑚
𝐹𝑥1 = ∑ 𝐹1 (𝑗)𝑓𝑗 (𝑥) , 𝐹𝑥2 = ∑ 𝑥4 𝑡𝑗 𝑓𝑗 (𝑥)

𝑗=1 𝑗=1
𝑚 𝑚
𝐹𝑥3 = ∑ 𝐹3 (𝑗)𝑓𝑗 (𝑥) , 𝐹𝑥4 = ∑ 𝐹4 (𝑗)𝑓𝑗 (𝑥)

𝑗=1 𝑗=1
Where the subscript denote partial derivative, and

𝐹1 (𝑗) = −𝑥3 𝑡𝑗 𝑒 𝑥1 𝑡𝑗 , 𝐹2 (𝑗) = −𝑥4 𝑡𝑗 𝑒 𝑥2 𝑡𝑗 , 𝐹3 (𝑗) = −𝑒 𝑥1 𝑡𝑗 , 𝐹4 (𝑗) = −𝑒 𝑥2 𝑡𝑗
Then, we can write :
𝐹𝑥1
𝐹𝑥2 𝐹1 (1) 𝐹2 (1) 𝐹3 (1) 𝐹4 (1)
∇𝑡 = , 𝐽=( ⋮ ⋮ ⋮ ⋮ )
𝐹𝑥3
𝐹1 (𝑚) 𝐹2 (𝑚) 𝐹3 (𝑚) 𝐹4 (𝑚)
𝐹𝑥4
( )
Furthermore,
𝑚 𝑚
𝐹𝑥1 𝑥1 = ∑ 𝐹1 (𝑗)2 + 𝑓𝑗 (𝑥)(−𝑥3 𝑡𝑗2 𝑒 𝑥1 𝑡𝑗 ) , 𝐹𝑥2 𝑥2 = ∑ 𝐹2 (𝑗)2 + 𝑓𝑗 (𝑥)(−𝑥4 𝑡𝑗2 𝑒 𝑥2 𝑡𝑗 )

𝑗=1 𝑗=1
𝑚 𝑚 𝑚
𝐹𝑥3 𝑥3 = ∑ 𝑒 2𝑥1 𝑡𝑗
, 𝐹𝑥4 𝑥4 = ∑ 𝑒 2𝑥2 𝑡𝑗
, 𝐹𝑥3 𝑥4 = ∑ 𝑒 (𝑥1 +𝑥2 )𝑡𝑗
𝑗=1 𝑗=1 𝑗=1
2
𝑚 𝑚
𝐹𝑥1 𝑥2 = ∑ 𝐹1 (𝑗)𝐹2 (𝑗) , 𝐹𝑥1 𝑥3 = ∑ 𝑥3 𝑡𝑗 𝑒 2𝑥1 𝑡𝑗 − 𝑓𝑗 (𝑥)𝑡𝑗 𝑒 𝑥1 𝑡𝑗

𝑗=1 𝑗=1
𝑚 𝑚
𝐹𝑥1 𝑥4 = ∑ 𝐹1 (𝑗)𝐹4 (𝑗) , 𝐹𝑥2 𝑥3 = ∑ 𝐹2 (𝑗)𝐹3 (𝑗),

𝑗=1 𝑗=1
𝑚
𝐹𝑥2 𝑥4 = ∑ 𝑥4 𝑡𝑗 𝑒 2𝑥2 𝑡𝑗 − 𝑓𝑗 (𝑥)𝑡𝑗 𝑒 𝑥2 𝑡𝑗

𝑗=1
So that we can write the Hessian of 𝐹 as follows :
∇2 𝐹 = [𝐻𝑖𝑗 ], 𝐻𝑖𝑗 = 𝐹𝑥𝑖 𝑥𝑗 , (𝑖, 𝑗) ∈ {1,2,3,4} × {1,2,3,4}
With only four variables, we already have to compute 10 derivatives, just to write the
Hessian matrix. This is why the Newton method, which uses the hessian within the
computation is not really preferable, or may be counted as a disadvantage.
1.3 General Computation in MATLAB
Once again, our goal is to fit the model with the saved data (see Appendix B). We will
use the same exact data on all methods, so we can fairly compare the effectiveness of
each method. There are two categories, in which we divide the methods. The first
categories include all line searches methods (Steepest descent, Hybrid I (Steepest-
Newton), Gauss-Newton), where the complete code is written in Appendix A, (A.1). The
rest of the methods (Dogleg, Levenberg-Marquardt, Hybrid II (LM-QN)) will be
embedded into the same category, and the code is given in Appendix A (A.2).
We choose the same starting point (initial point) for all methods, which is 𝑥0 =
[2,3,15, −7]𝑇 . We choose this starting point, so all of the chosen methods succeed to
converge to a minima. Note that some points may lead to a failure on some method, which
later be discuss. Furthermore, we record the converge point 𝑥 ∗ as the model’s parameter
on each method, along with the value of 𝐹(𝑥 ∗ ) and number of iterations.
II. Line Search Methods

2.1 Computation using MATLAB
As in Appendix A, (A.1), we compute all three line search methods : Steepest descent,
Hybrid I, Gauss-Newton) on a same script for ease on comparisons. We denote the
methods by numbers as follow :
 Method 1 := Steepest Descent

 Method 2 := Hybrid I (Steepest-Newton)
 Method 3 := Gauss Newton
3
We then plot the fitted curve from these three methods onto the same figure.
The termination condition for all method is when the norm of the gradient ‖∇𝑓‖2 < 10−3,
which actually mean that the norm-2 of the gradient is close to zero. The step direction in
iteration 𝑘 denoted as 𝑝𝑘 (pk in code), with step size 𝛼𝑘 (alpha in code). The number of
iterations are saved in variable iter, and the parameters 𝑥 ∗ are saved as Xmin, and 𝐹(𝑥 ∗ )
saved as Fmin. Then, we take these variables to an array param1,param2,param3.
The step size 𝛼𝑘 are computed using algorithm 3.2 & 3.3 in [1] pg.59,60 which uses the
strong Wolfe condition with 𝑐1 = 10−4 , 𝑐2 = 0.9. The interpolation part in the zoom
function will use the successive quadratic interpolation, which coded as matlab function
quadmin (see Appendix A).
Figure 2. The resulting fitted curves along with the data using steepest descent method
(blue), Hybrid I method (orange), and Gauss-Newton method (yellow).
2.2 Steepest Descent Method

The steepest descent method simply chose a direction with a steepest descent, i.e
∇𝐹𝑘𝑇 𝑝𝑘 = −1, where 𝑝𝑘 is the step direction. Although this method is the simplest method
to implement, there are various disadvantages when using this method. Implementing this
method to our problem yield us the following result :
 Fitted curve : 𝑦(𝑡) = 13.4396𝑒 1.0994𝑡 − 8.4729𝑒 1.7459𝑡

 Number of iteration : 298 iterations
 SSE : 𝐹(𝑥 ∗ ) = 0.0164
The fitted curve can be seen in Figure 2, as the thick blue line. A small value for SSE is
a good indication that the process converge to a minima, yet it is too early to say the SSE
good enough, compared to other methods. It seems that this so practical method sacrifices
computational speed, since we have a lot of iterations going.
4
2.3 Hybrid I (Steepest-Newton) Method

The Hyrid I method combine the steepest descent method with Newton method. When
𝑥𝑘 is far from minima, the hessian matrix, which is needed for Newton’s method direction
𝑝𝑘 = −(∇2 𝐹𝑘 )−1 ∇𝐹𝑘 doesn’t guarantee a descent direction, since we might have a non
positive definite hessian matrix. Therefore, when such situation occur, we simply replace
the Newton direction with steepest, and from previous task we see how steepest descent
direction, is good during the early steps, in which the 𝑥𝑘 isn’t close to a minima. However,
as the steps are closer to the minima, we can safely use Newton direction, where the local
convergence is already guaranteed. Implementation of this idea isn’t really difficult, and
the results are as follows :
 Number of iterations : 119 iterations
 SSE : 𝐹(𝑥 ∗ ) = 0.0111
The fitted curve can be seen in Figure 2, as the orange line. Here, we have a smaller SSE
compared to previous method, which mean that the resulting parameters 𝑥1 , … , 𝑥4 are
more desirable. The number of iterations for this process is better than steepest descent,
but implementation of this method, require a hard computation for derivatives in the
hessian matrix. It is also worth noting that the Hybrid I method convergence is
guaranteed, unlike the Newton method, which is the reason we don’t discuss Newton
method directly in here.
2.4 Gauss-Newton Method
The principle of a Gauss-Newton method isn’t so different from Newton’s. However,
instead of computing the exact hessian of 𝐹, we simply approximate it by 𝐽𝑇 𝐽, where 𝐽
𝑇
denote the Jacobian of (𝑓1 (𝑥), … , 𝑓𝑚 (𝑥)) . Therefore, the descent direction is now
written as 𝑝𝑘 = −(𝐽𝑘𝑇 𝐽𝑘 )−1 ∇𝐹𝑘 . This approximation saved us some hard computations,
since we now don’t need to compute the exact hessian. Yet, when 𝑥0 have some elements
whose values are equal, this method may meet some problem, since the matrix 𝐽𝑇 𝐽 can be
singular, and we may take this as a disadvantage of this method. Moreover, the result
from this method is given as follows :
 SSE : 𝐹(𝑥 ∗ ) = 0.0111
The fitted curve can be seen in Figure 2, as yellow stripes. The parameters and SSE
doesn’t differ much from Hybrid I method. However, we have a much smaller number of
iterations compare to both previous line searches methods, and we may conclude that
among all line searches method we used, this method is the best.
5
III. Other Methods

3.1 Computation using MATLAB
As in Appendix A, (A.2), we compute Dogleg method, Levenberg-Marquardt method,
and Hybrid II (LM-QN) method on a same script for ease on comparisons. We denote the
methods by numbers as follow :
 Method 1 := Dogleg
 Method 2 := Levenberg-Marquardt
 Method 3 := Hybrid II (LM-QN)
We then plot the fitted curve from these three methods onto the same figure (see Figure
3). The number of iterations are saved in variable iter, and the parameters 𝑥 ∗ are saved
as Xmin, and 𝐹(𝑥 ∗ ) saved as Fmin. Then, we take these variables to an array
param1,param2,param3.
Figure 3. Fitted curves resulted along with the data using Dogleg method (blue),
Levenberg-Marquardt method (orange), and Hybrid II method (yellow).
3.2 Dogleg Method

The dogleg method combines the Gauss-Newton steps, with steepest descent. This
method is one of trust region methods. However, we cannot display the steps, with its’
regions as in previous task, since we now working on four dimensional space ℝ4 . Dogleg
method requires us to compute the Jacobian as in Gauss-Newton. Using the algorithm
3.21 in [2] pg.32, we choose parameters 𝜀1 = 𝜀2 = 𝜀3 = 0.01, Δ0 = 1, which resulting :
 SSE : 𝐹(𝑥 ∗ ) = 0.0115
6
The fitted curve can be seen in Figure 3, as the thick blue line. Compared to steepest
descent, and Hybrid I method, this method is clearly more superior, in terms of
computational speed, while the SSE is slightly differ from 0.0111 and insignificant to be
considered so different. However, the Gauss-Newton has smaller iterations, so that we
still prefer Gauss-Newton method for this case. Note that, this is just a hard comparison,
and it is hard to discuss whether, which method is better (GN or Dogleg), since choosing
different parameters for this method, may gives a different number of iterations, and
maybe SSE. Therefore, we cannot really conclude which method is better (GN or
Dogleg), but we may say that this method is quite good, since the number of iteration is
quite small, and the computations are not so difficult. The disadvantage of this method
however, lays on the choosing of the parameters, where sometimes, bad choice of
parameters can gives a bad result.
3.3 Levenberg-Marquardt (LM) Method
One of the damped methods is the Levenberg-Marquardt (LM) method. This method
steps ℎ𝑙𝑚 determined by the following :
𝑇
(𝐽𝑇 𝐽 + 𝜇𝐼)ℎ𝑙𝑚 = −𝐽𝑇 𝑓,̅ 𝑓 ̅ = (𝑓1 (𝑥), … , 𝑓𝑚 (𝑥))
The terms 𝐽𝑇 𝐽 + 𝜇𝐼 approximates the hessian, while 𝐽𝑇 𝑓 ̅ represents ∇𝑓. When 𝜇 is large,
the LM steps are close to steepest descent, while small value of 𝜇, gives a close steps to
Gauss-Newton, and therefore, we may see this method as somewhat a middle
combination of steepest descent with Gauss-Newton. Implementing algorithm 3.16 in [2]
pg.27, with parameters 𝜀1 = 𝜀2 = 0.01, 𝜏 = 10−6 will yield the following results :
 Fitted curve : 9.3326𝑒 0.9729𝑡 − 4.3451𝑒 2.077𝑡

 SSE : 𝐹(𝑥 ∗ ) = 0.0111
The fitted curve can be seen in Figure 3, as the orange line. The SSE is the same as in
Gauss-Newton, which we may say the best so far. Furthermore, the number of iterations
here is only 8 iterations, which is very efficient compared to all other methods we had
already discuss. It is easy to say LM method is the best one, compared to the previous
ones. However, we still can provide a better convergence rate, using Hybrid II method.
3.4 Hybrid II (LM-QN) Method

The last method that we’ll discuss is a hybrid of LM method, with Quasi-Newton (QN),
namely the BFGS. The reason we combine these two, is to achieve an even better
convergence rate (Quasi-Newton has superlinear convergence). In theory, we simply
switches along these two method for each iteration stepes, with some conditions. The
method switching mechanism is [2] pg.35,36. Implementing the algorithm (see [2] pg.
36-38), yields :
 Fitted curve : 9.3326𝑒 0.9729𝑡 − 4.3451𝑒 2.077𝑡
7
 SSE : 𝐹(𝑥 ∗ ) = 0.0111

This is the exact same result as in LM method, and the fitted curve represented as the
yellow stripes in Figure 3. This is possible when we do not switch method at all for all
iterations (all iterations uses LM method). In this case, since there is only 8 iterations, this
kind of behavior is highly possible. In theory however, if we choose some different initial
point, we might meet some steps which uses the QN method.
IV. Conclusions
We finish this report by the following conclusions :
 All of the methods discussed ( Steepest descent, Hybrid I, Gauss-Newton, Dogleg,
LM, Hybrid II ) can be used to solve non-linear least square problem.
 From all methods, the smallest value for 𝐹 is 0.0111, with the minimizer
parameters 𝑥1 , … , 𝑥4 of the model 𝑀 is close to the parameters used in generating
the data.
 Implementation of all methods displayed in Appendix A.
 The best method from the above discussions is LM and Hybrid II.
Reference
[1] J. Nocedal, S.J. Wright, Numerical Optimization, Springer, 1999.
[2] K. Madsen, H.B Nielsen, Method for Non Linear Least Square Problem,
IMM-DTU, 2004.
8
Appendix A
A.1 Complete MATLAB code for Line-search methods :
% Nonlinear Least Square (fitting) with Line Search
% Set model to M(x1...x4,t)=x3*exp(x1*t)+x4*exp(x2*t)
% Created by : Nicholas Malvin 20119020
clear all; clc; close all;
%% Generate data (t,y)

t=-2:2.5/30:0.5-2.5/30;
y=importdata('data_v2.mat');
%% Line Search Procedure

%Line Search Parameters
a0=0; a1=0.9; amax=1;
c1=10^-4; c2=0.9; maxiter=2000;
for method=1:3 %1(Steepest)/2(SN_Hybrid)/3(Gauss-Newton)
%Initial guess point

x0=[2;3;15;-7];
for k=1:maxiter
xk=x0;
X(1,k)=xk(1);X(2,k)=xk(2);X(3,k)=xk(3);X(4,k)=xk(4);
if Norm(gf(xk,t,y))<10^-3
break
end
%Descent direction pk
if method==1 % Steepest
pk=-gf(xk,t,y);
end
if method==2 % Hybrid(Stee-Newt)
H=hess(xk,t,y);E=eig(H);
if (E(1)<=0 || E(2)<=0) || (E(3)<=0 || E(4)<=0)
pk=-gf(xk,t,y);
else
pk=-H^(-1)*gf(xk,t,y);
end
end
if method==3 % Gauss-Newton
H=J(xk,t)'*J(xk,t);
pk=-H^(-1)*gf(xk,t,y);
end
%Determine step size (Wolfe Condition)

for j=1:maxiter
if j==maxiter
alpha=amax;
end
if phi(a1,xk,pk,t,y)>phi(0,xk,pk,t,y)+c1*a1*dphi(0,xk,pk,t,y)
|| (phi(a1,xk,pk,t,y)>=phi(a0,xk,pk,t,y) && j>1)
alpha=zoom(a0,a1,xk,pk,c1,c2,t,y); break
end
9
if abs(dphi(a1,xk,pk,t,y))<=-c2*dphi(0,xk,pk,t,y)
alpha=a1; break
end
if dphi(a1,xk,pk,t,y)>=0
alpha=zoom(a1,a0,xk,pk,c1,c2,t,y); break
end
a0=a1;
a1=(a1+amax)/2;
end
x0=x0+alpha*pk;
end
Xmin=x0; % minimizer
Fmin=F(Xmin,t,y); % value of f at minimizer
iter=k; % Number of iterations
%Plotting regression
x1=Xmin(1);x2=Xmin(2);x3=Xmin(3);x4=Xmin(4);
figure(2)
hold on
tplot=-2:2.5/100:0.5;
if method==1
plot(tplot,x3*exp(x1*tplot)+x4*exp(x2*tplot),'color',[0, 0.447,
0.741],'linewidth',9)
param1=[x1 x2 x3 x4 Fmin iter];
end
if method==2
plot(tplot,x3*exp(x1*tplot)+x4*exp(x2*tplot),'color',[0.8500,
0.3250, 0.0980],'linewidth',5)
end
if method==3
plot(tplot,x3*exp(x1*tplot)+x4*exp(x2*tplot),'-.','color',[0.9290,
0.6940, 0.1250],'linewidth',2)
aaaaplot(t,y,'o','MarkerFaceColor','w','MarkerEdgeColor','k','MarkerSi
ze',8,'linewidth',1);
end
axis([-3 1 0.9*min(y) 1.1*max(y)]);
Leg=legend('Steepest','Hybrid(Stee-Newt)','Gauss-newton','Data');
set(Leg,'location','NorthWest');
end
%% Functions
function z=f(j,x,t,y)
z=y(j)-x(3)*exp(x(1)*t(j))-x(4)*exp(x(2)*t(j));
end
function Ja=J(x,t) %Jacobian

Ja=[-x(3)*t(1)*exp(x(1)*t(1)) -x(4)*t(1)*exp(x(2)*t(1)) -
exp(x(1)*t(1)) -exp(x(2)*t(1))];
for k=2:length(t)
j=[-x(3)*t(k)*exp(x(1)*t(k)) -x(4)*t(k)*exp(x(2)*t(k)) -
exp(x(1)*t(k)) -exp(x(2)*t(k))];
Ja=[Ja;j];
end
end
function z=F(x,t,y) %Objective function (SSE)
10
z=0;
for j=1:length(t)
z=z+(f(j,x,t,y))^2;
end
z=0.5*z;
end
function z=gf(x,t,y) %Gradient of F

x1=x(1);x2=x(2);x3=x(3);x4=x(4);
z1=0;z2=0;z3=0;z4=0;
for j=1:length(t)
z1=z1+f(j,x,t,y)*(-x3*t(j))*exp(x1*t(j));
z3=z3-f(j,x,t,y)*exp(x1*t(j));
end
z=[z1;z2;z3;z4];
end
function h=hess(x,t,y) %Hessian of F

x1=x(1);x2=x(2);x3=x(3);x4=x(4);
h11=0;h22=0;h33=0;h44=0;h12=0;h13=0;h14=0;h23=0;h24=0;h34=0;
for j=1:length(t)
h11=h11+(x3*t(j)*exp(x1*t(j)))^2+f(j,x,t,y)*(-
x3*t(j)^2*exp(x1*t(j)));
h22=h22+(x4*t(j)*exp(x2*t(j)))^2+f(j,x,t,y)*(-
x4*t(j)^2*exp(x2*t(j)));
h33=h33+exp(2*x1*t(j));
h44=h44+exp(2*x2*t(j));
h12=h12+x3*x4*t(j)^2*exp(x1*t(j)+x2*t(j));
h13=h13+x3*t(j)*exp(2*x1*t(j))+f(j,x,t,y)*(-t(j)*exp(x1*t(j)));
h14=h14+x3*t(j)*exp(x1*t(j)+x2*t(j));
h23=h23+x4*t(j)*exp(x1*t(j)+x2*t(j));
h24=h24+x4*t(j)*exp(2*x2*t(j))+f(j,x,t,y)*(-t(j)*exp(x2*t(j)));
h34=h34+exp(x1*t(j)+x2*t(j));
end
h=[h11 h12 h13 h14;
h12 h22 h23 h24;
h13 h23 h33 h34;
h14 h24 h34 h44];
end
function p=phi(a,xk,pk,t,y) %phi(a)=f(xk+a*pk)

p=F(xk+a*pk,t,y);
end
function dp=dphi(a,xk,pk,t,y)
dp=gf(xk+a*pk,t,y)'*pk;
end
function N=Norm(x)
N=sqrt(x(1)^2+x(2)^2+x(3)^2+x(4)^2);
end
%SUCCESIVE QUADRATIC INTERPOLATION

function y=quadmin(a0,a1,xk,pk,t,y)
maxiter=100;
x=[a0 a1 (a0+a1)/2];
for j=1:3
11
P(j)=F(xk+x(j)*pk,t,y);
end
tol=0.001; %tolerance
for i=1:maxiter
p=polyfit(x,P,2);
xmin=-p(2)/2/p(1);
x=[x(2:end) xmin];
e=abs(x(3)-x(2))/abs(x(3));
if e<tol
break
end
for j=1:3
P(j)=F(xk+x(j)*pk,t,y);
end
end
y=xmin;
end
%ZOOM FUNCTION
function z = zoom(a0,a1,xk,pk,c1,c2,t,y)
for k=1:1000
if k==1000 %if zoom not converge, we safeguard the steplength as 1
z=1;
end
a=quadmin(a0,a1,xk,pk,t,y);
if phi(a,xk,pk,t,y)>phi(0,xk,pk,t,y)+c1*a*dphi(0,xk,pk,t,y) ||
phi(a,xk,pk,t,y)>=phi(a0,xk,pk,t,y)
a1=a;
else
if abs(dphi(a,xk,pk,t,y))<=-c2*dphi(0,xk,pk,t,y)
z=a; return
end
if dphi(a,xk,pk,t,y)*(a1-a0)>=0
a1=a0;
end
a0=a;
end
end
end
12
A.2 Complete MATLAB code for non-linesearch methods :

% Nonlinear Least Square (fitting) with Levenberg&Dogleg
% Set model to M(x1...x4,t)=x3*exp(x1*t)+x4*exp(x2*t)
% Created by : Nicholas Malvin 20119020
clear all; clc; close all;

t=-2:2.5/30:0.5-2.5/30;
y=importdata('data_v2.mat');
%% Applying Multiple methods

maxiter=1000;k=0;
for method=1:3 %1(Dogleg)/2(Hybrid),3(Lev-Mar)
%Initial guess point

x0=[2;3;15;-7];
if method==1 %DOGLEG with all eps1,eps2,eps3=0.01

x=x0;D=1;g=J(x,t)'*fvec(x,t,y);
found=(max(abs(fvec(x,t,y)))<=0.01)||(max(abs(g))<=0.01);
while found==0 && k<maxiter
k=k+1;
alp=(g'*g)/(g'*J(x,t)'*J(x,t)*g);
hsd=-alp*g; hgn=-inv(J(x,t)'*J(x,t))*J(x,t)'*fvec(x,t,y);
if Norm(hgn)<=D
hdl=hgn;
else if Norm(alp*hsd)>=D
hdl=(D/Norm(hsd))*hsd;
else
a=alp*hsd;b=hgn;c=a'*(b-a);
if c<=0
beta=(-c+sqrt(c^2+(Norm(b-a))^2*(D^2-a'*a)))/((b-
a)'*(b-a));
else
beta=(D^2-a'*a)/(c+sqrt(c^2+(Norm(b-a))^2*(D^2-
a'*a)));
end
hdl=alp*hsd+beta*(hgn-alp*hsd);
end
end
if Norm(hdl)<=0.01*(Norm(x)+0.01)
found=1;
else
xnew=x+hdl
r=(F(x,t,y)-F(xnew,t,y))/(L([0;0;0;0],x,t,y)-
L(hdl,x,t,y));
if r>0
x=xnew; g=J(x,t)'*fvec(x,t,y);
aaaaaaaaaaaaaaaafound=(max(abs(fvec(x,t,y)))<=0.001)||(max(abs(g))<=0.
01);
end
if r>0.75
D=max(D,3*Norm(hdl));
else if r<0.25
D=D/2; found=(D<=0.01*(Norm(x)+0.01));
13
end
end
end
end
end
if method==2 %Levenberg-Marquardt
k=0; v=2; x=x0; A=J(x,t)'*J(x,t);
g=J(x,t)'*fvec(x,t,y); tau=10^-6;
found=(max(abs(g))<=0.01); mu=tau*max(diag(A));
I=diag(ones(1,4));
k=k+1; hlm=-((A+mu*I)^(-1))*g;
if Norm(hlm)<=0.001*(Norm(x)+0.01)
found=1;
else
xnew=x+hlm;
r=(F(x,t,y)-F(xnew,t,y))/(0.5*hlm'*(mu*hlm-g));
if r>0
x=xnew; A=J(x,t)'*J(x,t); g=J(x,t)'*fvec(x,t,y);
found=(max(abs(g))<=0.01);
mu=mu*max(1/3,1-(2*r-1)^3); v=2;
else
mu=mu*v; v=2*v;
end
end
end
end
if method==3 %Hybrid (LM-QN)

k=0;maxiter=100;x=x0;
tau=10^-6; A0=J(x,t)'*J(x,t); mu=tau*max(max(diag(A0)));
I=diag(ones(1,4)); B=I;
found=(max(abs(gf(x,t,y)))<=0.01); met='LM'; count=0; v=2;
k=k+1;
if met=='LM'
xnew=x; met='LM';
hlm=-((J(x,t)'*J(x,t)+mu*I)^(-1))*gf(x,t,y);
if Norm(hlm)<=0.01(Norm(x)+0.01)
found=1;
else
xnew=x+hlm;
r=(F(x,t,y)-F(x+hlm,t,y))/(0.5*hlm'*(mu*hlm-
J(x,t)'*fvec(x,t,y)));
if r>0
better=1; found=(max(abs(gf(xnew,t,y)))<=0.01);
mu=mu*max(1/3,1-(2*r-1)^3); v=2;
if max(abs(gf(xnew,t,y)))<0.02*F(xnew,t,y)
count=count+1
if count==3;
met='QN';
else
count=0;
end
else
count=0;
end
14
else
count=0; better=0; mu=mu*v; v=2*v;
end
end
else
xnew=x; met='QN'; better=0; hqn=-B*gf(x,t,y);
if Norm(hqn)<=0.01*(Norm(x)+0.01)
found=1;
else
better=(F(xnew,t,y)<F(x,t,y))||(F(xnew,t,y)<=(1+10^-
6)*F(x,t,y) && max(abs(gf(xnew,t,y)))
<max(abs(gf(x,t,y))));
if max(abs(gf(xnew,t,y)))>=max(abs(gf(x,t,y)))
met='LM';
end
end
end
h=xnew-x;Y=J(xnew,t)'*J(xnew,t)*h+(J(xnew,t)-
J(x,t))'*fvec(xnew,t,y);
if h'*Y>0
V=B*h; B=B+(1/(h'*Y))*Y*Y'-(1/(h'*V)*V)*V';
end
if better==1
x=xnew;
end
end
end
Xmin=x; % minimizer
Fmin=F(Xmin,t,y); % value of f at minimizer
iter=k; % Number of iterations
%Plotting regression
x1=Xmin(1);x2=Xmin(2);x3=Xmin(3);x4=Xmin(4);
figure(2)
hold on
tplot=-2:2.5/100:0.5;
if method==1
plot(tplot,x3*exp(x1*tplot)+x4*exp(x2*tplot),'color',[0, 0.4470,
0.7410],'linewidth',9)
end
if method==2
plot(tplot,x3*exp(x1*tplot)+x4*exp(x2*tplot),'color',[0.8500,
0.3250, 0.0980],'linewidth',5)
end
if method==3
plot(tplot,x3*exp(x1*tplot)+x4*exp(x2*tplot),'-.','color',[0.9290,
0.6940, 0.1250],'linewidth',2)
aaaaplot(t,y,'o','MarkerFaceColor','w','MarkerEdgeColor','k','MarkerSi
ze',8,'linewidth',1);
end
axis([-3 1 0.9*min(y) 1.1*max(y)]);
Leg=legend('Dogleg','Lev-Mar','Hybrid(LM-QN)','Data')
set(Leg,'location','NorthWest')
end
15
%% Functions
function z=f(j,x,t,y)
z=y(j)-x(3)*exp(x(1)*t(j))-x(4)*exp(x(2)*t(j));
end
function z=fvec(x,t,y)
z=f(1,x,t,y);
for j=2:length(t)
z=[z;f(j,x,t,y)];
end
end
function Ja=J(x,t)
Ja=[-x(3)*t(1)*exp(x(1)*t(1)) -x(4)*t(1)*exp(x(2)*t(1)) -
exp(x(1)*t(1)) -exp(x(2)*t(1))];
for k=2:length(t)
j=[-x(3)*t(k)*exp(x(1)*t(k)) -x(4)*t(k)*exp(x(2)*t(k)) -
exp(x(1)*t(k)) -exp(x(2)*t(k))];
Ja=[Ja;j];
end
end
function z=F(x,t,y) %Objective function (SSE)

z=0;
for j=1:length(t)
z=z+(f(j,x,t,y))^2;
end
z=0.5*z;
end
function z=gf(x,t,y) %Gradient of F

x1=x(1);x2=x(2);x3=x(3);x4=x(4);
z1=0;z2=0;z3=0;z4=0;
for j=1:length(t)
end
z=[z1;z2;z3;z4];
end
function N=Norm(x)
N=sqrt(x(1)^2+x(2)^2+x(3)^2+x(4)^2);
end
function Lin=L(h,x,t,y)
Lin=0.5*(Norm(fvec(x,t,y)+J(x,t)*h))^2;
end
16
Appendix B
Generating data by MATLAB :
t=-2:2.5/30:0.5-2.5/30;m=length(t);
y=10*exp(t)-5*exp(2*t)+normrnd(0,0.03,[1 m]);
figure(1)
plot(t,y,'o','MarkerFaceColor','w','MarkerEdgeColor','k','MarkerSize',
8,'linewidth',1);
axis([-3 1 0.9*min(y) 1.1*max(y)]);
Leg=legend('Data (t_i,y_i)')
set(Leg,'location','NorthWest')
Saved data (data_v2.mat) :
𝑖 𝑡 𝑦
1 -2 1,29241004146288
2 -1,91666666666667 1,33655843638144
3 -1,83333333333333 1,48343080355583
4 -1,75000000000000 1,59720575339143
5 -1,66666666666667 1,72086369413927
6 -1,58333333333333 1,82029994022383
7 -1,50000000000000 1,99217146710787
8 -1,41666666666667 2,11568193915654
9 -1,33333333333333 2,26166074052819
10 -1,25000000000000 2,41852492988996
11 -1,16666666666667 2,66030686900851
12 -1,08333333333333 2,78648170473390
13 -1 2,99693058020358
14 -0,916666666666667 3,16283825140780
15 -0,833333333333334 3,39269026688311
16 -0,750000000000000 3,51105359278980
17 -0,666666666666667 3,78357672286345
18 -0,583333333333333 3,98054225334309
19 -0,500000000000000 4,19547586823796
20 -0,416666666666667 4,41301724381983
21 -0,333333333333333 4,58846707716412
22 -0,250000000000000 4,81368646590987
23 -0,166666666666667 4,86500749948671
24 -0,0833333333333333 4,96053455349911
25 5,55111512312578e-17 4,95292053588978
26 0,0833333333333334 4,94791695097324
27 0,166666666666667 4,79540270361915
28 0,250000000000000 4,59755678409042
29 0,333333333333333 4,24304664861307
30 0,416666666666667 3,67621611337129
17

Least Square Methods

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Least Square Methods

Uploaded by

Copyright:

Available Formats

MA5171 Metoda Optimasi Lanjut Nicholas Malvin / 20119020

MA5171 Metoda Optimasi Lanjut

𝑡𝑡 = 10𝑡𝑡𝑡 − 5𝑡2𝑡𝑡 + 𝑡𝑡 , 𝑡𝑡 ~𝑡(0,0.03), 𝑡 = 1,2, … ,30

Figure 1. The data (𝑡𝑡 , 𝑡𝑡 ), 𝑡 = 1, … ,30 generated by MATLAB.

𝑡1 = −2, 𝑡2 = −2 + Δ𝑡, … , 𝑡𝑡 = −2 + (𝑡 − 1)Δ𝑡, … , 𝑡𝑡 = 0.5 − Δ𝑡

𝐹𝑥1 = ∑ 𝐹1 (𝑗)𝑓𝑗 (𝑥) , 𝐹𝑥2 = ∑ 𝑥4 𝑡𝑗 𝑓𝑗 (𝑥)

𝐹𝑥3 = ∑ 𝐹3 (𝑗)𝑓𝑗 (𝑥) , 𝐹𝑥4 = ∑ 𝐹4 (𝑗)𝑓𝑗 (𝑥)

Where the subscript denote partial derivative, and

𝐹𝑥1 𝑥1 = ∑ 𝐹1 (𝑗)2 + 𝑓𝑗 (𝑥)(−𝑥3 𝑡𝑗2 𝑒 𝑥1 𝑡𝑗 ) , 𝐹𝑥2 𝑥2 = ∑ 𝐹2 (𝑗)2 + 𝑓𝑗 (𝑥)(−𝑥4 𝑡𝑗2 𝑒 𝑥2 𝑡𝑗 )

𝐹𝑥1 𝑥2 = ∑ 𝐹1 (𝑗)𝐹2 (𝑗) , 𝐹𝑥1 𝑥3 = ∑ 𝑥3 𝑡𝑗 𝑒 2𝑥1 𝑡𝑗 − 𝑓𝑗 (𝑥)𝑡𝑗 𝑒 𝑥1 𝑡𝑗

𝐹𝑥1 𝑥4 = ∑ 𝐹1 (𝑗)𝐹4 (𝑗) , 𝐹𝑥2 𝑥3 = ∑ 𝐹2 (𝑗)𝐹3 (𝑗),

𝐹𝑥2 𝑥4 = ∑ 𝑥4 𝑡𝑗 𝑒 2𝑥2 𝑡𝑗 − 𝑓𝑗 (𝑥)𝑡𝑗 𝑒 𝑥2 𝑡𝑗

So that we can write the Hessian of 𝐹 as follows :

∇2 𝐹 = [𝐻𝑖𝑗 ], 𝐻𝑖𝑗 = 𝐹𝑥𝑖 𝑥𝑗 , (𝑖, 𝑗) ∈ {1,2,3,4} × {1,2,3,4}

II. Line Search Methods

 Method 1 := Steepest Descent

2.2 Steepest Descent Method

 Fitted curve : 𝑦(𝑡) = 13.4396𝑒 1.0994𝑡 − 8.4729𝑒 1.7459𝑡

2.3 Hybrid I (Steepest-Newton) Method

III. Other Methods

3.2 Dogleg Method

 Fitted curve : 9.3326𝑒 0.9729𝑡 − 4.3451𝑒 2.077𝑡

3.4 Hybrid II (LM-QN) Method

 SSE : 𝐹(𝑥 ∗ ) = 0.0111

%% Generate data (t,y)

%% Line Search Procedure

for method=1:3 %1(Steepest)/2(SN_Hybrid)/3(Gauss-Newton)

%Initial guess point

%Determine step size (Wolfe Condition)

function Ja=J(x,t) %Jacobian

function z=F(x,t,y) %Objective function (SSE)

function z=gf(x,t,y) %Gradient of F

function h=hess(x,t,y) %Hessian of F

function p=phi(a,xk,pk,t,y) %phi(a)=f(xk+a*pk)

%SUCCESIVE QUADRATIC INTERPOLATION

A.2 Complete MATLAB code for non-linesearch methods :

%% Generate data (t,y)

%% Applying Multiple methods

%Initial guess point

if method==1 %DOGLEG with all eps1,eps2,eps3=0.01

if method==3 %Hybrid (LM-QN)

function z=F(x,t,y) %Objective function (SSE)

function z=gf(x,t,y) %Gradient of F

Saved data (data_v2.mat) :

You might also like