Professional Documents
Culture Documents
Nonlinear Least Squares Problems: This Lecture Is Based On The Book P. C. Hansen, V. Pereyra and G. Scherer
Nonlinear Least Squares Problems: This Lecture Is Based On The Book P. C. Hansen, V. Pereyra and G. Scherer
Non-linearity
M (x1 , x2 , t) = x1 e−x2 t
we have:
• ∂M/∂x1 = e−x2 t which is independent of x1 ,
• ∂M/∂x2 = −t x1 e−x2 t which depends on x2 .
Thus M is a nonlinear model with the parameter x2 appearing
nonlinearly.
02610 Optimization and Data Fitting – Nonlinear Least-Squares Problems 3
2
∑
m
minx f (x) ≡ minx 1
2 ∥r(x)∥2 = minx 1
2 ri (x)2 ,
i=1
ri (x) = yi − M (x, ti ), i = 1, . . . , m .
Here yi are the measured data corresponding to ti .
The nonlinearity arises only from M (x, t).
02610 Optimization and Data Fitting – Nonlinear Least-Squares Problems 5
where
[ ] [ ]
∇ ri (x) kℓ
2
= − ∇ M (x, ti )
2
kℓ
∂ 2 M (x, ti )
= − , k, ℓ = 1, . . . , m .
∂xk ∂xℓ
02610 Optimization and Data Fitting – Nonlinear Least-Squares Problems 7
The first – and often dominant – term J(x)T J(x) of the Hessian
contains only the Jacobian matrix J(x), i.e., only first derivatives!
In the second term, the second derivatives are multiplied by the
residuals. If the model is adequate then the residuals will be small
near the solution and this term will be of secondary importance.
In this case one gets an important part of the Hessian “for free” if
one has already computed the Jacobian.
02610 Optimization and Data Fitting – Nonlinear Least-Squares Problems 8
= J(x∗ )† Cov(y)(J(x∗ )† )T .
Newton’s method
If f (x) is twice continuously differentiable then we can use
Newton’s method to solve the nonlinear equation
∇f (x) = J(x)T r(x) = 0
which provides local stationary points for f (x). This version of the
Newton iteration takes the form, for k = 0, 1, 2, . . .
( 2 )−1
xk+1 = xk − ∇ f (xk ) ∇f (xk )
( )−1
= xk − J(xk ) J(xk ) + S(xk )
T
J(xk )T r(xk ),
where S(xk ) denotes the matrix
∑
m
S(xk ) = ri (xk )∇2 ri (xk ).
i=1
Note that in the full-rank case this is actually the normal equations
for the linear least squares problem
2
min J(xk ) ∆xk − (−r(xk ))
2 .
GN
∆xGN
k
f (xk + αk ∆xGN
k ) < f (xk ) + c1 αk ∇f (xk )T ∆xGN
k
Very similar to G-N, except that we replace the line search with a
trust-region strategy where the norm of the step is limited.
2
min ∥J(xk ) ∆x + r(xk )∥2 subject to ∥∆x∥2 ≤ bound.
Algorithm: Levenberg-Marquardt
• Start with the initial point x0 and iterate for k = 0, 1, 2, . . .
• At each step k choose the Lagrange parameter λk .
• Solve the linear LSQ problem
2
J(xk ) −r(xk )
min
1
/
∆x −
∆x
2
λk I 0
2
x2
x3
0.2 0.2 0.2
0.1 0.1
0
0 0.5 0 2 4 0 2 4
x2 x1 x1
x2
x3
0.2 0.2 0.2
0.1 0.1
0
0 0.5 0 2 4 0 2 4
x2 x1 x1
02610 Optimization and Data Fitting – Nonlinear Least-Squares Problems 20