You are on page 1of 68

Applied Numerical Optimization

Prof. Alexander Mitsos, Ph.D.

Basic solution methods for unconstrained problems


Solution Methods for Unconstrained Optimization min 𝑓(𝒙)
𝒙∈𝑅 𝑛

Unconstrained
optimization methods

Direct methods Indirect methods

2 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Indirect Methods – Concept

• First-order necessary conditions

𝜕𝑓
ቤ = 0 = 𝑔1 (𝒙) nonlinear system of
𝜕𝑥1 𝒙
equations
𝜕𝑓
ቤ = 0 = 𝑔2 (𝒙) 𝒈 𝒙 =𝟎
𝜵𝑓 𝒙 = 𝟎 ⇔ 𝜕𝑥2 𝒙 ⇔

𝜕𝑓
ቤ = 0 = 𝑔𝑛 (𝒙)
𝜕𝑥1 𝒙

• The optimal solution is found by solving the system of equations analytically or numerically
(e.g., by Newton’s method).

• Differentiation and solution of the system of equations is challenging for complex problems!

3 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
analytically or numerically

4 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Direct Methods – Concept

Idea: Construct a convergent sequence of {𝒙(𝑘) }∞


𝑘=1 , which fulfills the following conditions:

∃𝑘ത ≥ 0: 𝑓 𝒙 𝑘+1
<𝑓 𝒙 𝑘
∀𝑘 > 𝑘ത and lim 𝒙(𝑘) = 𝒙∗ ∈ 𝑅𝑛
𝑘→∞

𝒙(2)
𝒙(1) 𝒙(3)
𝒙(4)
𝒙(0) 𝒙(5)
𝒙∗

 = 𝑅𝑛

5 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Definition: Rate of Convergence

Idea: Construct a convergent sequence of {𝒙(𝑘) }∞


𝑘=1 , which fulfills the following conditions:

∃𝑘ത ≥ 0: 𝑓 𝒙 𝑘+1
<𝑓 𝒙 𝑘
∀𝑘 > 𝑘ത and lim 𝒙(𝑘) = 𝒙∗ ∈ 𝑅𝑛
𝑘→∞

Rate of convergence:
• Linear: if there exists a constant 𝐶 ∈ (0,1), such that for sufficiently large 𝑘 :

𝑘+1
𝒙 − 𝒙∗ ≤ 𝐶 𝒙 𝑘
− 𝒙∗

• Order 𝑝 (often 𝑝 = 2): if there exists a constant 𝑀 > 0, such that

𝑘+1 𝑝
𝒙 − 𝒙∗ ≤ 𝑀 𝒙 𝑘
− 𝒙∗

• Superlinear: if there exists a sequence 𝑐𝑘 converging to zero, i.e., lim 𝑐𝑘 = 0,


𝑘→∞
such that
𝑘+1
𝒙 − 𝒙∗ ≤ 𝑐𝑘 𝒙 𝑘
− 𝒙∗

6 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Optimal solution is found by directly Optimal solution is found by solving the system
improving the objective function via of equations (optimality conditions):
iterative descent. 𝜵𝑓 𝒙 = 𝟎
analytically or numerically.

7 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Direct vs Indirect: Nomenclature not consistent in Literature

• Throughout class we use "direct" and "indirect":


 "indirect methods": 1. set up optimality conditions and then 2. try to solve the system of equations (or equations and
inequalities)
 "direct methods": directly aim to improve objective function (or objective function and constraints). These methods hope
to converge to optimality conditions.
• In the literature there are many alternative uses of the word, including
 exactly the opposite than ours
 "direct": without the use of derivatives, "indirect": using derivatives
 only in the context of dynamic optimization problems:
 "direct": first convert to nonlinear program
 "indirect": first set up optimality conditions
 only in the context of constrained problems
 "direct": only feasible iterates
 "indirect": infeasible iterates are allowed

8 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Iterative descent Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
Line-search Trust-region analytically or numerically.
approach approach

9 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• What are direct vs indirect methods?


• Which direct methods did we learn?
• Which convergence rates exist? Why is the convergence rate important?

10 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Line search: basic idea and step length


Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Iterative descent Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
Line-search Trust-region analytically or numerically.
approach approach

2 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Direct Methods – Line-Search Approach
Definition (descent direction):
𝑘 𝑘 𝑇
A vector 𝒑 is called descent direction at 𝒙 , if 𝜵𝑓 𝒙 𝒑 < 0 holds.

Basic algorithm (line-search):


1. Choose a descent direction, 𝒑 𝑘
, such that
𝑘
𝑘 𝑇 𝛼𝑘 𝒑
𝜵𝑓 𝒙 𝒑𝑘 <0
𝑘+1
𝒙
𝑘
2. Determine a step length 𝛼𝑘 𝒑
𝑘
𝒙
3. Set 𝒙 𝑘+1 =𝒙 𝑘 + 𝛼𝑘 𝒑 𝑘  = 𝑅𝑛

Open issues:
𝑘
• Determination of the descent direction 𝒑 ?
• Calculation of the step length 𝛼𝑘 ?

3 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Calculation of Step Length 𝜶𝒌
The exact line search strategy:
𝑘
1. Define the one-dimensional function along the descent direction 𝒑 :

𝜙 𝛼 = 𝑓(𝒙 𝑘 𝑘
+ 𝛼𝒑 )

2. Solve the one-dimensional minimization problem


min 𝜙(𝛼)
𝛼>0

Remarks
1. Naively speaking it would be ideal to globally minimize 𝜙(𝛼). Generally, it is very expensive to find this
solution. It is not necessarily a good idea since the search is one-dimensional
2. One could also search for some local solution. But this is often also too expensive (need function and/or
gradient evaluations at a number of points).
𝑘+1
3. Practical strategies (so-called non-exact LS): find 𝛼 such that 𝑓(𝒙 ) becomes as small as possible
with minimal effort.
4 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Practical Line-Search Strategies
6
20
4
4 7 1.51
2.81
10

x2
2 1 1.5 2.81
0.1
0 20

-2
-3 -2 -1 0 x1 1 2 3 4
40 1

𝑘 𝑘
30
𝜙 𝛼 = 𝑓(𝒙 + 𝛼𝒑 )

20

10

-10
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

5 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Armijo Condition
Theorem[1]:
𝑘
Let 𝑓 be continuously differentiable, 𝒑 a descent direction, and let 𝑐1 ∈ (0,1) be given. Then there exists an
𝛼 > 0, such that for 𝜙 𝛼 ≔ 𝑓(𝒙 𝑘 𝑘
+ 𝛼𝒑 ), the condition 𝜙 𝛼 ≤ 𝜙 0 + 𝛼𝑐1 𝜙′ 0 holds.

Geometrical interpretation:
𝜙 𝛼
𝜙 0 𝑐1 =0

𝑐1 = 0.1

α
𝑐1 = 0.25
𝑇 𝑐1 = 1 𝑐1 = 0.5
𝜙 ′ 0 = 𝜵𝑓 𝒙 𝑘
𝒑 𝑘
<0
𝜙𝑙 𝛼 = 𝜙 0 + 𝛼𝑐1 𝜙′(0)
feasible domain for 𝑐1 = 0.25

6 of 10 Applied Numerical Optimization [1] Nocedal J., Wright S. J., Numerical Optimization, 2nd Edition, Springer, 2006
Prof. Alexander Mitsos, Ph.D.
Simple Line-Search Algorithm
Remarks:
1. The choice of a step length, which fulfills the Armijo condition guarantees the descent of 𝑓:

′ 𝑘 𝑇 𝑘 𝑘
𝜙 0 = 𝜵𝑓 𝒙 𝒑 < 0 (𝒑 is a descent direction)

𝜙 𝛼 ≤ 𝜙 0 + 𝛼𝑐1 𝜙′(0) ⇒𝜙 𝛼 <𝜙 0


⇒ a descent is guaranteed!

2. The choice of 𝑐1 is crucial:


• Large 𝑐1 leads to small values of 𝛼, such that 𝒙 𝑘+1 ≈ 𝒙 𝑘 .
• Small 𝑐1 potentially results in small reduction of 𝑓 and therefore slower convergence

Simple line-search algorithm:


choose 𝛼1 > 0; 𝜌, 𝑐1 ∈ (0,1)
set 𝛼 = 𝛼1
repeat 𝛼 ← 𝜌𝛼 until 𝜙 𝛼 ≤ 𝜙 0 + 𝛼𝑐1 𝜙′(0)

7 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Improved Line-Search Algorithm

choose 𝛼0 > 0 and 𝑐1 ∈ (0,1)

if 𝜙 𝛼0 ≤ 𝜙 0 + 𝛼0 𝑐1 𝜙′(0) STOP, else

find a better 𝛼 ∈ (0, 𝛼0 ) through quadratic interpolation of available data:

𝜙′(0)𝛼02
𝛼1 = −
2[𝜙 𝛼0 − 𝜙 0 − 𝜙 ′ 0 𝛼0 ]

if 𝜙 𝛼1 ≤ 𝜙 0 + 𝛼1 𝑐1 𝜙′(0) STOP, else

find a better 𝛼 ∈ (0, 𝛼1 ) through cubic interpolation of available data (how ?)

repeat the procedure of cubic interpolation, until the condition is fulfilled

8 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Wolfe Conditions
Theorem[1]:
Let 𝑓 be continuously differentiable, 𝒑 𝑘 a descent direction and 𝑐1 ∈ (0,1), 𝑐2 ∈ (𝑐1 , 1). Then, there exists an
𝛼 > 0, such that 𝜙 𝛼 ≤ 𝜙 0 + 𝛼𝑐1 𝜙′(0)
𝜙′(𝛼) ≥ 𝑐2 𝜙′(0) (slope condition)

Geometric interpretation:  guarantee minimum step length!

𝜙 𝛼
𝜙 0 Relevance:

𝑐2 𝜙′(0) 𝑐1 𝜙′(0) Wolfe Conditions promote


convergence to a stationary point[1]

𝜙′(0) α
feasible values

9 of 10 Applied Numerical Optimization [1] Nocedal J., Wright S. J., Numerical Optimization, 2nd Edition, Springer, 2006
Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Explain the basic ideas of the line-search method.


• What is a descent direction? How it is defined?
• Explain the Armijo-rule and its potential drawbacks?
• Explain the Wolfe conditions and the advantage compared to Armijo’s rule.

10 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Line search: simple directions


Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Iterative descent Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
Line-search Trust-region analytically or numerically.
approach approach

2 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Determination of a Descent Direction: A Toolbox
Line-search approaches differ from each other with respect to the determination of descent direction and step
length.

Line-search
approach

Determine the direction 𝒑 𝑘

Steepest-descent …
direction special cases
Conjugate Newton’s-step
directions direction Extra work:
𝑘 prove that it
Many gradient methods use a symmetric positive definite matrix 𝑫 and calculate
guarantees
𝑘+1 𝑘 𝑘 𝑘
𝒙 =𝒙 − 𝛼𝑘 𝑫 𝜵𝑓(𝒙 ) descent!
3 of 11 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Determination of a Descent Direction: A Toolbox
Line-search approaches differ from each other with respect to the determination of descent direction and step
length.

Line-search
approach

Determine the direction 𝒑 𝑘

Steepest-descent …
direction special cases
Conjugate Newton’s-step
directions direction
𝑘
Many gradient methods use a symmetric positive definite matrix 𝑫 and calculate
𝑘+1 𝑘 𝑘 𝑘
𝒙 =𝒙 − 𝛼𝑘 𝑫 𝜵𝑓(𝒙 )

4 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Steepest-Descent Direction (1)
𝑘 𝑘 𝑘 𝑘 𝑇 𝑘
Taylor series: 𝑓 𝒙 + 𝛼𝒑 =𝑓 𝒙 + 𝛼𝜵𝑓 𝒙 𝒑 + 𝑂(𝛼 2 )

The rate of change of 𝑓 at 𝒙 𝑘 𝑘


along the direction 𝒑 is the coefficient in the linear term:
𝑘 𝑇 𝑘
𝜵𝑓 𝒙 𝒑
𝑘
The unit direction 𝒑 with the highest rate of change is the solution of the following problem

𝑘 𝑇 𝑘 𝑘
min 𝜵𝑓 𝒙 𝒑 s. t. 𝒑 =1
𝒑 𝑘 ∈𝑅𝑛

𝑘 )𝑇 𝒑 𝑘 𝑘 𝑘
Note that 𝜵𝑓(𝒙 = 𝜵𝑓(𝒙 ) 𝒑 cos(𝜃)

The solution of the problem is achieved for cos(𝜃) = −1 ⇒ 𝜃 = 𝜋

⇒𝒑 𝑘 𝑘 𝑘
= −𝜵𝑓(𝒙 )/ 𝜵𝑓(𝒙 )
𝑘
The choice of 𝑫 is the identity matrix 𝑰.

5 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Steepest-Descent Direction (2)

𝑘 𝑇 𝑘
descent direction: 𝜵𝑓 𝒙 𝒑 <0 𝑘 𝑘
𝑓 𝒙 >𝐶 𝑓 𝒙 =𝐶
𝑓 𝒙 𝑘
<𝐶
𝑘 𝑇 1
𝜵𝑓 𝒙 𝒑 >0
𝑘
𝜵𝑓 𝒙 𝒑 1

𝑘
𝜵𝑓(𝒙 )𝑇 𝒑 𝑘
= 𝜵𝑓(𝒙 𝑘
) 𝒑 𝑘
cos(𝜃)
𝑘 𝑘 𝑇 2
𝒙 𝜵𝑓 𝒙 𝒑 <0
2
𝒑

−𝜵𝑓 𝒙 𝑘

6 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Method of Steepest-Descent

Algorithm:
choose 𝒙 0

for k=0,1,…

𝜵𝑓(𝒙 𝑘
if ) ≤ 𝜀 stop, else
𝒙∗
𝑘 𝑘
set 𝒑 = −𝜵𝑓(𝒙 )
determine the step length 𝛼𝑘 (e.g. 𝒙 𝑘+1
using the Armijo rule)
set 𝒙 𝑘+1 =𝒙 𝑘 + 𝛼𝑘 𝒑 𝑘
𝑘
𝒙
end for “well scaled” “poorly scaled”
Directions become perpendicular

Edgar T. F., Himmelblau D. M., Optimization of Chemical Processes,


7 of 11 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D. 2nd Edition, McGraw-Hill, 2001
Determination of a Descent Direction: A Toolbox
Line-search approaches differ from each other with respect to the determination of descent direction and step
length.

Line-search
approach

Determine the direction 𝒑 𝑘

Steepest-descent …
direction special cases
Conjugate Newton’s-step
directions direction
𝑘
Many gradient methods use a symmetric positive definite matrix 𝑫 and calculate
𝑘+1 𝑘 𝑘 𝑘
𝒙 =𝒙 − 𝛼𝑘 𝑫 𝜵𝑓(𝒙 )

8 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Newton’s Descent Direction
Quadratic approximation of 𝑓 at 𝒙 𝑘+1

𝑇 1 𝑇
𝑘+1 𝑘 𝑘 𝑘+1 𝑘 𝑘+1 𝑘
𝑚 𝒙 =𝑓 𝒙 + 𝛁𝑓 𝒙 𝒙 −𝒙 + 𝒙 −𝒙 𝜵2 𝑓(𝒙 𝑘
)(𝒙 𝑘+1
−𝒙 𝑘
)
2
linear approximation of
𝑓 at 𝑥𝑘
(1st nec. opt. cond. for 𝑚) 𝑓 = const

𝑘+1 𝑘
0 = 𝜵𝑚 𝒙 = 𝜵𝑓 𝒙 + 𝜵2 𝑓 𝒙 𝑘 𝒙 𝑘+1
−𝒙 𝑘
𝑥* 𝒙 𝑘
𝑝𝑘𝑁𝑒𝑤𝑡𝑜𝑛
𝑘+1 𝑘 −1
⇒𝒙 =𝒙 − 𝜵2 𝑓 𝒙 𝑘
𝜵𝑓(𝒙 𝑘
) 𝒙 𝑘+1
𝑆𝑡𝑒𝑒𝑝 𝐷𝑒𝑐.
𝑝𝑘

𝑘 −1 minimum of the quadratic


⇒𝒑 = − 𝜵2 𝑓 𝒙 𝑘
𝜵𝑓 𝒙 𝑘
approximation
quadratic approximation of
⇒ 𝛼𝑘 = 1 𝑓 at 𝒙 𝑘
Fig.: Comparison of steepest-descent with Newton‘s
𝑘
The choice of 𝑫 is the inverse of the Hessian method from viewpoint of objective function approximation

9 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Newton’s Method

Algorithm: Remarks:
choose 𝒙 0 1. line-search?
𝑘+1 𝑘
for k=0,1,… 𝒙 =𝒙 + 𝛼𝑘 𝒑 𝑘
𝑘 −1
if 𝜵𝑓(𝒙 𝑘
) ≤ 𝜀 stop, else 𝒑 = − 𝜵2 𝑓 𝒙 𝑘
𝜵𝑓 𝒙 𝑘

𝛼𝑘 = 1
𝑘 −1
set 𝒑 = − 𝜵2 𝑓 𝒙 𝑘
𝜵𝑓 𝒙 𝑘

2. (+) locally quadratic convergence, if 𝒙 𝑘 close to 𝒙∗


set 𝒙 𝑘+1
=𝒙 𝑘
+𝒑 𝑘 (−) 2nd derivatives & inversion (expensive for large
system of equations)
end for
3. If 𝑓 is quadratic, the algorithm converges in one iteration.

4. Convergence to a minimum is not guaranteed! Why?

10 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Explain the basic ideas of the line-search method.


• Explain the steepest descent method.
• What additional requirements puts Newton‘s method on the objective function?
• Explain the Newton direction. Is it better than other descent directions? Why is the Newton step-length equal to
one?

11 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Line search: complexity and examples


Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Iterative descent Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
Line-search Trust-region analytically or numerically.
approach approach

2 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Complexity Analysis

Nesterov (2004) proves: “In general, optimization problems are unsolvable” *

Let 𝐹 denote a class of problems, e.g., Lipschitz-continuous functions with Lipschitz-constant 𝐿, i.e.,
|𝑓 𝒙 − 𝑓 𝒚 | < 𝐿 𝒙 − 𝒚 , 𝐿 is assumed to be fixed for all 𝑃 ∈ 𝐹.

“Performance of a method 𝑀 on a problem 𝑃 ∈ 𝐹 is the total amount of computational effort that is required by 𝑀
to solve 𝑃.” *

“To solve the problem means to find an approximate solution to 𝑃 with an accuracy ε > 0.” *

For unconstrained problems, the accuracy ε > 0 can be defined as the norm of the objective’s gradient.

* Yurii Nesterov, Introductory Lectures on Convex Optimization – A Basic Course, Kluwer Academic Publishers, (2004)

3 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Complexity Analysis – Measuring Computational Effort

Unit of measurement: Query to an oracle


It is assumed that the objective function is unknown and that the algorithm solves the optimization problem
by querying an oracle for local information about the unknown objective function. An oracle is simply a “black
box” capable of answering any query of the form:
𝒙1
• Given 𝒙 return the value 𝑓 𝒙 (Zeroth-order oracle) Algorithm 𝑓 𝒙1
𝒙2
f
• Given 𝒙 return 𝑓 𝒙 and gradient 𝜵𝑓 𝒙 (First-order oracle)
𝑓 𝒙2
Black-box =
• Given 𝒙 return 𝑓 𝒙 , 𝜵𝑓 𝒙 and Hessian 𝜵2 𝑓 𝒙 (Second-order oracle) “Oracle”

Analytical Complexity: The smallest number of queries to an oracle to solve Problem 𝑃 to accuracy ε. [1]

Arithmetical Complexity: The smallest number of arithmetic operations (including work of the oracle and
work of method), required to solve problem 𝑃 up to accuracy ε. [1]

4 of 11 Applied Numerical Optimization [1] Yurii Nesterov, Introductory Lectures on Convex Optimization – A Basic Course,
Prof. Alexander Mitsos, Ph.D. Kluwer Academic Publishers, 2004.
Analytical Complexity of Steepest Descent Method
Algorithm:
0
choose 𝒙
for k=0,1,…
𝑘
if 𝜵𝑓(𝒙 ) ≤ 𝜀 stop, else
set 𝒑 𝑘 = −𝜵𝑓 𝒙 𝑘
determine the step length 𝛼𝑘 (e.g. using the Armijo rule)
𝑘+1 𝑘 𝑘
set 𝒙 =𝒙 + 𝛼𝑘 𝒑
end for

• Problem class: 𝑓 is continuously differentiable and 𝜵𝑓 𝒙 is Lipschitz-continuous with fixed Lipschitz


constant 𝐿, i.e., 𝜵𝑓 𝒙 − 𝜵𝑓 𝒚 < 𝐿 𝒙 − 𝒚

• First-order oracle: returns 𝑓 𝒙 and gradient 𝜵𝑓 𝒙

1
• Worst-case analytical complexity (queries to oracle): 𝑂
𝜀2

5 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Rosenbrock Function

Contour lines of f(x)


min2 𝑓 𝒙 = 100(𝑥2 − 𝑥12 )2 +(1 − 𝑥1 )2
𝒙∈𝑅

• Solution point is 𝒙 *=(1,1)T - why?

𝑥2
𝑓 𝒙

[1]

𝑥2
𝑥1

𝑥1
[1] https://commons.wikimedia.org/w/index.php?curid=9941741
6 of 11 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Illustration of Convergence (1)

min2 𝑓 𝒙 = 100(𝑥2 − 𝑥12 )2 +(1 − 𝑥1 )2


𝒙∈𝑅

• Steepest descent with Armijo line-search

𝑓 𝒙

[1]

𝑥2
𝑥1

[1] https://commons.wikimedia.org/w/index.php?curid=9941741
7 of 11 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Illustration of Convergence (2)

min2 𝑓 𝒙 = 100(𝑥2 − 𝑥12 )2 +(1 − 𝑥1 )2


𝒙∈𝑅

• Steepest descent with Wolfe line-search

𝑓 𝒙

[1]

𝑥2
𝑥1

[1] https://commons.wikimedia.org/w/index.php?curid=9941741
8 of 11 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Analytical Complexity of Newton’s Method

• Problem class: 𝑓 is twice continuously differentiable and 𝜵2 𝑓 𝒙 is Lipschitz-continuous with fixed Lipschitz
constant 𝐿, i.e., 𝜵2 𝑓 𝒙 − 𝜵2 𝑓 𝒚 < 𝐿 𝒙 − 𝒚

• Second-order oracle: returns 𝑓 𝒙 , 𝜵𝑓 𝒙 and Hessian 𝜵2 𝑓 𝒙

•  Quadratic approximation of 𝑓 around 𝒙 𝑘


, line search
𝑇
𝑘+1 𝑘 𝑘 𝑇 𝑘+1 𝑘
1 𝑘+1 𝑘
𝑚 𝒙 =𝑓 𝒙 + 𝛁𝑓 𝒙 𝒙 −𝒙 + 𝒙 −𝒙 𝜵2 𝑓( 𝒙 𝑘
)( 𝒙 𝑘+1
− 𝒙(𝑘) )
2
1
• Worst-case analytical complexity: 𝑂 , 1 > 𝜏 > 0, arbitrary but fixed for a given problem
𝜀2−𝜏

• Quadratic approximation of 𝑓 around 𝒙 𝑘 with cubic regularization, line search

𝑘+1 𝑘+1
1 𝑘+1 𝑘 3
𝑚𝑟𝑒𝑔𝑢𝑙𝑎𝑟𝑖𝑧𝑒𝑑 𝒙 =𝑚 𝒙 + 𝜎𝑘 𝒙 −𝒙
3
1
• Worst-case analytical complexity: 𝑂
𝜀3/2

9 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Illustration of Convergence (3)

min2 𝑓 𝒙 = 100(𝑥2 − 𝑥12 )2 +(1 − 𝑥1 )2


𝒙∈𝑅

• Modified Newton (Armijo line-search;


if Hessian is < 0 switch to steepest descent)

𝑓 𝒙

[1]

𝑥2
𝑥1

[1] https://commons.wikimedia.org/w/index.php?curid=9941741
10 of 11 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Check Yourself

• What does the term complexity analysis refer to?


• What is the difference of analytical and arithmetic complexity
• Which method has better analytical complexity: Newton vs. steepest descent?

11 of 11 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Line search: advanced directions


Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Iterative descent Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
Line-search Trust-region analytically or numerically.
approach approach

2 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Determination of a Descent Direction: A Toolbox
Line-search approaches differ from each other with respect to the determination of descent direction and
step length.

Line-search
approach

Determine the direction 𝒑 𝑘

Steepest-descent …
direction special cases
Conjugate Newton’s-step
directions direction
variants

3 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Inexact Newton Method (1)

Define: 𝑓 (𝑘) = 𝑓 𝒙 𝑘 and 𝒈 𝑘 ≔ 𝜵𝑓(𝒙 𝑘 ) and 𝑯 𝑘 ≔ 𝜵2 𝑓 𝒙 𝑘

From Newton’s method:


𝑘+1 𝑘 𝑘
𝒙 =𝒙 +𝒑
𝜵2 𝑓 𝒙 𝑘
𝒑 𝑘
= −𝜵𝑓(𝒙 𝑘
)⇒𝑯 𝑘
𝒑 𝑘 = −𝒈 𝑘

Idea:
𝑘
• The linear equation system, 𝑯 𝒑 𝑘 = −𝒈 𝑘
, is solved approximately by an iterative method, e.g., by CG
𝑘
(conjugate gradients) if 𝑯 is positive definite.

Comments:
• LU- or Cholesky-decomposition – very high computational effort!
• Large errors occur for ill-conditioned problems.
• The exact solution is not needed.

4 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Inexact Newton Method (2)

Newton-CG method: Algorithm:


choose 𝒙 0

for k=0,1,…
Newton’s method if 𝜵𝑓(𝒙 𝑘
) ≤ 𝜀 stop, else

CG method calculate 𝒈 𝑘
≔ 𝜵𝑓(𝒙 𝑘
) and 𝑯 𝑘
≔ 𝜵2 𝑓 𝒙 𝑘
to determine 𝒑 𝑘
approximately solve 𝑯 𝑘
𝒑 𝑘
= −𝒈 𝑘
for 𝒑 𝑘
with CG method

𝑘+1 𝑘 𝑘
set 𝒙 =𝒙 + 𝛼𝑘 𝒑
end for

Some line search strategy is needed. (Why?)


5 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Modified Newton Method

Motivation: What if
𝑘
•𝑯 𝑘
is singular or almost singular (poorly conditioned)? 𝑯 ≔ 𝜵2 𝑓 𝒙 𝑘

𝑘
•𝑯 is not positive definite?

Idea Approximations
𝑩 𝑘 =𝑯 𝑘 +𝑬 𝑘
𝑘 𝑘 𝑘
replace 𝑯 by the approximation 𝑩 ≈𝑯 with 𝑬 𝑘 = 𝜏𝑘 𝑰, 𝜏𝑘 ≥ 0 smartly chosen
converges to steepest descent for 𝜏𝑘 → ∞
𝑩 𝑘 𝒑 𝑘 = −𝒈 𝑘

𝑘+1 𝑘
𝒙 =𝒙 + 𝛼𝑘 𝒑 𝑘 , (k from the line-search)

Alternatives exist, e.g., see [1]

nd
6 of 10 Applied Numerical Optimization [1] Nocedal J., Wright S. J., Numerical Optimization, 2 Edition, Springer, 2006, Chapter 3
Prof. Alexander Mitsos, Ph.D.
Quasi-Newton Methods (1)

𝑘
Idea: Reduce complexity by simplified calculation of 𝑯 (Davidon):
𝑘
𝑘 𝑘 𝑯 ≔ 𝜵2 𝑓 𝒙 𝑘
• replace 𝑯 by an approximation 𝑩 .
𝑘
• instead of calculating 𝑩 , we look for a simple update using information 𝑘 𝑘
𝒈 ≔ 𝜵𝑓(𝒙 )
from the last iterations.
𝑓 (𝑘) ≔ 𝑓(𝒙 𝑘
)
Approach:
𝑘 𝑘 𝑇 1
• Consider quadratic approximation of f at 𝒙 , 𝑚(𝑘) 𝒑 = 𝑓 (𝑘) + 𝒈 𝒑 + 2 𝒑𝑇 𝑩 𝑘
𝒑.
−1
• First order optimality condition: 𝒑 𝑘 = −𝑩 𝑘 𝒈 𝑘
symmetric positive definite
• By convexity necessary and sufficient for minimization of 𝑚(𝑘) 𝒑 .

𝑘+1 𝑘
• Construct the quadratic approximation at 𝒙 =𝒙 + 𝛼𝑘 𝒑 𝑘 ,
𝑇 1 𝑇 𝑘+1
𝑚(𝑘+1) 𝒑 = 𝑓 (𝑘+1) + 𝒈 𝑘+1
𝒑+ 𝒑 𝑩 𝒑
2
• What conditions must 𝑩 𝑘+1 satisfy?
7 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Quasi-Newton Methods (2)
Conditions on 𝑩 𝑘+1
:
1. Gradient of 𝑚(𝑘+1) at 𝒙 𝑘
and 𝒙 𝑘+1
must be equal to gradient of 𝑓.

𝜵𝑚(𝑘+1) 𝒑 = 𝒈 𝑘+1
+𝑩 𝑘+1
𝒑
𝑘+1 At 𝒙 = 𝒙 𝑘 𝑘
At 𝒙 = 𝒙 , 𝒑=0 , 𝒑 = −𝛼𝑘 𝒑
We want 𝜵𝑚(𝑘+1) 0 = 𝒈 𝑘+1
We want 𝜵𝑚(𝑘+1) −𝛼𝑘 𝒑 𝑘
=𝒈 𝑘

𝑘+1 𝑘+1
⇒𝒈 − 𝛼𝑘 𝑩 𝒑𝑘 =𝒈 𝑘

𝑘+1 𝑘 𝑘+1 𝑘
Automatically satisfied ⇒𝑩 𝛼𝑘 𝒑 =𝒈 −𝒈
=𝒙 𝑘+1 −𝒙 𝑘

𝑘+1 𝑘 𝑘 𝑘 𝑘+1 𝑘 𝑘 𝑘+1 𝑘


⇒𝑩 𝒔 =𝒚 , where 𝒔 =𝒙 −𝒙 and 𝒚 =𝒈 −𝒈

𝑘+1 𝑘 𝑇 𝑘+1 𝑘 𝑘 𝑘 𝑇 𝑘
2. Since, 𝑩 is symmetric positive definite: 𝒔 𝑩 𝒔 > 0, ∀𝒔 ≠0⇒𝒔 𝒚 >0

Wolfe conditions (line-search) guarantee these constraints for all 𝑓, even when 𝑓 is non-convex.
8 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Quasi-Newton Methods (3)
𝑘+1
Conditions on 𝑩 :
𝑘+1
𝑩 𝒔𝑘 =𝒚 𝑘
gives many solutions for 𝑩 𝑘+1

• Unique solution: 𝑩 𝑘+1 should be close to 𝑩 𝑘

min 𝑩 − 𝑩 𝑘 ← weighted Frobenius-Norm


𝑩 𝑊
𝐴 = 𝑊 1/2 𝐴𝑊 1/2 , for any 𝑊 s.t. 𝑊𝑦𝑘 = 𝑠𝑘
s. t. 𝑩𝑇 = 𝑩 𝑊 𝐹
𝑛 𝑛
2 𝑛×𝑛 2 2
𝑩𝒔 𝑘 =𝒚 𝑘 ∙ 𝐹: 𝑅 → 𝑅≥0 , 𝐶 𝐹 = ෍ ෍ 𝑐𝑖𝑗
𝑖=1 𝑗=1

𝑘+1
1 𝑘 𝑘 𝑇 𝑘
1 𝑘 𝑘 𝑇
1 𝑘 𝑘 𝑇
⇒𝑩 = 𝑰− 𝒚 𝒔 𝑩 𝑰− 𝒔 𝒚 + 𝒚 𝒚 → DFP formula
𝒚 𝑘 𝑇𝒔 𝑘 𝒚 𝑘 𝑇𝒔 𝑘 𝒚 𝑘 𝑇𝒔 𝑘

𝑘+1 −1
1 𝑘 𝑘 𝑇 𝑘 −1
1 𝑘 𝑘 𝑇
1 𝑘 𝑇
⇒𝑩 = 𝑰− 𝒔 𝒚 𝑩 𝑰− 𝒚 𝒔 + 𝒔 𝒔𝑘 → BFGS formula
𝒚 𝑘 𝑇𝒔 𝑘 𝒚 𝑘 𝑇𝒔 𝑘 𝒚 𝑘 𝑇𝒔 𝑘

9 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Explain the inexact Newton method.


• What is the main idea of the modified and quasi-Newton methods? Why are these methods advantageous?
• Why is it necessary to introduce a step-length control mechanism (line-search) into modified and quasi-Newton
methods?

10 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Parameter estimation
Determination of a Descent Direction: A Toolbox
Line-search approaches differ from each other with respect to the determination of descent direction and step
length.

Line-search
approach

Determine the direction 𝒑 𝑘

Steepest-descent …
direction special cases
Conjugate Newton’s-step
directions direction
variants

2 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Regression Problems: Least-Squares Formulation
Example:
Consider a batch reactor with the reaction A → B at constant temperature
TR . The reagent concentration 𝑐𝐴 is measured at time instants 𝑡𝑗 .
The reaction is of first order, therefore we can write the analytic solution:
𝑐𝐴 TR
𝑑𝑐𝐴
ቤ = −𝑘 ∙ 𝑐𝐴 (𝑡) → 𝑐𝐴 (𝑡) = 𝑐𝐴 |𝑡=0 ∙ 𝑒 −𝑘𝑡 AB
𝑑𝑡 𝑡
The reaction constant 𝑘 and the reagent concentration at initial time 𝑐𝐴 (𝑡 = 0)
are unknown and should be determined from the measurements.
Optimization formulation uses 𝑥1 = 𝑐𝐴 𝑡 = 0 , 𝑥2 = 𝑘 : 2
measured values
𝑥2 ∙𝑡𝑗
𝑐𝐴, theoretical 𝑡𝑗 = 𝜑 𝒙, 𝑡𝑗 = 𝑥1 ∙ 𝑒 model 1.5 model
error
𝑐𝐴, measured 𝑡𝑗 = 𝑦𝑗 measurement cA 1
𝜀𝑗 = 𝑦𝑗 − 𝜑 𝒙, 𝑡𝑗 , 𝑗 = 1, … , 𝑚 residual (error) 0.5
1 2
1 2
min 𝜺(𝒙) 2 = Σ𝑗 𝑦𝑗 − 𝜙 𝒙, 𝑡𝑗 00 5 10 15 20
𝒙∈𝑅 2 2 2 𝑡
3 of 6 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Gauss-Newton Method
1 2 1 𝑇
• min2 𝑓 𝒙 = min2 2
𝜺(𝒙) 2 = min2 2
𝜺(𝒙) 𝜺(𝒙)
𝒙∈𝑅 𝒙∈𝑅 𝒙∈𝑅

• Define: 𝑱 𝒙 ≔ 𝜵𝜺(𝒙) ∈ 𝑅𝑚×2

⇒ 𝜵𝑓 𝒙 = 𝑱(𝒙)𝑇 𝜺(𝒙)
𝑚

⇒ 𝜵2 𝑓 𝒙 = 𝑱(𝒙)𝑇 𝑱 𝒙 + ෍ 𝜀𝑗 (𝒙)𝜵2 𝜀𝑗 (𝒙)


𝑗=1

• The Hessian can be approximated by the first term in case of almost linear problems (i.e., 𝜵2 𝜀𝑗 𝒙 = 0) or
good starting values (i.e., small 𝜀𝑗 (𝒙))

• Newton’s direction: 𝜵2 𝑓 𝒙 𝑘 𝒑𝑘 = −𝜵𝑓(𝒙 𝑘 )

𝑘 𝑇 𝑘 𝑘 𝑇
• With Hessian approximation: 𝑱 𝑱 𝒑 𝑘 = −𝑱 𝜺 𝑘

4 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Remarks on Gauss-Newton Method

𝑘 𝑘
• If 𝑱 has full-rank, 𝒑 is always a descent direction

2
𝒑 𝑘 𝑇 𝑘 𝑘 𝑇 𝑘 𝑇 𝑘 𝑘 𝑇 𝑘 𝑇 𝑘 𝑘 𝑘 𝑘
∙ 𝜵𝑓 𝒙 =𝒑 ∙𝑱 𝜺 = −𝒑 ∙𝑱 ∙𝑱 ∙𝒑 =− 𝑱 ∙𝒑 2
< 0,

𝑘 𝑘 𝑘 𝑇 𝑘
The inequality is strict unless 𝑱 ∙𝒑 =0⇔𝑱 𝜺 = 𝜵𝑓𝑘 = 0.  Optimum

• In descent-direction 𝒑 𝑘
, the step-length is determined as per the Wolfe-conditions

• For linear models the Jacobian 𝑱 matrix is constant.

• The condition of minimization corresponds to the normal equations


𝑘 𝑘
• If 𝑱 is singular or almost singular, the descent direction 𝒑 is, usually, not reliable. The method converges

very poorly. Quasi-Newton methods are therefore more efficient.

• It is a local method

5 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Where can least-squares problems be applied?


• When is a least-squares problem linear or nonlinear?
• What is the key idea of the Gauss-Newton method?

6 of 6 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.

Trust region method


Solution Methods for Unconstrained Optimization

Unconstrained
optimization methods

Direct methods Indirect methods


Iterative descent Optimal solution is found by solving the system
of equations (optimality conditions):
𝜵𝑓 𝒙 = 𝟎
Line-search Trust-region analytically or numerically.
approach approach

2 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Trust Region Method (1)

Idea:

𝑘
• Approximate 𝑓 at 𝒙 by the quadratic model function 𝑚(𝑘) :
𝑘 𝑇 1
𝑚(𝑘) 𝒑 = 𝑓 (𝑘) + 𝒈 𝒑 + 2 𝒑𝑇 𝑩 𝑘
𝒑

where 𝑓 (𝑘) = 𝑓(𝒙 𝑘


), 𝒈 𝑘
= 𝜵𝑓 𝒙 𝑘
and 𝑩 𝑘
is symmetric

• Taylor-series: the approximation error is small for small 𝒑

• For each iteration 𝑘 = 0,1, … choose a trust region radius ∆(𝑘)

• Solve the minimization problem: min 𝑚(𝑘) (𝒑) s. t. 𝒑 ≤ ∆(𝑘) and set 𝒑 𝑘
to the solution found
𝒑

• Set 𝒙 𝑘+1 =𝒙 𝑘 +𝒑 𝑘

3 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Trust Region Method (2)

How to update the radius Δ(𝑘) ?

• Compare the agreement between the model function 𝑚𝑘 and the objective function 𝑓 at the previous
iterations. Define contraction rate 𝜌𝑘 as:

𝑓 𝒙 𝑘 − 𝑓(𝒙 𝑘 + 𝒑 𝑘 ) actual reduction


𝜌𝑘 = =
𝑚(𝑘) 𝟎 − 𝑚(𝑘) (𝒑 𝑘 ) predicted reduction

As 𝑚𝑘 is minimized over a domain containing 0:


𝑚(𝑘) 𝟎 − 𝑚(𝑘) 𝒑 𝑘 >0

If 𝜌𝑘 < 0 ! reject this step (ascent)

If 𝜌𝑘 ≈ 1 ! increase the radius: good agreement

If 𝜌𝑘 ≈ 0 ! decrease the radius: poor agreement

4 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Trust Region Method (3)

Basic Algorithm:
1
choose ∆(𝑚𝑎𝑥) > 0,∆(0)  (0,∆(𝑚𝑎𝑥) ) and   [0, 4 )

for 𝑘 = 0,1, …

calculate direction 𝒑 𝑘
, contraction rate 𝜌𝑘

1 ||𝒑 𝑘 ||
if 𝜌𝑘 < , ∆(𝑘+1) =
4 4

3 𝑘
else if 𝜌𝑘 > and ||𝒑 || = ∆(𝑘) , ∆(𝑘+1) =min(2∆(𝑘) , ∆(𝑚𝑎𝑥) )
4

else ∆(𝑘+1) = ∆(𝑘)


𝑘+1 𝑘 𝑘
if 𝜌𝑘 > , 𝒙 =𝒙 +𝒑

else 𝒙 𝑘+1 =𝒙 𝑘

5 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
Trust Region Method (3)

Remarks:

• ∆(𝑘) is increased only if ||𝒑 𝑘


|| reaches the boundary of the domain.

• Strategies for the efficient solution of the minimization problem for 𝒑(𝑘) :
𝑘
- The Cauchy point: minimum along the steepest descent direction (−𝒈 ), slow
𝑘
- The Dogleg method: applicable when 𝑩 is positive definite, fast (superlinear)

- Steihaug’s approach for large sparse matrices

6 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.
The Dogleg Method
Idea:
• For a large ∆(𝑘) : Newton step, 𝒑 𝑘 𝑘 −1 Dogleg-path
= 𝒑𝐵 = −𝑩 𝒈 𝑘 . Where 𝒑𝐵 is the ∆ (𝑘)

𝑘
unconstrained minimum of 𝑚𝑘 , 𝒑 𝐵
≤∆ (𝑘)
. 𝒙
𝒑𝐵
(𝑘) 𝑘 𝒑𝑈 𝒑 𝑘
(𝜏 ∗ )
• For a small ∆ : search the solution along the direction −𝒈
𝒈 𝑘 𝑇𝒈 𝑘
• For an intermediate ∆(𝑘) : additionally calculate 𝒑𝑈 = − 𝑘 𝑇 𝑘 𝑘
𝒈 𝑘
𝑘
𝒈 𝑩 𝒈 −𝒈
Where 𝒑𝑈 is the unconstrained minimum of 𝑚(𝑘) in the steepest descent direction.

• Then calculate the search-direction using a linear combination of

𝜏𝒑𝑈 0≤𝜏≤1
𝒑𝑈 and 𝒑𝐵 : 𝒑 𝑘 𝜏 =൝ 𝑈
𝒑 + (𝜏 − 1)(𝒑𝐵 − 𝒑𝑈 ) 1 ≤ 𝜏 ≤ 2
𝑘
with 𝒑 (𝜏 ∗ ) = ∆(𝑘) .
7 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Illustration of Convergence – Rosenbrock Function

Contour lines of f(x)


min 𝑓 𝒙 = 100(𝑥2 − 𝑥12 )2 +(1 − 𝑥1 )2
𝒙∈𝑅 2

• Solution point is 𝒙 *=(1,1)T - why?

𝑓 𝒙
x2

[1]

𝑥2
𝑥1

x1
[1] https://commons.wikimedia.org/w/index.php?curid=9941741
8 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Illustration of Convergence – BFGS (Quasi-Newton Method)

min2 𝑓 𝒙 = 100(𝑥2 − 𝑥12 )2 +(1 − 𝑥1 )2


𝒙∈𝑅

• Matlab trust-region (fminunc)

𝑓 𝒙

[1]

𝑥2
𝑥1

[1] https://commons.wikimedia.org/w/index.php?curid=9941741
9 of 10 Applied Numerical Optimization
Prof. Alexander Mitsos, Ph.D.
Check Yourself

• Explain trust region method.


• Which model problem is solved in trust-region methods?
• How is trust-region radius updated?

10 of 10 Applied Numerical Optimization


Prof. Alexander Mitsos, Ph.D.

You might also like