You are on page 1of 18

APPLIED OPTIMIZATION

Lydie NOUVELIERE
Associate Professor-HDR
University Evry-Val-d’Essonne – University Paris-Saclay
IBISC Laboratory

M2 E3A Systèmes Automatiques Mobiles


QUIZZ Oct. 07th
Course questions !

2
Duality
• Dual function of Lagrange :
• g is always concave
• For 𝜇 ≥ 0

• Dual problem (Q) associated to primal problem (P) :


maximizing on Rp+q
with

• Duality jump :
Saddle point
• Symmetrization of (P) :

• (P) is equivalent to

• Dual problen (Q) is

• Saddle point : minimal thanks to a variable, maximal thanks to the other


• (𝑥 ∗ , 𝜆∗ , 𝜇∗ ) is a saddle point of Lagrangian if for all
𝑥 ,𝜆 ,𝜇 (𝜇∗ ≥ 0 and 𝜇 ≥ 0 )
Used in Usawa
Algorithm
Duality Theorem
• (𝑥 ∗ , 𝜆∗ , 𝜇∗ ) is a saddle point with 𝜇∗ ≥ 0 if and only if 𝑥 ∗ is a solution of (P),
(𝜆∗ , 𝜇∗ ) is a solution of (Q) and the duality jump is null.
• Interest : to solve the problem, on can then search for a saddle point of
Lagrangian
• Remark : a saddle point of Lagrangian verifies the KKT conditions (without
any hypothesis other than J, h and g are C1)

• If the problem is convex : saddle point ֞ KKT


Regularity
• Condition stronger than Mangasarian-Fromowitz :
• x* is eligible
• ∇ℎ𝑖 (𝑥 ∗ ) and ∇𝑔𝑗 (𝑥 ∗ ) are linearly independant for 𝑗 ∈ 𝐼(𝑥 ∗ )

• Constraints strongly active :

• If , one obtain a strict complementarity


2nd Order necessary conditions
• Hypothesis :
• J, h and g are C2
• x* is solution of (PO) and strongly regular
Then there exists 𝜆∗ = (𝜆1∗ , … , 𝜆∗𝑝 ) and 𝜇∗ = (𝜇1∗ , … , 𝜇𝑞∗ ) such that :
• KKT conditions are verified
• And for all d verifying :
• ∇ℎ𝑖 𝑥 ∗ , 𝑑 = 0, 1 ≤ 𝑖 ≤ 𝑝
• ∇𝑔𝑗 𝑥 ∗ , 𝑑 = 0, 𝑗 ∈ 𝐼 + 𝑥 ∗
• ∇𝑔𝑗 𝑥 ∗ , 𝑑 ≤ 0, 𝑗 ∈ 𝐼 𝑥 ∗ \𝐼 + 𝑥 ∗
one obtains :
2nd Order Sufficient condition
• Hypothesis :
• J, h and g are C2
• (𝑥 ∗ , 𝜆∗ , 𝜇∗ ) verifies KKT conditions
If matrix is positive definite in :

then x* is a local minimum of J in .


Method : Gradient descent
Direction of descent :
of at
Directional derivative of f atinx following
d is a direction of descent at x, if :

The direction of higher slope d+ is the gradient direction :


The direction of higher descent d- is the opposite to gradient direction:

9
Method : Gradient descent

➔ d1 not possible because


scalar pdt >0
➔ d2 possible but d- is better
than d2 to have a neg.10
Scal. pdt
Method : Gradient descent
Variation following a direction :
Displacement from with Direction of displacement
No displacement following do
Quadratic model at neighborhood of x0 : Function from R to R

with (Gradient) with


(Hessian)
Best direction : d2 < d- < d1

11
Method : Gradient descent
Local minimization :
2 particular points are defined from the quadratic model of f at x0 :
• Newton point xn : minimization of f compared to
• Cauchy point xc : minimization of f following d0= -g0

Newton point xn :

exists if is positive defined.


Cauchy point xc :
xc exists if f is convex following g0
(condition less strong than H0
if positive defined). 13
Example
A 100 random points

An error to be minimized
Etape 1 : consists in defining an objective function to be
minimized = function with a parameter «a» that returns
an error. = error between points and the regression
model.

obj. function :

«a», parameter of regression model y=ax to be estimated


yi output value at point i
axi predicted value by regression model at i
n number of points
14
Example
The gradient descent method consists of taking a value of “a”
randomly, and varying it more or less strongly with respect to the
slope of the objective function.

Instead of testing all the values ​of "a", one can vary its value with
variable steps that get smaller and smaller as one gets closer to
the minimum.

Looking through your memories of high school, this slope is the


derivative also called partial derivative if you are working on
several parameters. It is calculated as follows :
15
Example

16
Example
The figure below shows the slope values ​for different values ​of "a" ranging
from -10 to 10.
The closer we get to the minimum, the more the slope decreases. It is
negative on the left and positive on the right. To find the right value for “a”, it
suffices to vary “a” proportionally to this gradient. If the slope decreases, we
increase "a", if it increases we decrease "a".

17
Example

18

You might also like