Professional Documents
Culture Documents
Panos Patrinos
STADIUS, Department of Electrical
Engineering, KU Leuven
1 – Outline 2/47
1 Quadratic programs
2 Nonlinear programs
a>
1
a>
2
A = . ∈ IRm×n
..
a>
m
very important problem. arises in
1 algorithms for NLPs (SQP)
2 optimal control of linear systems (MPC)
many algorithms: active set, interior point, first-order
methods,. . .
we assume that B 0 (convex QP)
1 – KKT conditions 4/47
µi ≥ 0 i ∈ A(x? ) = {i | Ai x? = bi }
µi = 0 i ∈ I(x? ) = {i | Ai x? < bi }
A(x? ) = {i | Ai x? = bi }
AA x = bA
to mean
a>
i x = bi , i∈A
1 – Active set methods 6/47
AI x̄ ≤ bI
µ̄ ≥ 0
Wk ⊆ A(xk ) = {i | a>
i xk = bi }
dk is direction of descent:
f (xk + αdk ) < f (xk )
for α sufficiently small
1 – Primal active set method 10/47
where gk = Bxk + g.
if dk = 0, from KKT conditions for subproblem
if µ̃ ≥ 0, then xk and
µWk = µ̃, µi = 0, i ∈
/ Wk
where gk = Bxk + g.
if µ̃i < 0 for some i ∈ Wk we can further reduce cost by
solving
i = argmin µ̃i
i∈Wk
1 – Primal active set method 13/47
if dk = 0 then
if µ̃i ≥ 0 for all i ∈ Wk then
stop with x? = xk
else
j ← argmini∈Wk µ̃i , Wk+1 ← Wk \ {j}, xk+1 ← xk
end if
else
xk+1 = xk + αk dk , with αk that keeps xk+1 feasible
if αk < 1 then
Wk+1 = Wk ∪ {j}, where j is a blocking constraint
else
Wk+1 = Wk
end if
end if
end for
Our strategy is to use some heuristic to choose a value of M and solve (16.47) by
the usual means. If the solution we obtain has a positive value of η, we increase M and try
1 – Example [Nocedal
again. Note that a & Wright
feasible point is easy(2006), Ex.16.4]
to obtain for the subproblem (16.47): We set x ! x̃
(where, as before, x̃ is the user-supplied initial guess) and choose η large enough that all the
14/47
constraints in (16.47) are satisfied. This approach is, in fact, an exact penalty method using
the ℓ∞ norm; see Chapter 17.
A variant of (16.47) that penalizes the ℓ1 norm of the constraint violation rather than
the ℓ∞ norm is as follows:
2 + t) + Me v
minimize Gx + x−
min x(x
1 c +1)
Me (s + (x2 − 2.5)2
1 T
(x,s,t,v) 2
T T
E
T
I
subject to aiT x − bi + si − ti ! 0, i ∈ E,
subject to − x1 + 2xa 2x −≤
b +2
v ≥ 0, T
i i i i ∈ I, (16.48)
s ≥ 0, t ≥ 0, v ≥ 0.
x1 + 2x2 ≤ 6
Here, eE is the vector (1, 1, . . . , 1)T of length |E|; similarly for eI . The slack variables si , ti ,
x − 2x ≤ 2
and vi soak up any infeasibility in the constraints.
1 2
In the following example we use subscripts on the vectors x and p to denote their
components, and we use superscripts to indicate the iteration index. For example, x1 denotes
x ≥ 0, x ≥ 0
4
the first component, while x1denotes the fourth2iterate of the vector x.
x2
(2,2)
x5
x4
(0,1) (4,1)
x 2, x 3 (2,0) x1
x 0, x 1
components, and we use superscripts to indicate the iteration index. For example, x1 denotes
the first component, while x 4 denotes the fourth iterate of the vector x.
1 – Example [Nocedal & Wright (2006), Ex.16.4] 15/47
x2
(2,2)
x5
x4
(0,1) (4,1)
x 2, x 3 (2,0) x1
x 0, x 1
x0 = (2, 0),Figure
W16.3 Iterates of the active-set method.
0 = {3, 5}, d0 = (0, 0), (µ̃3 , µ̃5 ) = (−2, −1)
x1 = (2, 0), W1 = {5}, d1 = (−1, 0), α1 = 1
x2 = (1, 0), W2 = {5}, d2 = (0, 0), µ̃5 = −5
x3 = (1, 0), W3 = {∅}, d3 = (0, 2.5), α3 = 0.6
x4 = (1, 1.5), W4 = {1}, d4 = (0.4, 0.2), α4 = 1
x5 = (1.4, 1.7), W5 = {1}, d5 = (0, 0), µ̃1 = 0.8, solution
found
1 – Initial feasible point 16/47
w0 = max{0, Ax0 − b}
x? = −B −1 (A> µ? + g)
2 – Outline 18/47
1 Quadratic programs
2 Nonlinear programs
minimize f (x)
subject to hi (x) = 0, i = 1, . . . , l
gi (x) ≤ 0, i = 1, . . . , m
at iteration k solve QP
1 >
minimize 2 d Bk d + ∇f (xk )> d
subject to hi (xk ) + ∇hi (xk )> d = 0, i = 1, . . . , l
gi (xk ) + ∇gi (xk )> d ≤ 0, i = 1, . . . , m
and let
xk+1 = xk + dk
(λk+1 ,µk+1 ) Lagrange multipliers for equalities/inequalities
Newton
Bk = ∇2xx L(xk , λk , µk )
quadratic convergence (under assumptions)
with
sk = xk+1 − xk ,
yk = ∇x L(xk+1 , λk+1 , µk+1 ) − ∇x L(xk , λk+1 , µk+1 )
ỹk = θk yk + (1 − θk )Bk sk
1, if s> >
k yk ≥ 0.2sk Bk sk
θk = >
>0.8sk Bk s>k , if s> >
k yk < 0.2sk Bk sk
s B s −s y
k k k k k
hence Bk+1 0
2 – Convergence rate of SQP 23/47
suppose that
1 (x? , λ? , µ? ) is a KKT point
2 LICQ holds
3 SOSC holds
minimize f (x)
subject to hi (x) = 0, i = 1, . . . , l
gi (x) = 0, i ∈ A(x? )
2 – Local SQP example 25/47
minimize − d1 − d2
3
subject to d1 − d2 ≤ 4
d1 + 2d2 ≤ − 14
to obtain
next iterate
k = 1:
2 0
B1 = ∇2xx L(x1 , µ1 ) = .
0 34
solve
to obtain
next iterate
k xk µk g1 (xk ) g2 (xk ) dk
0 (0.5, 1) (0, 0) −0.75 0.25 (0.417, −0.333)
1 (0.917, 0.667) (0.33, 0.667) 0.174 0.285 (−0.170, −0.020)
2 (0.747, 0.686) (0, 0.73) −0.128 0.029 (−0.038, 0.021)
3 (0.709, 0.707) (0, 0.707) −0.204 0.002 (−0.002, 0.00)
4 (0.707, 0.707) (0, 0.707) −0.207 0 (0, 0)
2 – Globalization of SQP 29/47
where
l
X m
X
P (x) = |hi (x)| + max{gi (x), 0}
i=1 i=1
= kh(x)k1 + k max{g(x), 0}k1
if x? is
1 local minimum for penalized problem
2 feasible for constrained problem
then x? is local minimum for constrained problem
if (x? , λ? , µ? ) satisfy KKT and SOSC for constrained problem and
minimize 0
subject to x3 + 3x2 + 3 = 0
20
15
10
0
-4 -3 -2 -1 0 1 2
x
2 – Exact penalization 32/47
minimize x
subject to x ≥ 1
global minimum x?
= 1, Lagrange multiplier λ? = 1
ϕc (x) = x + c max{1 − x, 0} is an exact penalty for c > 1
c = 0.9
c = 1.2
c = 1.5
1.4 c = 1.8
ϕc (x)
1.2
c=2
c=4
c=8
1.4 c = 16
ϕc (x)
1.2
ϕc (x + αd) − ϕc (x)
ϕ0c (x; d) = lim
α↓0 α
If dk = 0 stop.
Choose ck ≥ k(λk+1 , µk+1 )k∞ + c̄ and compute
Set xk+1 = xk + αk dk
end for
2 – Convergence of global SQP 37/47
assume that
1 f , h, g are smooth with Lipschitz gradient
2 ck remains constant after some iteration k
then (every accumulation point of) the sequence
{(xk , λk , µk )} converges to a KKT point
3 – Outline 38/47
1 Quadratic programs
2 Nonlinear programs
minimize f (x)
subject to hi (x) = 0, i = 1, . . . , l
gi (x) + si = 0, i = 1, . . . , m
si ≥ 0, i = 1, . . . , m
s ≥ 0, µi ≥ 0, si µi = 0
s ≥ 0, µi ≥ 0, si µi = τ
where S = diag(s1 , . . . , sm ).
solution for fixed τ ≥ 0: (x(τ ), s(τ ), λ(τ ), µ(τ ))
trajectory as τ varies is called primal-dual central path
m
X
minimize f (x) − τ log si
i=1
subject to hi (x) = 0, i = 1, . . . , l
gi (x) + si = 0, i = 1, . . . , m
10
5
−τ log(s)
0 1 2 3
s
τ = 2, τ = 1, τ = 0.5
3 – Example 45/47
as τk decreases, xk → x?
3 – Example 46/47
τ = 0.3 τ = 0.03
1 1
0.5 0.5
0 0
−0.5 −0.5
−1 −1
2.1 2.2 2.3 2.1 2.2 2.3
3 – Interior point methods 47/47