## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

5. Duality

• Lagrange dual problem

• weak and strong duality

• geometric interpretation

• optimality conditions

• perturbation and sensitivity analysis

• examples

• generalized inequalities

5–1

Lagrangian

standard form problem (not necessarily convex)

minimize f

0

(x)

subject to f

i

(x) ≤ 0, i = 1, . . . , m

h

i

(x) = 0, i = 1, . . . , p

variable x ∈ R

n

, domain T, optimal value p

⋆

Lagrangian: L : R

n

R

m

R

p

→R, with domL = T R

m

R

p

,

L(x, λ, ν) = f

0

(x) +

m

¸

i=1

λ

i

f

i

(x) +

p

¸

i=1

ν

i

h

i

(x)

• weighted sum of objective and constraint functions

• λ

i

is Lagrange multiplier associated with f

i

(x) ≤ 0

• ν

i

is Lagrange multiplier associated with h

i

(x) = 0

Duality 5–2

Lagrange dual function

Lagrange dual function: g : R

m

R

p

→R,

g(λ, ν) = inf

x∈D

L(x, λ, ν)

= inf

x∈D

f

0

(x) +

m

¸

i=1

λ

i

f

i

(x) +

p

¸

i=1

ν

i

h

i

(x)

**g is concave, can be −∞ for some λ, ν
**

lower bound property: if λ _ 0, then g(λ, ν) ≤ p

⋆

proof: if ˜ x is feasible and λ _ 0, then

f

0

(˜ x) ≥ L(˜ x, λ, ν) ≥ inf

x∈D

L(x, λ, ν) = g(λ, ν)

minimizing over all feasible ˜ x gives p

⋆

≥ g(λ, ν)

Duality 5–3

Least-norm solution of linear equations

minimize x

T

x

subject to Ax = b

dual function

• Lagrangian is L(x, ν) = x

T

x + ν

T

(Ax −b)

• to minimize L over x, set gradient equal to zero:

∇

x

L(x, ν) = 2x + A

T

ν = 0 =⇒ x = −(1/2)A

T

ν

• plug in in L to obtain g:

g(ν) = L((−1/2)A

T

ν, ν) = −

1

4

ν

T

AA

T

ν −b

T

ν

a concave function of ν

lower bound property: p

⋆

≥ −(1/4)ν

T

AA

T

ν −b

T

ν for all ν

Duality 5–4

Standard form LP

minimize c

T

x

subject to Ax = b, x _ 0

dual function

• Lagrangian is

L(x, λ, ν) = c

T

x + ν

T

(Ax −b) −λ

T

x

= −b

T

ν + (c + A

T

ν −λ)

T

x

• L is linear in x, hence

g(λ, ν) = inf

x

L(x, λ, ν) =

−b

T

ν A

T

ν −λ + c = 0

−∞ otherwise

g is linear on aﬃne domain ¦(λ, ν) [ A

T

ν −λ + c = 0¦, hence concave

lower bound property: p

⋆

≥ −b

T

ν if A

T

ν + c _ 0

Duality 5–5

Equality constrained norm minimization

minimize |x|

subject to Ax = b

dual function

g(ν) = inf

x

(|x| −ν

T

Ax + b

T

ν) =

b

T

ν |A

T

ν|

∗

≤ 1

−∞ otherwise

where |v|

∗

= sup

u≤1

u

T

v is dual norm of | |

proof: follows from inf

x

(|x| −y

T

x) = 0 if |y|

∗

≤ 1, −∞ otherwise

• if |y|

∗

≤ 1, then |x| −y

T

x ≥ 0 for all x, with equality if x = 0

• if |y|

∗

> 1, choose x = tu where |u| ≤ 1, u

T

y = |y|

∗

> 1:

|x| −y

T

x = t(|u| −|y|

∗

) →−∞ as t →∞

lower bound property: p

⋆

≥ b

T

ν if |A

T

ν|

∗

≤ 1

Duality 5–6

Two-way partitioning

minimize x

T

Wx

subject to x

2

i

= 1, i = 1, . . . , n

• a nonconvex problem; feasible set contains 2

n

discrete points

• interpretation: partition ¦1, . . . , n¦ in two sets; W

ij

is cost of assigning

i, j to the same set; −W

ij

is cost of assigning to diﬀerent sets

dual function

g(ν) = inf

x

(x

T

Wx +

¸

i

ν

i

(x

2

i

−1)) = inf

x

x

T

(W +diag(ν))x −1

T

ν

=

−1

T

ν W +diag(ν) _ 0

−∞ otherwise

lower bound property: p

⋆

≥ −1

T

ν if W +diag(ν) _ 0

example: ν = −λ

min

(W)1 gives bound p

⋆

≥ nλ

min

(W)

Duality 5–7

Lagrange dual and conjugate function

minimize f

0

(x)

subject to Ax _ b, Cx = d

dual function

g(λ, ν) = inf

x∈domf

0

f

0

(x) + (A

T

λ + C

T

ν)

T

x −b

T

λ −d

T

ν

= −f

∗

0

(−A

T

λ −C

T

ν) −b

T

λ −d

T

ν

• recall deﬁnition of conjugate f

∗

(y) = sup

x∈domf

(y

T

x −f(x))

• simpliﬁes derivation of dual if conjugate of f

0

is kown

example: entropy maximization

f

0

(x) =

n

¸

i=1

x

i

log x

i

, f

∗

0

(y) =

n

¸

i=1

e

y

i

−1

Duality 5–8

The dual problem

Lagrange dual problem

maximize g(λ, ν)

subject to λ _ 0

• ﬁnds best lower bound on p

⋆

, obtained from Lagrange dual function

• a convex optimization problem; optimal value denoted d

⋆

• λ, ν are dual feasible if λ _ 0, (λ, ν) ∈ domg

• often simpliﬁed by making implicit constraint (λ, ν) ∈ domg explicit

example: standard form LP and its dual (page 5–5)

minimize c

T

x

subject to Ax = b

x _ 0

maximize −b

T

ν

subject to A

T

ν + c _ 0

Duality 5–9

Weak and strong duality

weak duality: d

⋆

≤ p

⋆

• always holds (for convex and nonconvex problems)

• can be used to ﬁnd nontrivial lower bounds for diﬃcult problems

for example, solving the SDP

maximize −1

T

ν

subject to W +diag(ν) _ 0

gives a lower bound for the two-way partitioning problem on page 5–7

strong duality: d

⋆

= p

⋆

• does not hold in general

• (usually) holds for convex problems

• conditions that guarantee strong duality in convex problems are called

constraint qualiﬁcations

Duality 5–10

Slater’s constraint qualiﬁcation

strong duality holds for a convex problem

minimize f

0

(x)

subject to f

i

(x) ≤ 0, i = 1, . . . , m

Ax = b

if it is strictly feasible, i.e.,

∃x ∈ int T : f

i

(x) < 0, i = 1, . . . , m, Ax = b

• also guarantees that the dual optimum is attained (if p

⋆

> −∞)

• can be sharpened: e.g., can replace int T with relint T (interior

relative to aﬃne hull); linear inequalities do not need to hold with strict

inequality, . . .

• there exist many other types of constraint qualiﬁcations

Duality 5–11

Inequality form LP

primal problem

minimize c

T

x

subject to Ax _ b

dual function

g(λ) = inf

x

(c + A

T

λ)

T

x −b

T

λ

=

−b

T

λ A

T

λ + c = 0

−∞ otherwise

dual problem

maximize −b

T

λ

subject to A

T

λ + c = 0, λ _ 0

• from Slater’s condition: p

⋆

= d

⋆

if A˜ x ≺ b for some ˜ x

• in fact, p

⋆

= d

⋆

except when primal and dual are infeasible

Duality 5–12

Quadratic program

primal problem (assume P ∈ S

n

++

)

minimize x

T

Px

subject to Ax _ b

dual function

g(λ) = inf

x

x

T

Px + λ

T

(Ax −b)

= −

1

4

λ

T

AP

−1

A

T

λ −b

T

λ

dual problem

maximize −(1/4)λ

T

AP

−1

A

T

λ −b

T

λ

subject to λ _ 0

• from Slater’s condition: p

⋆

= d

⋆

if A˜ x ≺ b for some ˜ x

• in fact, p

⋆

= d

⋆

always

Duality 5–13

A nonconvex problem with strong duality

minimize x

T

Ax + 2b

T

x

subject to x

T

x ≤ 1

A _ 0, hence nonconvex

dual function: g(λ) = inf

x

(x

T

(A + λI)x + 2b

T

x −λ)

• unbounded below if A + λI _ 0 or if A + λI _ 0 and b ∈ 1(A + λI)

• minimized by x = −(A + λI)

†

b otherwise: g(λ) = −b

T

(A + λI)

†

b −λ

dual problem and equivalent SDP:

maximize −b

T

(A + λI)

†

b −λ

subject to A + λI _ 0

b ∈ 1(A + λI)

maximize −t −λ

subject to

¸

A + λI b

b

T

t

_ 0

strong duality although primal problem is not convex (not easy to show)

Duality 5–14

Geometric interpretation

for simplicity, consider problem with one constraint f

1

(x) ≤ 0

interpretation of dual function:

g(λ) = inf

(u,t)∈G

(t + λu), where ( = ¦(f

1

(x), f

0

(x)) [ x ∈ T¦

G

p

⋆

g(λ)

λu + t = g(λ)

t

u

G

p

⋆

d

⋆

t

u

• λu + t = g(λ) is (non-vertical) supporting hyperplane to (

• hyperplane intersects t-axis at t = g(λ)

Duality 5–15

epigraph variation: same interpretation if ( is replaced with

/ = ¦(u, t) [ f

1

(x) ≤ u, f

0

(x) ≤ t for some x ∈ T¦

A

p

⋆

g(λ)

λu + t = g(λ)

t

u

strong duality

• holds if there is a non-vertical supporting hyperplane to / at (0, p

⋆

)

• for convex problem, / is convex, hence has supp. hyperplane at (0, p

⋆

)

• Slater’s condition: if there exist (˜ u,

˜

t) ∈ / with ˜ u < 0, then supporting

hyperplanes at (0, p

⋆

) must be non-vertical

Duality 5–16

Complementary slackness

assume strong duality holds, x

⋆

is primal optimal, (λ

⋆

, ν

⋆

) is dual optimal

f

0

(x

⋆

) = g(λ

⋆

, ν

⋆

) = inf

x

f

0

(x) +

m

¸

i=1

λ

⋆

i

f

i

(x) +

p

¸

i=1

ν

⋆

i

h

i

(x)

≤ f

0

(x

⋆

) +

m

¸

i=1

λ

⋆

i

f

i

(x

⋆

) +

p

¸

i=1

ν

⋆

i

h

i

(x

⋆

)

≤ f

0

(x

⋆

)

hence, the two inequalities hold with equality

• x

⋆

minimizes L(x, λ

⋆

, ν

⋆

)

• λ

⋆

i

f

i

(x

⋆

) = 0 for i = 1, . . . , m (known as complementary slackness):

λ

⋆

i

> 0 =⇒f

i

(x

⋆

) = 0, f

i

(x

⋆

) < 0 =⇒λ

⋆

i

= 0

Duality 5–17

Karush-Kuhn-Tucker (KKT) conditions

the following four conditions are called KKT conditions (for a problem with

diﬀerentiable f

i

, h

i

):

1. primal constraints: f

i

(x) ≤ 0, i = 1, . . . , m, h

i

(x) = 0, i = 1, . . . , p

2. dual constraints: λ _ 0

3. complementary slackness: λ

i

f

i

(x) = 0, i = 1, . . . , m

4. gradient of Lagrangian with respect to x vanishes:

∇f

0

(x) +

m

¸

i=1

λ

i

∇f

i

(x) +

p

¸

i=1

ν

i

∇h

i

(x) = 0

from page 5–17: if strong duality holds and x, λ, ν are optimal, then they

must satisfy the KKT conditions

Duality 5–18

KKT conditions for convex problem

if ˜ x,

˜

λ, ˜ ν satisfy KKT for a convex problem, then they are optimal:

• from complementary slackness: f

0

(˜ x) = L(˜ x,

˜

λ, ˜ ν)

• from 4th condition (and convexity): g(

˜

λ, ˜ ν) = L(˜ x,

˜

λ, ˜ ν)

hence, f

0

(˜ x) = g(

˜

λ, ˜ ν)

if Slater’s condition is satisﬁed:

x is optimal if and only if there exist λ, ν that satisfy KKT conditions

• recall that Slater implies strong duality, and dual optimum is attained

• generalizes optimality condition ∇f

0

(x) = 0 for unconstrained problem

Duality 5–19

example: water-ﬁlling (assume α

i

> 0)

minimize −

¸

n

i=1

log(x

i

+ α

i

)

subject to x _ 0, 1

T

x = 1

x is optimal iﬀ x _ 0, 1

T

x = 1, and there exist λ ∈ R

n

, ν ∈ R such that

λ _ 0, λ

i

x

i

= 0,

1

x

i

+ α

i

+ λ

i

= ν

• if ν < 1/α

i

: λ

i

= 0 and x

i

= 1/ν −α

i

• if ν ≥ 1/α

i

: λ

i

= ν −1/α

i

and x

i

= 0

• determine ν from 1

T

x =

¸

n

i=1

max¦0, 1/ν −α

i

¦ = 1

interpretation

• n patches; level of patch i is at height α

i

• ﬂood area with unit amount of water

• resulting level is 1/ν

⋆

i

1/ν

⋆

x

i

α

i

Duality 5–20

Perturbation and sensitivity analysis

(unperturbed) optimization problem and its dual

minimize f

0

(x)

subject to f

i

(x) ≤ 0, i = 1, . . . , m

h

i

(x) = 0, i = 1, . . . , p

maximize g(λ, ν)

subject to λ _ 0

perturbed problem and its dual

min. f

0

(x)

s.t. f

i

(x) ≤ u

i

, i = 1, . . . , m

h

i

(x) = v

i

, i = 1, . . . , p

max. g(λ, ν) −u

T

λ −v

T

ν

s.t. λ _ 0

• x is primal variable; u, v are parameters

• p

⋆

(u, v) is optimal value as a function of u, v

• we are interested in information about p

⋆

(u, v) that we can obtain from

the solution of the unperturbed problem and its dual

Duality 5–21

global sensitivity result

assume strong duality holds for unperturbed problem, and that λ

⋆

, ν

⋆

are

dual optimal for unperturbed problem

apply weak duality to perturbed problem:

p

⋆

(u, v) ≥ g(λ

⋆

, ν

⋆

) −u

T

λ

⋆

−v

T

ν

⋆

= p

⋆

(0, 0) −u

T

λ

⋆

−v

T

ν

⋆

sensitivity interpretation

• if λ

⋆

i

large: p

⋆

increases greatly if we tighten constraint i (u

i

< 0)

• if λ

⋆

i

small: p

⋆

does not decrease much if we loosen constraint i (u

i

> 0)

• if ν

⋆

i

large and positive: p

⋆

increases greatly if we take v

i

< 0;

if ν

⋆

i

large and negative: p

⋆

increases greatly if we take v

i

> 0

• if ν

⋆

i

small and positive: p

⋆

does not decrease much if we take v

i

> 0;

if ν

⋆

i

small and negative: p

⋆

does not decrease much if we take v

i

< 0

Duality 5–22

local sensitivity: if (in addition) p

⋆

(u, v) is diﬀerentiable at (0, 0), then

λ

⋆

i

= −

∂p

⋆

(0, 0)

∂u

i

, ν

⋆

i

= −

∂p

⋆

(0, 0)

∂v

i

proof (for λ

⋆

i

): from global sensitivity result,

∂p

⋆

(0, 0)

∂u

i

= lim

tց0

p

⋆

(te

i

, 0) −p

⋆

(0, 0)

t

≥ −λ

⋆

i

∂p

⋆

(0, 0)

∂u

i

= lim

tր0

p

⋆

(te

i

, 0) −p

⋆

(0, 0)

t

≤ −λ

⋆

i

hence, equality

p

⋆

(u) for a problem with one (inequality)

constraint: u

p

⋆

(u)

p

⋆

(0) −λ

⋆

u

u = 0

Duality 5–23

Duality and problem reformulations

• equivalent formulations of a problem can lead to very diﬀerent duals

• reformulating the primal problem can be useful when the dual is diﬃcult

to derive, or uninteresting

common reformulations

• introduce new variables and equality constraints

• make explicit constraints implicit or vice-versa

• transform objective or constraint functions

e.g., replace f

0

(x) by φ(f

0

(x)) with φ convex, increasing

Duality 5–24

Introducing new variables and equality constraints

minimize f

0

(Ax + b)

• dual function is constant: g = inf

x

L(x) = inf

x

f

0

(Ax + b) = p

⋆

• we have strong duality, but dual is quite useless

reformulated problem and its dual

minimize f

0

(y)

subject to Ax + b −y = 0

maximize b

T

ν −f

∗

0

(ν)

subject to A

T

ν = 0

dual function follows from

g(ν) = inf

x,y

(f

0

(y) −ν

T

y + ν

T

Ax + b

T

ν)

=

−f

∗

0

(ν) + b

T

ν A

T

ν = 0

−∞ otherwise

Duality 5–25

norm approximation problem: minimize |Ax −b|

minimize |y|

subject to y = Ax −b

can look up conjugate of | |, or derive dual directly

g(ν) = inf

x,y

(|y| + ν

T

y −ν

T

Ax + b

T

ν)

=

b

T

ν + inf

y

(|y| + ν

T

y) A

T

ν = 0

−∞ otherwise

=

b

T

ν A

T

ν = 0, |ν|

∗

≤ 1

−∞ otherwise

(see page 5–4)

dual of norm approximation problem

maximize b

T

ν

subject to A

T

ν = 0, |ν|

∗

≤ 1

Duality 5–26

Implicit constraints

LP with box constraints: primal and dual problem

minimize c

T

x

subject to Ax = b

−1 _ x _ 1

maximize −b

T

ν −1

T

λ

1

−1

T

λ

2

subject to c + A

T

ν + λ

1

−λ

2

= 0

λ

1

_ 0, λ

2

_ 0

reformulation with box constraints made implicit

minimize f

0

(x) =

c

T

x −1 _ x _ 1

∞ otherwise

subject to Ax = b

dual function

g(ν) = inf

−1x1

(c

T

x + ν

T

(Ax −b))

= −b

T

ν −|A

T

ν + c|

1

dual problem: maximize −b

T

ν −|A

T

ν + c|

1

Duality 5–27

Problems with generalized inequalities

minimize f

0

(x)

subject to f

i

(x) _

K

i

0, i = 1, . . . , m

h

i

(x) = 0, i = 1, . . . , p

_

K

i

is generalized inequality on R

k

i

deﬁnitions are parallel to scalar case:

• Lagrange multiplier for f

i

(x) _

K

i

0 is vector λ

i

∈ R

k

i

• Lagrangian L : R

n

R

k

1

R

k

m

R

p

→R, is deﬁned as

L(x, λ

1

, , λ

m

, ν) = f

0

(x) +

m

¸

i=1

λ

T

i

f

i

(x) +

p

¸

i=1

ν

i

h

i

(x)

• dual function g : R

k

1

R

k

m

R

p

→R, is deﬁned as

g(λ

1

, . . . , λ

m

, ν) = inf

x∈D

L(x, λ

1

, , λ

m

, ν)

Duality 5–28

lower bound property: if λ

i

_

K

∗

i

0, then g(λ

1

, . . . , λ

m

, ν) ≤ p

⋆

proof: if ˜ x is feasible and λ _

K

∗

i

0, then

f

0

(˜ x) ≥ f

0

(˜ x) +

m

¸

i=1

λ

T

i

f

i

(˜ x) +

p

¸

i=1

ν

i

h

i

(˜ x)

≥ inf

x∈D

L(x, λ

1

, . . . , λ

m

, ν)

= g(λ

1

, . . . , λ

m

, ν)

minimizing over all feasible ˜ x gives p

⋆

≥ g(λ

1

, . . . , λ

m

, ν)

dual problem

maximize g(λ

1

, . . . , λ

m

, ν)

subject to λ

i

_

K

∗

i

0, i = 1, . . . , m

• weak duality: p

⋆

≥ d

⋆

always

• strong duality: p

⋆

= d

⋆

for convex problem with constraint qualiﬁcation

(for example, Slater’s: primal problem is strictly feasible)

Duality 5–29

Semideﬁnite program

primal SDP (F

i

, G ∈ S

k

)

minimize c

T

x

subject to x

1

F

1

+ + x

n

F

n

_ G

• Lagrange multiplier is matrix Z ∈ S

k

• Lagrangian L(x, Z) = c

T

x +tr (Z(x

1

F

1

+ + x

n

F

n

−G))

• dual function

g(Z) = inf

x

L(x, Z) =

−tr(GZ) tr(F

i

Z) + c

i

= 0, i = 1, . . . , n

−∞ otherwise

dual SDP

maximize −tr(GZ)

subject to Z _ 0, tr(F

i

Z) + c

i

= 0, i = 1, . . . , n

p

⋆

= d

⋆

if primal SDP is strictly feasible (∃x with x

1

F

1

+ +x

n

F

n

≺ G)

Duality 5–30

- 40-1157uploaded byJohn Rong
- estegradienteuploaded byMartin Martinez Cadavid
- Face Recognition Using Total Margin-Based Adaptive Fuzzy Support Vector Machinesuploaded byMelissa Hobbs
- 5uploaded byborelcantelli
- eth-3071-02uploaded byAndreea
- MAT3310-2_on_the_web.pdfuploaded byjulianli0220
- I Latin American Workshop on Optimization and Controluploaded byEduardo Tusa
- ps1fall200uploaded byHenrik Grönwall
- pbuc123uploaded bykanda71
- 10.1.1.66.4094uploaded bypokojang
- KarushKuhnTuckeruploaded byAVALLEST
- Operation Research Ansuploaded bySumouli Mukherjee
- Classification Mechanism of Support Vectoruploaded byharsha
- lagrangian optimizationuploaded byEsat
- 2003 - A Tutorial on Support Vector Regressionuploaded byFranck Dernoncourt
- Capital Budgeting of Interrelated Projectsuploaded byJonathan Cruz
- svm_introuploaded byBinit Thapa
- College Level Mathuploaded byKessel Mae Dominguez
- M02 Age-Related Algebra - BkAgeAlgebrauploaded byNora
- 10.1.1.161.7484uploaded bytamann2004
- Some Algebraic Manipulationuploaded byAnt Ryo
- Power Control by Geometric Programminguploaded byThanhnhan Ngovo
- Why is Schr¨odinger’s Equation Linearuploaded bycapitaotaylor
- 2018uploaded byValeria Tejena
- project-2uploaded byMahmoudAbdulGalil
- Maxent Dualuploaded byHiền Trần
- Cs.theorists.toolkituploaded byJohn Le Tourneux
- Year+9+Sheet+5uploaded bylockedup123
- reinforced concrete slabs ultimate load using FEM.pdfuploaded byrasna
- pr2013-svmuploaded bymikebites

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading