Convex Problems, Separation Theorems: September 17, 2008

Lecture 7
Convex Problems, Separation Theorems
September 17, 2008

Lecture 7
Outline
• Preliminary for Duality Theory

• Separation Theorems (Ch. 2.5 of Boyd and Vandenberghe’s book)
Supporting Hyperplane Theorem
Separating Hyperplane Theorems
• Duality
Motivation
Visualization of Primal-Dual Framework
Primal-Dual Constrained Optimization Problems
Dual Function Properties
Weak and Strong Duality
Examples
Convex Optimization 1
Lecture 7
Some Terminology
Given a hyperplane H = {x ∈ Rn | aT x = b}, we say that

• The hyperplane H passes through a vector x0 when
x0 ∈ H ⇐⇒ a T x0 = b
• The hyperplane H contains a set C in one of its halfspaces when
either aT z ≤ b for all z ∈ C or aT z ≥ b for all z ∈ C
Lecture 7
Supporting Hyperplane Theorem
Th. Let C ⊆ Rn be a nonempty convex set. Let x0 be such that

either x0 ∈ bdC or x0 ∈ /C
Then, there exists a hyperplane passing through x0 and containing the set
C in one of its halfspaces, i.e., there is a vector a ∈ Rn, a 6= 0, such that
sup aT z ≤ aT x0
z∈C
• A hyperplane with such property is referred to as a supporting hyperplane
Lecture 7
Proof of the Supporting Hyperplane Teorem

Proof for the case when C is closed:
Let {xk } 6⊆ C such that xk → x0. (Why does it exist?)
Let zk∗ be the projection of xk on the set C for each k. Consider
xk −zk∗
ak = kx −z∗k for k ≥ 1
k k
This sequence is bounded and, therefore, it has a limit point, say a ∈ Rn.
Let {ak }K be a subsequence of {ak } converging to a. Since zk∗ is the
projection of xk for each k, by the Projection Theorem (b), it follows that
for each k (zk∗ − xk )T (z − zk∗) ≥ 0 for all z ∈ C
implying that for all k
aTk z ≤ aTk zk∗ for all z ∈ C
Because aTk zk∗ = aTk (zk∗ − xk ) + aTk xk < aTk xk for all k, we obtain
aTk z < aTk xk for all z ∈ C
Since xk → x0 and ak → a over k ∈ K, it follows that
a T z ≤ a T x0 for all z ∈ C
Lecture 7
Separating Hyperplane Theorems
Th. Let C, D ⊆ Rn be nonempty convex disjoint sets i.e., C ∩ D = ∅.

Then, there exists a hyperplane separating these sets, i.e.,
there is a ∈ Rn, a 6= 0, such that
sup aT x ≤ inf aT z
x∈C z∈D
Lecture 7
Proof of the Separating Hyperplane Theorem

Consider the set Y = C − D. This is a (nonempty) convex set.
Since C ∩ D = ∅, it follows that 0 6∈ Y .
By the Supporting Hyperplane Theorem, it follows that there exists a ∈ Rn
such that
sup aT y ≤ 0.
y∈Y
Hence, aT y ≤ 0 for all y ∈ Y . Because Y = C − D, we have
aT x ≤ aT z for all x ∈ C and all z ∈ D
Taking supremum over x ∈ C and infimum over z ∈ D, we obtain
sup aT x ≤ inf aT z
x∈C z∈D
Lecture 7
Strictly Separating Hyperplane Theorem
Th. Let C, D ⊆ Rn be nonempty convex disjoint sets.

Assume that C − D is closed. Then, there exists a hyperplane strictly
separating the sets, i.e., there is a ∈ Rn, a 6= 0, such that
sup aT x < inf aT z

x∈C z∈D
Proof: Homework assignment.

• When is C − D closed?
• One of conditions: C is closed and D is compact
Lecture 7
Duality Theory
• An important part of optimization theory
• Its implications are far reaching both in theory and practice
• A powerful tool providing:

• A basis for the development of a rich class of optimization algorithms
• A general systematic way for developing bounding strategies (both in
continuous and discrete optimization)
• A basis for sensitivity analysis
Lecture 7
Main Idea and Issues in Duality Theory
• Associate an “equivalent dual problem” with a given (primal) problem
• Methodology applicable to a general constrained optimization problem
• Investigate:
• Is there a general relation between the primal and its associated dual
problem?
• Under which conditions the primal and the dual problems have the
same optimal values?
• Under which conditions the primal and dual optimal solutions exist?
• What are the relations between primal and dual optimal solutions?
• What kind of information the dual optimal solutions provide about

the primal problem?
Lecture 7
Geometric Visualization of Duality
We illustrate duality using an abstract “geometric framework”
• This framework provides insights into:

• Weak duality
• Strong duality (zero duality gap)
• Existence of duality gap
Within this setting, we define:

• A “geometric primal problem” using an abstract set V ⊆ Rm × R
• A corresponding “geometric dual problem” using the hyperplanes that

support the set V
Lecture 7
Geometric Primal
Consider an abstract (nonempty) set V of vectors (u, w) ∈ Rm × R

The set V intersects the w-axis, i.e.,
(0, w) ∈ V for some w ∈ R
The set V extends “north” and “east”:
[North] For any (u, w) ∈ V and u ∈ Rm with u ũ, we have (ũ, w) ∈ V
[East] For any (u, w) ∈ V and w ∈ R with w ≤ w̃, we have (u, w̃) ∈ V
• Geometric Primal Problem
Determine the minimum intercept of the set V and the w-axis:
minimize w
subject to (0, w) ∈ V
The minimum intercept value is denoted by f ∗, i.e., f ∗ = inf (0,w)∈V w.
Lecture 7
Nonvertical Hyperplanes
A hyperplane in Rm × R: {(u, w) | µT u + µ0w = ξ}, µ ∈ Rm, µ0, ξ ∈ R
• We say that a hyperplane is nonvertical when µ0 6= 0
• Let Hµ,ξ denote a nonvertical hyperplane in Rm × R, i.e.,
Hµ,ξ = {(u, w) | µT u + w = ξ} with µ ∈ Rm, ξ ∈ R
• Let q(µ) be the minimum value of µT u + w for (u, w) ∈ V , i.e.,
q(µ) = inf (u,w)∈V {µT u + w}
• A nonvertical hyperplane Hµ,ξ̂ supports a set V when ξ̂ = q(µ)
Lecture 7
Geometric Dual Problem
• A hyperplane supporting the set V intersects the w-axis at (0, q(µ))

• Geometric Dual Problem: Determine the maximum intercept with the
w-axis for the nonvertical hyperplanes that support the set V :
maximize q(µ)
subject to µ ∈ Rm
• Note: q(µ) = inf (u,w)∈V {µT u + w}, q(µ) = −∞ for µ 6 0
Lecture 7
Observations
Primal: minimize w Dual: maximize q(µ)

subject to (0, w) ∈ V subject to µ 0
• Dual values q(µ) are always below f ∗ and below any w with (0, w) ∈ V
• Dual optimal value q ∗ never exceeds the primal optimal value f ∗:
q∗ ≤ f ∗ Weak Duality
• The weak duality may be strict i.e., q ∗ < f ∗ there is a Duality Gap
Lecture 7
Duality Gap Illustrations
• A duality gap may exist even for a convex set V
Lecture 7
Strong Duality
• We may have q ∗ = f ∗ Strong Duality
• However, a nonvertical hyperplane achieving the maximum intercept

may not exist
• With or without convexity of V , only one relation is sure: q ∗ ≤ f ∗
Lecture 7
Constrained Optimization Duality

Primal Problem (not necessarily convex)
minimize f (x)
subject to gj (x) ≤ 0, j = 1, . . . , m
x∈X
variable x ∈ Rn, feasible, optimal value f ∗ > −∞

Geometric Framework:
• Define the set V as follows:
V = {(u, w) | there is x ∈ X such that g(x) u, f (x) ≤ w}
• Dual function is:
q(µ) = inf (u,w)∈V {w + µT u} = inf x∈X {f (x) + µT g(x)}, µ 0
• Dual Problem:
maximize q(µ)
subject to µ 0
Lecture 7
General Case
Primal Problem (not necessarily convex)
minimize f (x)
subject to gj (x) ≤ 0, j = 1, . . . , m
hj (x) = 0, j = 1, . . . , p
x∈X
variable x ∈ Rn, feasible, optimal value f ∗ > −∞
Lagrangian Function: L : Rn × Rm × Rp → R given by
m
X p
X
L(x, µ, λ) = f (x) + µj gj (x) + λj hj (x)
j=1 j=1
= f (x) + µ g(x) + λT h(x)

T
• Weighted sum of the objective and constraint functions

• µ ∈ Rm is Lagrange multiplier associated with g = (g1, . . . , gm)
• λ ∈ Rp is Lagrange multiplier associated with h = (h1, . . . , hp)
Lecture 7
Dual Problem
Lagrangian Function:
T T

q(µ, λ) = inf x∈X L(x, µ, λ) = inf x∈X f (x) + µ g(x) + λ h(x)
The infimum above has an implicit constraint on the primal problem domain
• Dual Problem:
maximize q(µ, λ)
subject to µ 0, λ ∈ Rp
• Important properties: hold without any assumptions on the primal

• Concave Dual: q(µ, λ) is concave, the constraint set is convex
• Lower Bound: For any µ 0 and λ ∈ Rp, we have q(µ, λ) ≤ f ∗
Lecture 7
Least-Norm Solution of Linear Equations

minimize xT x
subject to Ax = b
Dual Function:
• Lagrangian is: L(x, λ) = xT x + λT (Ax − b)
• To minimize L over x ∈ Rn, set the gradient ∇xL equal to zero:
T 1 T
∇xL(x, λ) = 2x + A λ = 0 =⇒ xλ = − A λ
2
• Plug xλ in L to obtain q(λ):
1 T
q(λ) = L(xλ, λ) = − λ AAT λ − bT λ
4
a concave function of λ
Lower Bound Property: −(1/4)λT AAT λ − bT λ ≤ f ∗ for all λ
Lecture 7
Standard Form LP
minimize cT x
subject to Ax = b, x0
Dual Function:
• Lagrangian is:
L(x, µ, λ) = cT x + λT (Ax − b) − µT x
= −bT λ + (c + AT λ − µ)T x
• L is linear in x, hence
(
−bT λ when AT λ − µ + c = 0
q(µ, λ) = inf L(x, µ, λ) =
x −∞ otherwise
q is linear on affine domain {(µ, λ) | AT λ − µ + c = 0}, hence concave

Lower Bound Property: −bT λ ≤ f ∗ when AT λ + c 0
Lecture 7
Two-Way Partitioning
minimize xT W x
subject to x2i = 1, i = 1, . . . , n
• A nonconvex problem: feasible set contains 2n discrete points
• Interpretation: partition {1, . . . , n} in two sets; Wij is cost of assigning
i, j to the same set; −Wij is cost of assigning to different sets
Dual Function:
( )
q(λ) = inf xT W x + λi(x2i − 1) = inf xT [W + diag(λ)]x − 1T λ
X
x x
i (
−1T λ when W + diag(λ) 0
=
−∞ otherwise
Lower Bound Property: −1T λ ≤ f ∗ when W + diag(λ) 0

example: λ = −λmin(W )1 gives bound nλmin(W ) ≤ f ∗
Lecture 7
Weak and Strong Duality

Weak Duality: q ∗ ≤ f ∗
• Holds always without any assumptions on the primal problem
• Can be used to compute nontrivial lower bounds for difficult problems
For example, a lower bound for the two-way partitioning problem can be
obtained by solving the SDP
maximize −1T λ
subject to W + diag(λ) 0
Nonzero Duality Gap: q ∗ < f ∗

• May hold even for a convex primal problem
Zero Duality Gap: q ∗ = f ∗ (Strong Duality)
• Does not hold in general
• Even for convex primal problems additional conditions are needed
• The conditions that guarantee strong duality in convex problems are
referred to as constraint qualifications
Lecture 7
Examples with Zero Duality Gap

Examples show that, in general, the relation q ∗ = f ∗ provides no information
about the existence of dual optimal solutions
• Unique Dual Optimal:
1
x21 x22

minimize x1 − x2 minimize 2
+
subject to x1 + x2 ≤ 1 subject to x1 ≤ 1
x1 ≥ 0, x2 ≥ 0
• Multiple Dual Optimal:
minimize |x1| − x2
subject to x1 ≤ 0, x2 ≥ 0
• No Dual Optimal:
minimize x
subject to x2 ≤ 0
Similarly, we can construct examples when q ∗ = f ∗ and there is no
information about the existence of the primal optimal solutions
Lecture 7
Examples with Duality Gap
• Discrete Optimization:
minimize −x
subject to x ≤ 1, x ∈ {0, 2}
• Convex Optimization:
√
− x1 x2
minimize e
subject to x21 ≤ 0, x1 ≥ 0, x2 ≥ 0

Convex Problems, Separation Theorems: September 17, 2008

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Convex Problems, Separation Theorems: September 17, 2008

Uploaded by

Copyright:

Available Formats

Lecture 7

Convex Problems, Separation Theorems

September 17, 2008

• Preliminary for Duality Theory

Primal-Dual Constrained Optimization Problems

Dual Function Properties

Weak and Strong Duality

Given a hyperplane H = {x ∈ Rn | aT x = b}, we say that

• The hyperplane H contains a set C in one of its halfspaces when

either aT z ≤ b for all z ∈ C or aT z ≥ b for all z ∈ C

Supporting Hyperplane Theorem

Th. Let C ⊆ Rn be a nonempty convex set. Let x0 be such that

• A hyperplane with such property is referred to as a supporting hyperplane

Proof of the Supporting Hyperplane Teorem

Separating Hyperplane Theorems

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets i.e., C ∩ D = ∅.

Proof of the Separating Hyperplane Theorem

Hence, aT y ≤ 0 for all y ∈ Y . Because Y = C − D, we have

aT x ≤ aT z for all x ∈ C and all z ∈ D

Taking supremum over x ∈ C and infimum over z ∈ D, we obtain

Strictly Separating Hyperplane Theorem

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets.

sup aT x < inf aT z

Proof: Homework assignment.

• An important part of optimization theory

• Its implications are far reaching both in theory and practice

• A powerful tool providing:

Main Idea and Issues in Duality Theory

• Associate an “equivalent dual problem” with a given (primal) problem

• Methodology applicable to a general constrained optimization problem

• What kind of information the dual optimal solutions provide about

Geometric Visualization of Duality

We illustrate duality using an abstract “geometric framework”

• This framework provides insights into:

Within this setting, we define:

• A corresponding “geometric dual problem” using the hyperplanes that

Consider an abstract (nonempty) set V of vectors (u, w) ∈ Rm × R

The minimum intercept value is denoted by f ∗, i.e., f ∗ = inf (0,w)∈V w.

Geometric Dual Problem

• A hyperplane supporting the set V intersects the w-axis at (0, q(µ))

Primal: minimize w Dual: maximize q(µ)

Duality Gap Illustrations

• A duality gap may exist even for a convex set V

• We may have q ∗ = f ∗ Strong Duality

• However, a nonvertical hyperplane achieving the maximum intercept

• With or without convexity of V , only one relation is sure: q ∗ ≤ f ∗

Constrained Optimization Duality

variable x ∈ Rn, feasible, optimal value f ∗ > −∞

= f (x) + µ g(x) + λT h(x)

• Weighted sum of the objective and constraint functions

• Important properties: hold without any assumptions on the primal

Least-Norm Solution of Linear Equations

• Plug xλ in L to obtain q(λ):

q is linear on affine domain {(µ, λ) | AT λ − µ + c = 0}, hence concave

Lower Bound Property: −1T λ ≤ f ∗ when W + diag(λ)  0

Weak and Strong Duality

Nonzero Duality Gap: q ∗ < f ∗

Examples with Zero Duality Gap

Examples with Duality Gap

You might also like

Lower Bound Property: −1T λ ≤ f ∗ when W + diag(λ) 0