You are on page 1of 26

Lecture 7

Convex Problems, Separation Theorems

September 17, 2008


Lecture 7

Outline

• Preliminary for Duality Theory


• Separation Theorems (Ch. 2.5 of Boyd and Vandenberghe’s book)
 Supporting Hyperplane Theorem
 Separating Hyperplane Theorems

• Duality
 Motivation
 Visualization of Primal-Dual Framework

 Primal-Dual Constrained Optimization Problems

 Dual Function Properties

 Weak and Strong Duality

 Examples

Convex Optimization 1
Lecture 7

Some Terminology

Given a hyperplane H = {x ∈ Rn | aT x = b}, we say that


• The hyperplane H passes through a vector x0 when

x0 ∈ H ⇐⇒ a T x0 = b

• The hyperplane H contains a set C in one of its halfspaces when

either aT z ≤ b for all z ∈ C or aT z ≥ b for all z ∈ C

Convex Optimization 2
Lecture 7

Supporting Hyperplane Theorem

Th. Let C ⊆ Rn be a nonempty convex set. Let x0 be such that


either x0 ∈ bdC or x0 ∈ /C
Then, there exists a hyperplane passing through x0 and containing the set
C in one of its halfspaces, i.e., there is a vector a ∈ Rn, a 6= 0, such that

sup aT z ≤ aT x0
z∈C

• A hyperplane with such property is referred to as a supporting hyperplane

Convex Optimization 3
Lecture 7

Proof of the Supporting Hyperplane Teorem


Proof for the case when C is closed:
Let {xk } 6⊆ C such that xk → x0. (Why does it exist?)
Let zk∗ be the projection of xk on the set C for each k. Consider
xk −zk∗
ak = kx −z∗k for k ≥ 1
k k
This sequence is bounded and, therefore, it has a limit point, say a ∈ Rn.
Let {ak }K be a subsequence of {ak } converging to a. Since zk∗ is the
projection of xk for each k, by the Projection Theorem (b), it follows that
for each k (zk∗ − xk )T (z − zk∗) ≥ 0 for all z ∈ C
implying that for all k
aTk z ≤ aTk zk∗ for all z ∈ C
Because aTk zk∗ = aTk (zk∗ − xk ) + aTk xk < aTk xk for all k, we obtain
aTk z < aTk xk for all z ∈ C
Since xk → x0 and ak → a over k ∈ K, it follows that
a T z ≤ a T x0 for all z ∈ C

Convex Optimization 4
Lecture 7

Separating Hyperplane Theorems

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets i.e., C ∩ D = ∅.


Then, there exists a hyperplane separating these sets, i.e.,
there is a ∈ Rn, a 6= 0, such that

sup aT x ≤ inf aT z
x∈C z∈D

Convex Optimization 5
Lecture 7

Proof of the Separating Hyperplane Theorem


Consider the set Y = C − D. This is a (nonempty) convex set.
Since C ∩ D = ∅, it follows that 0 6∈ Y .
By the Supporting Hyperplane Theorem, it follows that there exists a ∈ Rn
such that
sup aT y ≤ 0.
y∈Y

Hence, aT y ≤ 0 for all y ∈ Y . Because Y = C − D, we have

aT x ≤ aT z for all x ∈ C and all z ∈ D

Taking supremum over x ∈ C and infimum over z ∈ D, we obtain

sup aT x ≤ inf aT z
x∈C z∈D

Convex Optimization 6
Lecture 7

Strictly Separating Hyperplane Theorem

Th. Let C, D ⊆ Rn be nonempty convex disjoint sets.


Assume that C − D is closed. Then, there exists a hyperplane strictly
separating the sets, i.e., there is a ∈ Rn, a 6= 0, such that

sup aT x < inf aT z


x∈C z∈D

Proof: Homework assignment.


• When is C − D closed?
• One of conditions: C is closed and D is compact

Convex Optimization 7
Lecture 7

Duality Theory

• An important part of optimization theory

• Its implications are far reaching both in theory and practice

• A powerful tool providing:


• A basis for the development of a rich class of optimization algorithms
• A general systematic way for developing bounding strategies (both in
continuous and discrete optimization)
• A basis for sensitivity analysis

Convex Optimization 8
Lecture 7

Main Idea and Issues in Duality Theory

• Associate an “equivalent dual problem” with a given (primal) problem

• Methodology applicable to a general constrained optimization problem

• Investigate:
• Is there a general relation between the primal and its associated dual
problem?

• Under which conditions the primal and the dual problems have the
same optimal values?

• Under which conditions the primal and dual optimal solutions exist?

• What are the relations between primal and dual optimal solutions?

• What kind of information the dual optimal solutions provide about


the primal problem?

Convex Optimization 9
Lecture 7

Geometric Visualization of Duality

We illustrate duality using an abstract “geometric framework”

• This framework provides insights into:


• Weak duality
• Strong duality (zero duality gap)
• Existence of duality gap

Within this setting, we define:


• A “geometric primal problem” using an abstract set V ⊆ Rm × R

• A corresponding “geometric dual problem” using the hyperplanes that


support the set V

Convex Optimization 10
Lecture 7

Geometric Primal

Consider an abstract (nonempty) set V of vectors (u, w) ∈ Rm × R


The set V intersects the w-axis, i.e.,
(0, w) ∈ V for some w ∈ R
The set V extends “north” and “east”:
[North] For any (u, w) ∈ V and u ∈ Rm with u  ũ, we have (ũ, w) ∈ V
[East] For any (u, w) ∈ V and w ∈ R with w ≤ w̃, we have (u, w̃) ∈ V
• Geometric Primal Problem
Determine the minimum intercept of the set V and the w-axis:

minimize w
subject to (0, w) ∈ V

The minimum intercept value is denoted by f ∗, i.e., f ∗ = inf (0,w)∈V w.

Convex Optimization 11
Lecture 7

Nonvertical Hyperplanes
A hyperplane in Rm × R: {(u, w) | µT u + µ0w = ξ}, µ ∈ Rm, µ0, ξ ∈ R
• We say that a hyperplane is nonvertical when µ0 6= 0
• Let Hµ,ξ denote a nonvertical hyperplane in Rm × R, i.e.,
Hµ,ξ = {(u, w) | µT u + w = ξ} with µ ∈ Rm, ξ ∈ R
• Let q(µ) be the minimum value of µT u + w for (u, w) ∈ V , i.e.,
q(µ) = inf (u,w)∈V {µT u + w}
• A nonvertical hyperplane Hµ,ξ̂ supports a set V when ξ̂ = q(µ)

Convex Optimization 12
Lecture 7

Geometric Dual Problem

• A hyperplane supporting the set V intersects the w-axis at (0, q(µ))


• Geometric Dual Problem: Determine the maximum intercept with the
w-axis for the nonvertical hyperplanes that support the set V :

maximize q(µ)
subject to µ ∈ Rm
• Note: q(µ) = inf (u,w)∈V {µT u + w}, q(µ) = −∞ for µ 6 0

Convex Optimization 13
Lecture 7

Observations

Primal: minimize w Dual: maximize q(µ)


subject to (0, w) ∈ V subject to µ  0
• Dual values q(µ) are always below f ∗ and below any w with (0, w) ∈ V
• Dual optimal value q ∗ never exceeds the primal optimal value f ∗:
q∗ ≤ f ∗ Weak Duality
• The weak duality may be strict i.e., q ∗ < f ∗ there is a Duality Gap

Convex Optimization 14
Lecture 7

Duality Gap Illustrations

• A duality gap may exist even for a convex set V

Convex Optimization 15
Lecture 7

Strong Duality

• We may have q ∗ = f ∗ Strong Duality

• However, a nonvertical hyperplane achieving the maximum intercept


may not exist

• With or without convexity of V , only one relation is sure: q ∗ ≤ f ∗

Convex Optimization 16
Lecture 7

Constrained Optimization Duality


Primal Problem (not necessarily convex)

minimize f (x)
subject to gj (x) ≤ 0, j = 1, . . . , m
x∈X

variable x ∈ Rn, feasible, optimal value f ∗ > −∞


Geometric Framework:
• Define the set V as follows:
V = {(u, w) | there is x ∈ X such that g(x)  u, f (x) ≤ w}
• Dual function is:
q(µ) = inf (u,w)∈V {w + µT u} = inf x∈X {f (x) + µT g(x)}, µ  0
• Dual Problem:
maximize q(µ)
subject to µ  0

Convex Optimization 17
Lecture 7

General Case
Primal Problem (not necessarily convex)
minimize f (x)
subject to gj (x) ≤ 0, j = 1, . . . , m
hj (x) = 0, j = 1, . . . , p
x∈X
variable x ∈ Rn, feasible, optimal value f ∗ > −∞
Lagrangian Function: L : Rn × Rm × Rp → R given by

m
X p
X
L(x, µ, λ) = f (x) + µj gj (x) + λj hj (x)
j=1 j=1

= f (x) + µ g(x) + λT h(x)


T

• Weighted sum of the objective and constraint functions


• µ ∈ Rm is Lagrange multiplier associated with g = (g1, . . . , gm)
• λ ∈ Rp is Lagrange multiplier associated with h = (h1, . . . , hp)

Convex Optimization 18
Lecture 7

Dual Problem
Lagrangian Function:
T T

q(µ, λ) = inf x∈X L(x, µ, λ) = inf x∈X f (x) + µ g(x) + λ h(x)
The infimum above has an implicit constraint on the primal problem domain
• Dual Problem:
maximize q(µ, λ)
subject to µ  0, λ ∈ Rp

• Important properties: hold without any assumptions on the primal


• Concave Dual: q(µ, λ) is concave, the constraint set is convex
• Lower Bound: For any µ  0 and λ ∈ Rp, we have q(µ, λ) ≤ f ∗

Convex Optimization 19
Lecture 7

Least-Norm Solution of Linear Equations


minimize xT x
subject to Ax = b
Dual Function:
• Lagrangian is: L(x, λ) = xT x + λT (Ax − b)
• To minimize L over x ∈ Rn, set the gradient ∇xL equal to zero:

T 1 T
∇xL(x, λ) = 2x + A λ = 0 =⇒ xλ = − A λ
2

• Plug xλ in L to obtain q(λ):

1 T
q(λ) = L(xλ, λ) = − λ AAT λ − bT λ
4

a concave function of λ
Lower Bound Property: −(1/4)λT AAT λ − bT λ ≤ f ∗ for all λ

Convex Optimization 20
Lecture 7

Standard Form LP
minimize cT x
subject to Ax = b, x0
Dual Function:
• Lagrangian is:

L(x, µ, λ) = cT x + λT (Ax − b) − µT x
= −bT λ + (c + AT λ − µ)T x

• L is linear in x, hence
(
−bT λ when AT λ − µ + c = 0
q(µ, λ) = inf L(x, µ, λ) =
x −∞ otherwise

q is linear on affine domain {(µ, λ) | AT λ − µ + c = 0}, hence concave


Lower Bound Property: −bT λ ≤ f ∗ when AT λ + c  0

Convex Optimization 21
Lecture 7

Two-Way Partitioning
minimize xT W x
subject to x2i = 1, i = 1, . . . , n
• A nonconvex problem: feasible set contains 2n discrete points
• Interpretation: partition {1, . . . , n} in two sets; Wij is cost of assigning
i, j to the same set; −Wij is cost of assigning to different sets
Dual Function:
( )
q(λ) = inf xT W x + λi(x2i − 1) = inf xT [W + diag(λ)]x − 1T λ
X
x x
i (
−1T λ when W + diag(λ)  0
=
−∞ otherwise

Lower Bound Property: −1T λ ≤ f ∗ when W + diag(λ)  0


example: λ = −λmin(W )1 gives bound nλmin(W ) ≤ f ∗

Convex Optimization 22
Lecture 7

Weak and Strong Duality


Weak Duality: q ∗ ≤ f ∗
• Holds always without any assumptions on the primal problem
• Can be used to compute nontrivial lower bounds for difficult problems
For example, a lower bound for the two-way partitioning problem can be
obtained by solving the SDP

maximize −1T λ
subject to W + diag(λ)  0

Nonzero Duality Gap: q ∗ < f ∗


• May hold even for a convex primal problem
Zero Duality Gap: q ∗ = f ∗ (Strong Duality)
• Does not hold in general
• Even for convex primal problems additional conditions are needed
• The conditions that guarantee strong duality in convex problems are
referred to as constraint qualifications

Convex Optimization 23
Lecture 7

Examples with Zero Duality Gap


Examples show that, in general, the relation q ∗ = f ∗ provides no information
about the existence of dual optimal solutions
• Unique Dual Optimal:
1
x21 x22

minimize x1 − x2 minimize 2
+
subject to x1 + x2 ≤ 1 subject to x1 ≤ 1
x1 ≥ 0, x2 ≥ 0
• Multiple Dual Optimal:
minimize |x1| − x2
subject to x1 ≤ 0, x2 ≥ 0
• No Dual Optimal:
minimize x
subject to x2 ≤ 0
Similarly, we can construct examples when q ∗ = f ∗ and there is no
information about the existence of the primal optimal solutions

Convex Optimization 24
Lecture 7

Examples with Duality Gap

• Discrete Optimization:

minimize −x
subject to x ≤ 1, x ∈ {0, 2}

• Convex Optimization:


− x1 x2
minimize e
subject to x21 ≤ 0, x1 ≥ 0, x2 ≥ 0

Convex Optimization 25

You might also like