You are on page 1of 31

INDE6372: Lecture 9

Convexity in LO

Jiming Peng
Department of Industrial Engineering
University of Houston

October 15, 2019

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Introduction: I
A mathematical optimization problem has the following form

min f (x)
s.t. gi (x) ≤ bi , i = 1, · · · , m;
x ≥ 0.

Let define

X = {x : gi (x) ≤ bi , i = 1, · · · , m; x ≥ 0}.

Then we can rewrite the problem as

min f (x).
x∈X

A solution x ∗ ∈ X is said to be optimal if

f (x ∗ ) ≤ f (x), ∀x ∈ X .

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Example

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Convexity

For LP, all the functions are linear (or affine). For a linear function
f (x), it holds

f (αx + βy ) = αf (x) + βf (y ), ∀α + β = 1, α, β ≥ 0.

The point αx + βy is called a convex combination of x and y .


We say a function f is convex if

f (αx + βy ) ≤ αf (x) + βf (y ), ∀α + β = 1, α, β ≥ 0.

Observation: a linear function is convex. In LP, all the functions


are convex.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Example

The univariate function f (x) = x 2 is convex.


The absolute function |x| is convex.
The sum of two convex functions is still convex.
More examples?

Convex function Nonconvex function

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Affine and Convex Set

Consider a convex combination

y = αx 1 + βx 2 , α + β = 1, α, β ≥ 0.

We can write

y = αx 1 + (1 − α)x 2 = x 2 + α(x 1 − x 2 ).

A set S is said to be convex if for any two points x 1 , x 2 ∈ S, the


convex combination αx 1 + (1 − α)x 2 ∈ S for any α ∈ [0, 1].
A set S is said to be affine if for any two points x 1 , x 2 ∈ S, the
data point αx 1 + (1 − α)x 2 ∈ S for any α ∈ <.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Example

Convex Set Nonconvex Set

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Affine and Convex Set

Theorem
An affine set is convex.
Example: the straight line aT x = b is affine.
The convex hull of a set S is defined by

Hull(S) = {α1 x 1 + · · · + αk x k |α1 + · · · + αk = 1, αi ≥ 0}.

Hull(S) is the smallest convex set containing S.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Example

Convex Hull Convex Hull

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Intersection of Convex Sets
Theorem
Given any collection of convex sets (finite, countable or
uncountable), their intersection is itself a convex set.
Proof: If the intersection is empty, or consists of a single point,
the theorem is true by definition. Otherwise, take any two points
A, B in the intersection. The line A-B joining these points must
also lie wholly within each set in the collection, hence must lie
wholly within their intersection.
Question: How about the union of convex sets?

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Convex Set

Theorem
Let S = {x : gi (x) ≤ 0, i = 1, · · · , m}. If all the functions gi (x)
are convex, then S is convex.
Proof:
Let Si = {x : gi (x) ≤ 0},
because gi (x) is a convex function, for any x 1 , x 2 ∈ Si , we have

gi (αx 1 + (1 − α)x 2 ) ≤ αgi (x 1 ) + (1 − α)gi (x 2 ) ≤ 0, ∀α ∈ [0, 1].

Thus Si is a convex set. Since S = ∩Si , using the theorem from


the previous slide, we can conclude that S is convex.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Cone

A set S is called a cone if for every x ∈ S and α ≥ 0, we have


αx ∈ S.
Example: the set X = {x ∈ <n : x1 , x2 , · · · , xn ≥ 0} is a cone.
Such a cone is also called the nonnegative orthant, denoted by <n+ .
The set q
S = {x ∈ <3 : x12 + x22 ≤ x3 }
is a cone, the ice-cream cone!

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Hyperplane and Halfspace

A hyperplane is a set of the form

{x : aT x = b},

where a ∈ <n , b ∈ Re.


The feasible set in standard LP is the intersection of the
nonnegative orthant and some hyperplanes.
The set {x : aT x ≤ b} is called halfspace.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Local and Global Minimum
To describe a local minimizer of an optimization problem, we
define the neighborhood of a data point:

N(x, r ) = {y : ky − xk ≤ r }.

Consider the convex optimization problem

min f (x)
x∈X

We say x ∗ is a local minimal solution to the problem if there exists


a neighborhood with a small r such that

f (x ∗ ) ≤ f (x), ∀x ∈ N(X ∗ , r ) ∩ X .

Theorem
If f (x) is convex and the constrained set X is also convex, then a
local minimal solution to the problem is also a global minimal
solution.
Jiming Peng Department of Industrial Engineering University of Houston
INDE6372: Lecture 9 Convexity in LO
Proof of the Theorem

Suppose to the contrary that x ∗ is not the global minimal solution, then there exists another y ∗ ∈ X satisfying
f (y ∗ ) < f (x ∗ ). Now let us consider the convex combination x ∗ + α(y ∗ − x ∗ ) for α ∈ [0, 1]. Since X is
convex, we have
∗ ∗ ∗ ∗ ∗
x , y ∈ X =⇒ x + α(y − x ) ∈ X , ∀α ∈ [0, 1].

Because f (x) is convex, it follows

∗ ∗ ∗ ∗ ∗ ∗
f (x + α(y − x )) ≤ (1 − α)f (x ) + αf (y ) ≤ f (x ),

and the inequality holds strictly if 0 < α ≤ 1. Recall that for small α, we have

∗ ∗ ∗ ∗
kα(y − x )k = αky − x k.

For fixed x ∗ , y ∗ , we can choose a very small parameter α such that

∗ ∗
αky − x k ≤ r.

This shows that the point x ∗ + α(y ∗ − x ∗ ) is in the neighborhood of x ∗ , but with a smaller objective value. This
violates our basic assumption that x ∗ is a local minimum. In this way, we have derived a contradiction. Thus we
can conclude that x ∗ must be a global minimal solution.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Farkas Lemma

Lemma
The system Ax ≤ b has no solutions if and only if there is a y such
that
AT y = 0, y ≥ 0, b T y < 0.

Proof: consider the primal LP

max 0
Ax ≤ b

and its dual

min bT y
AT y = 0
y ≥ 0.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Proof of Farkas Lemma: Continuation

Clearly the dual is feasible. Thus if the primal is feasible, then the
dual is bounded. This implies that if the primal is infeasible, then
the dual is unbounded. It remains to show that the dual is
unbounded if and only if the relations in the lemma hold (recall the
definition of unbounded LP). This finishes the proof of Farkas
Lemma.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Farkas Lemma: Application Example
Consider an interbank market consisting of n banks. The liability
relationship between any two banks is described by a liability
matrix L ∈ <n×n where Lij ≥ 0 denotes the liability of bank i to
bank j. Let bi denote the exogenous operating cash flow received
by bank i which can be used P to compensate the potential shortfall
on incoming cash flow, pi = nj=1 Lij , the total liability of bank i.
Let A = diag(p) − LT and e be the all-1 vector. Given b, the
systemic risk of a financial system can be estimated via solving the
following LO problem

max eT x (1)
Ax ≤ b, 0 ≤ x ≤ e.

Key Issue: The operating cash flow b is subject to market


fluctuation. When the system becomes infeasible (or some bank
will be bankrupted)?
Jiming Peng Department of Industrial Engineering University of Houston
INDE6372: Lecture 9 Convexity in LO
The separation theorem

A polyhedron is the intersection of a finite number of halfspaces

{x ∈ <n : Ax ≤ b, A ∈ <m×n }

Theorem
Let P and P̄ are two disjoint nonempty polyhedra in <n . Then
there exist disjoint halfspaces H and H̄ such that P ⊂ H, P̄ ⊂ H̄.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Proof of the separation theorem

Let

P = {x ∈ <n : Ax ≤ b, A ∈ <m×n }, P̄ = {x ∈ <n : Āx ≤ b̄, Ā ∈ <k×n }.

Since these two polyhedra have no intersections, thus the set

P̃ = {x : Ax ≤ b, Āx ≤ b̄}

has no feasible points. From the Farkas lemma, there exists (y , ȳ )


satisfying the following

AT y + ĀT ȳ = 0, , b T y + b̄ T ȳ < 0, (y , ȳ ) ≥ 0.

From the second inequality, we have either b T y < 0 or b̄ T ȳ < 0


(or both). Without loss of generality, we assume that b T y < 0.
Because P is nonempty, thus AT y 6= 0.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Proof of the separation theorem: Cont.

Define

H = {x : y T Ax ≤ b T y } H̄ = {x : y T Ax ≥ −b̄ T ȳ }

Clearly these two sets are halfspaces. Moreover, there is no


intersection between them, otherwise, we have

b T y ≥ y T Ax ≥ −b̄ T ȳ =⇒ b T y + b̄ T ȳ ≥ 0,

which yields a contradiction.


Next we show P ⊂ H and P̄ ⊂ H̄. Because
P = {x : Ax ≤ b}, y ≥ 0, therefore, for every x ∈ P, we have
y T Ax ≤ b T y which implies x ∈ H. Thus P ⊂ H. The case P̄ ⊂ H̄
follows similarly.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Separation Example
We consider the following example:

P = {x ∈ R 2 : 0 ≤ x1 , x2 ≤ 1}, P̄ = {x ∈ R 2 : 2 ≤ x1 , x2 ≤ 3} (2)
or P = {x ∈ R 2 : Ax ≤ b}, P̄ = {x ∈ R 2 : Āx ≤ b̄} (3)
 
I
A= , b = (1, 1, 0, 0)T , Ā = A, b̄ = (3, 3, −2, −2)T . (4)
−I

Clearly, the system Ax ≤ b, Ax ≤ b̄ is infeasible. By Farkas


Lemma, the following system

y1 − y3 + ȳ1 − ȳ3 = 0, (5)


y2 − y4 + ȳ2 − ȳ4 = 0, (6)
y1 + y2 + 3(ȳ1 + ȳ2 ) − 2(ȳ3 + ȳ4 ) < 0, (7)
(y , ȳ ) ≥ 0. (8)

is feasible, i.e., y = (0, 1, 0, 0), ȳ = (0, 0, 0, 1).


Jiming Peng Department of Industrial Engineering University of Houston
INDE6372: Lecture 9 Convexity in LO
Separation Example

Note that for the example in the previous slide, we have

y T Ax = x2 , b T y = 1, b̄ T ȳ = −2,

From the above relation, we derive the following two polyhedrons

H = {x ∈ R 2 : x2 ≤ 1}, H̄ = {x ∈ R 2 : x2 ≥ −b̄ T ȳ = 2}.

It is easy to verify that P ⊂ H, P̄ ⊂ H̄.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Separation Example

Note: The separation theorem does not hold for nonconvex sets.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Linear Optimization for Classification: I

Let a1 , · · · , an represent the data points in a data set, which


consists of two groups (‘good’ and ‘bad’). We would like to find
some classification rule to separate these two groups. A linear
classifier is defined by
a0 x = b.
A classical LO model is described as the follows:

aiT x < b, if aj is in group ‘good’; (9)


aiT x ≥ b, if aj is in group ‘bad’. (10)

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Linear Optimization for Classification: II

To find a nice classifier, we solve the following optimization


problem

Pn
max i=1 αi (11)
aiT x < b − αi , if aj is in group ‘good’; (12)
aiT x ≥ b + αi , if aj is in group ‘bad’; (13)
α1 , · · · , αn ≥ 0. (14)

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Linear Optimization for Classification: III

Note: Classification has been widely used in many applications


such as loan applications, fraud detection, customer segmentation,
image and text analysis etc.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Strict Complementary Conditions

Consider the primal

min cT x
Ax = b, x ≥ 0,

and its dual

max bT y
AT y + s = c, s ≥ 0.

Theorem
If both the primal and its dual are feasible, then there exists an
optimal solution x ∗ to the primal problem, and optimal solution
(y ∗ , s ∗ ) to the dual problem such that x ∗ + s ∗ > 0.

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Proof of the Strict Complementary Conditions
Now let us consider the following problem
min −xj
Ax = b, x ≥ 0
c T x ≤ z ∗.
Thus the optimal solution to the above problem is 0. From the
strong duality theory, we know its dual
max bT y − τ z ∗
AT y − τ c ≤ −ej , x0 ≥ 0
has optimal value 0. Here ej is the vector with zero elements
except for 1 at its j-th position. If (ȳ , τ = 0) is the optimal
solution, then it is easy to see that y ∗ + ȳ is also an optimal
solution to the original dual with a slack variable sj∗ > 0.
If τ > 0, then it must hold b T ȳ = τ z ∗ , thus the point ȳ /τ is an
optimal solution to the original dual with a slack variable sj > 0.
Jiming Peng Department of Industrial Engineering University of Houston
INDE6372: Lecture 9 Convexity in LO
Example

min x1 + 2x2
x1 + x2 ≤ 2;
2x1 − x2 ≥ 1;
x1 , x2 ≥ 0.

Which algorithm should we use to solve the above problem?


Answer: Dual simplex, since if we write out the standard form for
the above problem, then we have a solution which is feasible for
the dual problem.
What is the solution to the dual problem?

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO
Solution to Example in Standard Form

min x1 + 2x2 + 0x3 + 0x4


x1 + x2 + x3 = 2;
−2x1 + x2 + x4 = −1;
x1 , x2 , x3 , x4 ≥ 0.

Solution to the primal problem x1 = 0.5, x2 = 0, x3 = 1.5, x4 = 0!


Solution to the dual problem is
y1 = 0, y2 = 0.5, s1 = 0, s2 = 2.5, s3 = 0, s4 = 0.5.
Check for the above solutions, we do have x + s > 0!

Jiming Peng Department of Industrial Engineering University of Houston


INDE6372: Lecture 9 Convexity in LO

You might also like