INDE6372: Lecture 9 Convexity in LO: Jiming Peng Department of Industrial Engineering University of Houston

INDE6372: Lecture 9
Convexity in LO
Jiming Peng
Department of Industrial Engineering
University of Houston
October 15, 2019
Jiming Peng Department of Industrial Engineering University of Houston

INDE6372: Lecture 9 Convexity in LO
Introduction: I
A mathematical optimization problem has the following form
min f (x)
s.t. gi (x) ≤ bi , i = 1, · · · , m;
x ≥ 0.
Let define
X = {x : gi (x) ≤ bi , i = 1, · · · , m; x ≥ 0}.
Then we can rewrite the problem as
min f (x).
x∈X
A solution x ∗ ∈ X is said to be optimal if
f (x ∗ ) ≤ f (x), ∀x ∈ X .

Example

Convexity
For LP, all the functions are linear (or affine). For a linear function
f (x), it holds
f (αx + βy ) = αf (x) + βf (y ), ∀α + β = 1, α, β ≥ 0.
The point αx + βy is called a convex combination of x and y .

We say a function f is convex if
f (αx + βy ) ≤ αf (x) + βf (y ), ∀α + β = 1, α, β ≥ 0.
Observation: a linear function is convex. In LP, all the functions

are convex.

Example
The univariate function f (x) = x 2 is convex.

The absolute function |x| is convex.
The sum of two convex functions is still convex.
More examples?
Convex function Nonconvex function

Affine and Convex Set
Consider a convex combination
y = αx 1 + βx 2 , α + β = 1, α, β ≥ 0.
We can write
y = αx 1 + (1 − α)x 2 = x 2 + α(x 1 − x 2 ).
A set S is said to be convex if for any two points x 1 , x 2 ∈ S, the

convex combination αx 1 + (1 − α)x 2 ∈ S for any α ∈ [0, 1].
A set S is said to be affine if for any two points x 1 , x 2 ∈ S, the
data point αx 1 + (1 − α)x 2 ∈ S for any α ∈ <.

Example
Convex Set Nonconvex Set

Affine and Convex Set
Theorem
An affine set is convex.
Example: the straight line aT x = b is affine.
The convex hull of a set S is defined by
Hull(S) = {α1 x 1 + · · · + αk x k |α1 + · · · + αk = 1, αi ≥ 0}.
Hull(S) is the smallest convex set containing S.

Example
Convex Hull Convex Hull

Intersection of Convex Sets
Theorem
Given any collection of convex sets (finite, countable or
uncountable), their intersection is itself a convex set.
Proof: If the intersection is empty, or consists of a single point,
the theorem is true by definition. Otherwise, take any two points
A, B in the intersection. The line A-B joining these points must
also lie wholly within each set in the collection, hence must lie
wholly within their intersection.
Question: How about the union of convex sets?

Convex Set
Theorem
Let S = {x : gi (x) ≤ 0, i = 1, · · · , m}. If all the functions gi (x)
are convex, then S is convex.
Proof:
Let Si = {x : gi (x) ≤ 0},
because gi (x) is a convex function, for any x 1 , x 2 ∈ Si , we have
gi (αx 1 + (1 − α)x 2 ) ≤ αgi (x 1 ) + (1 − α)gi (x 2 ) ≤ 0, ∀α ∈ [0, 1].
Thus Si is a convex set. Since S = ∩Si , using the theorem from

the previous slide, we can conclude that S is convex.

Cone
A set S is called a cone if for every x ∈ S and α ≥ 0, we have

αx ∈ S.
Example: the set X = {x ∈ <n : x1 , x2 , · · · , xn ≥ 0} is a cone.
Such a cone is also called the nonnegative orthant, denoted by <n+ .
The set q
S = {x ∈ <3 : x12 + x22 ≤ x3 }
is a cone, the ice-cream cone!

Hyperplane and Halfspace
A hyperplane is a set of the form
{x : aT x = b},
where a ∈ <n , b ∈ Re.

The feasible set in standard LP is the intersection of the
nonnegative orthant and some hyperplanes.
The set {x : aT x ≤ b} is called halfspace.

Local and Global Minimum
To describe a local minimizer of an optimization problem, we
define the neighborhood of a data point:
N(x, r ) = {y : ky − xk ≤ r }.
Consider the convex optimization problem
min f (x)
x∈X
We say x ∗ is a local minimal solution to the problem if there exists

a neighborhood with a small r such that
f (x ∗ ) ≤ f (x), ∀x ∈ N(X ∗ , r ) ∩ X .
Theorem
If f (x) is convex and the constrained set X is also convex, then a
local minimal solution to the problem is also a global minimal
solution.
Proof of the Theorem
Suppose to the contrary that x ∗ is not the global minimal solution, then there exists another y ∗ ∈ X satisfying
f (y ∗ ) < f (x ∗ ). Now let us consider the convex combination x ∗ + α(y ∗ − x ∗ ) for α ∈ [0, 1]. Since X is
convex, we have
∗ ∗ ∗ ∗ ∗
x , y ∈ X =⇒ x + α(y − x ) ∈ X , ∀α ∈ [0, 1].
Because f (x) is convex, it follows
∗ ∗ ∗ ∗ ∗ ∗
f (x + α(y − x )) ≤ (1 − α)f (x ) + αf (y ) ≤ f (x ),
and the inequality holds strictly if 0 < α ≤ 1. Recall that for small α, we have
∗ ∗ ∗ ∗
kα(y − x )k = αky − x k.
For fixed x ∗ , y ∗ , we can choose a very small parameter α such that
∗ ∗
αky − x k ≤ r.
This shows that the point x ∗ + α(y ∗ − x ∗ ) is in the neighborhood of x ∗ , but with a smaller objective value. This
violates our basic assumption that x ∗ is a local minimum. In this way, we have derived a contradiction. Thus we
can conclude that x ∗ must be a global minimal solution.

Farkas Lemma
Lemma
The system Ax ≤ b has no solutions if and only if there is a y such
that
AT y = 0, y ≥ 0, b T y < 0.
Proof: consider the primal LP
max 0
Ax ≤ b
and its dual
min bT y
AT y = 0
y ≥ 0.

Proof of Farkas Lemma: Continuation
Clearly the dual is feasible. Thus if the primal is feasible, then the
dual is bounded. This implies that if the primal is infeasible, then
the dual is unbounded. It remains to show that the dual is
unbounded if and only if the relations in the lemma hold (recall the
definition of unbounded LP). This finishes the proof of Farkas
Lemma.

Farkas Lemma: Application Example
Consider an interbank market consisting of n banks. The liability
relationship between any two banks is described by a liability
matrix L ∈ <n×n where Lij ≥ 0 denotes the liability of bank i to
bank j. Let bi denote the exogenous operating cash flow received
by bank i which can be used P to compensate the potential shortfall
on incoming cash flow, pi = nj=1 Lij , the total liability of bank i.
Let A = diag(p) − LT and e be the all-1 vector. Given b, the
systemic risk of a financial system can be estimated via solving the
following LO problem
max eT x (1)
Ax ≤ b, 0 ≤ x ≤ e.
Key Issue: The operating cash flow b is subject to market

fluctuation. When the system becomes infeasible (or some bank
will be bankrupted)?
The separation theorem
A polyhedron is the intersection of a finite number of halfspaces
{x ∈ <n : Ax ≤ b, A ∈ <m×n }
Theorem
Let P and P̄ are two disjoint nonempty polyhedra in <n . Then
there exist disjoint halfspaces H and H̄ such that P ⊂ H, P̄ ⊂ H̄.

Proof of the separation theorem
Let
P = {x ∈ <n : Ax ≤ b, A ∈ <m×n }, P̄ = {x ∈ <n : Āx ≤ b̄, Ā ∈ <k×n }.
Since these two polyhedra have no intersections, thus the set
P̃ = {x : Ax ≤ b, Āx ≤ b̄}
has no feasible points. From the Farkas lemma, there exists (y , ȳ )

satisfying the following
AT y + ĀT ȳ = 0, , b T y + b̄ T ȳ < 0, (y , ȳ ) ≥ 0.
From the second inequality, we have either b T y < 0 or b̄ T ȳ < 0

(or both). Without loss of generality, we assume that b T y < 0.
Because P is nonempty, thus AT y 6= 0.

Proof of the separation theorem: Cont.
Define
H = {x : y T Ax ≤ b T y } H̄ = {x : y T Ax ≥ −b̄ T ȳ }
Clearly these two sets are halfspaces. Moreover, there is no

intersection between them, otherwise, we have
b T y ≥ y T Ax ≥ −b̄ T ȳ =⇒ b T y + b̄ T ȳ ≥ 0,
which yields a contradiction.

Next we show P ⊂ H and P̄ ⊂ H̄. Because
P = {x : Ax ≤ b}, y ≥ 0, therefore, for every x ∈ P, we have
y T Ax ≤ b T y which implies x ∈ H. Thus P ⊂ H. The case P̄ ⊂ H̄
follows similarly.

Separation Example
We consider the following example:
P = {x ∈ R 2 : 0 ≤ x1 , x2 ≤ 1}, P̄ = {x ∈ R 2 : 2 ≤ x1 , x2 ≤ 3} (2)
or P = {x ∈ R 2 : Ax ≤ b}, P̄ = {x ∈ R 2 : Āx ≤ b̄} (3)

I
A= , b = (1, 1, 0, 0)T , Ā = A, b̄ = (3, 3, −2, −2)T . (4)
−I
Clearly, the system Ax ≤ b, Ax ≤ b̄ is infeasible. By Farkas

Lemma, the following system
y1 − y3 + ȳ1 − ȳ3 = 0, (5)

y2 − y4 + ȳ2 − ȳ4 = 0, (6)
y1 + y2 + 3(ȳ1 + ȳ2 ) − 2(ȳ3 + ȳ4 ) < 0, (7)
(y , ȳ ) ≥ 0. (8)
is feasible, i.e., y = (0, 1, 0, 0), ȳ = (0, 0, 0, 1).

Separation Example
Note that for the example in the previous slide, we have
y T Ax = x2 , b T y = 1, b̄ T ȳ = −2,
From the above relation, we derive the following two polyhedrons
H = {x ∈ R 2 : x2 ≤ 1}, H̄ = {x ∈ R 2 : x2 ≥ −b̄ T ȳ = 2}.
It is easy to verify that P ⊂ H, P̄ ⊂ H̄.

Separation Example
Note: The separation theorem does not hold for nonconvex sets.

Linear Optimization for Classification: I
Let a1 , · · · , an represent the data points in a data set, which

consists of two groups (‘good’ and ‘bad’). We would like to find
some classification rule to separate these two groups. A linear
classifier is defined by
a0 x = b.
A classical LO model is described as the follows:
aiT x < b, if aj is in group ‘good’; (9)

aiT x ≥ b, if aj is in group ‘bad’. (10)

Linear Optimization for Classification: II
To find a nice classifier, we solve the following optimization

problem
Pn
max i=1 αi (11)
aiT x < b − αi , if aj is in group ‘good’; (12)
aiT x ≥ b + αi , if aj is in group ‘bad’; (13)
α1 , · · · , αn ≥ 0. (14)

Linear Optimization for Classification: III
Note: Classification has been widely used in many applications

such as loan applications, fraud detection, customer segmentation,
image and text analysis etc.

Strict Complementary Conditions
Consider the primal
min cT x
Ax = b, x ≥ 0,
and its dual
max bT y
AT y + s = c, s ≥ 0.
Theorem
If both the primal and its dual are feasible, then there exists an
optimal solution x ∗ to the primal problem, and optimal solution
(y ∗ , s ∗ ) to the dual problem such that x ∗ + s ∗ > 0.

Proof of the Strict Complementary Conditions
Now let us consider the following problem
min −xj
Ax = b, x ≥ 0
c T x ≤ z ∗.
Thus the optimal solution to the above problem is 0. From the
strong duality theory, we know its dual
max bT y − τ z ∗
AT y − τ c ≤ −ej , x0 ≥ 0
has optimal value 0. Here ej is the vector with zero elements
except for 1 at its j-th position. If (ȳ , τ = 0) is the optimal
solution, then it is easy to see that y ∗ + ȳ is also an optimal
solution to the original dual with a slack variable sj∗ > 0.
If τ > 0, then it must hold b T ȳ = τ z ∗ , thus the point ȳ /τ is an
optimal solution to the original dual with a slack variable sj > 0.
Example
min x1 + 2x2
x1 + x2 ≤ 2;
2x1 − x2 ≥ 1;
x1 , x2 ≥ 0.
Which algorithm should we use to solve the above problem?

Answer: Dual simplex, since if we write out the standard form for
the above problem, then we have a solution which is feasible for
the dual problem.
What is the solution to the dual problem?

Solution to Example in Standard Form
min x1 + 2x2 + 0x3 + 0x4

x1 + x2 + x3 = 2;
−2x1 + x2 + x4 = −1;
x1 , x2 , x3 , x4 ≥ 0.
Solution to the primal problem x1 = 0.5, x2 = 0, x3 = 1.5, x4 = 0!

Solution to the dual problem is
y1 = 0, y2 = 0.5, s1 = 0, s2 = 2.5, s3 = 0, s4 = 0.5.
Check for the above solutions, we do have x + s > 0!


INDE6372: Lecture 9 Convexity in LO: Jiming Peng Department of Industrial Engineering University of Houston

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

INDE6372: Lecture 9 Convexity in LO: Jiming Peng Department of Industrial Engineering University of Houston

Uploaded by

Copyright:

Available Formats

INDE6372: Lecture 9

October 15, 2019

Jiming Peng Department of Industrial Engineering University of Houston

Then we can rewrite the problem as

A solution x ∗ ∈ X is said to be optimal if

Jiming Peng Department of Industrial Engineering University of Houston

Jiming Peng Department of Industrial Engineering University of Houston

The point αx + βy is called a convex combination of x and y .

Observation: a linear function is convex. In LP, all the functions

Jiming Peng Department of Industrial Engineering University of Houston

The univariate function f (x) = x 2 is convex.

Convex function Nonconvex function

Jiming Peng Department of Industrial Engineering University of Houston

Consider a convex combination

A set S is said to be convex if for any two points x 1 , x 2 ∈ S, the

Jiming Peng Department of Industrial Engineering University of Houston

Convex Set Nonconvex Set

Jiming Peng Department of Industrial Engineering University of Houston

Hull(S) = {α1 x 1 + · · · + αk x k |α1 + · · · + αk = 1, αi ≥ 0}.

Hull(S) is the smallest convex set containing S.

Jiming Peng Department of Industrial Engineering University of Houston

Convex Hull Convex Hull

Jiming Peng Department of Industrial Engineering University of Houston

Jiming Peng Department of Industrial Engineering University of Houston

gi (αx 1 + (1 − α)x 2 ) ≤ αgi (x 1 ) + (1 − α)gi (x 2 ) ≤ 0, ∀α ∈ [0, 1].

Thus Si is a convex set. Since S = ∩Si , using the theorem from

Jiming Peng Department of Industrial Engineering University of Houston

A set S is called a cone if for every x ∈ S and α ≥ 0, we have

Jiming Peng Department of Industrial Engineering University of Houston

A hyperplane is a set of the form

where a ∈ <n , b ∈ Re.

Jiming Peng Department of Industrial Engineering University of Houston

Consider the convex optimization problem

We say x ∗ is a local minimal solution to the problem if there exists

Because f (x) is convex, it follows

For fixed x ∗ , y ∗ , we can choose a very small parameter α such that

Jiming Peng Department of Industrial Engineering University of Houston

Proof: consider the primal LP

and its dual

Jiming Peng Department of Industrial Engineering University of Houston

Jiming Peng Department of Industrial Engineering University of Houston

Key Issue: The operating cash flow b is subject to market

A polyhedron is the intersection of a finite number of halfspaces

Jiming Peng Department of Industrial Engineering University of Houston

P = {x ∈ <n : Ax ≤ b, A ∈ <m×n }, P̄ = {x ∈ <n : Āx ≤ b̄, Ā ∈ <k×n }.

Since these two polyhedra have no intersections, thus the set

has no feasible points. From the Farkas lemma, there exists (y , ȳ )

From the second inequality, we have either b T y < 0 or b̄ T ȳ < 0

Jiming Peng Department of Industrial Engineering University of Houston

Clearly these two sets are halfspaces. Moreover, there is no

which yields a contradiction.

Jiming Peng Department of Industrial Engineering University of Houston

Clearly, the system Ax ≤ b, Ax ≤ b̄ is infeasible. By Farkas

y1 − y3 + ȳ1 − ȳ3 = 0, (5)

is feasible, i.e., y = (0, 1, 0, 0), ȳ = (0, 0, 0, 1).

Note that for the example in the previous slide, we have

From the above relation, we derive the following two polyhedrons

H = {x ∈ R 2 : x2 ≤ 1}, H̄ = {x ∈ R 2 : x2 ≥ −b̄ T ȳ = 2}.

It is easy to verify that P ⊂ H, P̄ ⊂ H̄.

Jiming Peng Department of Industrial Engineering University of Houston

Jiming Peng Department of Industrial Engineering University of Houston

Let a1 , · · · , an represent the data points in a data set, which

aiT x < b, if aj is in group ‘good’; (9)