Professional Documents
Culture Documents
Lecture notes
Philine Schiewe
assistant professor for operations research
Aalto University, School of Science
1 Introduction 5
1.1 What is optimisation? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Types of optimisation problems/ mathematical programming . . . . . . . . . . . . 6
1.3 Modelling real-world problems using optimisation . . . . . . . . . . . . . . . . . . . 7
1.4 Modelling problems as LP and graphical solution approach . . . . . . . . . . . . . 13
1.4.1 Graphical sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2 Linear programming 21
2.1 Representations of linear programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Geometric and algebraic basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Simplex method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Gauss-Jordan elimination and simplex tableaus . . . . . . . . . . . . . . . . . . . . 29
2.5 Artificial variables and feasible initial solutions . . . . . . . . . . . . . . . . . . . . 33
2.5.1 The M-method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5.2 Two-phase method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6 Special cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.7 Duality in linear programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.7.1 Primal-dual relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7.2 Dual simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.8 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.8.1 Changes in the independent term (b) . . . . . . . . . . . . . . . . . . . . . . 54
2.8.2 Changes in the objective function coefficients (c) . . . . . . . . . . . . . . . 55
3 Integer programming 59
3.1 Integer programming problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.2 Modelling with integer variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.1 Fixed cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2.2 Disjunctions and implications . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.3 Solving general integer programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.1 Branch-and-bound method . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3
4 Contents
4 Nonlinear optimisation 77
4.1 Revision of calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Nonlinear optimisation models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3 Optimality conditions - Unconstrained problems . . . . . . . . . . . . . . . . . . . 81
4.4 Convexity of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.5 One dimensional optimisation methods - Line search . . . . . . . . . . . . . . . . . 85
4.5.1 Bisection method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5.2 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.6 Multidimensional functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.7 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.8 Multidimensional optimisation methods . . . . . . . . . . . . . . . . . . . . . . . . 93
4.8.1 Steepest descent/ gradient descent method . . . . . . . . . . . . . . . . . . 93
4.8.2 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.9 Optimality conditions for constrained problems . . . . . . . . . . . . . . . . . . . . 96
4.9.1 Karush-Kuhn-Tucker (KKT) conditions . . . . . . . . . . . . . . . . . . . . 99
4.10 Solution approaches for constrained problems . . . . . . . . . . . . . . . . . . . . . 102
4.10.1 Newton’s method for constrained problems . . . . . . . . . . . . . . . . . . 103
4.10.2 Barrier method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.10.3 Primal-dual interior point method . . . . . . . . . . . . . . . . . . . . . . . 109
Chapter 1
Introduction
problem
n m
t io od
ta el
re lin
rp g
te
in
optimisation
solution model
Starting from our problem, we first have to derive a mathematical model to represent it. We have
to identify
This lecture focuses on the next step, optimisation. The goal is to find a feasible solution, i.e.,
a variable assignment that does not violate any constraint, that maximises (or minimises) the
objective function we are considering. This can be achieved by
During the course of this lecture, we will get to know different classes of optimization problems
and consider appropriate solution approaches.
5
6 Chapter 1. Introduction
When we have computed an optimal (or near-optimal) solution, we might or might not be done.
We might notice that the problem we just solved was not the problem we actually wanted to
consider: the fastest way to work might be by car but that will not work if you don’t own a car.
Thus, a careful interpretation of the solution is necessary and can lead to changes in the model –
starting the algorithm engineering cycle all over again.
(P ) : min f (x)
s.t. gi (x) ≤ 0, i = 1, . . . , m
hj (x) = 0, j = 1, . . . , l
x ∈ X.
Note that we call gi (x) ≤ 0 for i = 1, . . . , m inequality constraints and hj (x) = 0 for j = 1, . . . , l
equality constraints.
We are interested in feasible and especially optimal solution. While we write “min” in P , a
minimum (or optimal solution) does not necessarily exist.
Definition 1.2 (Feasible and optimal solutions). Let P be an optimization problem as in Defini-
tion 1.1.
Our goal will be to solve variations of the general problem P . The simpler the assumptions are
that define a type of problem, the better the methods to solve such problems. We focus on the
following four types of optimization problems.
• Nonlinear programming (NLP): some (or all) of the functions f, gi or hi are nonlinear;
• (Mixed-)integer programming ((M)IP): LP where (some of the) variables are binary
n−k
(or integer), i.e., X = Rk × {0, 1} (or X = Rk × Zn−k )
• Mixed-integer nonlinear programming (MINLP): some (or all) of the functions f, gi or
n−k
hi are nonlinear and (some of the) variables are binary (or integer), i.e., X = Rk × {0, 1}
(or X = Rk × Zn−k )
Table 1.1: Resources and selling price for tables and chairs.
Remark. Note that models are simplified representations of reality. Simplifying assumptions in
this example include:
The most suitable optimisation method for solving an optimisation model depends on the model’s
mathematical properties.
In this course, we will learn how to specify a suitable method for a model given its properties. For
now, we will concentrate on (continuous) linear (optimisation) models (LPs).
Linear models have particular properties that can be exploited to devise an efficient optimisation
method.
1.3. Modelling real-world problems using optimisation 9
Example 1.3 (continued). Consider again the model for calculating the optimal number of tables
and chairs to produce. We can interpret the constraints for available labour and available wood
as a half-space in R2 . The feasible region is the intersection of these two half-spaces and R2+ =
{(x1 , x2 ) : x1 , x2 ≥ 0}, see Figure 1.3a.
x2
6 6
x ** = (6.08, x2* = 4.34)
z = 2400 z = 7478.3
4 4
2 2
0 4 14 0 4 14
0 2 6 8 10 12 0 2 6 8 10 12
x1 x1
(a) Feasible region. (b) Finding an optimal solution.
To find optimal solution in the feasible region we consider level sets of the objective function. For
a given objective value z, the level set {(x1 , x2 ) ∈ R2 : 800x1 + 600x2 = z} describes all solution
with objective value z. Note that in our case, the level sets are lines in R2 with the same slope.
Figure 1.3b shows how we can graphically find the largest value z such that the corresponding level
set intersects with the feasible region. This intersection is the set of optimal solutions.
A more complex real-world problem to be modelled is the production planning problem which is a
problem considered in operations research (OR).
Example 1.4 (Real-world model: production planning problems (OR)). In the production plan-
ning problem, we need to plan the production and distribution of goods. The transportation costs
are proportional to the distance travelled, factories have a capacity limit and clients have a known
demands.
Factories Clients
NY
SD
CH
SE
MI
Clients
Factory NY Chicago Miami Capacity
Seattle 2.5 1.7 1.8 350
San Diego 3.5 1.8 1.4 600
Demands 325 300 275 -
Table 1.2: Problem data: unit transportation costs, demands and capacities.
Let i ∈ I = {Seattle, San Diego} be the index set representing the factories. Similarly, let j ∈
J = {New York, Chicago, Miami} be the index set representing the clients. Figure 1.4 shows the
factories and clients while Table 1.2 gives the corresponding costs, demands and capacities.
To model the problem mathematically, we follow the same three key steps as above:
15 15
Pos. obs. Pos. obs.
Neg. obs. Neg. obs.
Classifier
10 10
x2
x2
5 5
0 0
10 0 10 20 10 0 10 20
x1 x1
Our task is to obtain a function f : Rn → R from a given family of functions such that
Here, f is selected as a linear classifier, i.e., f (xi ) = a⊤ xi − b, in which we try to set optimal a
and b considering the classification error.
The best possible classifier is that which minimises misclassification.
Let us define the following error measures:
(
− − 0, if a⊤ xi − b ≤ 0,
e (xi ∈ I ; a, b) =
a⊤ xi − b, if a⊤ xi − b > 0.
(
0, if a⊤ xi − b ≥ 0,
e+ (xi ∈ I + ; a, b) =
b − a ⊤ xi , if a⊤ xi − b < 0.
X X
(LC ′ ) : min e− (xi ; a, b) + e+ (xi ; a, b)
xi ∈I − xi ∈I +
n
s.t. a ∈ R
b ∈ R.
a⊤ xi − b − ui ≤ 0 (1.1)
ui ≥ 0 (1.2)
a⊤ xi − b ≤ 0 ⇒ a⊤ xi − b − 0 ≤ 0 ⇒ e− (xi ; a, b) = 0 =: ui ≥ 0
Note that choosing a smaller value for ui would lead to a contradiction. As ui is minimised,
constraints (1.1) and (1.2) model e− correctly.
+ +
Analogously, we introduce slack variables (vi )M
i=N +1 to model e and for each xi ∈ I we add the
following constraints.
a⊤ xi − b + vi ≥ 0 (1.3)
vi ≥ 0 (1.4)
a⊤ xi − b ≥ 0 ⇒ a⊤ xi − b + 0 ≥ 0 ⇒ e+ (xi ; a, b) = 0 =: vi ≥ 0
a⊤ xi − b < 0 ⇒ a⊤ xi − b + (b − a⊤ xi ) ≥ 0 ⇒ e+ (xi ; a, b) = b − a⊤ xi =: vi ≥ 0.
Note that choosing a smaller value for vi would lead to a contradiction. As vi is minimised,
constraints (1.3) and (1.4) model e+ correctly.
1.4. Modelling problems as LP and graphical solution approach 13
N
X M
X
(LC) : min ui + vi
i=1 i=N +1
s.t. a⊤ xi − b − ui ≤ 0, i = 1, . . . , N
a⊤ xi − b + vi ≥ 0, i = N + 1, . . . , M
a ∈ Rn
b∈R
ui ≥ 0, i = 1, . . . , N
vi ≥ 0, i = N + 1, . . . , M.
• feed composition;
• fuel specification;
• drug manufacturing.
Example 1.6 (Diet problem). A farms uses at least 800 lb of a special feed daily. The special
feed is a mixture of corn and soybean meal with the following compositions:
The dietary requirements of the special feed are at least 30% protein and at most 5% fiber.
14 Chapter 1. Introduction
It is convenient to reformulate problems to a format with variables on the left-hand side and
constants on the right-hand side.
n
X
aij xj ≤ bi , i = 1, . . . , m
|{z}
j=1
| {z } RHS
LHS
Remark. In practice, optimisation problems have many more variables. Two-variable problems are
however graphically representable, which is useful to infer geometrical properties of linear problems
(LP).
Example 1.6 (Diet problem, continued). To compute an optimal solution to the diet problem,
we first graph the feasible set. Note that due to the domain of the variables (1.9), we only need to
consider R2+ = {(x1 , x2 ) : x1 ≥ 0, x2 ≥ 0}.
Each of the constraints (1.6), (1.7) and (1.8) defines a halfspace in R2 , see Figure 1.6. the feasible
region is the intersection of these halfspaces and R2 .
2000 2000
x1 + x2 800 x1 + x2 800
0.21x1 0.30x2 0
1500 1500
1000 1000
x2
x2
500 500
0 0
0 500 1000 1500 2000 0 500 1000 1500 2000
x1 x1
(a) Constraint (1.6). (b) Adding constraint (1.7).
2000 2000
x1 + x2 800 Feasible region
0.21x1 0.30x2 0 x1 + x2 = 800
0.03x1 0.01x2 0 0.21x1 0.30x2 = 0
1500 1500 0.03x1 0.01x2 = 0
1000 1000
x2
x2
500 500
0 0
0 500 1000 1500 2000 0 500 1000 1500 2000
x1 x1
(c) Adding constraint (1.8). (d) Feasible region.
To determine an optimal solutions we consider the level set (x1 , x2 ) ∈ R2 : 0.3x1 + 0.9x2 = z for
varying values of z, see Figure 1.7a. Notice that in R2 , level sets are parallel lines. By reducing z
16 Chapter 1. Introduction
as long as possible while the corresponding level set intersects with the feasible region, we find an
optimal solution, see Figure 1.7b.
2000 2000
z = 1800 Feasible region z = 1800 Feasible region
x1 + x2 = 800 x1 + x2 = 800
0.21x1 0.3x2 = 0 0.21x1 0.3x2 = 0
1500 z = 1300 0.03x1 0.01x2 = 0 1500 z = 1300 0.03x1 0.01x2 = 0
z = 0.30x1 + 0.90x2 z = 0.30x1 + 0.90x2
x2
500 500
x * = (x1* = 470.6, x2* = 329.4)
z * = 437.64
0 0
0 500 1000 1500 2000 0 500 1000 1500 2000
x1 x1
(a) Level sets. (b) Optimal solution.
Remark. The diet problem example illustrates some important concepts of the geometry of linear
programs. For any feasible solution, i.e., any point in the feasible region, we differentiate between
active and inactive constraints.
The intersection of two (linearly independent) active constraints form a vertex of the feasible region
in R2 . We will see later, that it suffices to consider vertices as candidates for optimal solutions.
Example 1.7 (Production planning). A paint factory produces exterior and interior paint from
raw materials M1 and M2, see Table 1.4. The maximum demand for interior paint is 2 tons/day.
Moreover, the amount of interior paint produced cannot exceed that of exterior paint by more than
1 ton/day.
Our goal is to determine optimal paint production.
1.4. Modelling problems as LP and graphical solution approach 17
max z = c⊤ x
Ax ≤ b
x≥0
where
18 Chapter 1. Introduction
6 4
1 2
c = (5, 4)⊤ , x = (x1 , x2 )⊤ , A = and b = (24, 6, 1, 2)⊤ .
−1 1
0 1
The feasible region as well as the level sets {(x1 , x2 ) : 5x1 + 4x2 = z} for the objective function are
given in Figure 1.8a and Figure 1.8b, respectively.
6 6
6x1 + 4x2 24 Feasible region
5 x1 + 2x2 6 5 6x1 + 42x2 = 24
x1 + x2 1 x1 + 2x2 = 6
x2 2 x1 + x2 = 1
4 4 z = 15 x2 = 2
z = 5x1 + 4x2
3 3
x2
2 x2 2
z=5
1 1
0 1 4 5 0 1 4 5
0 2 3 6 0 2 3 6
x1 x1
(a) Feasible region. (b) Level sets.
6
4 z = 15 z
3
x2
2 z ** = 21*
z=5 x = (x1 = 3, x2* = 1.5)
1
0 1 4 5
0 2 3 6
x1
(c) Gradient and optimal solution.
The direction in which the objective function increases, we can consider the gradient ∇z =
∂z
( ∂x , ∂z )⊤ = (5, 4)⊤ , see Figure 1.8c. Moving the level set in the direction of ∇z as long as
1 ∂x2
the intersection with the feasible region is non-empty, gives us an optimal solution.
Remark. In case of minimisation instead of maximisation, we move towards −∇z.
1.4. Modelling problems as LP and graphical solution approach 19
1. For which changes in the coefficients of the objective function (c) does this vertex remains
optimal?
2. For which changes in the right-hand side (b) does this set of active constraints remain optimal?
Remark. Note that, when changing the right-hand side b, the optimal solution x, i.e., the coordi-
nates of the optimal solution, will change, but not the intersection of active constraints that forms
it.
We conduct a sensitivity analysis for the production planning problem in Example 1.7.
Example 1.7 (Production planning, continued). We first consider the case where the coefficients
of the objective functions c can change. For which changes does the optimal vertex and the
corresponding solution x∗ remain optimal?
Let z ′ = c1 x1 + c2 x2 be the perturbed objective function. Then x∗ remains optimal if the slope of
z ′ lies between (1.10) (blue) and (1.11) (orange), see Figure 1.9a. This is the same as requiring
1 c1 6
2 ≤ c2 ≤ 4 .
Next we consider changes in the right-hand side b. For which changes in bi , i ∈ {1, 2, 3}, does the
set of active constraints remain optimal?
We first consider changing b1 , i.e., the right-hand side of (1.10), restricting the availability of M1,
see Figure 1.9b. The set of active constraints remains the same for changes in b1 between 20 and
36 (pre-calculated). Then, the marginal value y1 for b1 ∈ [20, 36] is
∆z z(D) − z(G)
=
∆b1 b1 (D) − b1 (G)
= 750 ($/ton).
Next, we consider changing b2 , i.e., the right-hand side of (1.11), restricting the availability of M2,
see Figure 1.9c. The set of active constraints remains the same for changes in b2 between 4 and 20 3
(pre-calculated). Then, the marginal value y2 for b2 ∈ [4, 6 32 ] is
∆z z(H) − z(F )
=
∆b2 b2 (H) − b2 (F )
= 500 ($/ton).
Note that the remaining constraints are inactive. When the corresponding right-hand side is
changed a little, there is thus no influence on the optimal solution. However, it can happen that
the right-hand side is changed so much that the originally optimal solution is no longer feasible,
and thus also no longer optimal.
20 Chapter 1. Introduction
6 6
Feasible region Feasible region
6x1 + 4x2 = 24 6x1 + 4x2 = 36
5 x1 + 2x2 = 6 5 6x1 + 4x2 = 20
x1 + x2 = 1 x1 + 2x2 = 6
x2 = 2 x1 + x2 = 1
4 z = 5x1 + 4x2 4 x2 = 2
3 3
x2
3
x2
2 (H)
(C) (D) z ** = 21*
x = (x1 = 3, x2* = 1.5)
(E)
1 (B)
Linear programming
In this chapter, we consider linear optimization problems and the simplex method for solving
linear optimization problems. In order to apply the simplex method, we have to reformulate LPs
in standard form.
(P ) : min c⊤ x
s.t. a⊤
j x ≤ bj , j = 1, . . . , m1
a⊤
j x ≥ bj , j = m1 + 1, . . . , m2
a⊤
j x = bj , j = m2 + 1, . . . , m
xi ≥ 0, i ∈ I1
xi ≤ 0, i ∈ I2
xi ∈ R, i ∈ I3
(P ) : min c⊤ x
s.t. A1 x ≤ b1
A2 x ≥ b2
A3 x = b3
xi ≥ 0, i ∈ I1
xi ≤ 0, i ∈ I2
xi ∈ R, i ∈ I3
with
A1 ∈ Rm1 ×n , b1 ∈ Rm1 , A2 ∈ R(m2 −m1 )×n , b2 ∈ Rm2 −m1 , A3 ∈ R(m−m2 )×n , b3 ∈∈ Rm−m2 .
To ease the notation for the following chapter, we additionally define a standard form for linear
programs as well as a ≤-form.
21
22 Chapter 2. Linear programming
(P ) : max c⊤ x
Ax = b
x≥0
with A ∈ Rm×n , b ∈ Rm
+.
(P ) : min c⊤ x
s.t. Ax ≤ b
x ∈ Rn
with A ∈ Rm×n , b ∈ Rm .
Remark. Note that these terms are not used consistently in the literature, i.e., there exist multiple,
slightly varying, standard forms.
Note that a linear program in general from can be transferred into standard form or ≤-form. Thus,
it suffices to consider LPs either in standard or ≤-form, depending on which is suited better.
−a⊤
j x ≤ −bj
a⊤
j x ≤ bj
−a⊤
j x ≤ −bj
to (P ′ ). For each variable xi ≥ 0 we add the constraint −xi ≤ 0 and for each variable xi ≤ 0
we add the constraint xi ≤ 0.
• a⊤ ⊤ ⊤
j x ≤ bj becomes aj x + sj = bj with sj = bj − aj x and sj ≥ 0 (slack).
2.1. Representations of linear programs 23
• a⊤ ⊤ ⊤
j x ≥ bj becomes aj x − si = bj with sj = aj x − bj and sj ≥ 0 (surplus).
Variables from the standard form are processed as follows. Nonpositive variables xi ≤ 0 are
replaced with −yi , where yi ≥ 0. Unrestricted variables xi ∈ R are replaced with yi+ − yi− ,
where yi+ , yi− ≥ 0.
All equality constraints a⊤ ⊤
i x = bi with negative bi are replaces by −ai x = −bi .
To convert the minimisation to maximisation, min z = c⊤ x is replaced with max −z =
−c⊤ x. Notice the sign change for z.
In ≤-form we get
To transform the problem to standard from, we first replace the variables: x1 is replaced by −y1 , x2
is replaced by y2+ − y2− . Additionally, we introduce a surplus variable s1 to handle the ≥ constraint
and negate the objective function. Thus we get
≤ ≥
• Ha,b = x ∈ Rn : a⊤ x ≤ b , Ha,b = x ∈ Rn : a⊤ x ≥ b are the half-spaces generated by
hyperplane Ha,b .
Definition 2.7. A polyhedron P is the intersection of finitely many half-spaces, i.e.,
\
Q= x ∈ Rn : a⊤j x ≤ bj
j=1,...m
with aj ∈ Rn , bj ∈ R.
λx + (1 − λ)y ∈ M
is satisfied.
Lemma 2.9. 1. Any half-space is convex.
2. The intersection of finitely many convex sets is convex.
3. Polyhedra are convex.
As we can reformulate any linear program into ≤-form, we can easily see that the feasible set F
of a linear program is a polyhedron and therefore convex.
Next, we consider the connection between extreme points and vertices of polyhedra.
Definition 2.10. Consider a polyhedron Q = j=1,...m x ∈ Rn : a⊤
T n
j x ≤ bj with aj ∈ R , bj ∈ R.
With this notation, we can formulate the fundamental theorem of linear optimisation.
Theorem 2.11 (Fundamental theorem of linear optimisation). Consider a linear program
(P ) max c⊤ x : Ax ≤ b, x ∈ Rn with feasible set F = Q. Note that Q is a polyhedron. Suppose Q
has at least one extreme point. If (P ) has an optimal solution, then (P ) has an optimal solution
that is an extreme point of Q.
2.3. Simplex method 25
Additionally, one can show the equivalence of vertices and extreme points of polyhedra.
Theorem 2.12 (Equivalence of extreme points and vertices). Let Q be a non-empty polyhedron.
Then z is an extreme point of Q if and only if z is a vertex of Q.
Using these two theorems, we have the fundamental ideas for the simplex method.
• It suffices to consider vertices (extreme points) of the feasible region when looking for optimal
solutions.
• There are only finitely many vertices, aswe have a total of m constraints, of which n have
to be active. Thus, there are at most mn vertices to consider.
Instead of iterating over all possible vertices, the simplex methods exploits a neighbourhood struc-
ture to speed up the process as detailed in the next section.
(P ) : max c⊤ x
Ax = b
x≥0
with A ∈ Rm×n , b ∈ Rm
+ , where n ≥ m and rank A = m.
Remark. Assuming that rank A = m, i.e., that A has full rank, is not really a restriction. Suppose
rank a = k < m. Then there are two possibilities:
• rank (A|b) = rank A = k: In this case, there are redundant constraints in Ax = b, i.e.,
constraints that are a linear combination of other constraints. As we can remove these
constraints iteratively, we end up with a matrix of full rank eventually.
• rank (A|b) > rank A = k: In this case, the system Ax = b is infeasible and we do not need
to consider it further.
Definition 2.14. Consider a linear program in standard form (P ) max c⊤ x : Ax = b, x ≥ 0 .
• An index set B = {B(1), . . . , B(m)} ⊂ {1, . . . , n} is called basis if the corresponding columns
of A are linearly independent. We call the matrix corresponding to these columns AB .
• The variables xB(1) , . . . , xB(m) are called basic variables. We abbreviate this as xB =
(xB(1) , . . . , xB(m) ).
• The remaining indices N = {1, . . . , n} \ {B(1), . . . , B(m)} are called nonbasic indices. We
call the matrix corresponding to these columns AN .
• The variables xk , k ∈ N , are called nonbasic variables. We abbreviate this as xN = (xk )k∈N .
26 Chapter 2. Linear programming
• Similarly, we write cB for the costs vector of the basic variables and cN for the cost vector
of the nonbasic variables.
Remark. As the columns of AB are linearly independent for a basis B, they form a basis of Rm ,
i.e., the corresponding matrix is non-singular. Thus, fixing the nonbasic variables xN also fixes the
basic variables xB .
Remark. Note that a linear program (P ) in standard form with m constraints and n variables
can be represented as a linear program (P ′ ) in ≤-form with 2m + n constraints and n variables.
A basic solution satisfies n linearly independent constraints in (P ′ ) with equality: of the 2m
corresponding to the equality constraints in (P ) only m are linearly independent and the n − m
linearly independent constraints corresponding to the nonbasic variables. Thus, a basic solution of
(P ) corresponds to a vertex in (P ′ ).
Note that any two columns of the corresponding matrix are linearly independent, such that any
two column indices form a basis. By setting the corresponding nonbasic variables to zero, we get
the six basic solutions (A) to (F) as given in Table 2.1 and Figure 2.1. All points are defined by
the intersection of two linearly independent, active constraints. Note that the points (A), (B), (D)
and (E) are feasible basic solutions, i.e., all variables are greater or equal to zero. They form the
vertices and thus the extreme points of the feasible region of (P ).
Point x1 x2 s1 s2 z
(A) 0 0 4 4 0
(B) 2 0 0 2 8
(C) 4 0 -4 0 16
(D) 4/3 4/3 0 0 28/3
(E) 0 2 2 0 6
(F) 0 4 0 -4 16
4 (F) 2x1 + x2 = 4
x1 + 2x2 = 4
optimal
3
2 (E)
x2
(D)
1
0 (A) 1
(B)
4
(C)
0 2 3
x1
Figure 2.1: Basic solutions, feasible set and optimal solution.
When all basic solutions are known as in Example 2.18, we can determine an optimal solution by
iterating over all feasible basic solutions and picking the best one. However, we do not want to
28 Chapter 2. Linear programming
precompute all basic solutions. Thus, the simplex method goes from a basis to an adjacent basis by
swapping a nonbasic variable and a basic variable until no further improvement can be observed in
the objective function or more precisely the reduced costs. Basis transformations such as swapping
a basic and a non-basic variable do not influence the feasible region of the linear program but we
have to make sure that the reduced costs are transformed in the same manner.
Definition 2.19. Let B be a basis of A. For a nonbasic variable xk , k ∈ N , the reduced costs are
given as
−1
c̄k = ck − c⊤
B AB Ak .
The reduced costs of nonbasic variables can be used to determine of a basic solution can be improved
upon.
Theorem 2.20. If x∗ is a feasible basic solution for basis B and the reduced costs satisfy c̄k ≤ 0 for
all nonbasic variables x∗k , k ∈ N , then x∗ is an optimal solution of (P ) max c⊤ x : Ax = b, x ≥ 0 .
xB = A−1 −1
B b − AB AN x N .
c⊤ x = c⊤ ⊤
B x B + cN x N
−1 −1
= c⊤ ⊤
B (AB b − AB AN xN ) + cN xN
−1 ⊤ −1
= c⊤ ⊤
B AB b + (cN − cB AB AN ) xN .
| {z }
reduced costs
On the other hand, this shows how we can improve a given basis solution that does not have
nonpositive reduced costs: By choosing a nonbasic variable xk , e.g., the one with highest reduced
costs, and increasing its value to xk = δ, we (potentially) get a better solution. Note that when
changing xk , we have to recompute xB to ensure feasibility.
Lemma 2.21. Let ãi,k = (A−1 −1
B Ak )i and b̃i = (AB b)i . Let x̃ = (x̃B , x̃N ) be the solution defined
by increasing xk to δ by setting x̃k = δ ≥ 0 for a k ∈ N and x̃k′ = 0 for k ′ ∈ N \ {k}. Then x̃ is
feasible for ( )
b̃i
δ = min : ãi,k > 0 .
i=1,...,m ãi,k
2.4. Gauss-Jordan elimination and simplex tableaus 29
Proof. Note that by definition, x̃N ≥ 0. Also, by definition Ax̃ = b as we set x̃B = A−1
B b −
A−1
B A N x̃ N . Thus, we have to guarantee x̃ B ≥ 0.
−1
x̃B = AB b − A−1 −1 −1
B AN x̃N = AB b − AB Ak δ ≥ 0
⇐⇒ b̃i − ãi,k δ ≥ 0, i = 1, . . . , m
b̃i
If ãi,k ≤ 0 this is satisfied anyway. For ãi,k ≤ 0 we get δ ≤ ãi,k and thus the minimum as
claimed.
We can even show that this choice of x̃ does not decrease the objective value and ensures that x̃
is a feasible basic solution. We omit the technical proof here.
Theorem 2.22. Let B be a basis, c̄k > 0 for nonbasic variable xk and
( )
b̃r b̃i
δ= = min : ãi,k > 0
ãr,k i=1,...,m ãi,k
Thus, we now have the following information that we can use in the simplex method.
• Determine which nonbasic variable will enter the basis. Here, we choose the one which has
the highest reduced costs and thus can potentially improve the objective value the most.
• Determine which basic variable will leave the basis such that the objective value is not reduced
and the columns corresponding to the basis are linearly independent.
• Update the reduced costs and determine whether more simplex steps are necessary.
If the problem we are considering was transformed from a ≤-from with nonnegative right.hand
side b to standard from, we have an easy basis to start with: All original variables are chosen as
nonbasic variables and set to zero, all newly introduced slack variables form a standard basis of
Rn and are used as basic variable. Other cases are discussed in section 2.5.
The second question is discussed in the following section.
1. The systems coefficient are laid as a matrix, including the objective function forming a so-
called tableau.
2. An identity (submatrix) is formed for the selected basis, which is equivalent to solving the
system for this basis.
3. Coefficients of basic variables are made zero in the objective function, such that the row
corresponds to the reduced costs.
4. Each new system solution is obtained performing elementary row operations (Gauss Jordan
elimination).
• Row permutation.
• Multiply a row by a non-zero scalar.
• Add to one row a scalar multiple of another.
Note that these operations do not change the feasible region of the corresponding linear
program.
−c1 . . . − cm 0 ...0 0
Ā I b
Remark. Note that the first row of the tableau represents the equation
z − c⊤ x = 0.
Therefore, the tableau would need one more column at the beginning, representing a variable z
corresponding to the objective value. However, none of the transformations used in the simplex
method have any effect on this additional row such that we do not use it in our notation.
−1 −1
−c̄ = −c⊤ + c⊤B AB A c⊤
B AB b
−1 −1
c̄ = c⊤ − c⊤
B AB A −c⊤
B AB b
A−1
B Ā A−1
B I A−1
B b
Thus, we can use the tableau to get the following information from the tableau as used in Theo-
rem 2.22:
• Determine the entering variable xk (pivot column P C): negative coefficient −c̄k with largest
absolute value in first row.
n o
• Determine the leaving variable (pivot row P R): arg mini=1,...,m ãb̃i,k
i
: ãi,k > 0 .
• Use row operations of pivot row P R until the pivot column corresponds to basic vector ei
with a 0 in the first row corresponding to the negative reduced costs.
To test whether the new basis B ′ is optimal, we only need to consider the values of the negative
reduced costs in the first row. As long as there is a negative value, the steps above are iterated.
When all values are non-negative, an optimal solution is found.
Example 2.23. Consider again the linear program of Example 2.18. The initial tableau is:
x1 x2 x3 x4 Sol.
−c̄ -4 -3 0 0 0
x3 2 1 1 0 4
x4 1 2 0 1 4
Note that the first column contains information on the current basis and the first row marks the
variables with s1 = x3 , s2 = x4 .
The first pivot column belongs to x1 as −c̄1 is the negative coefficient with largest absolute value.
x1 x2 x3 x4 Sol.
−c̄ -4 -3 0 0 0
x3 2 1 1 0 4
x4 1 2 0 1 4
n o
b̃i
The leaving variable is x3 , the first basic variable: arg mini=1,...,m ãik : aik > 0 .
After performing suitable row operations, we obtain:
x1 x2 x3 x4 Sol. Operations
−c̄ 0 -1 2 0 8 + (4) × P R
x1 1 1/2 1/2 0 2 ×(1/2) : P R
x4 0 3/2 -1/2 1 2 + (−1) × P R
where the rwo operations are performed to turn P C into a part of the basis.
As there is still a negative entry in −c̄, the method proceeds...
x1 x2 x3 x4 Sol. Operations
−c̄ 0 0 5/3 2/3 28/3 + (1) × P R
x1 1 0 2/3 -1/3 4/3 + (−1/2) × P R
x2 0 1 -1/3 2/3 4/3 ×2/3 : P R
4 (F) 2x1 + x2 = 4
x1 + 2x2 = 4
The initial basis results in the feasible basic so- 3
lution (A). To (A), both the basic solutions (B)
and (E) are adjacent.
2 (E)
x2
x1 x2 x3 x4 Sol.
−c̄ -4 -3 0 0 0 (D)
x3 2 1 1 0 4 1
x4 1 2 0 1 4
0 (A) 1
(B)
4
(C)
0 2 3
x1
4 (F) 2x1 + x2 = 4
x1 + 2x2 = 4
x1 x2 x3 x4 Sol.
−c̄ 0 -1 2 0 8
(D)
x1 1 1/2 1/2 0 2
x4 0 3/2 -1/2 1 2 1
0 (A) 1
(B)
4
(C)
0 2 3
x1
2.5. Artificial variables and feasible initial solutions 33
4 (F) 2x1 + x2 = 4
x1 + 2x2 = 4
By choosing to make x2 a basic variable, we get
z = 4x1 + 3x2
to the feasible basic solution (D). As all negative
3
reduced costs are positive, (D) is an optimal so-
2 (E)
x2
lution.
x1 x2 x3 x4 Sol.
(D) xz * == 28/3
* (4/3, 4/3)
−c̄ 0 0 5/3 2/3 28/3
x1 1 0 2/3 -1/3 4/3 1
x2 0 1 -1/3 2/3 4/3
0 (A) 1
(B)
4
(C)
0 2 3
x1
To end this section, we summarize the simplex method in algorithmic form.
Remark. For minimisation problem, we consider positive elements in row −c̄ and choose the largest
of these values.
Modern implementations of the simplex method rely on efficient computational algebra (factorisa-
tion) and a minimum representation of the problem (see revised simplex method).
n
In theory, the simplex method is an algorithm with exponential runtime. A total of m vertices
might need to be visited.
Algorithm 1 does not necessarily terminate. It is possible that the same vertices are visited over
and over again in a loop. We get to this case later.
Example 2.24. The origin is not feasible for the following LP.
min z = 4x1 + x2
s.t. 3x1 + x2 = 3
4x1 + 3x2 ≥ 6
x1 + 2x2 ≤ 4
x1 , x2 ≥ 0.
Feasible region
4 3x1 + x2 = 3
4x1 + 3x2 = 6
x1 + 2x2 = 4
3
x2
0
0 1 2 3 4
x1
Figure 2.2: Feasible region.
To circumvent this issue, we rely on artificial variables, which have the role of accumulating infea-
sibility.
Each (≥)- or (=)-constraint is augmented with an artificial variable. Then, we minimise their
value, i.e., the total infeasibility.
Example 2.24 (continued). To get to a linear program in standard from, we first add a slack and
a surplus variable.
min z = 4x1 + x2
s.t. 3x1 + x2 = 3
4x1 + 3x2 − x3 = 6
x1 + 2x2 + x4 = 4
x1 , x2 , x3 , x4 ≥ 0
As, however, the origin is no feasible basic solution, we add two further artificial variables r1 and
r2 .
2.5. Artificial variables and feasible initial solutions 35
min z = 4x1 + x2
s.t. 3x1 + x2 + r1 = 3
4x1 + 3x2 − x3 + r2 = 6
x1 + 2x2 + x4 = 4
x1 , x2 , x3 , x4 , r1 , r2 ≥ 0.
If a solution with zero infeasibility (i.e., artificial variables are nonbasic) is found, a basic feasible
solution is available. If the minimal (optimal) accumulated infeasibility is not zero (has basic
artificial variables), no basic feasible solution exists.
min z = 4x1 + x2 + M r1 + M r2
s.t. 3x1 + x2 + r1 = 3
4x1 + 3x2 − x3 + r2 = 6
x1 + 2x2 + x4 = 4
x1 , x2 , x3 , x4 , r1 , r2 ≥ 0.
Note that we cannot simply build a feasible initial tableau from this formulation, as now there are
basic variable, for which the negative reduced costs are not zero.
x1 x2 x3 r1 r2 x4 Sol.
−c̄ -4 -1 0 -100 -100 0 0
r1 3 1 0 1 0 0 3
r2 4 3 -1 0 1 0 6
x4 1 2 0 0 0 1 4
Thus, we first have to do some row operations, to get the following correct initial tableau.
x1 x2 x3 r1 r2 x4 Sol. Operation
−c̄ 696 399 -100 0 0 0 900 + 100 × Rr1 + 100 × Rr2
r1 3 1 0 1 0 0 3 (Rr1 )
r2 4 3 -1 0 1 0 6 (Rr2 )
x4 1 2 0 0 0 1 4
36 Chapter 2. Linear programming
Now, we can proceed with the method as before. Note that we are solving a minimisation problem.
x1 x2 x3 r1 r2 x4 Sol.
−c̄ 696 399 -100 0 0 0 900
r1 3 1 0 1 0 0 3
r2 4 3 -1 0 1 0 6
x4 1 2 0 0 0 1 4
x1 x2 x3 r1 r2 x4 Sol.
−c̄ 0 167 -100 -232 0 0 204
x1 1 1/3 0 1/3 0 0 1
r2 0 5/3 -1 -4/3 1 0 2
x4 0 5/3 0 -1/3 0 1 3
Feasible region
4 3x1 + x2 = 3
4x1 + 3x2 6
x1 + 2x2 4
3 z = 3x1 + 2x2
x2
2 x ** = (2/5, 9/5)
z = 17/5
0
0 1 2 3 4
x1
Figure 2.3: Optimal solution.
Alternatively, we can use the two-phase method to find a feasible basic solution. The advantage of
the two-phase method is that it does not need parametrisation. It is more often used in modern
solvers and uses an artificial objective function measuring infeasibility.
Example 2.24 (continued). Consider again the example from before. instead of augmenting
the objective function, we replace it by a minimisation problem that only minimises the artificial
variables and thus need no parametrisation.
2.5. Artificial variables and feasible initial solutions 37
min z = r1 + r2
s.t. 3x1 + x2 + r1 = 3
4x1 + 3x2 − x3 + r2 = 6
x1 + 2x2 + x4 = 4
x1 , x2 , x3 , x4 , r1 , r2 ≥ 0.
Note that we cannot simply build a feasible initial tableau from this formulation, as now there are
basic variable, for which the negative reduced costs are not zero.
x1 x2 x3 r1 r2 x4 Sol.
−c̄ 0 0 0 -1 -1 0 0
r1 3 1 0 1 0 0 3
r2 4 3 -1 0 1 0 6
x4 1 2 0 0 0 1 4
Thus, we first have to do some row operations, to get the following correct initial tableau.
x1 x2 x3 r1 r2 x4 Sol. Operation
−c̄ 7 4 -1 0 0 0 9 + Rr1 + Rr2
r1 3 1 0 1 0 0 3 (Rr1 )
r2 4 3 -1 0 1 0 6 (Rr2 )
x4 1 2 0 0 0 1 4
A few iterations of simplex methods takes us from this tableau to this optimal tableau, in which
the total infeasibility is zero, see Figure 2.4.
x1 x2 x3 r1 r2 x4 Sol.
−c̄ 0 0 0 -1 -1 0 0
x1 1 0 1/5 3/5 -1/5 0 3/5
x2 0 1 -3/5 -4/5 3/5 0 6/5
x4 0 0 1 1 -1 1 1
As a basic feasible solution is available, the second phase proceeds. The second phase consists of
applying the simplex method from the basic feasible solution obtained from the first-phase. We
can remove all artificial variables and reintroduce the objective function, rewriting it accordingly.
x1 x2 x3 r1 r2 x4 Sol.
−c̄ 0 0 0 -1 -1 0 0
x1 1 0 1/5 3/5 -1/5 0 3/5
x2 0 1 -3/5 -4/5 3/5 0 6/5
x4 0 0 1 1 -1 1 1
38 Chapter 2. Linear programming
By reintroducing the objective function, we again construct a tableau that is not corresponding to
a feasible initial tableau.
x1 x2 x3 x4 Sol.
−c̄ -4 -1 0 0 0
x1 1 0 1/5 0 3/5
x2 0 1 -3/5 0 6/5
x4 0 0 1 1 1
Thus, we have to apply some row operations to get the following correct initial tableau.
x1 x2 x3 x4 Sol.
−c̄ 0 0 1/5 0 18/5
x1 1 0 1/5 0 3/5
x2 0 1 -3/5 0 6/5
x4 0 0 1 1 1
Feasible region
4 3x1 + x2 = 3
4x1 + 3x2 = 6
x1 + 2x2 = 4
3
x2
2
x = (3/5, 6/5)
z = 18/5
1
0
0 1 2 3 4
x1
Figure 2.4: Feasible region and optimal solution of Phase 1.
0
0 1 2 3 4
x1
x1 x2 x3 x4 Sol.
−c̄ -3 -9 0 0 0
x3 1 4 1 0 8
x4 1 2 0 1 4
The ratio test for pivot column x2 leads to a tie, such that both x3 and x4 could be chosen as
leaving basic variables. Here, we arbitrarily choose x3 .
This results in a basic variable x4 = 0. By continuing the process, the solution does not change,
i.e., the point (x1 , x2 ) = (0, 2) is “visited twice”.
Remark. Degeneracy can cause cycling of the simplex method, i.e., the algorithm does not ter-
minate. Simple rules (see Bland’s rule, for example) can prevent it at the cost of performance.
Modern codes interject conditional basis perturbation and shifting to prevent cycles.
Degeneracy can a symptom of redundancy in model specification.
40 Chapter 2. Linear programming
5
Feasible region
x1 + 2x2 = 5
4 x1 + x2 = 4
z=2x_1 + 4x_2
3
x2
2
z=6
1
z=3
0 1 4 5
0 2 3
x1
Figure 2.6: Feasible region and level sets.
Next, we consider the case where multiple solutions are optimal. This is the case when the objective
function is parallel to an active constraint. While this means that there are infinitely many optimal
solutions, the method only visit the extreme points of the optimal set.
Example 2.26. Consider the following linear program
x1 x2 x3 x4 Sol.
−c̄ -2 -4 0 0 0
x3 1 2 1 0 5
x4 1 1 0 1 4
−c̄ 0 0 2 0 10
x2 1/2 1 1/2 0 5/2
x4 1/2 0 -1/2 1 3/2
−c̄ 0 0 2 0 10
x2 0 1 1 -1 1
x1 1 0 -1 2 3
In iteration 2 a nonbasic variable has negative reduced costs zero and thus can be made basic
without changing the objective value. For λ ∈ [0, 1], any (x1 , x2 ) = λ(0, 5/2) + (1 − λ)(3, 1) is
optimal, see also Figure 2.7.
Next, we consider unboundedness. In this case, the improvement of a solution is not constrained.
2.6. Special cases 41
5
Feasible region
x1 + 2x2 = 5
4 x1 + x2 = 4
z = 2x1 + 4x2
3 x * = (0, 5/2)
z * = 10
x2
2
z=6
x ** = (3, 1)
1 z = 10
z=3
0
0 1 2 3 4 5
x1
Figure 2.7: Feasible region and level sets.
Unboundedness is often a caused by a model specification issue. It occurs when the feasible solution
contains an extreme ray along which the objective value improves.
max z = 2x1 + x2
s.t. x1 − x2 ≤ 10
2x1 ≤ 40
x1 , x2 ≥ 0
30
Feasible region
x1 x2 = 10
2x1 = 40
20
x2
10
0
0 10 20 30
x1
Figure 2.8: Feasible region.
x1 x2 x3 x4 Sol.
−c̄ -2 -1 0 0 0
x3 1 -1 1 0 10
x4 2 0 0 1 40
Consider the column of nonbasic variable x2 . Here, the negative reduced costs are negative, i.e.,
making x2 basic would improve the objective value. However, this column does not induce any
restriction on how much the x2 can be increased: ratio test “fails” as there is no positive entry in
the column. Thus, the objective value can be arbitrarily high and the problem is unbounde, see
also Figure 2.9.
30
Feasible region
x1 x2 = 10
2x1 = 40
z = 2x1 + x2
20
x2
10
z = 20 z = 40 z = 60
0
0 10 20 30
x1
Figure 2.9: Feasible region and level sets.
Lastly, we consider infeasibility. Here, the feasible region is empty, which is usually also due to
poorly specified models. We can identify infeasible model using the two-phase method or the
M-method by adding artificial variables.
Note that infeasibility does not occur for model in ≤-form when b ∈ Rm
+.
4
2x1 + x2 2
3x1 + 4x2 12
3
x2
1
0 1 4
0 2 3
x1
Figure 2.10: Constraints, empty feasible region.
Solving the problem is with artificial objective min r1 we get the following tableaus.
x1 x2 x3 x4 r1 Sol.
−c̄ 3 4 0 -1 0 12
x3 2 1 1 0 0 2
r1 3 4 0 -1 1 12
−c̄ -5 0 -4 -1 0 4
x2 2 1 1 0 0 2
r1 -5 0 -1 -4 1 4
In the optimal solution, the artificial variable r1 is basic and strictly positive. Thus, the objective
value is positive as well, indicating that no feasible solution exists.