to Linear Optimization
AIHLNA sC1LNI1F1C sLR1Ls
1NCPI1M1zAI1CNANDNLURALCCMPUIAI1CN
1 . Dynamic Programming and Optimal Control, Vols. I and II, by Dim
itri P. Bertsekas, 1995.
z Nonlinear Programming, by Dimitri P. Bertsekas, 1995.
3. NeuroDynamic Programming, by Dimitri P. Bertseks and John N.
Tsitsiklis, 1996.
4. Constrained Optimization and Lagrange Multiplier Methods, by Dim
itri P. Bertsekas, 1996.
5. Stochastic Optimal Control: The DiscreteTime Case, by Dimitri P.
Bertsekas and Steven E. Shreve, 1996.
6. Introduction to Linear Optimization, by Dimitris Bertsimas and John
N. Tsitsiklis, 1997.
Introduction
to Linear Optimization
Dimiliis Deilsimus
JoLn `. JsilsikIis
Massaehusetts1nstìtute oiIeehnoo,y
~
AlLenu ScienliLc, DeImonl, MussucLusells
Athena seìentìñe
PostCmeeBo× 1u¡
Bemont, Mass. uz¡zsuuus
U. s. A.
Lmaì. athenaseOword. std. eom
¼¼¼ìniormatìonandorders. http.,,word. std. eom,¯athenase,
Cover Design: .»» ·..,·
_ 1997 Dimitris Bertsimas and John N. Tsitsiklis
All rights reserved. No part of this book may be reproduced in any form
by any electronic or mechanical means ¸including photocopying, recording,
or information storage and retrieval) without permission in writing from
the publisher.
Pubìshers Catao,ìn,ìnPubìeatìon Data
Bertsimas, Dimitris, Tsitsiklis, John N.
Introduction to Linear Optimization
Includes bibliographical references and index
1. Linear programming. z Mathematical optimization.
3. Integer programming. 1. Title.
T57. 74. B465 1997 519. 7 9678786
1sBN¡sscszu¡u¡
Jo teorqia,
auJ /o teorqe Miclae, vlo ej u· ·o earp
Jo Ae1auJra auJ Meiua
Contents
Preiaee . . . . . . . . . . . . . . . . . . . . . ×ì
1. 1ntroduetìon . . . . . . . . . . . .
i i Variants of the linear programming problem .
i z Examples of linear programming problems
i s Piecewise linear convex objective functions
i 1 Graphical representation and solution
i · Linear algebra background and notation
i Algorithms and operation counts
i : Exercises .
i  History, notes, and sources
z. Ihe,eometryoiìnear pro,rammìn,
z i Polyhedra and convex sets
z z Extreme points, vertices, and basic feasible solutions
z s Polyhedra i n standard form
z 1 Degeneracy .
z · Existence of extreme points .
z Optimality of extreme points
z : Representation of bounded polyhedra*
z  Projections of polyhedra: FourierMotzkin elimination*
z a Summary . .
z ia Exercises . «
z i i Notes and sources
1. Ihesìmpe× method
s i Optimality conditions . . . . . .
s z Development of the simplex method
s s Implementations of the simplex method
1
z
i·
zi
z
sz
s1
s
:¡
1z
1
·s
·
z
·
:
:a
:·
:·
:a
s¡
z
:
a1
vii
viii
s 1
s ·
s
s :
s 
s.a
s ia
:.
1 i
1 z
1 s
1 1
1 ·
1
1 :
1 
1 a
1 ia
1 i i
1 iz
1 is
s.
Anticycling: lexicography and Bland' s rule
Finding an initial basic feasible solution
Column geometry and the simplex method
Computational efciency of the simplex method
Summary . . .
Exercises . . .
Notes and sources
Duaìty theory . .
Motivation . . . .
The dual problem . .
The duality theorem .
Optimal dual variables as marginal costs
Standard form problems and the dual simplex method
Farkas' lemma and linear inequalities . .
Fom separating hyperplanes to duality*
Cones and extreme rays . . . . .
Representation of polyhedra . .
General linear programming duality*
Summary . . .
Exercises . . .
Notes and sources
sensìtìvìty anaysìs
· i Local sensitivity analysis
· z Global dependence on the righthand side vector
· s The set of all dual optimal solutions* .
· 1 Global dependence on the cost vector
· · Parametric programming
· Summary . .
· : Exercises . 
·  Notes and sources
c. Lar,eseae optìmizatìon
i Delayed column generation
z The cutting stock problem
s Cutting plane methods
1 DantzigWolfe decomposition
· Stochastic programming and Benders decomposition
Summary . .
: Exercises . a
 Notes and sources
Contents
ia
i i i
i i a
i z1
iz
iza
is:
¡1u
i1a
i1z
i1
i··
i·
i·
ia
i:1
i:a
is
i
i:
iaa
zu¡
zaz
ziz
zi·
zi
zi:
zzi
zzz
zza
z1¡
zsz
zs1
zs
zsa
z·1
za
za
zs
Contents
z.
: i
: z
: s
: 1
: ·
:
: :
: 
: a
: ia
: i i
: iz
: is
s.
 i
 z
 s
 1
 ·

 :
 
u.
a i
a z
a s
a 1
a ·
a
a :
a 
¡u.
ia i
ia z
ia s
ia 1
ia ·
Networkñow probems
Graphs . . . . . . . . . . . . . .
Formulation of the network fow problem
The network simplex algorithm .
The negative cost cycle algorithm
The maximum fow problem . . .
Duality in network fow problems
Dual ascent methods* . . . . .
The assignment problem and the auction algorithm
The shortest path problem . . «
The minimum spanning tree problem .
Summary . . .
Exercises . . .
Notes and sources
Compe×ìtyoiìnear pro,rammìn, andtheeìpsoìd
method . . . . . . . . . . . . . . . . . . .
Efcient algorithms and computational complexity . .
The key geometric result behind the ellipsoid method
The ellipsoid method for the feasibility problem
The ellipsoid method for optimization . . . .
Problems with exponentially many constraints*
Summary . . .
Exercises . . .
Notes and sources
Interìorpoìnt methods
The afne scaling algorithm
Convergence of afne scaling*
The potential reduction algorithm
The primal path following algorithm
The primaldual path following algorithm
An overview
Exercises . . . .
Notes and sources
Inte,erpro,rammìn,iormuatìons
Modeling techniques . . . . . . . . .
Guidelines for strong formulations . . .
Modeling with exponentially many constraints .
Summary
Exercises . . . . . . . . . . . . . . . .
ix
zcs
z:
z:z
z:
zai
sai
siz
si
sz·
ssz
s1s
s1·
s1:
s·
1su
sa
ss
s:a
s:
sa
s:
s
saz
1u1
sa·
1a1
1aa
1ia
1si
1s
11a
11
:s¡
1·z
1i
1·
1:z
1:z
x Contents
ia Notes and sources 1::
11. 1nte,er pro,rammìn,methods :zu
i i i Cutting plane methods 1a
i i z Branch and bound 1·
i i s Dynamic programming 1aa
i i 1 Integer programming duality 1a1
i i · Approximation algorithms ·a:
i i Local search ·i i
i i : Simulated annealing ·iz
i i  Complexity theory ·i1
i i a Summary ·zz
i i ia Exercises ·zs
i i i i Notes and sources ·sa
12. Ihe art ìn ìnear optìmìzatìon . . . . s11
iz i Modeling languages for linear optimization . ·s1
iz z Linear optimization libraries and general observations ·s·
iz s The feet assignment problem ·s:
iz 1 The air trafc fow management problem ·11
iz · The job shop scheduling problem ··i
iz Summary ·z
iz : Exercises ·s
iz  Notes and sources ·:
Reierenees scu
1nde× szu
Preface
The purpose of this book is to provide a unifed, insightful, and modern
treatment of linear optimization, that is, linear programming, network fow
problems, and discrete linear optimization. We discuss both classical top
ics, as well as the state of the art . We give special attention to theory, but
also cover applications and present case studies. Our main objective is to
help the reader become a sophisticated practitioner of (linear) optimiza
tion, or a researcher. More specifcally, we wish to develop the ability to
formulate fairly complex optimization problems, provide an appreciation
of the main classes of problems that are practically solvable, describe the
available solution methods, and build an understanding of the qualitative
properties of the solutions they provide.
Our general philosophy is that insight matters most . For the sub
j ect matter of this book, this necessarily requires a geometric view. On
the other hand, problems are solved by algorithms, and these can only
be described algebraically. Hence, our focus is on the beautiful interplay
between algebra and geometry. We build understanding using fgures and
geometric arguments, and then translate ideas into algebraic formulas and
algorithms. Given enough time, we expect that the reader will develop the
ability to pass fom one domain to the other without much efort .
Another of our objectives is to be comprehensive, but economical. We
have made an efort to cover and highlight all of the principal ideas in this
feld. However, we have not tried to be encyclopedic, or to discuss every
possible detail relevant to a particular algorithm. Our premise is that once
mature understanding of the basic principles is in place, further details can
be acquired by the reader with little additional efort.
Our last objective is to bring the reader up to date with respect to the
state of the art . This is especially true in our treatment of interior point
methods, large scale optimization, and the presentation of case studies that
stretch the limits of currently available algorithms and computers.
The success of any optimization methodology hinges on its ability to
deal with large and important problems. In that sense, the last chapter,
on the art of linear optimization, is a critical part of this book. It will, we
hope, convince the reader that progress on challenging problems requires
both problem specifc insight , as well as a deeper understanding of the
underlying theory.
xi
xii Preface
In any book dealing with linear programming, there are some impor
tant choices to be made regarding the treatment of the simplex method.
Taditionally, the simplex method is developed in terms of the full simplex
tableau, which tends to become the central topic. We have found that the
full simplex tableau is a useful device for working out numerical examples.
But other than that, we have tried not to overemphasize its importance.
Let us also mention another departure fom many other textbooks.
Introductory treatments often focus on standard form problems, which is
sufcient for the purposes of the simplex method. On the other hand, this
approach ofen leaves the reader wondering whether certain properties are
generally true, and can hinder the deeper understanding of the subj ect. We
depart from this tradition: we consider the general form of linear program
ming problems and defne key concepts ¸e. g. , extreme points) within this
context . ¸Of course, when it comes to algorithms, we often have to special
ize to the standard form. ) In the same spirit , we separate the structural
understanding of linear programming fom the particulars of the simplex
method. For example, we include a derivation of duality theory that does
not rely on the simplex method.
Finally, this book contains a treatment of several important topics
that are not commonly covered. These include a discussion of the col
umn geometry and of the insights it provides into the efciency of the
simplex method, the connection between duality and the pricing of fnan
cial assets, a unifed view of delayed column generation and cutting plane
methods, stochastic programming and Benders decomposition, the auction
algorithm for the assignment problem, certain theoretical implications of
the ellipsoid algorithm, a thorough treatment of interior point methods,
and a whole chapter on the practice of linear optimization. There are
also several noteworthy topics that are covered in the exercises, such as
Leontief systems, strict complementarity, options pricing, von Neumann' s
algorithm, submodular fnction minimization, and bounds for a number of
integer programming problems.
Here is a chapter by chapter description of the book.
Chapter ¡. Introduces the linear programming problem, together with
a number of examples, and provides some background material on linear
algebra.
Chapter2: Deals with the basic geometric properties of polyhedra, focus
ing on the defnition and the existence of extreme points, and emphasizing
the interplay betwen the geometric and the algebraic viewpoints.
Chapter1. Contains more or less the classical material associated with the
simplex method, as well as a discussion of the column geometry. It starts
with a highlevel and geometrically motivated derivation of the simplex
method. It then introduces the revised simplex method, and concludes
with the simplex tableau. The usual topics of Phase I and anticycling are
Preface xiii
also covered.
Chapter :. It is a comprehensive treatment of linear programming du
ality. The duality theorem is frst obtained as a corollary of the simplex
method. A more abstract derivation is also provided, based on the separat
ing hyperplane theorem, which is developed from frst principles. It ends
with a deeper look into the geometry of polyhedra.
Chapter s. Discusses sensitivity analysis, that is, the dependence of so
lutions and the optimal cost on the problem data, including parametric
programming. It also develops a characterization of dual optimal solutions
as subgradients of a suitably defned optimal cost function.
Chapter c. Presents the complementary ideas of delayed column gen
eration and cutting planes. These methods are frst developed at a high
level, and are then made concrete by discussing the cutting stock prob
lem, DantzigWolfe decomposition, stochastic programming, and Benders
decomposition.
Chapter z. Provides a comprehensive review of the principal results and
methods for the diferent variants of the network fow problem. It contains
representatives from all maj or types of algorithms: primal descent (the
simplex method) , dual ascent (the primaldual method) , and approximate
dual ascent (the auction algorithm) . The focus is on the major algorithmic
ideas, rather than on the refnements that can lead to better complexity
estimates.
Chapter s. Includes a discussion of complexity, a development of the el
lipsoid method, and a proof of the polynomiality of linear programming. It
also discusses the equivalence of separation and optimization, and provides
examples where the ellipsoid algorithm can be used to derive polynomial
time results for problems involving an exponential number of constraints.
Chapter u. Contains an overview of all major classes of interior point
methods, including afne scaling, potential reduction, and path following
(both primal and primaldual) methods. It includes a discussion of the
underlying geometric ideas and computational issues, as well as convergence
proofs and complexity analysis.
Chapter ¡u. Introduces integer programming formulations of discrete
optimization problems. It provides a number of examples, as well as some
intuition as to what constitutes a "strong" formulation.
Chapter¡¡. Covers the maj or classes of integer programming algorithms,
including exact methods (branch and bound, cutting planes, dynamic pro
gramming) , approximation algorithms, and heuristic methods (local search
and simulated annealing) . It also introduces a duality theory for integer
programming.
Chapter ¡z. Deals with the art in linear optimization, i . e. , the process
xiv Preface
of modeling, exploiting problem structure, and fne tuning of optimization
algorithms. We discuss the relative performance of interior point meth
ods and diferent variants of the simplex method, in a realistic large scale
setting. We also give some indication of the size of problems that can be
currently solved.
An important theme that runs through several chapters is the model
ing, complexity, and algorithms for problems with an exponential number
constraints. We discuss modeling in Section ia s complexity in Section  ·
algorithmic approaches in Chapter and · and we conclude with a case
study in Section iz·
There i s a fair number of exercises that are given at the end of each
chapter. Most of them are intended to deepen the understanding of the
subject , or to explore extensions of the theory in the text , as opposed
to routine drills. However, several numerical exercises are also included.
Starred exercises are supposed to be fairly hard. A solutions manual for
qualifed instructors can be obtained from the authors.
We have made a special efort to keep the text as modular as possible,
allowing the reader to omit certain topics without loss of continuity. For
example, much of the material in Chapters · and is rarely used in the
rest of the book. Frthermore, in Chapter : (on network fow problems) , a
reader who has gone through the problem formulation ( Sections : i: z)can
immediately move to any later section in that chapter. Also, the interior
point algorithms of Chapter a are not used later, with the exception of
some of the applications in Chapter iz Even within the core chapters
( Chapters i1) there are many sections that can be skipped during a frst
reading. Some sections have been marked with a star indicating that they
contain somewhat more advanced material that is not usually covered in
an introductory course.
The book was developed while we took turns teaching a frstyear
graduate course at M. LT. , for students in engineering and operations re
search. The only prerequisite is a working knowledge of linear algebra. In
fact , it is only a small subset of linear algebra that is needed (e. g. , the
concepts of subspaces, linear independence, and the rank of a matrix) .
However, these elementary tools are sometimes used in subtle ways, and
some mathematical maturity on the part of the reader can lead to a better
appreciation of the subject .
The book can be used to teach several diferent types of courses. The
frst two suggestions below are onesemester variants that we have tried at
M. LT. , but there are also other meaningful alternatives, depending on the
students' background and the course' s objectives.
(a) Cover most of Chapters i: and if time permits, cover a small number
of topics from Chapters a iz
( b) An alternative could be the same as above, except that interior point
Preface xv
algorithms ( Chapter a) are fully covered, replacing network fow prob
lems ( Chapter :)
(c) A broad overview course can be constructed by concentrating on the
easier material in most of the chapters. The core of such a course
could consist of Chapter i Sections z iz 1 s is · 1 i1 s · i : i
: s a i ia i some of the easier material in Chapter i i and an
application from Chapter iz
( d) Finally, the book i s also suitable for a halfcourse on integer pro
gramming, based on parts of Chapters i and  as well as Chapters
iaiz
There i s a truly large literature on linear optimization, and we make
no attempt to provide a comprehensive bibliography. To a great extent , the
sources that we cite are either original references of historical interest , or
recent texts where additional information can be found. For those topics,
however, that touch upon current research, we also provide pointers to
recent j ournal articles.
We would like to express our thanks to a number of individuals. We
are grateful to our colleagues Dimitri Bertsekas and Rob Feund, for many
discussions on the subjects in this book, as well as for reading parts of
the manuscript. Several of our students, colleagues, and friends have con
tributed by reading parts of the manuscript , providing critical comments,
and working on the exercises: Jim Christodouleas, Thalia Chryssikou,
Austin Frakt , David Gamarnik, Leon Hsu, Spyros Kontogiorgis, Peter Mar
bach, Gina Mourtzinou, Yannis Paschalidis, Georgia Perakis, Lakis Poly
menakos, Jay Sethuraman, Sarah Stock, Paul Tseng, and Ben Van Roy.
But mostly, we are grateful to our families for their patience, love, and
support in the course of this long proj ect .
Dimitris Bertsimas
John N Tsitsiklis
Cambridge, January 1 997
Lhapter 1
ÍD!1OOHC!1OD
Contents
1 . 1. Variants of the linear programming problem
1 . 2. Examples of linear programming problems
1 . 3. Piecewise linear convex objective functions
1 . 4. Graphical representation and solution
1 . 5. Linear algebra background and notation
1 . 6. Algorithms and operation counts
1 . 7. Exercises
1 . 8. History, notes, and sources
¡
2 Chap. 1 Introduction
In this chapter, we introduce .·»· ,¬,¬¬¬·», the problem of mini
mizing a linear cost function subject to linear equality and inequality con
straints. We consider a few equivalent forms and then present a number
of examples to illustrate the applicability of linear programming to a wide
variety of contexts. We also solve a few simple examples and obtain some
basic geometric intuition on the nature of the problem. The chapter ends
with a review of linear algebra and of the conventions used in describing
the computational requirements (operation count) of algorithms.
1 . 1 Variants of the linear programming
problem
In this section, we pose the linear programming problem, discuss a few
special forms that it takes, and establish some standard notation that we
will be using. Rather than starting abstractly, we frst state a concrete
example, which is meant to facilitate understanding of the formal defnition
that will follow. The example we give is devoid of any interpretation. Later
on, in Section i z we will have ample opportunity to develop examples that
arise in practical settings.
L×ampe ¡. ¡ The following is a linear programming problem:
minimize z.
l .. + +.·
subject to Xl + .. + .,
´
z
·.. .· 5
.· + .,
¸
·
., ¸
0
.·
´
O.
Here .
, ,.. ,.· ,and ., are variables whose values are to be chosen to minimize
the linear cost fnction z.
,  ..+ +.· , subject to a set of linear equality and
inequality constraints. Some of these constraints, such as .
, ±
0 and .·
´
0,
amount to simple restrictions on the sign of certain variables. The remaining
constraints are of the form a
·
x_ b, a
·
x = b, or a
·
x
¸
b, where a= , , , . , · , , ,
is a given vector
l
, x = ,., ,.. ,.· ,.,, is the vector of decision vriables, a
·
x is
their inner product _¸¸¸ ,, , and b is a given scalar. For example, in the frst
constraint, we have a= ( 1 , 1 , 0, 1) and b = z
We now generalize. In a ,»¬. linear programming problem, we are
given a cost vector e" ¸Lj¸

) and we seek to minimize a linear cost
function e
× " 2�
=
1 L_Ü_ over all ndimensional vectors × " ¸Üj¸ . . . 

)
As discussed further in Section 1.5, all vectors are assumed to be column vectors, and
are treated as such in matrixvector products. Row vectors are indicated as transposes
of (column) vectors. However, whenever we refer to a vector 7 inside the text, we
use the more economical notation 7  ¸Jj¸¸ J_)¸ even though 7 is a column vector.
The reader who is unfamiliar with our notation may wish to consult Section 1.5 before
continuing.
Sec. 1 . 1 Variants of the linear programming problem 1
subj ect to a set of linear equality and inequality constraints. In particular,
let +, +, +, be some fnite index sets, and suppose that for every · in
any one of these sets, we are given an ndimensional vector ai and a scalar
bi , that will be used to form the ith constraint . Let also ., and .
,
be
subsets of {I, . . . , n} that indicate which variables 
,
are constrained to be
nonnegative or nonpositive, respectively. We then consider the problem
minimize e
×
subject to a�×
>
bi , · ÷ +,
a� ×
< bi , · ÷+
,
a� × bi , · ÷ +,
( 1 . 1 )

,
>
a, , ÷ .,

,
<
a, , ÷ .,•
The variables 
.


are called ···» ····.· and a vector × sat
isfing all of the constraints is called a ,···. ·.»··» or ,···. ···
The set of all feasible solutions is called the ,···. ·· or ,···. ·,·»
If , is in neither ., nor ., there are no restrictions on the sign of 
,
in
which case we say that 
,
is a , or »»······ variable. The function
e
×is called the ·,··· ,»··» or ·· ,»»··» A feasible solution ×
that minimizes the objective function (that is, e
× : e
×, for all feasible ×)
is called an ,··¬. ,···. ·.»··» or, simply, an ,··¬. ·.»··» The
value of e
× is then called the ,··¬. ·· On the other hand, if for
every real number 1 we can fnd a feasible solution × whose cost is less
than 1, we say that the optimal cost is ¯0 or that the cost is »»·»»
·.» ( Sometimes, we will abuse terminology and say that the problem is
»»·»» , We fnally note that there is no need to study maximization
problems separately, because maximizing e
× is equivalent to minimizing
the linear cost function e
×
An equality constraint a� × " bi is equivalent to the two constraints
a�×: bi and a�×2 bi• In addition, any constraint of the form a� ×: bi can
be rewritten as ¸ ÷a_) 'x 2 bi. Finally, constraints of the form 
,
2 a or

,
: a are special cases of constraints of the form a� × 2 bi , where � is a
unit vector and bi " We conclude that the feasible set in a general linear
programming problem can be expressed exclusively in terms of inequality
constraints of the form a� × 2 bi. Suppose that there is a total of m such
constraints, indexed by · " 1 , ... , m, let b " (b
l
, ... , bT), and let Abe the
m x n matrix whose rows are the row vectors a� , ... , a�, that is,
A" [
a�
j.
a�
Then, the constraints a� × 2 bi , · " 1 , ... , m, can be expressed compactly
in the form A×2 b, and the linear programming problem can be written
:
as
minimize e
×
subject to nx_ b.
Chap. 1 Introduction
i z,
Inequalities such as nx_ b will always be interpreted componentwise; that
is, for every i , the ith component of the vector nxwhich is a� ×, is greater
than or equal to the ith component ·,of the vector b.
L×ampe ¡. z The linear programming problem in Example 1 . 1 can be rewrit
ten as
minimize 2X_ X2 + 4X3
subject to Xl X2  X
4 ±
2
3X2 X3
>
5
3X2 + X3
> 5
X3 + X
4 ±
3
Xl ±
0
X3
±
0,
which i s of the same form as the problem ( 1 . 2) , with c = ( 2, 1 , 4, 0) ,
1 1 0 1
0 3 1 0
A ¬
0 3 1 0
0 0 1 1
1 0 0 0
0 0 1 0
and b = ( 2, 5, 5, 3, 0, 0) .
Standard form problems
A linear programming problem of the form
minimize e
×
subject to nx b i s,
×
>
u,
i s said to be i n ··»· ,·¬ We provide an interpretation of problems in
standard form. Suppose that ×has dimension n and let n
.
n

be the
columns of n Then, the constraint nx= b can be written in the form

_
n, ," b.
i
=l
Intuitively, there are n available resource vectors n
.
n

and a target
vector b
.
We wish to "synthesize" the target vector b by using a non
negative amount ,of each resource vector n, while minimizing the cost
l�
=
l
, , where ,is the unit cost of the ith resource. The following is a
more concrete example.
Sec. 1 . 1 Variants of the linear programming problem s
rxa»,i1.3 (1hc dict problcn) Suppose that there are n diferent foods
and m diferent nutrients, and that we are given the following table with the
nutritional content of a unit of each food:
food 1
. . .
food n
nutrient 1
¤; ;
. . .
¤;
º
nutrient m
¤¬;
. .
¤¬
º
Let A be the m x n matrix with entries
¤
,¸. Note that the jth column A¸
of this matrix represents the nutritional content of the jth food. Let b be a
vector with the requirements of an ideal diet or, equivalently, a specifcation of
the nutritional contents of an "ideal food. " We then interpret the standard form
problem as the problem of mixing nonnegative quantities x, of the available foods,
to synthesize the ideal food at minimal cost . In a variant of this problem, the
vector bspecifes the ninino/ requirements of an adequate diet; in that case, the
constraints Ax= b are replaced by Ax_ b, and the problem is not in standard
form.
Reduction to standard form
As argued earlier, any linear programming problem, including the standard
form problem ( 1 . 3) , is a special case of the general form ( 1 . 1 ) . We now
argue that the converse is also true and that a general linear programming
problem can be transformed into an equivalent problem in standard form.
Here, when we say that the two problems are equivalent, we mean that given
a feasible solution to one problem, we can construct a feasible solution to
the other, with the same cost . In particular, the two problems have the
same optimal cost and given an optimal solution to one problem, we can
construct an optimal solution to the other. The problem transformation
we have in mind involves two steps:
¸a) Elimination of free variables: Given an unrestricted variable 
,
in a
problem in general form, we replace it by ¸  ¸ where ¸ and xj
are new variables on which we impose the sign constraints ¸ _ a
and ¸ 2 The underlying idea is that any real number can be
written as the diference of two nonnegative numbers.
¸b) Elimination of inequality constraints: Given an inequality constraint
of the form
D
_
,,

,¸
·,
,,,
s Chap. 1 Introduction
we introduce a new variable ·,and the standard form constraints
D
_
,, , ·, ·,
,,
.
·,
>
0
Such a variable ·,is called a ·.· variable. Similarly, an inequality
constraint �
¸
,,
,,

, ¸ ·, can be put in standard form by intro
ducing a ·»,.»· variable ·, and the constraints �¸
,
,,
, ·, "
·, ·,_ 0.
We conclude that a general problem can be brought into standard form
and, therefore, we only need to develop methods that are capable of solving
standard form problems.
rxa»,i¡. : The problem
minimize zXl + 4X2
subject to Xl + X2 ¸
3
3Xl + 2X2 14
Xl ¸
0,
i s equivalent t o the standard form problem
minimize zXl + +� +,
subject to Xl + �  ¸  X3 3
3Xl + z�  z, 14
Xl , X�, X;, X3 ¸ O.
For example, given the feasible solution (Xl , X2 ) = (6, 2) to the original prob
lem, we obtain the feasible solution (Xl , X�, ¸,X3 ) = ( 6, 0, 2, 1 ) to the standard
form problem, which has the same cost. Conversely, given the feasible solution
(Xl , �,¸,X3) = (8, 1 , 6, 0) to the standard form problem, we obtain the feasible
solution (Xl , X2 ) = (8, 5) to the original problem with the same cost .
In the sequel, we will ofen use the general form nx_ b to develop
the theory of linear programming. However, when it comes to algorithms,
and especially the simplex and interior point methods, we will be focusing
on the standard form nx " b, x _ u, which is computationally more
convenient.
1 . 2 Examples of linear programming
problems
In this section, we discuss a number of examples of linear programming
problems. One of our purposes is to indicate the vast range of situations to
which linear programming can be applied. Another purpose is to develop
some familiarity with the art of constructing mathematical formulations of
loosely defned optimization problems.
Sec. 1 . 2 Examples of linear programming problems z
A production problem
A frm produces n diferent goods using m diferent raw materials. Let ·,
i " 1, ... , m, be the available amount of the ith raw material. The jth
good, j " 1, ... , n, requires ,,
units of the ith material and results in a
revenue of ,
per unit produced. The frm faces the problem of deciding
how much of each good to produce in order to maximize its total revenue.
In this example, the choice of the decision variables is simple. Let 
,
j " 1, . o . , n, be the amount of the jth good. Then, the problem facing the
frm can be formulated as follows:
maximize ,,
 



subject to ,
·

·
  ,



<
·,
,
> 0,
i = 1, . . . , m,
j " 1 , ... ,n.
Production planning by a computer manufacturer
The example that we consider here is a problem that Digital Equipment
Corporation (DEC) had faced in the fourth quarter of 1988. It illustrates
the complexities and uncertainties of real world applications, as well as
the usefulness of mathematical modeling for making important strategic
decisions.
In the second quarter of 1988, DEC introduced a new family of (single
CPU) computer systems and workstations: GPl, GP 2, and GP3, which
are general purpose computer systems with diferent memory, disk storage,
and expansion capabilities, as well as WSl and WS2, which are work
stations. In Table 1. 1 , we list the models, the list prices, the average disk
usage per system, and the memory usage. For example, GPl uses four
256K memory boards, and 3 out of every 10 units are produced with a disk
drive.
system Prìee ] dìskdrìves ] zssKboards
GP¡ $60, 000 0. 3 4
GPz $40, 000 1 . 7 2
GP1 $30, 000 0 2
Ws¡ $30, 000 1 . 4 2
Wsz $15, 000 0 1
Ibe ¡ . ¡ . Features of the fve diferent DEC systems.
Shipments of this new family of products started in the third quarter
and ramped slowly during the fourth quarter. The following difculties
were anticipated for the next quarter:
s Chap. 1 Introduction
(a) The inhouse supplier of CPUs could provide at most 7, 000 units, due
to debugging problems.
(b) The supply of disk drives was uncertain and was estimated by the
manufacturer to be in the range of 3, 000 to 7, 000 units.
( c) The supply of 256K memory boards was also limited in the range of
8, 000 to 16, 000 units.
On the demand side, the marketing department established that the
maximum demand for the frst quarter of 1989 would be 1 , 800 for GP 1
systems, 300 for GP3 systems, 3, 800 systems for the whole GP family, and
3, 200 systems for the WS family. Included in these projections were 500
orders for GP2, 500 orders for WS 1 , and 400 orders for WS2 that had
already been received and had to be fulflled in the next quarter.
In the previous quarters, in order to address the disk drive shortage,
DEC had produced GP 1 , GP3, and WS2 with no disk drive ( although
3 out of 10 customers for GP 1 systems wanted a disk drive) , and GP 2,
WS 1 with one disk drive. We refer to this way of confguring the systems
as the constrained mode of production.
In addition, DEC could address the shortage of 256K memory boards
by using two alternative boards, instead of four 256K memory boards, in
the GP 1 system. DEC could provide 4, 000 alternative boards for the next
quarter.
It was clear to the manufacturing staf that the problem had become
complex, as revenue, proftability, and customer satisfaction were at risk.
The following decisions needed to be made:
(a) The production plan for the frst quarter of 1989.
(b) Concerning disk drive usage, should DEC continue to manufacture
products in the constrained mode, or should it plan to satisfy cus
tomer preferences?
(c) Concerning memory boards, should DEC use alternative memory
boards for its GP 1 systems?
( d) A fnal decision that had to be made was related to tradeofs be
tween shortages of disk drives and of 256K memory boards. The
manufacturing staf would like to concentrate their eforts on either
decreasing the shortage of disks or decreasing the shortage of 256K
memory boards. Hence, they would like to know which alternative
would have a larger efect on revenue.
In order to model the problem that DEC faced, we introduce variables

.

, , , that represent the number (in thousands) of GP 1 , GP
2, GP3, WS 1 , and WS2 systems, respectively, to be produced in the
next quarter. Strictly speaking, since 1000Xi stands for number of units, it
must be an integer. This can be accomplished by truncating each ,after
the third decimal point; given the size of the demand and the size of the
Sec. 1 . 2 Examples of linear programming problems u
variables , this has a negligible efect and the integrality constraint on
iaaa, can be ignored.
uic had to make two distinct decisions: whether to use the con
strained mode of production regarding disk drive usage, and whether to
use alternative memory boards for the cii system. As a result , there are
four diferent combinations of possible choices.
We frst develop a model for the case where alternative memory
boards are not used and the constrained mode of production of disk drives
is selected. The problem can be formulated as follows:
maximize a
, 
1a,  sa,  sa,

i·.
·
(total revenue)
subject to the following constraints:

, 
, 
,

,
 ¸
<
7 cii availability)
1
, 
z, 
z,

z,


·
<
 z·i availability)
, 
, <
s (disk drive availability)

,
< i  (max demand for cii,
,
<
a. s (max demand for cis,

, 
, 
.,
<
s. (max demand for ci,
,


·
<
s z (max demand for WS)
,
>
a.· (min demand for ciz)
,
>
a.· (min demand for WS 1 )

·
>
a 1 (min demand for WS2)

,
, , , 
¸ ¸
Notice that the objective function i s i n millions of dollars. In some
respects, this is a pessimistic formulation, because the z·i memory and
disk drive availability were set to  and s respectively, which is the lowest
value in the range that was estimated. It is actually of interest to determine
the solution to this problem as the z·i memory availability ranges from
 to i and the disk drive availability ranges fom s to 7, because this
provides valuable information on the sensitivity of the optimal solution on
availability. In another respect, the formulation is optimistic because, for
example, it assumes that the revenue from cii systems is a
,
for any

,
¸ i  even though a demand for i aa ci i systems is not guaranteed.
In order to accommodate the other three choices that uic had, some
of the problem constraints have to be modifed, as follows. If we use the
unconstrained mode of production for disk drives, the constraint 
,
, ¸ s
is replaced by
Frthermore, if we wish to use alternative memory boards in cii systems,
we replace the constraint 1
, 
z
, 
z,

z,


·
¸  by the two
10
constraints
2X
l
<
4,
2X
2 
2X3

2X4

X5
<

Chap. 1 Introduction
The four combinations of choices lead to four diferent linear programming
problems, each of which needs to be solved for a variety of parameter values
because, as discussed earlier, the righthand side of some of the constraints
is only known to lie within a certain range. Methods for solving linear
programming problems, when certain parameters are allowed to vary, will
be studied in Chapter · where this case study is revisited.
Multiperiod planning of electric power capacity
A state wants to plan its electricity capacity for the next T years. The
state has a forecast of d
t
megawatts, presumed accurate, of the demand
for electricity during year ·
=
i . . . , T. The existing capacity, which is in
oilfred plants, that will not be retired and will be available during year
· is et
. There are two alternatives for expanding electric capacity: coal
fred or nuclear power plants. There is a capital cost of C
t
per megawatt
of coalfred capacity that becomes operational at the beginning of year ·
The corresponding capital cost for nuclear power plants i s
nt
. For various
political and safety reasons, it has been decided that no more than za
of the total capacity should ever be nuclear. Coal plants last for za years,
while nuclear plants last for i· years. A least cost capacity expansion plan
is desired.
The frst step in formulating this problem as a linear programming
problem is to defne the decision variables. Let X
t
and
Yt
be the amount
of coal respectively, nuclearcapacity brought on line at the beginning
of year · Let
W
t
and
Zt
be the total coal respectively, nuclearcapacity
available in year · The cost of a capacity expansion plan is therefore,
J
�ct
X
t  ntYt
.
t=l
Since coalfred plants last for za years, we have
t
W
t
=
_
·
= i . . . , T.
s=rax{1,t19}
Similarly, for nuclear power plants,
t
Zt =
_
Ys
, ·
=
i . . . , T.
s=rax{1,t1
4
}
Sec. 1 . 2 Examples of linear programming problems ¡ ¡
Since the available capacity must meet the forecasted demand, we require
t = 1 , . . . , T.
Finally, since no more than 20% of the total capacity should ever be nuclear,
we have
which can be written as
Zt
<
0. 2,
Wt  Zt  et

Summarizing, the capacity expansion problem is as follows:
.
minimize _
,
C
t
X
t
 ntYt )
t=l
t
subject to
W
t 
_
X
s
= 0,
s=max{1,t19}
t
Zt 
_
Ys
= 0,
s=max{1,t1
4
}
t = 1 , . . . , T,
t = 1 , . . . , T,
t = 1 , . . . , T,
t= 1 , . . . , T,
t = 1 , . . . , T.
We note that this formulation i s not entirely realistic, because i t disregards
certain economies of scale that may favor larger plants. However, it can
provide a ballpark estimate of the true cost .
A scheduling problem
In the previous examples, the choice of the decision variables was fairly
straightforward. We now discuss an example where this choice is less obvi
ous.
A hospital wants to make a weekly night shift ( 12pm8am
)
schedule
for its nurses. The demand for nurses for the night shift on day j is an
integer 
,
j = 1 , . . . , 7. Every nurse works · days in a row on the night
shift . The problem is to fnd the minimal number of nurses the hospital
needs to hire.
One could try using a decision variable
Y
j equal to the number of
nurses that work on day j. With this defnition, however, we would not be
able to capture the constraint that every nurse works · days in a row. For
this reason, we choose the decision variables diferently, and defne Xj as
¡z Chap. 1 Introduction
the number of nurses starting their week on day , (For example, a nurse
whose week starts on day · will work days · : i z , We then have the
following problem formulation:
minimize X
l 
X
2 
X3

X4  X5

X
6

X7
subject to X
l
 X4 
X5
 X
6

X7
>
d
l
X
l 
X
2 
X5

X
6
+ X7
>
d
2
X
l 
X
2 
X3

X
6

X7
>
d3
X
l
 X
2 
X3
 X4 
X7
>
d4
X
l

X
2 
X3

X4

X5
>
d5
X
2 
X3

X4

X5

X
6
>
d
6
X3

X4

X5

X
6
 X7
>
d7

,
� 0, 
,
integer.
This would be a linear programming problem, except for the constraint that
each 
,
must be an integer, and we actually have a linear .»·,· ,·,·.¬
¬.», problem. One way of dealing with this issue is to ignore ( "relax" )
the integrality constraints and obtain the socalled ..»· ,·,¬¬¬.», ·
..·.» of the original problem. Because the linear programming problem
has fewer constraints, and therefore more options, the optimal cost will be
less than or equal to the optimal cost of the original problem. If the optimal
solution to the linear programming relaxation happens to be integer, then
it is also an optimal solution to the original problem. If it is not integer, we
can round each 
,
upwards, thus obtaining a feasible, but not necessarily
optimal, solution to the original problem. It turns out that for this partic
ular problem, an optimal solution can be found without too much efort .
However, this is the exception rather than the rule: fnding optimal solu
tions to general integer programming problems is typically difcult ; some
methods will be discussed in Chapter i i
Choosing paths in a communication network
Consider a communication network consisting of n nodes. Nodes are con
nected by communication links. A link allowing oneway transmission from
node . to node , is described by an ordered pair . , , Let · be the set
of all links. We assume that each link · ,, ÷ · can carry up to ·,,
bits
per second. There is a positive charge ,,
per bit transmitted along that
link. Each node k generates data, at the rate of ·
.
€bits per second, that
have to be transmitted to node £, either through a direct link ( k, £) or by
tracing a sequence of links. The problem is to choose paths along which
all data reach their intended destinations, while minimizing the total cost .
We allow the data with the same origin and destination to be split and be
transmitted along diferent paths.
In order to formulate this problem as a linear programming problem,
we introduce variables x' indicating the amount of data with origin k and
Sec. 1 . 2 Examples of linear programming problems
destination £ that traverse link · , ,, Let
¸ b
k
£
b�£ = _b
k
t
'
¹
°
0,
i f · = k,
i f · = £,
otherwise.
¡1
Thus, b�£ i s the net infow at node · , from outside the network, of data with
origin k and destination £. We then have the following formulation:
D D
minimize _ _ _Ci
j
x
¸¸
(
i
,j)EA
k
=
l
£
=
1
subject to
D D
_ _x
¸¸
:: U
i
j,
k
=
l
£
=1
·, k, £ = 1 , ... , n,
· , ,, ÷ ·,
x
�}?O
, · , ,, ÷·, k, £ = 1 , v . « ,n.
The frst constraint is a fow conservation constraint at node · for data with
origin k and destination £. The expression
{j1(
i
,j)EA}
x�£
¹J
represents the amount of data with origin and destination k and £, respec
tively, that leave node · along some link. The expression
{j1(j,
i
)EA}
x
k
£
J¹
represents the amount of data with the same origin and destination that
enter node · through some link. Finally, b7£ is the net amount of such
data that enter node · from outside the network. The second constraint
expresses the requirement that the total trafc through a link · , ,, cannot
exceed the link
'
s capacity.
This problem is known as the ¬».·.¬¬.·, ,» problem, with the
trafc corresponding to each origindestination pair viewed as a diferent
commodity. A mathematically similar problem arises when we consider a
transportation company that wishes to transport several commodities from
their origins to their destinations through a network. There is a version
of this problem, known as the minimum cost »·»·· ,» problem, in
which we do not distinguish between diferent commodities. Instead, we
are given the amount bi of external supply or demand at each node · , and
the objective is to transport material from the supply nodes to the demand
nodes, at minimum cost . The network fow problem, which is the subject
of Chapter : contains as special cases some important problems such as
the shortest path problem, the maximum fow problem, and the assignment
problem.
14 Chap. 1 Introduction
Pattern classifcation
We are given m examples of objects and for each one, say the ith one, a
description of its features in terms of an ndimensional vector a;. Objects
belong to one of two classes, and for each example we are told the class
that it belongs to.
More concretely, suppose that each object is an image of an apple
or an orange (these are our two classes) . In this context , we can use a
threedimensional feature vector ai to summarize the contents of the ith
image. The three components of ai (the features) could be the ellipticity
of the object , the length of its stem, and its color, as measured in some
scale. We are interested in designing a .···/· which, given a new object
(other than the originally available examples) , will fgure out whether it is
an image of an apple or of an orange.
A .·»· .···/· is defned in terms of an ndimensional vector x
and a scalar 

,, and operates as follows. Given a new obj ect with feature
vector a, the classifer declares it to be an object of the frst class if
and of the second class if
a
/
x < 

,,
In words, a linear classifer makes decisions on the basis of a linear combina
tion of the diferent features. Our objective is to use the available examples
in order to design a "good" linear classifer.
There are many ways of approaching this problem, but a reasonable
starting point could be the requirement that the classifer must give the
correct answer for each one of the available examples. Let S be the set of
examples of the frst class. We are then looking for some x and 

,, that
satisf the constraints
a�x _ 

,,
a�x < 

,,
i ÷ S,
i t S.
Note that the second set of constraints involves a strict inequality and is
not quite of the form arising in linear programming. This issue can be
bypassed by observing that if some choice of x and 

,, satisfes all of
the above constraints, then there exists some other choice (obtained by
multiplying x and 

,, by a suitably large positive scalar) that satisfes
a�x;:

,,
a�x _ 

,, i
i ÷ S,
i t S.
We conclude that the search for a linear classifer consistent with all avail
able examples is a problem of fnding a feasible solution to a linear pro
gramming problem.
Sec. 1 . 3 Piecewise linear convex objective functions
1 . 3 Piecewise linear convex objective
functions
15
All of the examples in the preceding section involved a ..»· objective
function. However, there is an important class of optimization problems
with a nonlinear objective function that can be cast as linear programming
problems; these are examined next .
We frst need some defnitions:
Deñnìtìon 1. 1
(a) A function ). x

xis called eonve×i f for every ×,¸ E x

.
and every A E [0, 1] , we have
f
,AX + ( 1
 A
)
Y) :: A
)(×) +(1

A
) f(y)
·
(b) A function f : x

 xis called eoneaveHfor every ×,y E x

.
and every
A
E [0, 1] , we have
f
,AX + ( 1
 A
) y
) ± A
f(×) + i

A
) f(y)
·
Note that if ×and y are vectors in x

and if
A
ranges in [0, 1]
'
then
points of the form AX + (1 
A
) y belong to the line segment joining ×
and y. The defnition of a convex function refers to the values of f, as
its argument traces this segment. If f were linear, the inequality in part
(a) of the defnition would hold with equality. The inequality therefore
means that when we restrict attention to such a segment, the graph of the
function lies no higher than the graph of a corresponding linear function;
see Figure 1 . 1 (a) .
It is easily seen that a function f is convex if and only if the function
f is concave. Note that a function of the form ),×) = ,+ I�
=
l
, ,
where , 

are scalars, called an ¡» function, is both convex and
concave. (It turns out that afne functions are the only functions that are
both convex and concave. ) Convex (as well as concave) functions play a
central role in optimization.
We say that a vector ×is a .. ¬.».¬»¬ of f if f(x) :: f(y) for all y
in the vicinity of × We also say that ×is a ,.·. ¬.».¬»¬ if f(x) :: f(y)
for all y. A convex function cannot have local minima that fail to be global
minima (see Figure 1 . 1 ) , and this property is of great help in designing
efcient optimization algorithms.
Let C
l
, . . . , Cm be vectors in x

let , . . . , dm be scalars, and consider
the function f : x

l xdefned by
f(x) = ¸ max ,e�×+ , ,
1.
=
l , ...,
m
¡s
V
(x) + , ¹
¯ ·· ·
�
x /
+
( I  /)
)
)
( a)
Chap. 1 Introduction
~
,
)
(c)
Fì,ure¡. ¡. (a) Illustration of the defnition of a convex fnction.
,b) A concave function. (c) A fnction that is neither convex nor
concave; note that Ais a local, but not global, minimum.
[see Figure 1 . 2(a)] . Such a function is convex, as a consequence of the
following result .
Iheorem ¡. ¡ Let I, . . . , f
m
. �n l � be convex functions. Then,
the function f defned by f(x) " maxi
=l
,
, , .
,m fi (X) is also convex.
Prooi. Let x,
y
÷ �
n
and let A ÷ [ 0, 1] . We have
f (AX + ( 1  A)Y) . max fi (AX + (1  A)Y)
'
=
l , , , , , D
<
,
max (Afi (X) + ( 1  A) h(
y
) )
t=l , . . . , m
< . max Afi (X) +
. max ( 1  A) fi (Y)
t=l , . . . , m t=l , . , . , D
Af(x) + ( 1  A) f(y) .
.
A function of the form maxi
=l , , , , , m
(c�x+di ) is called a ,·»·· .·»·
»·. function. A simple example is the absolute value function defned by
f() " I l " max{ } As illustrated in Figure 1 . 2(b) , a piecewise linear
convex function can be used to approximate a general convex fnction.
We now consider a generalization of linear programming, where the
objective function is piecewise linear and convex rather than linear:
minimize . max (c�x + di )
t=l , . . . , m
subject to Ax ;: b.
Sec. 1 . 3 Piecewise linea convex objective functions
M
J
. ¹
' '¹
Fì,ure ¡. z. ¸a) A piecewise linear convex function of a single
variable. ¸b)An approximation of a convex function by a piecewise
linear convex function.
¡z
Þ
J
Note that ¬¬, ,
¸ ¸ ¸ , m(c�x + di ) is equal to the smallest number × that
satisfes × � c�x + di for all · For this reason, the optimization problem
under consideration is equivalent to the linear programming problem
minimize
subject to
×
× � c�x + di ,
Ax � b,
where the decision variables are × and x.
i " 1, . . , m,
To summarize, linear programming can be used to solve problems with
piecewise linear convex cost functions, and the latter class of functions can
be used as an approximation of more general convex cost functions. On
the other hand, such a piecewise linear approximation is not always a good
idea because it can turn a smooth function into a nonsmooth one (piecewise
linear functions have discontinuous derivatives) .
We fnally note that if we are given a constraint of the form f(x) _ /,
where f i s the piecewise linear convex function f(x) " ¬¬,
·
¸ ¸ ¸ , m( f[x+9i ) ,
such a constraint can be rewritten as
fIx + 9i _ /, · = 1 , . . . , m,
and linear programming is again applicable.
Problems involving absolute values
Consider a problem of the form
D
minimize _ , ,I
, ,
subject to Ax � b,
18 Chap. 1 Introduction
where × " 
.


, and where the cost coefcients
, are assumed
to be nonnegative. The cost criterion, being the sum of the piecewise
linear convex functions , , is easily shown to be piecewise linear and
convex (Exercise 1 . 2) . However, expressing this cost criterion in the form
max
j
,e¸ ×+d
j
) is a bit involved, and a more direct route is preferable.
We observe that 
,
is the smallest number ., that satisfes ,
¸ ., and
, ¸., and we obtain the linear programming formulation
D
minimize _, .,
,
.
subject to A×
>
b
,
<
., · " 1 , . . . ,n,
,
<
., · " 1 , . . . ,n.
An alternative method for dealing with absolute values is to introduce
new variables ¸ _ constrained to be nonnegative, and let ," 
,
 _.
( Our intention is to have ," ¸or ," _ depending on whether ,is
positive or negative. ) We then replace every occurrence of , with ¸+¸
and obtain the alternative formulation
D
minimize
_, 
,
+_ ,
,
.
subject to A×
·
 A× _ b
×
·
, × _ 0,
where ×
· " , _, and ×
" 
, 
,
The relations 
,
" ¸  ¸ ¸ _ 0, _ _ 0, are not enough to
guarantee that ," ¸+¸ and the validity of this reformulation may
not be entirely obvious. Let us assume for simplicity that , > 0 for all
· At an optimal solution to the reformulated problem, and for each · we
must have either ¸" 0 or _" 0, because otherwise we could reduce both
¸and ¸by the same amount and preserve feasibility, while reducing the
cost, in contradiction of optimality. Having guaranteed that either ¸" 0
or _" 0, the desired relation 
," ¸+¸now follows.
The formal correctness of the two reformulations that have been pre
sented here, and in a somewhat more general setting, is the subject of
Exercise 1 . 5. We also note that the nonnegativity assumption on the cost
coefcients ,is crucial because, otherwise, the cost criterion is nonconvex.
E×ampe 1.5 Consider the problem
minimize z , ¸+ x,
subject to x
i
+ x,
2
4.
Sec. 1 . 3 Piecewise linea convex objective functions ¡u
Our frst reformulation yields
minimize 2·; + x,
subject to x; + x,
2
4
x; ± ·;
÷x; ± ·;
,
while the second yields
minimize 2xt + 2xI + x,
subject to x+
ì
x
l
+ x,
2
4
x+
1
2
0
X
l
>
O
We now continue with some applications involving piecewise linear
convex objective functions.
Data ftting
We are given m data points of the form (�, bi ), ·
=
1 , . . . , m, where � ÷ �
n
and b
i
÷ �, and wish to build a model that predicts the value of the variable
b fom knowledge of the vector a. In such a situation, one often uses a linear
model of the form b
=
a
/
x, where x is a parameter vector to be determined.
Given a particular parameter vector x, the ··.». or prediction error, at
the ith data point is defned as I b
i
a� x
l
. Given a choice between alternative
models, one should choose a model that "explains" the available data as
best as possible, i . e. , a model that results in small residuals.
One possibility is to minimize the largest residual. This is the problem
of minimizing
max I bi ¯ a� x
l
,
•
with respect to x, subject to no constraints. Note that we are dealing here
with a piecewise linear convex cost criterion. The following is an equivalent
linear programming formulation:
minimize 
subject to bi ¯ a� x :: Z,
b
i 
a� x � Z¸
the decision variables being Z and x.
·
=
i , . » . , m,
·
=
i , . . . , m,
In an alternative formulation, we could adopt the cost criterion
m
_I bi

a� x
l ·
i=l
zu Cbap. 1 Introduction
Since ·, a� x is the smallest number ., that satisfes ·, a�x¸., and
·,+ a� x¸., we obtain the formulation
minimize
subject to
Z
1
+
. . .
+ .,
·, a�x¸.,
·,+ a�x¸.,
. 1 , . . . , m,
.1 , . . . , m.
In practice, one may wish to use the quadratic cost criterion 27
1
·, 
a�x,
in order to obtain a "least squares ft . " This is a problem which is
easier than linear programming; it can be solved using calculus methods,
but its discussion is outside the scope of this book.
Optimal control of linear systems
Consider a dynamical system that evolves according to a model of the form
x·+ 1 )
,·,
nx·,+ no·,
·
x·,
Here x·,is the state of the system at time · ,·,is the system output ,
assumed scalar, and o·, is a control vector that we are free to choose
subj ect to linear constraints of the form bo·, ¸ d [these might include
saturation constraints, i . e. , hard bounds on the magnitude of each com
ponent of o·,, To mention some possible applications, this could be a
model of an airplane, an engine, an electrical circuit , a mechanical system,
a manufacturing system, or even a model of economic growth. We are also
given the initial state x, In one possible problem, we are to choose the
values of the control variables o, o: 1) to drive the state x:,to
a target state, assumed for simplicity to be zero. In addition to zeroing the
state, it is often desirable to keep the magnitude of the output small at all
intermediate times, and we may wish to minimize
max ,·,
.
We then obtain the following linear programming problem:
minimize .
subject to  ¸ ,·,¸ 
x ·+ 1 ) = nx·,+ no·,
,·,= ·x·,
bo·,¸d,
x :, u,
x , given.
·1, :  1 ,
·0, . . . :  1 ,
·1 , :  1 ,
·0, :  1 ,
Sec. 1 . 4 Graphical representation and solution z¡
Additional linear constraints on the state vectors ×(·) , or a more general
piecewise linear convex cost function of the state and the control, can also
be incorporated.
Rocket control
Consider a rocket that travels along a straight path. Let Xt , Vt , and at be
the position, velocity, and acceleration, respectively, of the rocket at time
· By discretizing time and by taking the time increment to be unity, we
obtain an approximate discretetime model of the form
Xt
+l
Xt + Vt ,
v
t
+l
Vt + at
·
We assume that the acceleration at is under our control, as it is determined
by the rocket thrust. In a rough model, the magnitude l at l of the accelera
tion can be assumed to be proportional to the rate of fuel consumption at
time ·
Suppose that the rocket i s initially at rest at the origin, that i s, Xo = 0
and Vo
=
U. We wish the rocket to take of and "land softly" at unit dis
tance from the origin after T time units, that is, XT
=
1 and VT = U.
Frthermore, we wish to accomplish this i n an economical fashion. One
possibility is to minimize the total fuel
_
_l at l spent subj ect to the pre
ceding constraints. Alternatively, we may wish to minimize the maximum
thrust required, which is maxt l at l . Under either alternative, the problem
can be formulated as a linear programming problem (Exercise i ,
1 . 4 Graphical representation and solution
In this section, we consider a few simple examples that provide useful geo
metric insights into the nature of linear programming problems. Our frst
example involves the graphical solution of a linear programming problem
with two variables.
L×ampe ¡. s Consider the problem
minimize ÷x
; x,
subject to x; + 2x,
±
3
2x; + x,
3
x; , x, ¸ 0.
The feasible set i s the shaded region in Figure 1 . 3. In order to fnd an optimal
solution, we proceed as follows. For any given scalar z, we consider the set of
all points whose cost c
'
x is equal to z; this is the line described by the equation
÷x
;
÷ x, = Z • Note that this line is perpendicular to the vector c = ( 1 , 1 ) .
Diferent values of z lead t o diferent lines, all of them parallel t o each other. In
zz Chap. 1 Introduction
Fì,ure ¡ . 1. Graphical solution of the problem in Example 1 . 6.
particular, increasing z corresponds t o moving the line z = ¬J_ ¬ J_ along the
direction of the vector c. Since we are interested in minimizing z, we would like
to move the line as much as possible in the direction of c, as long as we do not
leave the feasible region. The best we can do is z = 2 (see Figure 1 . 3) , and the
vector x = ( 1 , 1) is an optimal solution. Note that this is a corner of the feasible
set. (The concept of a "corner" will be defned formally in Chapter 2. )
For a problem i n three dimensions, the same approach can be used
except that the set of points with the same value of e
×is a plane, instead of
a line. This plane is again perpendicular to the vector e,and the objective
is to slide this plane as much as possible in the direction of e, as long as
we do not leave the feasible set .
E×ampe ¡. z Suppose that the feasible set is the unit cube, described by the
constraints 0 _ J¸ _ 1, i = 1 , 2, 3, and that c = , 1 ,1 , 1 ) . Then, the vector
x = ( 1 , 1 , 1) is an optimal solution. Once more, the optimal solution happens to
be a corner of the feasible set (Figure 1 . 4) .
Sec. 1 . 4 Graphical representation and solution
¬ C ¯ (1 , 1 , 1)
Fì,ure ¡. :. The threedimensional linear programming problem
in Example 1 . 7.
z1
In both of the preceding examples, the feasible set is bounded ¸does
not extend to infnity) , and the problem has a unique optimal solution.
This is not always the case and we have some additional possibilities that
are illustrated by the example that follows.
E×ampe ¡. s Consider the feasible set in l
2
defned by the constraints
which is shown in Figure 1 . 5.
Xl + X2 _ 1
Xl
>
0
X2
>
0,
(a) For the cost vector c = ( 1 , 1 ) , it is clear that x = (0, 0) is the unique
optimal solution.
(b) For the cost vector c = ¸ 1 , 0) , there are multiple optimal solutions, namely,
every vector x of the form x = (0, X2 ) , with 0 _ X2 _ 1 , is optimal. Note
that the set of optimal solutions is bounded.
(c) For the cost vector c = ¸0, 1) , there are multiple optimal solutions, namely,
every vector x of the form x = (Xl , 0) , with Xl 2 0, is optimal. In this case,
the set of optimal solutions is unbounded (contains vectors of arbitrarily
large magnitude) .
(d) Consider the cost vector c = ¸ 1 , 1) . For any feasible solution (Xl , X2 ) , we
can always produce another feasible solution with less cost, by increasing
the value of Xl . Therefore, no feasible solution is optimal. Frthermore,
by considering vectors (Xl , X2) with ever increasing values of Xl and X2 , we
can obtain a sequence of feasible solutions whose cost converges to ¬¯.
We therefore say that the optimal cost i s ¬¯.
(e) If we impose an additional constraint of the form Xl +X2 _ 2, i t i s evident
that no feasible solution exists.
za Chap. 1 Introduction
C ¯ (1 , 0)
Fì,ure ¡ . s. The feasible set in Example 1 . 8. For each choice of
c, an optimal solution is obtained by moving as much as possible
in the direction of c.
To summarize the insights obtained from Example i , we have the
following possibilities:
(a) There exists a unique optimal solution.
(b) There exist multiple optimal solutions; in this case, the set of optimal
solutions can be either bounded or unbounded.
(c) The optimal cost is 0, and no feasible solution is optimal.
(d) The feasible set is empty.
In principle, there is an additional possibility: an optimal solution
does not exist even though the problem is feasible and the optimal cost is
not 0; this is the case, for example, in the problem of minimizing l /x
subject to x > 0 (for every feasible solution, there exists another with less
cost , but the optimal cost is not 0) . We will see later in this book that
this possibility never arises in linear programming.
In the examples that we have considered, if the problem has at least
one optimal solution, then an optimal solution can be found among the
corners of the feasible set. In Chapter 2, we will show that this is a general
feature of linear programming problems, as long as the feasible set has at
least one corner.
Sec. 1 . 4 Graphical representation and solution zs
Visualizing standard form problems
We now discuss a method that allows us to visualize standard form problems
even if the dimension n of the vector xis greater than three. The reason
for wishing to do so is that when n _ s the feasible set of a standard
form problem does not have much variety and does not provide enough
insight into the general case. (In contrast, if the feasible set is described by
constraints of the form nx¸ t enough variety is obtained even if xhas
dimension three. )
Suppose that we have a standard form problem, and that the matrix
nhas dimensions m x n. In particular, the decision vector xis of dimension
n and we have m equality constraints. We assume that m _ n and that
the constraints nx = t force x to lie on an ¸n ÷ m)dimensional set .
(Intuitively, each constraint removes one of the "degrees of freedom" of x ,
If we "stand" on that ¸n  m )dimensional set and ignore the m dimensions
orthogonal to it , the feasible set is only constrained by the linear inequality
constraints , ¸ a · = 1 , . . . , n. In particular, if n ÷ m = 2, the feasible
set can be drawn as a twodimensional set defned by n linear inequality
constraints.
To illustrate this approach, consider the feasible set in x
,
defned by
the constraints ,+ 
+ ,= 1 and , 
,¸a[Figure 1 . 6( a)] , and note
that n = s and m = 1 . If we stand on the plane defned by the constraint
,+ 
+ ,= 1 , then the feasible set has the appearance of a triangle in
twodimensional space. Furthermore, each edge of the triangle corresponds
to one of the constraints , 
,¸asee Figure 1 . 6(b) .
(a) ' '¹
Fì,ure ¡. s. (a) An ndimensional view of the feasible set. (b)
An (n  m ) dimensional view of the same set.
zc Chap. 1 Introduction
1 . 5 Linear algebra background and notation
This section provides a summary of the main notational conventions that
we will be employing. It also contains a brief review of those results from
linear algebra that are used in the sequel.
Set theoretic notation
If  is a set and x is an element of  we write x ÷  A set can be
specifed in the form  " {x I x satisfes P} , as the set of all elements
having property P. The cardinality of a fnite set is denoted by

The
union of two sets and :is denoted by U : and their intersection by
: We use ,:to denote the set of all elements of that do not belong
to : The notation c :means that is a subset of : i . e. , every element
of is also an element of : in particular, could be equal to : If in
addition : we say that is a ,·,· subset of : We use Ø to denote
the empty set . The symbols ¬ and V have the meanings "there exists" and
"for all , " respectively.
We use � to denote the set of real numbers. For any real numbers a
and b, we defne the closed and open intervals [a, bl and (a, b) , respectively,
by
[ a, bl " {x ÷ � I a :: x :: b} ,
and
(a, b) " {x ÷ �I a < x < b}.
Vectors and matrices
A ¬··.. of dimensions m x n is an array of real numbers aij :
Matrices will be always denoted by upper case boldface characters. If A
is a matrix, we use the notation aij or [Al ij to refer to its ( i, j) th entry.
A ·» ··· is a matrix with m " 1 and a .»¬» ··· is a matrix
with n " 1 . The word ··· will always mean .»¬» ··· unless the
contrary is explicitly stated. Vectors will be usually denoted by lower case
boldface characters. We use the notation �
n
to indicate the set of all
ndimensional vectors. For any vector x ÷ �
n
, we use 
.



to
Sec. 1 . 5 Linear algebra background and notation zz
indicate i ts components. Thus,
The more economical notation x " 
.



, will also be used even
if we are referring to column vectors. We use u to denote the vector with
all components equal to zero. The ith »»·· ··· ,is the vector with all
components equal to zero except for the ith component which is equal to
one.
The ·¬»·,· nof an m x n matrix nis the n x m matrix
that is, ¸n
, ,,
" ¸n, ,, Similarly, if x is a vector in �

its transpose x' is
the row vector with the same entries.
If x and
y
are two vectors in �

then

/ /
X
Y
"
Y
x "
,
,
,
.
This quantity is called the ·»»· ,¬»· of x and
y
. Two vectors are
called ··/,». if their inner product is zero. Note that XiX � 0 for every
vector x, with equality holding if and only if x " a The expression "X
/
X
is the r».·» »·¬ of x and is denoted by I
l
x
l l
. The ·/»··. ·»,».··,
asserts that for any two vectors of the same dimension, we have
I X/y
l
�
I l
x
l l
· l l
y
l l
,
with equality holding if and only if one of the two vectors is a scalar multiple
of the other.
If nis an m x n matrix, we use n
,
to denote its jth column, that is,
n
,
" 
.
, 
,

,
, (This is our only exception to the rule of using
lower case characters to represent vectors. ) We also use a, to denote the
vector formed by the entries of the ith row, that is, a," ,
.
,
,

,
Thus,
zs Chap. 1 Introduction
Given two matrices A, B of dimensions m x n and n x k, respectively,
their ,·»· AB is a matrix of dimensions m x k whose entries are given
by

[
AB
] i
j
I)
A
] i
e [
B
] £j
a
�
B
j
,
£=1
where a� is the ith row of A, and B
j
is the jth column of B. Matrix
multiplication is associative, i . e. , (AB) C A( BC) , but , in general , it is
not commutative, that is, the equality AB BA is not always true. We
also have (AB) ' B' A' .
Let A be an m x n matrix with columns Ai . We then have Aei Ai .
Any vector x÷ x

can be written in the form xZ:�
=1
Xi ei , which leads
to
  
nxA _ei Xi _Aei xi " _Ai Xi ·
i
=1
i
=1 i
=1
A diferent representation of the matrixvector product nxi s provided by
the formula
[ a� x1
a�x
nx
.
,
�x
where a� , . . . , a� are the rows of A.
A matrix i s called ·,»· if the number m of i ts rows i s equal to the
number n of its columns. We use I to denote the ·»···, matrix, which is
a square matrix whose diagonal entries are equal to one and its ofdiagonal
entries are equal to zero. The identity matrix satisfes IA A and BI B
for any matrices A, B of dimensions compatible with those of I.
If xi s a vector, the notation x � u and x > u means that every
component of xis nonnegative (respectively, positive) . If A is a matrix,
the inequalities A � uand A > uhave a similar meaning.
Matrix inversion
Let A be a square matrix. If there exists a square matrix B of the same
dimensions satisfing AB BA I, we say that A is ·»·····. or »»
··»,».· Such a matrix B, called the ·»··· of A, is unique and is de
noted by A 
1
. We note that (A, ) 
1
(A 
1
) ' . Also, if A and B are
invertible matrices of the same dimensions, then AB is also invertible and
(AB) 
1
B
1
A
1
o
Given a fnite collection of vectors X
l
, . . . x
«
÷ x

we say that they
are .·»·., ,»»· if there exist real numbers 
l
, 
«
not all of
them zero, such that z:J=
1

.
x
.
u, otherwise, they are called .·»·.,
·»,»»· An equivalent defnition of linear independence requires that
Sec. 1 . 5 Linear algebra background and notation zu
none of the vectors X
l
, . . . x
«
is a linear combination of the remaining
vectors (Exercise 1 . 18) . We have the following result .
Iheorem ¡. z Let A be a square matrix. Then, the following state
ments are equivalent:
¦
a)
The matrix A is invertible.
, , The matrix A' i s invertible.
, ·
!
The determinant of A is nonzero.
¤!
The rows of A are linearly independent.
,
e
)
The columns of A are linearly independent.
(f) For every vector  the linear system Ax  has a unique
sol ution.
,
,
)
There exists some vector such that the linear system Ax
has a unique solution.
Assuming that A is an invertible square matrix, an explicit formula
for the solution x A
.
 of the system Ax  is given by .¬¬··
·»· Specifcally, the jth component of xis given by
det (A
j
)
·
'
det (A)
,
where A
j
is the same matrix as A, except that its jth column is replaced
by  Here, as well as later, the notation det (A) is used to denote the
··¬.»»· of a square matrix A.
Subspaces and bases
A nonempty subset S of 3
n
is called a ·»··, of 3
n
if x+ by E S for
every xy E S and every  b E 3. If, in addition, S = 3
n
, we say that S is
a ,·,· subspace. Note that every subspace must contain the zero vector.
The ·,» of a fnite number of vectors x
.
x
«
in 3
n
is the subspace
of 3
n
defned as the set of all vectors y of the form y ��
.

.
x
.
where
each 
.
is a real number. Any such vector y is called a ·.»· ¬·.»·.»
of x
.
x
«
Given a subspace S of 3
n
, with S = {O} , a ··.· of S is a collection of
vectors that are linearly independent and whose span is equal to S. Every
basis of a given subspace has the same number of vectors and this number
is called the .¬»·.» of the subspace. In particular, the dimension of
3
n
is equal to n and every proper subspace of 3
n
has dimension smaller
than n. Note that onedimensional subspaces are lines through the origin;
twodimensional subspaces are planes through the origin. Finally, the set
{O} is a subspace and its dimension is defned to be zero.
1u Chap. 1 Introduction
If is a proper subspace of �

then there exists a nonzero vector a
which is orthogonal to that is, ax " afor every x ÷  More generally,
if has dimension m < n, there exist n  m linearly independent vectors
that are orthogonal to 
The result that follows provides some important facts regarding bases
and linear independence.
Iheorem ¡. 1 8uppose that the span S of the vectors ×
·
,. » . , x
K
has
dimension m. Then:
(a) There exists a basis of consisting of m of the vectors ×
·
, . . . , x
K
.
(t, nk _ m and x
·
, . . . , x
k
are linearly independent, we can form a
basis of by starting with x
l
, . . . , x
k
, and choosing m ¯ k of the
vectors X
k+l
,
•
•
_ _
x
K
.
Proor. We only prove part b, , because a, is the special case of part
b,with k " 0 If every vector x
k+l
, . . . , x
K
can be expressed as a linear
combination of ×
·
,. . . , x
k
, then every vector in the span of x
l
, . . . , x
K
is
also a linear combination of ×
·
,. . . , x
k
, and the latter vectors form a basis.
In particular, m " k. ) Otherwise, at least one of the vectors x
k
+l
, . . . , x
K
is linearly independent from ×
·
,. . . , x
k
. By picking one such vector, we
now have k

1 of the vectors ×
·
,. . . , x
K
that are linearly independent . By
repeating this process m  k times, we end up with the desired basis of 
.
Let A be a matrix of dimensions m x n. The .»¬» ·, of A
is the subspace of �
spanned by the columns of A. The ·» ·, of
A is the subspace of �

spanned by the rows of A. The dimension of
the column space is always equal to the dimension of the row space, and
this number is called the ¬»· of A. Clearly, rankA, :: min{m, n, . The
matrix A is said to have ].. ¬»· if rankA," min{m, n, . Finally, the set
{x E �

I Ax " a,is called the »»..·, of A; it is a subspace of �

and
its dimension is equal to n  rankA, .
Afne subspaces
Let ,be a subspace of �

and let x
O
be some vector. If we add x
O
to
every element of , this amounts to translating , by x
O
. The resulting
set can be defned formally by
" ,

x
O
" ¦ x

x
O
I x ÷ , ¦
In general,  is not a subspace, because it does not necessarily contain
the zero vector, and it is called an ¡» ·»··, The .¬»·.» of is
defned to be equal to the dimension of the underlying subspace ,
Sec. 1 . 5 Linear algebra background and notation 1¡
As an example, let ×º , ×
·
,. . . , x
k
be some vectors i n �

and consider
the set 8 of all vectors of the form
where >, . . . , A
k
are arbitrary scalars. For this case, 80 can be identifed
with the span of the vectors ×
·
,. . . , x
k
, and 8 is an afne subspace. If
the vectors ×
·
,. . . , x
k
are linearly independent , their span has dimension
k, and the afne subspace 8 also has dimension k.
For a second example, we are given an m x n matrix Aand a vector
bE �
and we consider the set
8 = ¦×E �

I A×= b¦ ,
which we assume t o be nonempty. Let us fx some xO such that A×º= b
An arbitrary vector × belongs to 8 i f and only if A× = b = A×º , or
A,×÷ ×º) " a Hence, ×E 8 if and only if ×÷ ×º belongs to the subspace
80 = {y I Ay = O}. We conclude that 8 = {y + ×º l y E 80 } , and
8 is an afne subspace of �

If A has m linearly independent rows, its
nullspace 80 has dimension n ÷ m. Hence, the afne subspace 8 also has
dimension n ÷ m. Intuitively, if a� are the rows of A, each one of the
constraints a� ×= bi removes one degree of freedom from x, thus reducing
the dimension from n to n ÷ mj see Figure i : for an illustration.
Fì,ure ¡. z. Consider a set S in � defned by a single equality
constraint a
·
x = b. Let x
+
be an element of S. The vector a is
perpendicular to S. If x
;
and x
,
are linearly independent vectors
that are orthogonal to a, then every x Ç S is of the form x =
x
+
+ X, x
;
+ X, x
,
. In particular, S is a twodimensional afne
subspace.
1z Chap. 1 Introduction
1. 6 Algorithms and operation counts
Optimization problems such as linear programming and, more generally,
all computational problems are solved by .,·.·/¬· Loosely speaking, an
algorithm is a fnite set of instructions of the type used in common pro
gramming languages ¸arithmetic operations, conditional statements, read
and write statements, etc. ) . Although the running time of an algorithm
may depend substantially on clever programming or on the computer hard
ware available, we are interested in comparing algorithms without having
to examine the details of a particular implementation. As a frst approx
imation, this can be accomplished by counting the number of arithmetic
operations ¸ additions, multiplications, divisions, comparisons) required by
an algorithm. This approach is ofen adequate even though it ignores the
fact that adding or multiplying large integers or highprecision foating
point numbers is more demanding than adding or multiplying singledigit
integers. A more refned approach will be discussed briefy in Chapter 
L×ampe ¡. u
(a) Let a and b be vectors in �
º
The natural algorithm for computing a' b
requires nmultiplications and nI additions, for a total of znI arithmetic
operations.
(b) Let A and H be matrices of dimensions n x n The traditional way of
computing AH forms the inner product of a row of A and a column of H
to obtain an entry of AH Since there are n

entries to be evaluated, a
total of
,
zn
¹
) n

arithmetic operations are involved.
In Example 1. 9, an exact operation count was possible. However,
for more complicated problems and algorithms, an exact count is usually
very difcult . For this reason, we will settle for an estimate of the rate of
growth of the number of arithmetic operations, as a function of the problem
parameters. Thus, in Example 1. 9, we might be content to say that the
number of operations in the computation of an inner product increases
linearly with n, and the number of operations in matrix multiplication
increases cubically with n. This leads us to the order of magnitude notation
that we defne next.
Deñnìtion ¡. z Let f and g be functions that map positive numbers
to positive numbers.
,a)
We write f(n) = O(g(n) ) if there exist positive numbers no and
c such that f(n) : cg(n) for all n ;: no .
,

·
We write f(n) = n(g( n) ) if there exist positive numbers no and
f such that f(n) ;: cg(n) for all n ;: no .

·
¹
We write f(n) = 8(g(n) ) if both f( n) = O(g( n) ) and f(n)
n(g(n) ) hold.
Sec. 1 . 6 Algorithms ad operation counts 11
For example, we have 3n3 + n
2
+ ia = 8,n3 ) , n log n = O,n
2
) , and
n log n = O(n) .
While the running time of the algorithms considered in Example i a is
predictable, the running time of more complicated algorithms ofen depends
on the numerical values of the input data. In such cases, instead of trying
to estimate the running time for each possible choice of the input , it is
customary to estimate the running time for the »··· ,··.·. .»,»· · of a
given "size. " For example, if we have an algorithm for linear programming,
we might be interested in estimating its worstcase running time over all
problems with a given number of variables and constraints. This emphasis
on the worst case is somewhat conservative and, in practice, the "average"
running time of an algorithm might be more relevant . However, the average
running time is much more difcult to estimate, or even to defne, and for
this reason, the worstcase approach is widely used.
L×ampe ¡. ¡u (Operation count of linear system solvers and matrix
inversion) Consider the problem of solving a system of n linear equations in n
unknowns. The classical method that eliminates one variable at a time , Gaussian
elimination) is known to require O(n
3
) arithmetic operations in order to either
compute a solution or to decide that no solution exists. Practical methods for
matrix inversion also require O(n
3
) arithmetic operations. These facts will be of
use later on.
Is the O,n3 ) running time of Gaussian elimination good or bad? Some
perspective into this question is provided by the following observation: each
time that technological advances lead to computer hardware that is faster
by a factor of  ,presumably every few years) , we can solve problems of twice
the size than earlier possible. A similar argument applies to algorithms
whose running time is O,nk ) for some positive integer k. Such algorithms
are said to run in ,.,»¬.. ·.¬
Algorithms also exist whose running time is O( 2c
n
) , where n is a
parameter representing problem size and is a constant ; these are said to
take at least ,»»·.. ·.¬ For such algorithms and if = i, each time
that computer hardware becomes faster by a factor of 2, we can increase
the value of n that we can handle only by i It is then reasonable to expect
that no matter how much technology improves, problems with truly large
values of n will always be difcult to handle.
L×ampe ¡ . ¡ ¡ Suppose that we have a choice of two algorithms. The running
time of the frst is IO
n
/100 ,exponential) and the running time of the second
is lOn
3
,polynomial) . For very small n, e. g. , for n = 3, the exponential time
algorithm is preferable. To gain some perspective as to what happens for larger
n, suppose that we have access to a workstation that can execute 10
7
arithmetic
operations per second and that we are willing to let it run for 1000 seconds.
Let us fgure out what size problems can each algorithm handle within this time
fame. The equation IO
n
/100 = 10
7
Z 1000 yields n = 12, whereas the equation
1a Chap. 1 Introduction
lOn
3
= 10
7
Z 1000 yields n = 1000, indicating that the polynomial time algorithm
allows us to solve much larger problems.
The point of view emerging from the above discussion is that , as a frst
cut , it is useful to j uxtapose polynomial and exponential time algorithms,
the former being viewed as relatively fast and efcient , and the latter as
relatively slow. This point of view is j ustifed in many  but not all 
contexts and we will be returning to it later in this book.
1 . 7 Exercises
L×ereìse ¡. ¡¯ Suppose that a fnction I . R
º
' Ris both concave and convex.
Prove that I is an afne function.
L×ereìse ¡. z Suppose that J, . . . , 1m are convex functions fom R
º
into Rand
let I(x) = __¸ ], ,×)
(U) Show that if each I, is convex, so is I.
(b) Show that i f each I i s piecewise linear and convex, so i s I.
L×ereìse ¡. 1 Consider the problem of minimizing a cost function of the form
c'x + I(d'x) , subject to the linear constraints Ax � b Here, d is a given
vector and the function I . R ' R is as specifed in Figure 1. 8. Provide a linear
programming formulation of this problem.
I(
x
)
1
x
Fì,ure ¡. s. The function I of Exercise 1. 3.
L×ereìse ¡. a Consider the problem
minimize 2Xl + 31 x2 ¬ 101
subject to I XI + 21 + I X2 1 ± 5,
and reformulate it as a linear programming problem.
Sec. 1 . 7 Exercises 1s
E×ereìse ¡. s Consider a linear optimization problem, with absolute values, of
the following form:
minimize c
'
x + d
'
Y
subject to Ax + Hy _ b
,
,=
.,
, ' i.
Assume that all entries of H and d are nonnegative.
(a) Provide two diferent linear programming formulations, along the lines dis
cussed in Section 1. 3.
( b) Show that the original problem and the two reformulations are equivalent
in the sense that either all three are infeasible, or all three have the same
optimal cost.
(c) Provide U example to show that if H has negative entries, the problem
may have a local minimum that is not a global minimum. (It will be seen
in Chapter 2 that this is never the case in linear programming problems.
Hence, in the presence of such negative entries, a linear programming re
formulation is implausible. )
E×ereìse ¡. s Provide linear programming formulations of the two variants of
the rocket control problem discussed at the end of Section 1. 3.
E×ereìse ¡. z ¸1hc noncnt problcn) Suppose that zi s a random variable
taking values in the set 0, 1 , . . . , K, with probabilities ,
¸
, ,
,
, , ,,, respectively.
We are given the values of the frst two moments r¸z,= _¸¸¸ /,, and r¸z
.
,=
_¸
¸
¸ /
,, of zand we would like to obtain upper and lower bounds on the value
of the fourth moment r¸z
,
, = _¸¸¸ /
,
,
·
of z Show how linear programming
can be used to approach this problem.
E×ereìse ¡. s (Hoadlighting) Consider a road divided into n segments that is
illuminated by m lamps. Let
o
be the power of the jth lamp. The illumination i,
of the ith segment is assumed to be _¸¸,
,
,
,where ,
are known coefcients.
Let i;be the desired illumination of road i.
We are interested in choosing the lamp powers Pj so that the illuminations
i,are close to the desired illuminations i; Provide a reasonable linear program
ming formulation of this problem. Note that the wording of the problem is loose
and there is more than one possible formulation.
E×ereìse ¡. u Consider a school district with ineighborhoods, J schools, and
G grades at each school. Each school j has a capacity of .
,for grade g. In each
neighborhood i, the student population of grade i is s,
,
Finally, the distance
of school j fom neighborhood i is 1,
Formulate a linear programming problem
whose objective is to assign all students to schools, while minimizing the total
distance traveled by all students. (You may ignore the fact that numbers of
students must be integer. )
E×ereìse ¡. ¡u (Froductionandinvcntoryplanning) A company must de
liver 1,units of its product at the end of the ith month. Material produced during
1c Chap. 1 Introduction
a month can be delivered either at the end of the same month or can be stored
as inventory and delivered at the end of a subsequent month; however, there is
a storage cost of c, dollars per month for each unit of product held in inventory.
The year begins with zero inventory. If the company produces x, units in month
i and x,
+
, units in month i + 1 , it incurs a cost of c2 l xi +l  xi ¸ dollars, refecting
the cost of switching to a new production level. Formulate a linear programming
problem whose obj ective is to minimize the total cost of the production and in
ventory schedule over a period of twelve months. Assume that inventory lef at
the end of the year has no value and does not incur any storage costs.
L×ereìse ¡. ¡¡ (Optimal currency conversion) Suppose that there are A
available currencies, and assume that one unit of currency i can be exchanged for
'ij units of currency j. (Naturally, we assume that 'ij
>
0. ) There also certain
regulations that impose a limit n, on the total amount of currency i that can be
exchanged on any given day. Suppose that we start with Bunits of currency 1 and
that we would like to maximize the number of units of currency Athat we end up
with at the end of the day, through a sequence of currency transactions. Provide
a linear programming formulation of this problem. Assume that for any sequence
i I , . . . , i k of currencies, we have 'i1 i2 'i 2 i
3
. . . 'ik  1 ik 'i k i 1 : 1 , which means that
wealth cannot be multiplied by going through a cycle of currencies.
L×ereìse ¡. ¡z ( Chebychev center) Consider a set I described by linear
inequality constraints, that is, I = {x ¯ ¬' a, x : i , i = 1 , ,r} A ball
with center y and radius ' is defned as the set of all points within (Euclidean)
distance ' from y. We are interested in fnding a ball with the largest possible
radius, which is entirely contained within the set I (The center of such a ball is
called the t/cqc/cv ccntcr of I ) Provide a linear programming formulation of
this problem.
L×ereìse¡. ¡1 (Linear fractional programming) Consider the problem
c' x + d
f
'
x + g
minimize
subject to Ax : b
f
'
x + g
>
O.
Suppose that we have some prior knowledge that the optimal cost belongs to an
interval [K, L] . Provide a procedure, that uses linear programming as a subrou
tine, and that allows us to compute the optimal cost within any desired accuracy.
Hint. Consider the problem of deciding whether the optimal cost is less than or
equal to a certain number.
L×ereìse ¡. ¡a A company produces and sells two diferent products. The de
mand for each product is unlimited, but the company is constrained by cash
availability and machine capacity.
Each unit of the frst and second product requires 3 and 4 machine hours,
respectively. There are 20, 000 machine hours available in the current production
period. The production costs are $3 and $2 per unit of the frst and second
product, respectively. The selling prices of the frst and second product are $6
and $5. 40 per unit, respectively. The available cash is $4, 000; furthermore, 45%
Sec. 1 . 7 Exercises 1z
of the sales revenues from the frst product and 30% of the sales revenues from the
second product will be made available to fnance operations during the current
period.
(U) Formulate a linear programming problem that aims at maximizing net in
come subject to the cash availability and machine capacity limitations.
(b) Solve the problem graphically to obtain an optimal solution.
(c) Suppose that the company could increase its available machine hours by
2, 000, after spending $400 for certain repairs. Should the investment be
made?
L×ereìse ¡ . ¡s A company produces two kinds of products. A product of the
frst type requires 1/4 hours of assembly labor, 1/8 hours of testing, and $1 . 2
worth of raw materials. A product of the second type requires 1 /3 hours of
assembly, 1/3 hours of testing, and $0. 9 worth of raw materials. Given the current
personnel of the company, there can be at most 90 hours of assembly labor and
80 hours of testing, each day. Products of the frst and second type have a market
value of $9 and $8, respectively.
(U) Formulate a linear programming problem that can be used to maximize the
daily proft of the company.
(b) Consider the following two modifcations to the original problem:
,i, Suppose that up to 50 hours of overtime assembly labor can be sched
uled, at a cost of $7 per hour.
,ii, Suppose that the raw material supplier provides a 10% discount if
the daily bill is above $300.
Which of the above two elements can be easily incorporated into the lin
ear programming formulation and how? If one or both are not easy to
incorporate, indicate how you might nevertheless solve the problem.
L×ereìse ¡. ¡s A manager of an oil refnery has 8 million barrels of crude oil A
and 5 million barrels of crude oil B allocated for . production during the coming
month. These resources can be used to make either gasoline, which sells for $38
per barrel , or home heating oil, which sells for $33 per barrel. There are three
production processes with the following characteristics:
Process 1 Process z Process 3
Input crude A 3 1 5
Input crude B 5 1 3
Output gasoline 4 1 3
Output heating oil 3 1 4
Cost $51 $ 1 1 $40
All quantities are in barrels. For example, with the frst process, 3 barrels of
crude A and 5 barrels of crude B are used to produce 4 barrels of gasoline and
1s Chap. 1 Introduction
3 barrels of heating oil. The costs in this table refer to variable and allocated
overhead costs, and there are no separate cost items for the cost of the crudes.
Formulate a linear programming problem that would help the manager maximize
net revenue over the next month.
E×ereìse ¡. ¡z (Investment under taxation) An investor has a portfolio of
n diferent stocks. He has bought Si shares of stock i at price pi , i = 1 , . . . , n.
The current price of one share of stock i i s qi . The investor expects that the price
of one share of stock i in one year will be Ti . If he sells shares, the investor pays
transaction costs at the rate of 1 % of the amount transacted. In addition, the
investor pays taxes at the rate of 30% on capital gains. For example, suppose that
the investor sells 1, 000 shares of a stock at $50 per share. He has bought these
shares at $30 per share. He receives $50, 000. However, he owes 0. 30 x (50, 000 
30, 000) = $6, 000 on capital gain taxes and 0. 01 x (50, 000) = $500 on transaction
costs. So, by selling 1 , 000 shares of this stock he nets 50, 000  6, 000  500 =
$43, 500. Formulate the problem of selecting how many shares the investor needs
to sell in order to raise an amount of money K, net of capital gains and transaction
costs, while maximizing the expected value of his portfolio next year.
E×ereìse ¡. ¡s Show that the vectors in a given fnite collection are linearly
independent if and only if none of the vectors can be expressed as a linear com
bination of the others.
E×ereìse ¡ . ¡u Suppose that we are given a set of vectors in R
º
that form a
basis, and let y be an arbitrary vector in R
º
We wish to express y as a linear
combination of the basis vectors. How can this be accomplished?
E×ereìse ¡ . zu
(a) Let S = {Ax ¸ x Ç R
º
] , where A is a given matrix. Show that S is a
subspace of R
º
(b) Assume that Sis a proper subspace of R
º
. Show that there exists a matrix
B such that S = {y Ç R
º
¸ By = 0] Hint: Use vectors that are orthogonal
to S to form the matrix B.
(c) Suppose that V i s an mdimensional afne subspace of R
º
, with m <
n. Show that there exist linearly independent vectors al , . . . , anr, and
scalars bl , . , b
º
r, such that
V = {y l a� y = bi ' i = 1 , . . . , n  m¦ .
1 . 8 History, notes, and sources
The word "programming" has been used traditionally by planners to de
scribe the process of operations planning and resource allocation. In the
1940s, it was realized that this process could often be aided by solving op
timization problems involving linear constraints and linear objectives. The
term "linear programming" then emerged. The initial impetus came in the
afermath of World War II, within the context of military planning prob
lems. In 1947, Dantzig proposed an algorithm, the ··¬,.. ¬·/ which
Sec. 1 . 8 History, notes, and sources 1u
made the solution of linear programming problems practical. There fol
lowed a period of intense activity during which many important problems in
transportation, economics, military operations, scheduling, etc. , were cast
in this framework. Since then, computer technology has advanced rapidly,
the range of applications has expanded, new powerful methods have been
discovered, and the underlying mathematical understanding has become
deeper and more comprehensive. Today, linear programming is a routinely
used tool that can be found in some spreadsheet sofware packages.
Dantzig' s development of the simplex method has been a defning
moment in the history of the feld, because it came at a time of grow
ing practical needs and of advances in computing technology. But , as is
the case with most "scientifc revolutions, " the history of the feld is much
richer. Early work goes back to Fourier, who in 1824 developed an algo
rithm for solving systems of linear inequalities. Fourier' s method is far less
efcient than the simplex method, but this issue was not relevant at the
time. In 1910, de la Vallee Poussin developed a method, similar to the sim
plex method, for minimizing mai I bi a�×¸ , a problem that we discussed
in Section 1 . 3.
I n the late 1930s, the Soviet mathematician Kantorovich became in
terested in problems of optimal resource allocation in a centrally planned
economy, for which he gave linear programming formulations. He also pro
vided a solution method, but his work did not become widely known at the
time. Around the same time, several models arising in classical, Walrasian,
economics were studied and refned, and led to formulations closely related
to linear programming. Koopmans, an economist , played an important role
and eventually (in 1 975) shared the Nobel Prize in economic science with
Kantorovich.
On the theoretical font , the mathematical structures that under
lie linear programming were independently studied, in the period 1870
1930, by many prominent mathematicians, such as Farkas, Minkowski,
Caratheodory, and others. Also, in 1928, von Neumann developed an im
portant result in game theory that would later prove to have strong con
nections with the deeper structure of linear programming.
Subsequent to Dantzig' s work, there has been much and important
research in areas such as large scale optimization, network optimization,
interior point methods, integer programming, and complexity theory. We
defer the discussion of this research to the notes and sources sections of later
chapters. For a more detailed account of the history of linear programming,
the reader is referred to Schrijver ( 1986) , Orden ( 1993) , and the volume
edited by Lenstra, Rinnooy Kan, and Schrijver ( 1991) (see especially the
article by Dantzig in that volume) .
There are several texts that cover the general subject of linear pro
gramming, starting with a comprehensive one by Dantzig ( 1963) . Some
more recent texts are Papadimitriou and Steiglitz ( 1982) , Chvatal ( 1983) ,
Murty ( 1 983) , Luenberger ( 1984) , Bazaraa, Jarvis, and Sherali ( 1990) . Fi
au Chap. 1 Introduction
nally, Schrijver ( 1986) is a comprehensive, but more advanced reference on
the subject .
¡. ¡. The formulation of the diet problem is due t o Stigler ( 1945) .
¡. z. The case study on DEC' s production planning was developed by Fe
und and Shannahan ( 1992) . Methods for dealing with the nurse
scheduling and other cyclic problems are studied by Bartholdi, Orlin,
and Ratlif ( 1980) . More information on pattern classifcation can be
found in Duda and Hart ( 1973) , or Haykin ( 1994) .
¡. 1. A deep and comprehensive treatment of convex functions and their
properties is provided by Rockafellar ( 1970) . Linear programming
arises in control problems, in ways that are more sophisticated than
what is described here; see, e. g. , Dahleh and DiazBobillo ( 1995) .
¡. s. For an introduction t o linear algebra, see Strang ( 1988) .
¡. c. For a more detailed treatment of algorithms and their computational
requirements, see Lewis and Papadimitriou ( 1981 ) , Papadimitriou
and Steiglitz ( 1982) , or Cormen, Leiserson, and Rivest ( 1990) .
¡. z. Exercise 1 . 8 is adapted from Boyd and Vandenberghe ( 1995) . Ex
ercises 1 . 9 and 1 . 14 are adapted from Bradley, Hax, and Magnanti
( 1977) . Exercise 1 . 1 1 is adapted from Ahuja, Magnanti, and Orlin
( 1993) .
Lhapter Z
ÅDO gOODO!1j Ol Ì1DOð1
•
[1Og1ðDDIDg
Contents
2 . 1 . Polyhedra and convex sets
2. 2. Extreme points, vertices, and basic feasible solutions
2. 3. Polyhedra in standard form
2. 4. Degeneracy
2. 5. Existence of extreme points
2. 6. Optimality of extreme points
2. 7. Representation of bounded polyhedra*
2. 8. Projections of polyhedra: FourierMotzkin elimination*
2. 9. Summary
2. 10. Exercises
2 . 1 1 . Notes and sources
a¡
az Chap. 2 The geometry of linear programming
In this chapter, we defne a polyhedron as a set described by a fnite number
of linear equality and inequality constraints. In particular, the feasible set
in a linear programming problem is a polyhedron. We study the basic
geometric properties of polyhedra in some detail , with emphasis on their
"corner points" ¸vertices) . As it turns out , common geometric intuition
derived from the familiar threedimensional polyhedra is essentially correct
when applied to higherdimensional polyhedra. Another interesting aspect
of the development in this chapter is that certain concepts ¸e. g. , the concept
of a vertex) can be defned either geometrically or algebraically. While the
geometric view may be more natural, the algebraic approach is essential for
carrying out computations. Much of the richness of the subject lies in the
interplay between the geometric and the algebraic points of view.
Our development starts with a characterization of the corner points
of feasible sets in the general form [xnx¸t, Later on, we focus on the
case where the feasible set is in the standard form [x A×= t x ¸s,
and we derive a simple algebraic characterization of the corner points. The
latter characterization will play a central role in the development of the
simplex method in Chapter 3.
The main results of this chapter state that a nonempty polyhedron has
at least one corner point if and only if it does not contain a line, and if this
is the case, the search for optimal solutions to linear programming problems
can be restricted to corner points. These results are proved for the most
general case of linear programming problems using geometric arguments.
The same results will also be proved in the next chapter, for the case of
standard form problems, as a corollary of our development of the simplex
method. Thus, the reader who wishes to focus on standard form problems
may skip the proofs in Sections 2. 5 and 2. 6. Finally, Sections 2. 7 and 2. 8 can
also be skipped during a frst reading; any results in these sections that are
needed later on will be rederived in Chapter 4, using diferent techniques.
2. 1 Polyhedra and convex sets
In this section, we introduce some important concepts that will be used
to study the geometry of linear programming, including a discussion of
convexity.
Hyperplanes, halfspaces, and polyhedra
We start with the formal defnition of a polyhedron.
Deñnìtìonz. ¡ A poyhedronis a set that can be described in the
form ¦× E x

A× ¸ t,, where A is a m x n matrix and t is a
vector in x
Sec. 2. 1 Polyhedra and convex sets :1
As discussed in Section 1 . 1 , the feasible set of any linear programming
problem can be described by inequality constraints of the form A× � b,
and i s therefore a polyhedron. In particular, a set of the form ¦×E �

A×" b, ×� a,is also a polyhedron and will be referred to as a ,.,/¬»
·» ··»· ,·¬
A polyhedron can either "extend to infnity, " or can be confned in a
fnite region. The defnition that follows refers to this distinction.
Deñnìtìon z. z A set S c y

is boundedif there exists a constant
1 such that the absolute value of every component of every element
of S is les than or equal to 1.
The next defnition deals with polyhedra determined by a single linear
constraint .
Deñnìtìon z. 1 Let a be a nonzero vector in
y

and let b be a scalar.
(a
)
The set ¦×c �
I a'×= ·,is called a hyperpane
(b)
The set ¦×E �

I a'×� ·,is called a harspaee
Note that a hyperplane is the boundary of a corresponding halfspace.
In addition, the vector ain the defnition of the hyperplane is perpendicular
to the hyperplane itself. [To see this, note that if × and y belong to the
same hyperplane, then a'× " a'y Hence, a' (× y) " 0 and therefore a
is orthogonal to any direction vector confned to the hyperplane. ] Finally,
note that a polyhedron is equal to the intersection of a fnite number of
halfspaces; see Figure 2 . 1 .
Convex Sets
We now defne the important notion of a convex set .
Deñnìtìonz. : A set S c y

is eonve×if for any ×, y E S, and any
A E ¸0, I , we have AX + ( 1 \) yE S.
Note that i f \ E [ 0, 1]
'
then \×+ ( 1  \)yi s a weighted average of
the vectors ×, y, and therefore belongs to the line segment j oining × and
y Thus, a set is convex if the segment j oining any two of its elements is
contained in the set; see Figure 2. 2.
Our next defnition refers t o weighted averages of a fnite number of
vectors; see Figure 2. 3.
aa Chap. 2 The geometry of linear programming
. '
. l¹
Fì,ure z. ¡. ,a) A hyperplane and two halfspaces. (b) The poly
hedron {x I �x _ , , i = 1, . . , 5} is the intersection of fve halfs
paces. Note that each vector a, is perpendicular to the hyperplane
{x I �x = , }
Deñnìtìonz. sLet Xl , . . , ×
·
be vectors in � and let >, . , A
k
be
nonnegative scalars whose sum is unity.
(a)
The vector L�
=
l
AiXi is said to be a eonve× eombìnatìon of
the vectors X
l
, , ×
·
(b
)
The eonve× huof the vectors ×
·
, , ×
·
is the set of all convex
combinations of these vectors.
The result that follows establishes some important facts related to
convexity.
Iheoremz. ¡
(a)
The intersection of convex sets is convex.
(b
)
Every polyhedron is a convex set.

e
)
A convex combination of a fnite number of elements of a convex
set also belongs to that set.
(d
)
The convex hull of a fnite number of vectors is a convex set.
Sec. 2. 1 Polyhedra and convex sets
Fì,ure z. z. The set S is convex, but the set Q is not , because
the segment joining x and y is not contained in Q .
.
´
Fì,ure z. 1. The convex hull of seven points in +

Prooi.
as
(a
) Let , · = I, be convex sets where I is some index set , and suppose
that × and , belong to the intersection ,
·
, Let \ = [0, 1] . Since
each ,is convex and contains × , we have \×  ( 1 \, , = , which
proves that \×  (1 \, , also belongs to the intersection of the sets
, Therefore, ,
·
,is convex.
(b
)
Let a be a vector and let b a scalar. Suppose that × and , satisfy
a × � b and a, � b, respectively, and therefore belong to the same
halfspace. Let \ ÷ [0, 1] . Then, a ,\×  ( 1  \, ,, � \·  ( 1  \, · = b,
which proves that \×  , í \, , also belongs to the same halfspace.
Therefore a halfspace is convex. Since a polyhedron is the intersection
of a fnite number of halfspaces, the result follows from part (a) .
(c) A convex combination of two elements of a convex set lies in that
as Chap. 2 The geometry of linear progamming
set , by the defnition of convexity. Let us assume, as an induction
hypothesis, that a convex combination of k elements of a convex set
belongs to that set . Consider k + í elements x
.
x
.,,
of a convex
set S and let x, x
.,, be nonnegative scalars that sum to í We
assume, without loss of generality, that x
.,,= í We then have
, z í,
The coefcients )d( l  x
.,, , · " í . . . , k, are nonnegative and sum
to unity; using the induction hypothesis,
_¸¸,
\
.
×
.
}, í x
.,, , c S.
Then, the fact that S is convex and Eq. , z í , imply that
_¸�
,
,
\
. ×
.
c
S, and the induction step is complete.
(d) Let S be the convex hull of the vectors ×
·
, x
.
and let y "
�
.
�
.
z.÷
·
,.×
' , 4 "
z.÷
·
º.×'
be two elements of S, where
,.
_ 0,
º.
_ 0,
and
_¸¸,,.
"
_¸¸,
º.
" í Let ) c [0, 1] . Then,
. . .
xy+ , í \, z = \_,.×
.
+ , í \
, _º. ×
.
"
_¸\,.
+ , í \) º. ) ×
.
.÷
·
.÷
·
We note that the coefcients \
,.
+ , í  \)
º. , · " í . . . , k, are non
negative and sum to unity. This shows that xy+ , í \,z is a convex
combination of ×
·
,. . . x
.
and, therefore, belongs to S. This estab
lishes the convexity of S. .
2. 2 Extreme points, vertices, and basic
feasible solutions
We observed in Section í 1 that an optimal solution to a linear programming
problem tends to occur at a "corner" of the polyhedron over which we are
optimizing. In this section, we suggest three diferent ways of defning the
concept of a "corner" and then show that all three defnitions are equivalent .
Our frst defnition defnes an .··¬ ,·»· of a polyhedron as a point
that cannot be expressed as a convex combination of two other elements of
the polyhedron, and is illustrated in Figure z 1 Notice that this defnition
is entirely geometric and does not refer to a specifc representation of a
polyhedron in terms of linear constraints.
Denuitieu 2.6 Let P be a polyhedron. A vector × c P is a e×
t:eme peiut of P if we cannot fnd two vectors y,z c P, both diferent
fom }¡ and a scalar x [ 0, 1] ' such that } = ·y+ ( 1  \, z
Sec. 2. 2 Extreme points, vertices, and basic feasible solutions
B

.
Y
Fì,urez. :. The vector ¾ is not an extreme point because it is a
convex combination of Ä and 1. The vector x is an extreme point:
if x = Xy+ ( 1  Xj z and ' Ç [0, 1]
'
then either y , I, or z , I, or
x ¬ y, or x = z.
:z
An alternative geometric defnition defnes a ···. of a polyhedron
Pas the unique optimal solution to some linear programming problem with
feasible set P
Deñnition z. zLet P be a polyhedron. A vector ×  Pi s a verte×
of P if there exsts some e such that e'× < e
y for all y satisfing
y  P and y=×
In other words, × is a vertex of Pi f and only if P is on one side of
a hyperplane ¸the hyperplane ¦y e
y" e'×¦) which meets Ponly at the
point ×, see Figure z
The two geometric defnitions that we have given so far are not easy
to work with from an algorithmic point of view. We would like to have a
defnition that relies on a representation of a polyhedron in terms of linear
constraints and which reduces to an algebraic test . In order to provide such
a defnition, we need some more terminology.
Consider a polyhedron PC «

defned in terms of the linear equality
and inequality constraints
a� ×
>
'. , · MI ,
a� ×
<
'. , · M
2
,
a�×
'. , · M3 ,
where MI , M2 , and M3 are fnite index sets, each ai is a vector in «

and
as Chap. 2 The geometry of linear programming
Fì,ure z. s. The line at the bottom touches I at a single point
and x is a vertex. On the other hand, ¾ is not a vertex because
there is no hyperplane that meets I only at ¾.
each , i s a scalar. The defnition that follows i s illustrated in Figure 2. 6.
Deñnìtìon z. s Ia vector ×satisfes a� ×" , for some iin ., .,
or M3 , we say that the corresponding constraint is aetìveor bìndìn,
at ×
If there are n constraints that are active at a vector × ,then × satis
fes a certain system of n linear equations in n unknowns. This system has a
unique solution if and only if these n equations are "linearly independent . "
The result that follows gives a precise meaning to this statement , together
with a slight generalization.
Iheoremz. z Let × be an element onn and let I * ¦i ¸ a� ×= ¸
be the set of indices of constraints that are active at × Then, the
following are equivalent:
¦ a) There exist n vectors in the set ¦a, ¸ i = I} , which are linearly
independent.
¦b) The span of the vectors a, , i = I, is all of �n, that is, every
element of � can be expressed as a linear combination of the
vectors o, , : = I
¦ e) The system of equations a� ×" , , i = I, has a unique solution.
Prooi. Suppose that the vectors a, , i = I, span �

Then, the span of
these vectors has dimension n. By Theorem 1 . 3¸a) in Section 1 . 5, n of
Sec. 2. 2 Extreme points, vertices, ad basic feasible solutions
Fì,urez. s. Let I= { (Xl , X2 , X3 ) I Xl +X2 +X3 = 1 , Xl , X2 , X3 �
0} . There are three constraints that are active at each one of the
points A, B, C and L. There are only two constraints that are
active at point r,namely Xl + X2 + X3 = 1 and X2 = 0.
:u
these vectors form a basis of � and are therefore linearly independent .
Conversely, suppose that n of the vectors a, , i = I, are linearly independent .
Then, the subspace spanned by these n vectors is ndimensional and must
be equal to � Hence, every element of �is a linear combination of the
vectors �, i = I. This establishes the equivalence of ,a) and ,b) .
If the system of equations a�×= ·, · = I, has multiple solutions, say
×
and ×

, then the nonzero vector d = ×
:
×

satisfes a� d = 0 for all
i = I. Since d is orthogonal to every vector a, , i = I, d is not a linear
combination of these vectors and it follows that the vectors a, , i = I, do
not span � Conversely, if the vectors a, , i = I, do not span � choose
a nonzero vector d which is orthogonal to the subspace spanned by these
vectors. If ×satisfes a� ×= ·, for all i = I, we also have a� ,×+ d) = ·, for
all i = I, thus obtaining multiple solutions. We have therefore established
that ,b) and (c) are equivalent . .
With a slight abuse of language, we will ofen say that certain »
··¬·»·· are .·»·., ·»,»»· meaning that the corresponding vectors
a
.
are linearly independent . With this terminology, statement (a) in The
orem z z requires that there exist n linearly independent constraints that
are active at ×
We are now ready to provide an algebraic defnition of a corner point ,
as a feasible solution at which there are n linearly independent active con
straints. Note that since we are interested in a feasible solution, all equality
su Chap. 2 The geometry of linear programming
constraints must be active. This suggests the following way of looking for
corner points: frst impose the equality constraints and then require that
enough additional constraints be active, so that we get a total of n linearly
independent active constraints. Once we have n linearly independent active
constraints, a unique vector }
+
is determined ¸Theorem z z, However, this
procedure has no guarantee of leading to a feasible vector }
+
ç because some
of the inactive constraints could be violated; in the latter case we say that
we have a basic ¸but not basic feasible) solution.
Deñnìtìonz. u Consider a polyhedron P defned by linear equality
and inequality constraints, and let }
+
be an element of «

(a
)
The vector }
+
is a basìesoutìonif:
/i, All equality constraints are active;
/ii, Out of the constraints that are active at ×· , there are n of
them that are linearly independent.
(b
)
n}
+
is a basic solution that satisfes all of the constraints, we
say that it is a basìe ieasìbesoutìon
In reference to Figure z  we note that points A, B, and C are
basic feasible solutions. Point D is not a basic solution because it fails to
satisf the equality constraint . Point E is feasible, but not basic. If the
equality constraint 
.
+ 
+ ,* 1 were to be replaced by the constraints

.
+ 
+ ,� 1 and 
.
+ 
+ ,� 1 , then D would be a basic solution,
according to our defnition. This shows that whether a point is a basic
solution or not may depend on the way that a polyhedron is represented.
Defnition z + is also illustrated in Figure z :
Note that if the number m of constraints used to defne a polyhedron
PC «

is less than n, the number of active constraints at any given point
must also be less than n, and there are no basic or basic feasible solutions.
We have given so far three diferent defnitions that are meant to cap
ture the same concept; two of them are geometric ¸extreme point, vertex)
and the third is algebraic ¸basic feasible solution) . Fortunately, all three
defnitions are equivalent as we prove next and, for this reason, the three
terms can be used interchangeably.
Iheorem z. 1 Let P be a nonempty polyhedron and let 7
+
= P
Then, the following are equivalent:
,
a
)
7
+
is a vertex;
(b
)
× is an extreme point;
¦L) ×
is a basic feasible solution.
Sec. 2. 2 Extreme points, vertices, and basic feasible solutions
Fì,ure z. z. The points A, B, C, L, r,rare all basic solutions
because at each one of them, there are two linearly independent
constraints that are active. Points C, L, r, r are basic feasible
solutions.
s¡
Prooi. For the purposes of this proof and without loss of generality, we
assume that P is represented in terms of constraints of the form a� × � ·,
and a� ×= ·,
Verte×= L×tremepoìnt
Suppose that × E P is a vertex. Then, by Defnition 2. 7, there exists
some e E �

such that e'x* < e'y for every y satisfing y E P and
y = x* . If Y E P, 4 E P, y = × , 4 = × , and 0 � A � 1 , then
e'x* < e'y and e'
x* < e' z, which implies that e'
x* < e' (Ay + (1  A)z)
and, therefore, ×= Ay+( 1 A) z. Thus, ×cannot be expressed as a convex
combination of two other elements of P and is, therefore, an extreme point
(cf. Defnition 2. 6) .
L×treme poìnt = Basìeieasìbesoutìon
Suppose that × E P is not a basic feasible solution. We will show that ×
is not an extreme point of P. Let I = {i I a� × = ·, , Since × is not a
basic feasible solution, there do not exist n linearly independent vectors in
the family a, , i E I. Thus, the vectors �, i E I, lie in a proper subspace
of �

and there exists some nonzero vector d E �

such that a� d = 0,
for all i E I. Let  be a small positive number and consider the vectors
y " ×+ d and z = ×  d Notice that a� y = a� × = ·, for i E I.
Frthermore, for i 1 I, we have a� × > ·,and, provided that i s small , we
will also have a� y> ·, (It sufces to choose so that  l a� dl < a� ×¯ ·,for
all i 1 I. ) Thus, when is small enough, y E P and, by a similar argument ,
z E P. We fnally notice that × = (y + z) /2, which implies that × is not
an extreme point .
sz Chap. 2 The geometry of linear programming
Basìeieasìbesoutìon ¬ Verte×
Let × be a basic feasible solution and let I
e= 2
, .
a, We then have
e
× =
_a� × = _·, .
,
. , .
¦· a� ×
Furthermore, for any ×= P and any · , we have �×¸·, and
e
×= _�×¸_·, .
,
·
,
.
·¸ Let
( 2. 2)
This shows that ×i s an optimal solution to the problem of minimizing e
×
over the set P. Frthermore, equality holds in ( 2. 2) if and only if a� ×= ·,
for all · = I. Since × is a basic feasible solution, there are n linearly
independent constraints that are active at × ,and ×is the unique solution
to the system of equations a� ×= ·, ·= I (Theorem 2. 2) . It follows that ×
is the unique minimizer of e
×over the set P and, therefore, ×is a vertex
of P. .
Since a vector is a basic feasible solution if and only if it is an extreme
point, and since the defnition of an extreme point does not refer to any
particular representation of a polyhedron, we conclude that the property
of being a basic feasible solution is also independent of the representation
used. (This is in contrast to the defnition of a basic solution, which is
representation dependent, as pointed out in the discussion that followed
Defnition 2. 9. )
We fnally note the following important fact .
Coroary z. ¡ Given a fnite number of linear inequality constraints,
there can only be a fnite number of basic or basic feasible solutions.
Prooi. Consider a system of m linear inequality constraints imposed on
a vector × = x

At any basic solution, there are n linearly independent
active constraints. Since any n linearly independent active constraints de
fne a unique point , it follows that diferent basic solutions correspond to
diferent sets of n linearly independent active constraints. Therefore, the
number of basic solutions is bounded above by the number of ways that we
can choose n constraints out of a total of m, which is fnite. .
Although the number of basic and, therefore, basic feasible solutions
is guaranteed to be fnite, it can be very large. For example, the unit cube
{×= x

l
O S , S 1, · = 1, ... ,n} is defned in terms of 2n constraints,
but has 2
n
basic feasible solutions.
Sec. 2. 3 Polyhedra in standard form s1
Adjacent basic solutions
Two distinct basic solutions to a set of linear constraints in �are said to
be ,»· if we can fnd n 1 linearly independent constraints that are
active at both of them. In reference to Figure 2. 7, D and E are adjacent
to B; also, A and C are adj acent to D. If two adjacent basic solutions are
also feasible, then the line segment that joins them is called an , of the
feasible set (see also Exercise 2. 1 5) .
2. 3 Polyhedra in standard form
The defnition of a basic solution (Defnition 2. 9) refers to general polyhe
dra. We will now specialize to polyhedra in standard form. The defnitions
and the results in this section are central to the development of the simplex
method in the next chapter.
Let P = {x = � I Ax = b, x ¸ O} be a polyhedron in standard
form, and let the dimensions of A be m x n, where m is the number of
equality constraints. In most of our discussion of standard form problems,
we will make the assumption that the m rows of the matrix A are lin
early independent . (Since the rows are ndimensional, this requires that
Í _ n. ) At the end of this section, we show that when P is nonempty,
linearly dependent rows of A correspond to redundant constraints that can
be discarded; therefore, our linear independence assumption can be made
without loss of generality.
Recall that at any basic solution, there must be n linearly indepen
dent constraints that are active. Frthermore, every basic solution must
satisfy the equality constraints Ax = b, which provides us with m active
constraints; these are linearly independent because of our assumption on
the rows of A. In order to obtain a total of n active constraints, we need
to choose n m of the variables ,and set them to zero, which makes the
corresponding nonnegativity constraints , ¸ 0 active. However, for the
resulting set of n active constraints to be linearly independent , the choice
of these n m variables is not entirely arbitrary, as shown by the following
result .
Iheorem z. a Consider the constraints Ax " b and × _ u and as
sume that the m x n matrix A has linearly independent rows. A vector
x c �is a basic solution if and only if we have Ax = b, and there
exist indices B, l ) , . . . , B(m) such that:
(a
)
The columns AB
(
l
)
, " " AB
(
m
)
are lineary independent;
(b) ·
=
B, l )
,
.
. . , B,m) , then ," o
Prooi. Consider some x =�and suppose that there are indices B, l) , . . . ,
sa Chap. 2 The geometry of linear programming
B(m) that satisfy (a) and (b) in the statement of the theorem. The active
constraints ," 0, ·= B( l) , . . . , B(m) , and nx" bimply that
D
,,
,
.
Since the columns n,¸
,,
·" 1 , . . . , m, are linearly independent , ,¸
.
,
,¸
, are uniquely determined. Thus, the system of equations formed by
the active constraints has a unique solution. By Theorem 2. 2, there are n
linearly independent active constraints, and this implies that xis a basic
solution.
For the converse, we assume that xis a basic solution and we will
show that conditions (a) and (b) in the statement of the theorem are satis
fed. Let ,¸ , , ,¸
., be the components of xthat are nonzero. Since
xis a basic solution, the system of equations formed by the active con
straints _ ¸
n,, " b and , " 0, · = B( l ) , . . . , B(k) , have a unique
solution (ef. Theorem 2. 2) ; equivalently, the equation _ ¸
,
n,¸,, ,¸ ,,
" b
has a unique solution. It follows that the columns n,¸
.
, n,¸
.,
are
linearly independent . [If they were not , we could fnd scalars ·
.
·
.
not all of them zero, such that _ ¸
n,¸, , ·," 0. This would imply that
_ ¸
n,¸,,,¸,,
+ ·, ," b, contradicting the uniqueness of the solution. ]
We have shown that the columns n,¸
.
, n,¸
.,
are linearly inde
pendent and this implies that k :: m. Since nhas m linearly independent
rows, it also has m linearly independent columns, which span �
It follows
[ef. Theorem 1 . 3(b) in Section 1 . 5] that we can fnd mk additional columns
n,¸
.,
, n,¸
, so that the columns n,¸,,
· " 1 , . . . , m, are linearly
independent. In addition, if · = B( l ) , . . . , B(m) , then · = B( l ) , . . . , B(k)
(because k :: m) , and , " 0. Therefore, both conditions (a) and (b) in
the statement of the theorem are satisfed. .
In view of Theorem 2. 4, all basic solutions to a standard form poly
hedron can be constructed according to the following procedure.
Proeedure ior eonstruetìn,basìesoutìons
¡. Choose m linearly independent columns n,¸
, n,¸
,
z. Let , " 0 for all ·= B( l ) , . . . , B(m) .
1. Solve the system of m equations nx" bfor the unknowns ,¸
,
,¸
,
If a basic solution constructed according to this procedure is nonneg
ative, then it is feasible, and it is a basic feasible solution. Conversely, since
every basic feasible solution is a basic solution, it can be obtained from this
procedure. If xis a basic solution, the variables ,¸
,  ,¸
,
are called
Sec. 2. 3 Polyhedra in standard form ss
··· ····.·, the remaining variables are called »»··· The columns
n,¸
.
,
n,¸
, are called the ··· .»¬»· and, since they are linearly
independent , they form a ···· of �
We will sometimes talk about two
bases being ····»· or ·¡·»·, our convention is that distinct bases in
volve diferent sets {B( l) , . . . , B(m) } of ··· ·»··, if two bases involve
the same set of indices in a diferent order, they will be viewed as one and
the same basis.
By arranging the m basic columns next to each other, we obtain an
m x m matrix B, called a ···· ¬···. ¸Note that this matrix is invertible
because the basic columns are required to be linearly independent . ) We can
similarly defne a vector x,with the values of the basic variables. Thus,
[
,¸
.
,
1
,
�
,
n,¸
,
The basic variables are determined by solving the equation Bx," bwhose
unique solution is given by
x,* B
·
b
L×ampe z. ¡ Let the constraint Ax = b be of the form
[ �
1 2 1 0 0
n
x
� [ ,n
1 6 0 1 0
0 0 0 0 1
1 0 0 0 0
Let us choose A, ,A, ,A, ,A, as our basic columns. Note that they are linearly
independent and the corresponding basis matrix is the identity. We then obtain
the basic solution x = (0, 0, 0, 8, 12, 4, 6) which is nonnegative and, therefore,
is a basic feasible solution. Another basis is obtained by choosing the columns
A. ,A, ,A, ,A, ¸note that they are linearly independent) . The corresponding
basic solution is x = (0, 0, 4, 0, 12, 4, 6) , which is not feasible because x, =
12 < O.
Suppose now that there was an eighth column A, , identical to A,. Then,
the two sets of columns {A. ,A, , A, ,A,] and {A. ,A, ,A, ,A, ] coincide. On
the other hand the corresponding sets of basic indices, which are {J,5, 6, ¯] and
{J,5, 6, 8] , are diferent and we have two diferent bases, according to our con
ventions.
For an intuitive view of basic solutions, recall our interpretation of
the constraint n×" b, or ��
.
n, ,* b, as a requirement to synthesize
the vector b = �
using the resource vectors n, ¸ Section 1 . 1 ) . In a basic
solution, we use only m of the resource vectors, those associated with the
basic variables. Furthermore, in a basic feasible solution, this is accom
plished using a nonnegative amount of each basic vector; see Figure 2. 8.
sc Chap. 2 The geometry of linear programming
Fì,ure z. s. Consider a standard form problem with n = 4 and
m = 2, and let the vectors b,AI , . . . , A4 be as shown. The vectors
AI , A2 form a basis; the corresponding basic solution is infeasible
because a negative value of X2 is needed to synthesize b fom AI ,
A2 . The vectors AI , A3 form another basis; the corresponding
basic solution is feasible. Finally, the vectors AI , A4 do not form
a basis because they are linearly dependent.
Correspondence of bases and basic solutions
We now elaborate on the correspondence between basic solutions and bases.
Diferent basic solutions must correspond to diferent bases, because a basis
uniquely determines a basic solution. However, two diferent bases may lead
to the same basic solution. (For an extreme example, if we have b = u,
then every basis matrix leads to the same basic solution, namely, the zero
vector. ) This phenomenon has some important algorithmic implications,
and is closely related to degeneracy, which is the subj ect of the next section.
Adjacent basic solutions and adjacent bases
Recall that two distinct basic solutions are said to be adjacent if there are
n ¯ 1 linearly independent constraints that are active at both of them.
For standard form problems, we also say that two bases are ,»· if
they share all but one basic column. Then, it is not hard to check that
adjacent basic solutions can always be obtained from two adjacent bases.
Conversely, if two adj acent bases lead to distinct basic solutions, then the
latter are adj acent .
L×ampe z. z In reference to Example 2. 1 , the bases {A4 , A5 , A6 , A7} and
{A3 , A5 , A6 , A7} are adjacent because all but one columns are the same. The
corresponding basic solutions x = (0, 0, 0, 8, 12, 4, 6) and x = (0, 0, 4, 0, 12, 4, 6)
Sec. 2. 3 Polyhedra in standad form sz
are adj acent: we have n = 7 and a total of six common linearly independent
active constraints; these are x, 2 0, x, 2 0, and the four equality constraints.
The full row rank assumption on A
We close this section by showing that the full row rank assumption on the
matrix A results in no loss of generality.
Iheoremz. s Let P ¬ [.n×¬b, ._ O} be a nonempty polyhe
dron, where A is a matrix of dimensions m x n, with rows a� a�
Suppose that rank(A) ¬ k < m and that the rows a�
a�
are
linearly independent. Consider the polyhedron
Then Q = P.
Prooi. We provide the proof for the case where i
1
¬ 1 , . . . , i
k
¬ k, that
is, the frst k rows of A are linearly independent . The general case can be
reduced to this one by rearranging the rows of A.
Clearly P C Q since any element of P automatically satisfes the
constraints defning Q. We will now show that Q C P.
Since rank(A) = k, the row space of A has dimension k and the rows
a� a�form a basis of the row space. Therefore, every row a�of A can
be expressed in the form a�¬ _¸
·,,
a¸ for some scalars ·,,
Let .be
an element of P and note that
k k
·,¬a� .=
_,,
a¸.¬_·,,
·
,
,. ,.
·= 1 , . . . , m.
Consider now an element yof Q. We will show that it belongs to P. Indeed,
for any i ,
k k
a� y¬_A,j a¸ y ¬_·,,
·
,
¬·,
,.
,.
which establishes that y i and Q C i .
Notice that the polyhedron Q in Theorem z is in standard form;
namely, Q ¬ [. b.¬ i, .¸ O} where b is a k x n submatrix of A,
with rank equal to k, and iis a kdimensional subvector of b. We conclude
that as long as the feasible set is nonempty, a linear programming problem
in standard form can be reduced to an equivalent standard form problem
(with the same feasible set) in which the equality constraints are linearly
independent.
ss Chap. 2 The geometry of linear programming
L×ampe z. 1 Consider the ,nonempty) polyhedron defned by the constraints
2XI + X2 + X3
2
Xl + X2
1
1
The corresponding matrix A has rank two. This is because the last two rows
( 1 , 1 , 0) and ( 1 , 0, 1 ) are linearly independent, but the frst row is equal to the
sum of the other two. Thus, the frst constraint is redundant and afer it is
eliminated, we still have the same polyhedron.
2. 4 Degeneracy
According to our defnition, at a basic solution, we must have n linearly
independent active constraints. This allows for the possibility that the
number of active constraints is greater than n. ¸Of course, in n dimensions,
no more than n of them can be linearly independent . ) In this case, we say
that we have a ,»¬· basic solution. In other words, at a degenerate
basic solution, the number of active constraints is greater than the minimum
necessary.
Deñnìtìonz. ¡u A basic solution ×, x

is said to be de,enerateif
more than n of the constraints are active at ×
In two dimensions, a degenerate basic solution is at the intersection
of three or more lines; in three dimensions, a degenerate basic solution is at
the intersection of four or more planes; see Figure 2. 9 for an illustration. It
turns out that the presence of degeneracy can strongly afect the behavior
of linear programming algorithms and for this reason, we will now develop
some more intuition.
L×ampez. a Consider the polyhedron I defned by the constraints
Xl + X2 + 2X3
S
8
X2 + 6X3
S
12
Xl S
4
X2
S
6
Xl , X2 , X3
>
0.
The vector x = ( 2, 6, 0) is a nondegenerate basic feasible solution, because there
are exactly three active and linearly independent constraints, namely, X
l + X2 +
2X3
S
8, X2 S
6, and X3 _ 0. The vector x = (4, 0, 2) is a degenerate basic
feasible solution, because there are four active constraints, three of them linearly
independent, namely, X
l + X2 + 2X3
S
8, X2 + 6X3
S
12, Xl
S
4, and X2 _ 0.
Sec. 2. 4 Degeneracy
B
.
a
¹ , l¹
Fì,ure z. u. The points A and C are degenerate basic feasible
solutions. The points B and r are nondegenerate basic feasible
solutions. The point L is a degenerate basic solution.
Degeneracy in standard form polyhedra
su
At a basic solution of a polyhedron in standard form, the m equality con
straints are always active. Therefore, having more than n active constraints
is the same as having more than n ÷ m variables at zero level. This leads
us to the next defnition which is a special case of Defnition 2. 10.
Deñnìtìon z. ¡¡ Consider the standard form polyhedron i¬ {×÷
«

A× " b, × � O} and let × be a basic solution. Let m be the
number of rows of A The vector ×is a de,eneratebasic solution if
more than n ÷ m of the components of ×are zero.
L×ampe z. s Consider once more the polyhedron of Example 2. 4. By intro
ducing the slack variables x, ,• • • , x, , we can transform it into the standard form
P= ¦x = (X
l
, . . . ,x,) I Ax = b, x � c¦ , where
1 2
1 6
0 0
1 0
1 0
0 1
0 0
0 0
0
0
1
0
Consider the basis consisting of the linearly independent columns AI , A2 , A3 ,
A7. To calculate the corresponding basic solution, we frst set the nonbasic
variables x, , x, , and x, to zero, and then solve the system Ax = b for the
remaining variables, to obtain x = (4, 0, 2, 0, 0, 0, 6) . This is a degenerate basic
feasible solution, because we have a total of four variables that are zero, whereas
cu Chap. 2 The geometry of linear programming
n ¯ m = 7 ¯ 4 = 3. Thus, while we initially set only the three nonbasic variables
to zero, the solution to the system Ax = b turned out to satisf one more of
the constraints (namely, the constraint x, � 0) with equality. Consider now the
basis consisting of the linearly independent columns AI , A3 , A4 , and A7 . The
corresponding basic feasible solution is again x = (4, 0, 2, 0, 0, 0, 6) .
The preceding example suggests that we can think of degeneracy in
the following terms. We pick a basic solution by picking n linearly indepen
dent constraints to be satisfed with equality, and we realize that certain
other constraints are also satisfed with equality. If the entries of n or
b were chosen at random, this would almost never happen. Also, Figure
2. 10 illustrates that if the coefcients of the active constraints are slightly
perturbed, degeneracy can disappear (cf. Exercise 2. 1 8) . In practical prob
lems, however, the entries of n and b often have a special (nonrandom)
structure, and degeneracy is more common than the preceding argument
would seem to suggest.
Fì,ure z. ¡u. Small changes in the constraining inequalities can
remove degeneracy.
In order to visualize degeneracy in standard form polyhedra, we as
sume that !  m ¬ 2 and we draw the feasible set as a subset of the
twodimensional set defned by the equality constraints n.¬ b, see Fig
ure 2. 1 1 . At a nondegenerate basic solution, exactly nm of the constraints
,_ 0 are active; the corresponding variables are nonbasic. In the case of
a degenerate basic solution, more than n  m of the constraints ,_ 0 are
active, and there are usually several ways of choosing which n m variables
to call nonbasic; in that case, there are several bases corresponding to that
same basic solution. (This discussion refers to the typical case. However,
there are examples of degenerate basic solutions to which there corresponds
only one basis. )
Degeneracy i s not a purely geometric property
We close this section by pointing out that degeneracy of basic feasible solu
tions is not, in general, a geometric (representation independent) property,
Sec. 2. 4 Degeneracy
Fì,ure z. ¡¡. An (n  m)dimensional illustration of degener
acy. Here, n = 6 and m = 4. The basic feasible solution A is
nondegenerate and the basic variables are x; ,x, ,x. ,x, The ba
sic feasible solution B is degenerate. We can choose x; ,x, as the
nonbasic variables. Other possibilities are to choose x
;
,x, , or to
choose x, ,x, Thus, there are three possible bases, for the same
basic feasible solution B.
c¡
but rather i t may depend on the particular representation of a polyhedron.
To illustrate this point , consider the standard form polyhedron ¸cf. Figure
2. 12)
We have n ¬3, m ¬2 and n ÷ m ¬ 1. The vector ( 1 , 1 , 0) i s nondegenerate
because only one variable is zero. The vector (0, 0, 1 ) is degenerate because
two variables are zero. However, the same polyhedron can also be described
' ' l ;
Fì,ure z. ¡z. An example of degeneracy i n a standard form problem.
cz Chap. 2 The geometry of linear programming
Fì,ure z. ¡1. The polyhedron I contains a line and does not
have an extreme point , while Q does not contain a line and has
extreme points.
in the (nonstandard) form
P ¬  ,
.

, )
¸ 
.
 
¬ 0, 
.
 
 :,¬2, 
.
2 0, ,2 O} .
The vector ( 0, 0, 1 ) is now a nondegenerate basic feasible solution, because
there are only three active constraints.
For another example, consider a nondegenerate basic feasible solution
× of a standard form polyhedron i ¬ ¦× ¸ A× ¬ b, × 2 u¦, where A
is of dimensions m x n. In particular, exactly n  m of the variables xi
are equal to zero. Let us now represent i in the form i ¬ ¦× ¸ A× 2
b, A× 2 ÷b, × 2 u¦ Then, at the basic feasible solution × , we have
n ÷ m variables set to zero and an additional 2m inequality constraints are
satisfed with equality. We therefore have n  m active constraints and ×
is degenerate. Hence, under the second representation, every basic feasible
solution is degenerate.
We have established that a degenerate basic feasible solution under
one representation could be nondegenerate under another representation.
Still, it can be shown that if a basic feasible solution is degenerate under one
particular standard form representation, then it is degenerate under every
standard form representation of the same polyhedron (Exercise 2. 19) .
2. 5 Existence of extreme points
We obtain i n this section necessary and sufcient conditions for a polyhe
dron to have at least one extreme point . We frst observe that not every
polyhedron has this property. For example, if n > 1 , a halfspace in �

is a
polyhedron without extreme points. Also, as argued in Section 2. 2 (cf. the
discussion after Defnition 2. 9) , if the matrix Ahas fewer than n rows, then
the polyhedron ¦×= �

A×2 b¦does not have a basic feasible solution.
Sec. 2. 5 Exstence of extreme points c1
It turns out that the existence of an extreme point depends on whether
a polyhedron contains an infnite line or not; see Figure z í: We need the
following defnition.
Deñnìtìon z. ¡z A polyhedron iC x

eontaìnsa ìneif there exists
a vector x E iand a nonzero vector d E x

such that x + Ad E ifor
all scalars A.
We then have the following result .
Iheorem z. c Suppose that the polyhedron i " {x E x

¸ a� x _
·, i " í . . . , m} is non empty. Then, the following are equivalent:
(a) The polyhedron ihas at least one extreme point.
(b
) The polyhedron idoes not contain a line.
(e) There exist n vectors out of the family aI , . . . , am, which are
lineary independent.
Prooi.
(b) =
(
a
)
We frst prove that if idoes not contain a line, then it has a basic feasible
solution and, therefore, an extreme point . A geometric interpretation of
this proof is provided in Figure z í+
Let x be an element of iand let I¬{i ¸ a� x ¬·, [ If n of the vectors
a, , i E I, corresponding to the active constraints are linearly independent ,
then x is, by defnition, a basic feasible solution and, therefore, a basic
feasible solution exists. If this is not the case, then all of the vectors �,
· = I, lie in a proper subspace of x

and there exists a nonzero vector
d E x

such that a� d ¬ 0, for every i E I. Let us consider the line
consisting of all points of the form y ¬ x + Ad, where A is an arbitrary
scalar. For · E I, we have a� y ¬ a� x + Aa� d ¬ a� x ¬ ·, Thus, those
constraints that were active at x remain active at all points on the line.
However, since the polyhedron is assumed to contain no lines, it follows
that as we vary A, some constraint will be eventually violated. At the
point where some constraint is about to be violated, a new constraint must
become active, and we conclude that there exists some A * and some j , I
such that a
j
(x + A* d) ¬·
We claim that aj is not a linear combination of the vectors ai , ·= I
Indeed, we have a
j
x = ·
(because j , I) and a
j
(x + A* d) ¬ ·
(by the
defnition of A* ) . Thus, a
j
d = U On the other hand, a� d ¬ 0 for every
· E I (by the defnition of d) and therefore, d is orthogonal to any linear
combination of the vectors �, · E I Since d is not orthogonal to a
we
ca Chap. 2 The geometry of linear programming
Fì,ure z. ¡a. Starting from an arbitrary point of a polyhedron,
we choose a direction along which all currently active constraints
remain active. We then move along that direction until a new
constraint is about to be violated. At that point, the number of
linearly independent active constraints has increased by at least
one. We repeat this procedure until we end up with n linearly
independent active constraints, at which point we have a basic
feasible solution.
conclude that a¸ is a not a linear combination of the vectors a
.
, · = I
Thus, by moving from × to ×+ .* d, the number of linearly independent
active constraints has been increased by at least one. By repeating the same
argument , as many times as needed, we eventually end up with a point at
which there are n linearly independent active constraints. Such a point is,
by defnition, a basic solution; it is also feasible since we have stayed within
the feasible set .
(a
)
(e)
If ihas an extreme point ×,then ×is also a basic feasible solution ,cf. The
orem z :, and there exist n constraints that are active at ×, with the
corresponding vectors a
.
being linearly independent .

e
)
(b)
Suppose that n of the vectors a
.
are linearly independent and, without
loss of generality, let us assume that a
, , a

are linearly independent.
Suppose that icontains a line ×+ .d, where d is a nonzero vector. We
then have a�,×+ .d) �
'.
for all ·and all .. We conclude that a� d = 0 for
all · ,If a� d < 0, we can violate the constraint by picking . very large; a
symmetric argument applies if a� d > 0. ) Since the vectors a
.
, ·¬ í . . . , n,
are linearly independent , this implies that d ¬ a This is a contradiction
and establishes that idoes not contain a line. .
Notice that a bounded polyhedron does not contain a line. Similarly,
Sec. 2. 6 Optimality of extreme points cs
the positive orthant ¦×¸×� O} does not contain a line. Since a polyhedron
in standard form is contained in the positive orthant , it does not contain a
line either. These observations establish the following important corollary
of Theorem z 
Coroaryz. z Every nonempty bounded polyhedron and every
non empty polyhedron in standard form has at least one basic feasi
ble solution.
2. 6 Optimality of extreme points
Having established the conditions for the existence of extreme points, we
will now confrm the intuition developed in Chapter í as long as a linear
programming problem has an optimal solution and as long as the feasible
set has at least one extreme point , we can always fnd an optimal solution
within the set of extreme points of the feasible set . Later in this section,
we prove a somewhat stronger result , at the expense of a more complicated
proof.
Iheoremz. z Consider the linear programming problem of minimiz
ing e×over a polyhedron i Suppose that ihas at least one extreme
point and that there exists an optimal solution. Then, there exists an
optimal solution which is an extreme point of i
Prooi. ( See Figure z í for an illustration. ) Let Q be the set of all optimal
solutions, which we have assumed to be nonempty. Let ibe of the form
i¬ ¦× = x

¸ A× � b} and let · be the optimal value of the cost e×
Then, Q ¬¦×= x

¸A×� b, e×¬·} which is also a polyhedron. Since
p
Q
x*
Fì,ure z. ¡s. Illustration of the proof of Theorem 2. 7. Here, Q
is the set of optimal solutions and an extreme point x· of Q is also
an extreme point of I
cc Chap. 2 The geometry of linear programming
Q c iand since icontains no lines (cf. Theorem 2. 6) , Q contains no lines
either. Therefore, Q has an extreme point .
Let .· be an extreme point of Q. We will show that .· is also an
extreme point of i Suppose, in order to derive a contradiction, that .·
is not an extreme point of i Then, there exist y i . i such that
y/ .· ./ .· and some x¸a i,such that .·= xy i  x,
t
It follows
that · =  .· = x y , i  x,  . Frthermore, since · is the optimal
cost , y_ ·and ._ · This implies that  y= .= ·and therefore
.Q and yQ. But this contradicts the fact that .·is an extreme point
of Q. The contradiction establishes that .· is an extreme point of i In
addition, since .·belongs to Q, it is optimal . .
The above theorem applies to polyhedra in standard form, as well as
to bounded polyhedra, since they do not contain a line.
Our next result is stronger than Theorem 2. 7. It shows that the
existence of an optimal solution can be taken for granted, as long as the
optimal cost is fnite.
Iheorem z.s Consider the linear programming problem of minimiz
ing .over a polyhedron i Suppose that ihas at least one extreme
point. Then, either the optimal cost is equal to 0, or there exists
an extreme point which is optimal.
Prooi. The proof is essentially a repetition of the proof of Theorem 2. 6.
The diference i s that as we move towards a basic feasible solution, we will
also make sure that the costs do not increase. We will use the following
terminology: an element .of ihas ¬»· · if we can fnd · but not more
than k, linearly independent constraints that are active at .
Let us assume that the optimal cost is fnite. Let i = [. �
n
n._ t,and consider some .iof rank · < n. We will show that there
exists some y i which has greater rank and satisfes y ¸ . Let
i= {i a�.= ·, , where a�is the ith row of n Since k < n, the vectors
a, i i lie in a proper subspace of �
n
, _ and we can choose some nonzero
+�
n
orthogonal to every a, i i Frthermore, by possibly taking the
negative of + we can assume that +_ a
Suppose that  +< o Let us consider the halfline y = .+ x+
where xi s a positive scalar. As i n the proof of Theorem 2. 6, all points
on this halfline satisfy the relations a� y = ·, i i If the entire half
line were contained in i the optimal cost would be 0, which we have
assumed not to be the case. Therefore, the halfline eventually exits i
When this is about to happen, we have some x· > aand j ,isuch that
a¸ .+ x· +,= ·
,
• We let y= . x· +and note that  y< . As in the
proof of Theorem 2. 6, a
,
is linearly independent from a, i i and the
rank of yis at least k  í
Sec. 2. 7 Representation of bounded polyhedra* cz
Suppose now that e
d¬U. We consider the line y ¬×+ \d, where
\is an arbitrary scalar. Since Pcontains no lines, the line must eventually
exit Pand when that is about to happen, we are again at a vector yof rank
greater than that of × Frthermore, since e
d¬0, we have e
y¬e
×
In either case, we have found a new point ysuch that e
y¸e
×,and
whose rank is greater than that of × By repeating this process as many
times as needed, we end up with a vector w of rank n (thus, w is a basic
feasible solution) such that e
w¸e
×
Let «
.
. . . , w¯ be the basic feasible solutions in Pand let w be a
basic feasible solution such that e
w ¸ e
w
.
for all · We have already
shown that for every ×there exists some ·such that e
w
.
¸e
× It follows
that e
w¸e
×for all ×÷ P, and the basic feasible solution w is optimal.
¯
For a general linear programming problem, if the feasible set has
no extreme points, then Theorem z º does not apply directly. On the
other hand, any linear programming problem can be transformed into an
equivalent problem in standard form to which Theorem z º does apply.
This establishes the following corollary.
Coroary z. 1 Consider the linear programming problem of minimiz
ing e ×over a nonempty polyhedron. Then, either the optimal cost is
equal to 0 or there exists an optimal solution.
The result in Corollary z : should be contrasted with what may hap
pen in optimization problems with a nonlinear cost function. For example,
in the problem of minimizing l /x subject to x _ í the optimal cost is not
0, but an optimal solution does not exist .
2. 7 Representation of bounded polyhedra*
So far, we have been representing polyhedra in terms of their defning in
equalities. In this section, we provide an alternative, by showing that a
bounded polyhedron can also be represented as the convex hull of its ex
treme points. The proof that we give here is elementary and constructive,
and its main idea is summarized in Figure z í There is a similar repre
sentation of unbounded polyhedra involving extreme points and "extreme
rays" (edges that extend to infnity) . This representation can be developed
using the tools that we already have, at the expense of a more complicated
proof. A more elegant argument, based on duality theory, will be presented
in Section 1+ and will also result in an alternative proof of Theorem z +
below.
cs Chap. 2 The geometry of linear programming
Fì,ure z. ¡c. Given the vector Z¡ we express it as a convex com
bination of y and 1. The vector 1 belongs to the polyhedron Q
whose dimension is lower than that of I Using induction on di
mension, we can express the vector 1 as a convex combination of
extreme points of Q. These are also extreme points of I
Iheorem z. u A nonempty and bounded polyhedron is the convex
hull of its extreme points.
Prooi. Every convex combination of extreme points is an element of the
polyhedron, since polyhedra are convex sets. Thus, we only need to prove
the converse result and show that every element of a bounded polyhedron
can be represented as a convex combination of extreme points.
We defne the .¬»·.» of a polyhedron i C ¬

as the smallest
integer k such that iis contained in some kdimensional afne subspace
of ¬

¸Recall fom Section 1 . 5, that a kdimensional afne subspace is a
translation of a kdimensional subspace. , Our proof proceeds by induction
on the dimension of the polyhedron i If iis zerodimensional , it consists
of a single point . This point is an extreme point of iand the result is true.
Let us assume that the result is true for all polyhedra of dimension less
than k. Let i¬¦×= ¬

a� ×¸·, . 1 , . . . , m¦ be a nonempty bounded
kdimensional polyhedron. Then, iis contained in a kdimensional afne
subspace S of ¬

, which can be assumed to be of the form
where x
.
. . . , ×
·
are some vectors in ¬

Let i
, , i

.
·
be n  k linearly
independent vectors that are orthogonal to ×
.
, ,×
·
Let ¸, ¬ i¸×º , for
Sec. 2. 7 Representation of bounded polyhedra*
·í . . . , n  k. Then, every element .of S satisfes
·í . . . , n  k.
Since P e S, the same must be true for every element of P.
cu
( 2. 3)
Let
t
be an element of P. If z is an extreme point of P, then z
is a trivial convex combination of the extreme points of P and there is
nothing more to be proved. If z is not an extreme point of P, let us choose
an arbitrary extreme point yof P and form the halfline consisting of all
points of the form z + A¸Z  y, where A is a nonnegative scalar. Since
P is bounded, this halfline must eventually exit P and violate one of the
constraints, say the constraint a�
. � ·,
By considering what happens
when this constraint is j ust about to be violated, we fnd some A * �0 and
Ï P, such that
U
t
+ A* (
t
 Y) ,
and
Since the constraint a�
.�·, is violated if A grows beyond A* , it follows
that a�
,z  y, < o
Let ,be the polyhedron defned by
, ¦.P I a� .·,
¦
¦ x �

a� . � ·, · l ,    ¸ Í¸ a
� . ·,
¦
Since
t
, yP, we have r¸z ,,r¸, which shows that .  y is orthogonal
to each vector r, for ·í . . . , nk. On the other hand, we have shown that
a�
,z y,< 0, which implies that the vector a, is not a linear combination
of, and is therefore linearly independent from, the vectors r, Note that
,
¦. �

a� . ·, r
,x ,, i = l , . . . , n  k} ,
since Eq. ( 2. 3) holds for every element of P. The set on the right is defned
by n  k + í linearly independent equality constraints. Hence, it is an afne
subspace of dimension k  í (see the discussion at the end of Section í ,
Therefore, ,has dimension at most k í
By applying the induction hypothesis t o ,and Ïç we see that Ï can
be expressed as a convex combination
Ï _x, ·
,
of the extreme points ·
,
of , where x, are nonnegative scalars that sum
to one. Note that at an extreme point ·of ,we must have a� ··,for n
linearly independent vectors �; therefore, ·must also be an extreme point
of P. Using the defnition of A * , we also have
U + A* Y
t
i+ A
*
zu Chap. 2 The geometry of linear programming
Therefore,
A* Y
�
Ai
i
t
¬
í

A*

� i

A*
•
which shows that Z is a convex combination of the extreme points of P .
L×ampe z. c Consider the polyhedron
I= ¡ ,.
·
, .. , .· , ¸ .,+ ..+ .·
S
1, .
·
, .. , .· ¸+¦
It has four extreme points, namely, x
;
= ( 1 , 0, 0) , ×
.
= (0, 1 , 0) , ×
·
= (0, 0, 1 ) , and
x
,
= (0, 0, 0) . The vector ×= ( 1/3, 1/3, 1/4) belongs to I It can be represented
as
1
1
1
2
1 3 1 4
x= × + × + × + ×
3 3 4 12
There i s a converse to Theorem z + asserting that the convex hull of
a fnite number of points is a polyhedron. This result is proved in the next
section and again in Section 1 +
2. 8 Projections of polyhedra:
FourierMotzkin elimination *
In this section, we present perhaps the oldest method for solving linear pro
gramming problems. This method is not practical because it requires a very
large number of steps, but it has some interesting theoretical corollaries.
The key to this method is the concept of a ,¬,·.» defned as
follows: if × ¬ 
.


, is a vector in x

and k _ n, the projection
mapping
·.
x

 x
.
projects ×onto its frst k coordinates:
·.
x,¬·.

.
·

,¬ 
.

.
,
We also defne the projection n
.
,of a set c x

by letting
n
.
,¬ ¦
·.
x,I x ¦
see Figure z í : for an illustration. Note that is nonempty if and only if
n
.
,is nonempty. An equivalent defnition is
n
.
,¬  
.

.
, ¸ there exist 
.,
.


s. t. 
.


,}
Suppose now that we wish to decide whether a given polyhedron
P c x

is nonempty. If we can somehow eliminate the variable 

and
construct the set n

.
,P) c x

.
we can instead consider the presum
ably easier problem of deciding whether n

.
,P)is nonempty. If we keep
eliminating variables one by one, we eventually arrive at the set 1
1
,P)that
Sec. 2. 8 Projections of polyhedra: FourierMotzkin elimination*
Fì,ure z. ¡z. The projections II (S) and I2 (S) of a rotated
threedimensional cube.
z¡
involves a single variable, and whose emptiness is easy to check. The main
disadvantage of this method is that while each step reduces the dimension
by one, a large number of constraints is usually added. Exercise z z+ deals
with a family of examples in which the number of constraints increases
exponentially with the problem dimension.
We now describe the elimination method. We are given a polyhedron
Pin terms of linear inequality constraints of the form

_,,,±
·,
,
.
·¬í . . . , m.
We wish to eliminate X
n
and construct the projection H

,P)
zz Chap. 2 The geometry of linear programming
Lìmìnatìona,orìthm
¡. Rewrite each constraint 2¸
,
,,

,
_ ·,in the form

,
,



¸  _,, ,+ ·,
,,
·= í , , m;
if ,

= 0, divide both sides by ,

By letting ×= ¸
,


, )
we obtain an equivalent representation of Pinvolving the follow
ing constraints:


>
,+ r¸× if ,

> 0, ( 2. 4)

,
+ r¸×
>


if 
,

< 0, ( 2. 5
)
0
>

.
+ r¸± if 
.

¬O. ( 2. 6)
Here, each , 
,

.
is a scalar, and each r, r
,
r
.
is a vector in
y
.
z. Let Q be the polyhedron i n y

,
defned by the constraints

,
+ r¸× _ ,+ r¸×
U _ 
.
+ r¸×
if ,

> 0 and 
,

< 0,
if 
.

" O.
L×ampez. z Consider the polyhedron defned by the constraints
Xl + X2
?
1
Xl + X2 + 2x
s ?
2
2Xl + 3x
s ?
3
Xl  4x
s
> 4
2Xl + X2  X
s ?
5
We rewrite these constraints i n the form
0
?
1  Xl  X2
X
s ?
1  (xl /2)  (x2 /2)
X
s ?
1  ( 2Xl /3)
1 + (xl /4) ?
X
s
5 2Xl + X2 ?
X
s ·
Then, the set Q is defned by the constraints
0
?
1  Xl  X2
1 + xl /4 ?
1  (xl /2)  (x2 /2)
( 2
.
7)
( 2. 8)
Sec. 2. 8 Projections of polyhedra: FourierMotzkin elimination* z1
1 + xl /4 2 1  ( 2Xl /3)
÷5 2Xl + X2
> 1  (xl /2)  (X2 /2)
÷5 2Xl + X2 2 1  (2Xl /3) .
Ihcorcm z. ¡u The polyhedron Q constructed by the elimination al
gorithm is equal to the projection H

,P) of P
Prooi. If x c H

.
·
¸P) , there exists some X
n
such that ¸x, ·

) c P In
particular, the vector × ¬ ¸x,·

) satisfes Eqs. ,z 1,· , z º, fom which it
follows immediately that xsatisfes Eqs. , z :, ·,z º, and xc Q. This shows
that H

., ¸P)C Q.
We will now prove that Q C H

, ,P) Let x c Q. It follows from
Eq. , z :, that
Let X
n
be any number between the two sides of the above inequality. It
then follows that ¸x,·

) satisfes Eqs. ,z 1,· , z º, and, therefore, belongs to
the polyhedron P ¯
Notice that for any vector ×¬ 
.
,·

) ,we have
Accordingly, for any polyhedron P, we also have
By generalizing this observation, we see that if we apply the elimination al
gorithm k times, we end up with the set H

·
,P) ,if we apply it n l times,
we end up with H, ¸ P) Unfortunately, each application of the elimination
algorithm can increase the number of constraints substantially, leading to
a polyhedron i¡
¸P) described by a very large number of constraints. Of
course, since H,¸P) is onedimensional, almost all of these constraints will
be redundant , but this is of no help: in order to decide which ones are
redundant, we must , in general, enumerate them.
The elimination algorithm has an important theoretical consequence:
since the projection h
·
¸P) can be generated by repeated application of the
elimination algorithm, and since the elimination algorithm always produces
a polyhedron, it follows that a projection H
·
,P) of a polyhedron is also a
polyhedron. This fact might be considered obvious, but a proof simpler
than the one we gave is not apparent . We now restate it in somewhat
diferent language.
za Chap. 2 The geometry of linear programming
Coroary z. : Let P C ×

,.
be a polyhedron. Then, the set
[ ×÷ x

¸ there exists y E ×'such that ,×, y) E P}
is also a polyhedron.
A variation of Corollary z 1 states that the image of a polyhedron
under a linear mapping is also a polyhedron.
Coroary z. s Let P C x

be a polyhedron and let A be an m x n
matrix. Then, the set Q ¬¦A× ×÷ P¦ is also a polyhedron.
Prooi. We have Q ¬ ¦y = x
¸ there exists × = x

such that A× ¬
y, × = P¦ Therefore, Q is the projection of the polyhed
r
on ¦ ,×, y) =
x

,
¸ A×¬y, ×= P¦ onto the ycoordinates. .
Coroary z. c The convex hull of a fnite number of vectors is a poly
hedron.
Prooi. The convex hull
of a fnite number of vectors X
l
, , ×
·
is the image of the polyhedron
under the linear mapping that maps ;, . . . ,\
·
)to _�
:
AiXi and is, there
fore, a polyhedron. .
We fnally indicate how the elimination algorithm can be used to
solve linear programming problems. Consider the problem of minimizing
e×subject to ×belonging to a polyhedron P We defne a new variable
·
,
and introduce the constraint
·
, = e× If we use the elimination algorithm
n times to eliminate the variables
·
, . . . 

we are left with the set
Q ¬ [ ., I there exists ×= P such that
·
, ¬e
×¦ ,
and the optimal cost is equal to the smallest element of Q. An optimal
solution ×can be recovered by backtracking (Exercise z zí,
Sec. 2. 9 Summay zs
2. 9 Summary
We summarize our main conclusions so far regarding the solutions to linear
programming problems.
¸a) If the feasible set is nonempty and bounded, there exists an optimal
solution. Frthermore, there exists an optimal solution which is an
extreme point .
¸b) I f the feasible set i s unbounded, there are the following possibilities:
¸i) There exists an optimal solution which is an extreme point .
¸ii) There exists an optimal solution, but no optimal solution is an
extreme point . ¸This can only happen if the feasible set has
no extreme points; it never happens when the problem is in
standard form. )
¸iii) The optimal cost is 0.
Suppose now that the optimal cost i s fnite and that the feasible set
contains at least one extreme point . Since there are only fnitely many
extreme points, the problem can be solved in a fnite number of steps, by
enumerating all extreme points and evaluating the cost of each one. This
is hardly a practical algorithm because the number of extreme points can
increase exponentially with the number of variables and constraints. In the
next chapter, we will exploit the geometry of the feasible set and develop
the ··¬,.. ¬·/ a systematic procedure that moves fom one extreme
point to another, without having to enumerate all extreme points.
An interesting aspect of the material in this chapter is the distinction
between geometric ¸representation independent) properties of a polyhedron
and those properties that depend on a particular representation. In that
respect , we have established the following:
¸a) Whether or not a point is an extreme point ¸equivalently, vertex, or
basic feasible solution) is a geometric property.
¸b) Whether or not a point is a basic solution may depend on the way
that a polyhedron is represented.
¸c) Whether or not a basic or basic feasible solution is degenerate may
depend on the way that a polyhedron is represented.
2. 10 Exercises
L×ereìse z. ¡ For each one of the following sets, determine whether it is a poly
hedron.
(a) The set of all (x, y) Ç u
.
satisfing the constraints
x cos 1+ y sin (
S
1 ,
x 2 0,
y 2 0.
\ 1Ç [ 0, 7/2] ,
zc Chap. 2 The geometry of linear programming
(b) The set of all x ¯ + satisfing the constraint x
2
 8x + 15 ¸ O.
(c) The empty set.
L×ereìse z. z Let f +º ÷ +be a convex function and let cbe some constant.
Show that the set S = {x ¯ +
¸ f(x) ¸ c} is convex.
L×ereìse z. 1 (Hasicfcasiblcsolutionsinstandardfornpolyhcdravith
uppcr bounds) Consider a polyhedron defned by the constraints Ax= b and
0 ¸ x ¸ u, and assume that the matrix . has linearly independent rows. Provide
a procedure analogous to the one in Section 2. 3 for constructing basic solutions,
and prove an analog of Theorem z +
L×ereìse z. a We know that every linear programming problem can be con
verted to an equivalent problem in standard form. We also know that nonempty
polyhedra in standard form have at least one extreme point. We are then tempted
to conclude that every nonempty polyhedron has at least one extreme point. Ex
plain what is wrong with this argument.
L×ereìse z. s (Extrcnc pointsofisotnorphic polyhcdra) A mapping f is
called afne if it is of the form f(x) = Ax+ b, where . is a matrix and b is a
vector. Let I and Q be polyhedra in +º and +
·
, respectively. We say that I
and Q are isomorhic if there exist afne mappings f . I ¬ Q and 9 . Q  I
such that q ¸]¸x) , = x for all x ¯ I, and f ¸q(y) , = y for all y ¯ Q. (Intuitively,
isomorphic polyhedra have the same shape. )
(a) If I and Q are isomorphic, show that there exists a onetoone correspon
dence between their extreme points. In particular, if f and 9 are as above,
show that xis an extreme point of Iif and only if f(x) is an extreme point
of Q.
(b) (Introducing slack vriablcs lcads to an isotnorphic polyhcdron)
Let I = {x ¯ +º I Ax 2 b, x 2 0} , where . is a matrix of dimensions
/x n. Let Q = { (x, Z) ¯ +º

¸ .x  Z = b, x 2 0, Z 2 0} . Show that I
and Q are isomorphic.
L×ereìse z.c (Carathóodory' s thcorcn) Let . . be a collection of
vectors in +
·
(a) Let
c = ¸� .
I 2 ,
Show that any element of C can be expressed in the form _�
,
. with
2 0, and with at most m of the coefcients being nonzero. Hint:
Consider the polyhedron
(b) Let Ibe the convex hull of the vectors .
Sec. 2. 1 0 Exercises zz
Show that any element of Ican be expressed in the form _�
,
Ai Ai , where
_�
,
Ai = 1 and Ai _ 0 for all i, with at most m + 1 of the coefcients Ai
being nonzero.
L×ereìse z. z Suppose that {x ¯ �º ¸ a�x _ bi , i = 1, . . . , m} and {x ¯ �º ¸
g�x _ hi , i = 1 , ,/} are two representations of the same nonempty polyhedron.
Suppose that the vectors a
i
, , a
¬
span �º Show that the same must be true
for the vectors g, , ,g
·
L×ereìse z. s Consider the standard form polyhedron {x ¸ Ax = b, x _ 0},
and assume that the rows of the matrix A are linearly independent . Let x be a
basic solution, and let J = {i ¸ Xi , 0} Show that a basis is associated with the
basic solution x if and only if every column Ai , i ¯ J, is in the basis.
L×ereìse z. u Consider the standard form polyhedron {x ¸ Ax = b, x _ 0},
and assume that the rows of the matrix A are linearly independent.
(a) Suppose that two diferent bases lead to the same basic solution. Show
that the basic solution is degenerate.
(b) Consider a degenerate basic solution. Is it true that it corresponds to two
or more distinct bases? Prove or give a counterexample.
(c) Suppose that a basic solution is degenerate. Is it true that there exists an
adjacent basic solution which is degenerate? Prove or give a counterexam
ple.
L×ereìse z. ¡u Consider the standard form polyhedron I= {x ¸ Ax= b, x _
0} Suppose that the matrix A has dimensions m x n and that its rows are
linearly independent. For each one of the following statements, state whether it
is true or false. If true, provide a proof, else, provide a counterexample.
(a) If n = m + 1 , then Ihas at most two basic feasible solutions.
(b) The set of all optimal solutions is bounded.
(c) At every optimal solution, no more than m variables can be positive.
(d) If there is more than one optimal solution, then there are uncountably
many optimal solutions.
(c) If there are several optimal solutions, then there exist at least two basic
feasible solutions that are optimal.
(f) Consider the problem of minimizing max{c' x, d
'
x} over the set I If this
problem has an optimal solution, it must have an optimal solution which
is an extreme point of I
L×ereìse z. ¡¡ Let I = {x ¯ �º ¸ Ax _ b} Suppose that at a particular
basic feasible solution, there are / active constraints, with / > n. Is it true
that there exist exactly ¸�, bases that lead to this basic feasible solution? Here
¸�, = /'J¸ n'¸/÷ n) ' , is the number of ways that we can choose n out of / given
items.
L×ereìse z. ¡z Consider a nonempty polyhedron I and suppose that for each
variable Xi we have either the constraint Xi _ 0 or the constraint Xi _ O. Is it
true that I has at least one basic feasible solution?
zs Chap. 2 The geometry of linear programming
L×ereìse z. ¡1 Consider the standard form polyhedron P = {x ¸ Ax = b, x 2
0}. Suppose that the matrix A, of dimensions m x n, has linearly independent
rows, and that all basic feasible solutions are nondegenerate. Let x be an element
of P that has exactly m positive components.
(a) Show that x is a basic feasible solution.
(b) Show that the result of part (a) is false if the nondegeneracy assumption is
removed.
L×ereìse z. ¡a Let P be a bounded polyhedron in +º , let abe a vector in +º ,
and let b be some scalar. We defne
Q = {x E P l a' x = b} .
Show that every extreme point of Q is either an extreme point of P or a convex
combination of two adjacent extreme points of P.
L×ereìse z. ¡s (Edgcsoining agaccnt vcrticcs) Consider the polyhedron
P = {x E +º ¸ a�x 2 bi , i = 1 , . . , m} . Suppose that u and v are distinct
basic feasible solutions that satisfy a� u = a�v = bi , i = 1 , . . . ,n ÷ 1, and that
the vectors a, , , a
º
, are linearly independent. ,In particular, u and v are
adj acent. ) Let L = {Xu+ ( 1 ÷ X) v¸ ° : X : ¹ } be the segment that joins u and
v. Prove that L = {z E P l a� z = bi , i = 1 , . . . , n ÷ I } .
L×ereìse z. ¡c Consider the set { x E +º ¸ a,
=
. . . = a
º
, = 0, 0 : a
º
: ¹} .
Could this be the feasible set of a problem i n standard form?
L×ereìse z. ¡z Consider the polyhedron {x E +º ¸ Ax : b, x 2 0} and a
nondegenerate basic feasible solution x* . We introduce slack variables Z and
construct a corresponding polyhedron { (x, Z) Ax + Z = b, x 2 0, Z 2 0} in
standard form. Show that (x* , bAx*
)
is a nondegenerate basic feasible solution
for the new polyhedron.
L×ereìse z. ¡s Consider a polyhedron P = {x ¸ Ax 2 b} . Given any C > 0,
show that there exists some
]
with the following two properties:
(a) The absolute value of every component of b ¯
]
is bounded by C.
(b) Every basic feasible solution i n the polyhedron P = {x ¸ Ax 2
]
} is
nondegenerate.
L×ereìse z. ¡u· Let P c +ºbe a polyhedron in standard form whose defnition
involves m linearly independent equality constraints. Its dimension is defned as
the smallest integer k such that P is contained in some kdimensional afne
subspace of +º
(a) Explain why the dimension of I is at most n¯ m.
(b) Suppose that P has a nondegenerate basic feasible solution. Show that the
dimension of P is equal to n÷ m.
(c) Suppose that x i s a degenerate basic feasible solution. Show that x i s degen
erate under every standard form representation of the same polyhedron (in
the same space +º) Hint. Using parts (a) and (b) , compare the number of
equality constraints in two representations of P under which x is degenerate
and nondegenerate, respectively. Then, count active constraints.
Sec. 2. 1 1 Notes and sources zu
L×ereìse z. zu· Consider the FourierMotzkin elimination algorithm.
¸U) Suppose that the number ¬ of constraints defning a polyhedron Iis even.
Show, by means of an example, that the elimination algorithm may produce
a description of the polyhedron n
,¸I) involving as many as ¬
,
1 linear
constraints, but no more than that.
(b) Show that the elimination algorithm produces a description of the one
dimensional polyhedron II ¸I) involving no more than ¬
·
,
_
·
con
straints.
(c) Let n = 2º+,+2, where ,is a nonnegative integer. Consider a polyhedron
in ×defned by the s ¸�, constraints
1 _ i < j < / _ n,
where all possible combinations are present. Show that after ,eliminations,
we have at least
2
,º
+
constraints. ¸Note that this number increases exponentially with n. )
L×ereìse z. z¡ Suppose that FourierMotzkin elimination is used in the manner
described at the end of Section 2 3to fnd the optimal cost in a linear programming
problem. Show how this approach can be augmented to obtain an optimal solution
as well .
L×ereìse z. zz Let I and Q be polyhedra in × Let I+Q = {×+y ¸ × ¯
P, y ¯ Q}
¸U) Show that I+Qi s a polyhedron.
(b) Show that every extreme point of I+Q is the sum of an extreme point of
I and an extreme point of Q
2. 1 1 Notes and sources
The relation between algebra and geometry goes far back in the history of
mathematics, but was limited to two and threedimensional spaces. The
insight that the same relation goes through in higher dimensions only came
in the middle of the nineteenth century.
z. z. Our algebraic defnition of basic ¸feasible) solutions for general poly
hedra, in terms of the number of linearly independent active con
straints, is not common. Nevertheless, we consider it to be quite
central, because it provides the main bridge between the algebraic
and geometric viewpoint , it allows for a unifed treatment , and shows
that there is not much that is special about standard form problems.
z. s. FourierMotzkin elimination i s due to Fourier , íºz., Dines , í+íº,
and Motzkin , í+:,
Lhapter ö
ÅDO S1D[ÌO× DO!DOO
Contents
I l Optimality conditions
I . Development of the simplex method
I I Implementations of the simplex method
I 1 Anticycling: lexicography and Bland' s rule
I ¯ Finding an initial basic feasible solution
I I Column geometry and the simplex method
I ¯ Computational efciency of the simplex method
I · Summary
I ò Exercises
I lo Notes and sources
s¡
sz Chap. 3 The simplex method
We saw in Chapter ., that if a linear programming problem in standard
form has an optimal solution, then there exists a basic feasible solution that
is optimal . The simplex method is based on this fact and searches for an op
timal solution by moving from one basic feasible solution to another, along
the edges of the feasible set , always in a cost reducing direction. Eventu
ally, a basic feasible solution is reached at which none of the available edges
leads to a cost reduction; such a basic feasible solution is optimal and the
algorithm terminates. In this chapter, we provide a detailed development
of the simplex method and discuss a few diferent implementations, includ
ing the simplex tableau and the revised simplex method. We also address
some difculties that may arise in the presence of degeneracy. We provide
an interpretation of the simplex method in terms of column geometry, and
we conclude with a discussion of its running time, as a function of the
dimension of the problem being solved.
Throughout this chapter, we consider the standard form problem
minimize ex
subj ect to Ax b
x
>
u,
and we let Pbe the corresponding feasible set . We assume that the dimen
sions of the matrix A are m x n and that its rows are linearly independent .
We continue using our previous notation: Ai is the ith column of the matrix
A, and a� is its ith row.
3. 1 Optimality conditions
Many optimization algorithms are structured as follows: given a feasible
solution, we search its neighborhood to fnd a nearby feasible solution with
lower cost . If no nearby feasible solution leads to a cost improvement , the
algorithm terminates and we have a ..., ,·.¬. solution. For general
optimization problems, a locally optimal solution need not be ,globally)
optimal. Fortunately, in linear programming, local optimality implies global
optimality; this is because we are minimizing a convex function over a
convex set ,cf. Exercise I l ) In this section, we concentrate on the problem
of searching for a direction of cost decrease in a neighborhood of a given
basic feasible solution, and on the associated optimality conditions.
Suppose that we are at a point x = Pand that we contemplate moving
away from x, in the direction of a vector d = �

Clearly, we should only
consider those choices of d that do not immediately take us outside the
feasible set . This leads to the following defnition, illustrated in Figure I l
Sec. 3. 1 Optimality conditions s1
Fì,ure 1. ¡. Feasible directions at diferent points of a polyhedron.
Deñnìtìon1. ¡ Let x be an element of a polyhedron P A vector
d = �
^
is said to be a ieasìbe dìreetìon at x, if there exists a
positive scalar e for which x + ed = P
Let x be a basic feasible solution to the standard form problem,
let B( l ) , . . . , B(m) be the indices of the basic variables, and let n =
¸n,¸
.
,
n,¸
, ,be the corresponding basis matrix. In particular, we have
.= ofor every nonbasic variable, while the vector X,= ¸,¸
.
,  ,¸
,
,
of basic variables is given by
XB = n
.
t
We consider the possibility of moving away from x, to a new vector
x + ed, by selecting a nonbasic variable 
,
(which is initially at zero level) ,
and increasing it t o a positive value e, while keeping the remaining nonbasic
variables at zero. Algebraically, 
,
= l , and = ofor every nonbasic index
·other than j. At the same time, the vector XB of basic variables changes
to XB + edB, where d, = ¸,¸ , , ,¸
,
,
• •
v , ,¸
,
,is the vector with those
components of d that correspond to the basic variables.
Given that we are only interested in feasible solutions, we require
A(x + ed) = b, and since x is feasible, we also have nx = b Thus, for the
equality constraints to be satisfed for e > o, we need Ad = o. Recall now
that 
,
= l , and that = ofor all other nonbasic indices · Then,
^
0 = Ad= _n = _n,¸ , ,¸,
+ n
,
= BdB + n
,
.
.
Since the basis matrix nis invertible, we obtain
dB = n
.
n
,
,I l )
sa Chap. 3 The simplex method
The direction vector dthat we have j ust constructed will be referred
to as the jth ··. .··.» We have so far guaranteed that the equality
constraints are respected as we move away from ×along the basic direction
d How about the nonnegativity constraints? We recall that the variable

,
is increased, and all other nonbasic variables stay at zero level. Thus,
we need only worry about the basic variables. We distinguish two cases:
(a) Suppose that × is a nondegenerate basic feasible solution. Then,
x,> 0, from which it follows that x,

+,_ 0, and feasibility is
maintained, when 1is sufciently small. In particular, dis a feasible
direction.
(b) Suppose now that ×is degenerate. Then, dis not always a feasible di
rection. Indeed, it is possible that a basic variable ,
, ,
is zero, while
the corresponding component ,
,,
of +,¬ B
1
A
,
is negative. In
that case, if we follow the jth basic direction, the nonnegativity con
straint for ,
,,
is immediately violated, and we are led to infeasible
solutions; see Figure I .
We now study the efects on the cost function i f we move along a basic
direction. If d is the jth basic direction, then the rate e d of cost change
along the direction dis given by C�dB
,
where ,¬ ¸,
.
, ,
, )
Using Eq. , I l ) , this is the same as ,
÷ C�B
1
n
,
This quantity is im
portant enough to warrant a defnition. For an intuitive interpretation, ,
is the cost per unit increase in the variable 
,
and the term C�B
1
A
,
is
the cost of the compensating change in the basic variables necessitated by
the constraint A×¬b.
Deñnìtìon1. z Let ×be a basic solution, let B be an associated basis
matrix, and let , be the vector of costs of the basic variables. For
each j, we defne the redueed eost:
,
of the variable 
,
according to
the formula

, B1 A ,
"
, , ,
L×ampe 1. ¡ Consider the linear programming problem
minimize CI XI + C2X2 + C3X3 + C4X4
subject to Xl + X2 + X3 + X4
2XI + 3X3 + 4X4
Xl , X2 , X3 , X4 2 O.
2
2
The frst two columns of the matrix A are Al = ¸ I , 2) and A2 = ¸ I , 0) . Since
they are linearly independent, we can choose Xl and X2 as our basic variables.
The corresponding basis matrix is
Sec. 3. 1 Optimality conditions
Fì,ure1. z. Let n = 5, nm = As discussed in Section . we
can visualize the feasible set by standing on the twodimensional
set defned by the constraint A×= b, in which case, the edges of
the feasible set are associated with the nonnegativity constraints
Xi 2 O. At the nondegenerate basic feasible solution E, the vari
ables Xl and X3 are at zero level nonbasic and X2 , X4 , X5 are
positive basic variables. The frst basic direction is obtained by
increasing Xl , while keeping the other nonbasic vriable X3 at zero
level. This is the direction corresponding to the edge EF. Con
sider now the degenerate basic feasible solution F and let X3 , X5
be the nonbasic variables. Note that X4 is a basic variable at zero
level. A basic direction is obtained by increasing X3 , while keeping
the other nonbasic variable X5 at zero level. This is the direction
corresponding to the line FG and it takes us outside the feasible
set. Thus, this basic direction is not a feasible direction.
ss
We set X3 = X4 = 0, and solve for Xl , X2 , to obtain Xl = and X2 = We have
thus obtained a nondegenerate basic feasible solution.
A basic direction corresponding to an increase in the nonbasic variable X3 ,
is constructed as follows. We have d3 = and d4 = O. The direction of change of
the basic variables is obtained using Eq.
¸ ¸ ¸ ¸ ¸
The cost of moving along this basic direction is c' d = +c2 +C3 . This
is the same as the reduced cost of the variable X3 .
Consider now Defnition I .for the case of a basic variable. Since n
is the matrix ¸n,¸
.
,
n,¸,, , we have n
¸n,¸
,
n,¸,, ,= 1, where
sc Chap. 3 The simplex method
1is the m x m identity matrix. In particular, n
n,¸,,
is the ith column
of the identity matrix, which is the ith unit vector , Therefore, for every
basic variable ,¸,, we have
·
n
n
o ,¸,,
= ,¸,, 
,
,¸,,
= ,¸,,  
,
,= ,¸,,  ,¸,,
=
,
and we see that the reduced cost of every basic variable is zero.
Our next result provides us with optimality conditions. Given our
interpretation of the reduced costs as rates of cost change along certain
directions, this result is intuitive.
Iheorem 1. ¡ Consider a basic feasible solution × associated with a
basis matrix n and let c be the corresponding vector of reduced costs.
¦a) I c � 0, then ×is optimal.
( b) I×is optimal and non degenerate, then c � a
Prooi.
¦a) We assume that c � 0, we let ybe an arbitrary feasible solution, and
we defne + = y × Feasibility implies that n× ¬ ny = b and,
therefore, n+¬a The latter equality can be rewritten in the form
n+,+ _n, ,= 0,
,
·
where N is the set of indices corresponding to the nonbasic variables
under the given basis. Since nis invertible, we obtain
and
+,=  _n
n, ,
,
·
e
d= �+,+ _, ,= _, �n
n, , ,= _, ,
,
·
,
·
,
·
For any nonbasic index i = N, we must have , = o and, since y
is feasible, ,, � Thus, , � o and , , � o, for all i = N. We
conclude that e
,y ×) = e
d � o, and since y was an arbitrary
feasible solution, ×is optimal .
(b) Suppose that × is a nondegenerate basic feasible solution and that
c¸ < ofor some j . Since the reduced cost of a basic variable is always
zero, X
j
must be a nonbasic variable and c¸ is the rate of cost change
along the jth basic direction. Since ×is nondegenerate, the jth basic
direction is a feasible direction of cost decrease, as discussed earlier.
By moving in that direction, we obtain feasible solutions whose cost
is less than that of ×, and ×is not optimal . ¯
Sec. 3. 2 Development of the simplex method sz
Note that Theorem I lallows the possibility that x i s a (degenerate)
optimal basic feasible solution, but that c¸ < ofor some nonbasic index j.
There is an analog of Theorem I l that provides conditions under which
a basic feasible solution x is a unique optimal solution; see Exercise I I
A related view of the optimality conditions i s developed i n Exercises I .
and I I
According to Theorem I l, i n order to decide whether a nondegenerate
basic feasible solution is optimal, we need only check whether all reduced
costs are nonnegative, which is the same as examining the n m basic
directions. If x is a degenerate basic feasible solution, an equally simple
computational test for determining whether x is optimal is not available
(see Exercises I ¯and I ·) Fortunately, the simplex method, as developed
in subsequent sections, manages to get around this difculty in an efective
manner.
Note that in order to use Theorem I l and assert that a certain ba
sic solution is optimal, we need to satisf two conditions: feasibility, and
nonnegativity of the reduced costs. This leads us to the following defnition.
Deñnìtìon 1.1 A basis matrix nis said to be optìmaif:
(a
) n' � 0, and
, , c
'
* e e�B
A� 0' .
Clearly, if an optimal basis is found, the corresponding basic solution
is feasible, satisfes the optimality conditions, and is therefore optimal . On
the other hand, in the degenerate case, having an optimal basic feasible
solution does not necessarily mean that the reduced costs are nonnegative.
3. 2 Development of the simplex method
We will now complete the development of the simplex method. Our main
task is to work out the details of how to move to a better basic feasible
solution, whenever a proftable basic direction is discovered.
Let us assume that every basic feasible solution is nondegenerate.
This assumption will remain in efect until it is explicitly relaxed later
in this section. Suppose that we are at a basic feasible solution x and
that we have computed the reduced costs c¸ of the nonbasic variables. If
all of them are nonnegative, Theorem I l shows that we have an optimal
solution, and we stop. If on the other hand, the reduced cost c¸of a nonbasic
variable ·¸ is negative, the jth basic direction d is a feasible direction of
cost decrease. [This is the direction obtained by letting 1¸ * l , 1, * o
for i ¬ B, l , , . . , B,m, , , , and dB * B
A¸ ¦ While moving along this
direction d, the nonbasic variable ·¸becomes positive and all other nonbasic
ss Chap. 3 The simplex method
variables remain at zero. We describe this situation by saying that 
,
,or
A¸ ) »··· or .· ·¬»,/· .»· ·/ ··.·
Once we start moving away fom × along the direction d, we are
tracing points of the form ×+ ^d, where 1_ 0 Since costs decrease along
the direction d, it is desirable to move as far as possible. This takes us to
the point ×+ ^d, where
1 = max ¡1_ o
×+ ^dE P
The resulting cost change is ^ e
d, which is the same as ^c¸
We now derive a formula for 1 Given that Ad= 0, we have A,×+
^d)= A×= bfor all 1, and the equality constraints will never be violated.
Thus, ×+ ^dcan become infeasible only if one of its components becomes
negative. We distinguish two cases:
,a, If d _ 0, then ×+ ^d _ ° for all 1 _ o, the vector ×+ ^d never
becomes infeasible, and we let 1 = 0.
,b, If ,< ofor some i , the constraint ,+ ,_ obecomes 1� , },
This constraint on 1must be satisfed for every i with ,< 0 Thus,
the largest possible value of 1is
1 = min
¸
,
¸
,
,
. · ,
Recall that if ,is a nonbasic variable, then either ,is the entering
variable and ," l , or else ,= 0 In either case, ,is nonnegative.
Thus, we only need to consider the basic variables and we have the
equivalent formula
• ¸

·
, ,
¸
= mIn ¯ .
,
,
.
.· ·

·
,,
, I .)
Note that 1 > o, because 
·
,,
> o for all i , as a consequence of
nondegeneracy.
L×ampe 1. z This is a continuation of Example 3 I fom the previous section,
dealing with the linear programming problem
minimize C1 Xl + C2X2 + C3X3 + C4X4
subject to Xl + X2 + X3 + X4 2
2Xl + 3X3 + 4X4 2
Xl , X2 , X3 , X4 � O.
Let us again consider the basic feasible solution x = ¸ I , I , 0, 0) and recall that the
reduced cost C3 of the nonbasic variable X3 was found to be 3c¸2+ c, J2+ C3 .
Suppose that c = ¸ 2, 0, 0, 0) , in which case, we have C3 = 3. Since C3 is negative,
we form the corresponding basic direction, which is d = ¸ 3J2, IJ2, I , 0) , and
consider vectors of the form x+9d, with 9 � O. As 9 increases, the only component
of x that decreases is the frst one (because d
l < 0) The largest possible value
Sec. 3. 2 Development of the simplex method su
of ( is given by 0' = ¸a, J4, ) = 2J3 This takes us to the point y = x +2dJ3=
¸0, +J3, 2J3, 0) Note that the columns A2 and A3 corresponding to the nonzero
variables at the new vector y are ¸ I , 0) and ¸ I , 3) , respectively, and are linearly
independent. Therefore, they form a basis and the vector y is a new basic feasible
solution. In particular, the variable a, has entered the basis and the vriable a,
has exited the basis.
Once 1· is chosen, and assuming it is fnite, we move to the new
feasible solution y = ×+1· d Since ·¸ = oand 1¸ = l , we have ,¸ = 1· > 0.
Let £ be a minimizing index in Eq. , I .) , that is,
in particular,
and
·
j¸·,
= min
¸
·j¸, , ¸
= 1
·
1
j¸
,
, (
,
i
~ ·, , ·:
1
j¸,,
,
1
j¸·,
< o,
·
j¸·,
+1
·
1
j¸·,
= 0
We observe that the basic variable 
j¸
,
,
has become zero, whereas the
nonbasic variable ·¸has now become positive, which suggests that ·¸should
replace 
j¸·,
in the basis. Accordingly, we take the old basis matrix B and
replace A
j¸
,
,
with A¸ , thus obtaining the matrix
A
j¸·
¬
i ,
A
�¸
~
, ¸
, I I)
Equivalently, we are replacing the set ¦B, l , , . . . , B,m, ¦ of basic indices by
a new set ¦ B, l , , . . . , B,m, ¦ of indices given by
B,· , = ¸ '
·
'
J ,
, I 1)
Iheorem1. z
(a) The columns A
j¸,,
,· = £, and A¸ are linearly independent and,
therefore, B is a basis matrix.
( b
)
The vector y " ×+1· d is a basic feasible solution associated
with the basis matrix B.
Prooi.
¸a) If the vectors A¸
¸ ,,
,
· = l , . . . , m, are linearly dependent, then there
exist coefcients \, , \
~
, not all of them zero, such that
~
_\, A¸
¸ ,,
= 0,
,
i
90 Chap. 3 The simplex method
which implies that
,
_\, B
·
A¸
,,
= 0,
,
·
and the vectors B
'A¸
,,
are also linearly dependent . To show that
this is not the case, we will prove that the vectors B
A,
,, , · , e,
and B
A¸ are linearly independent . We have B
'B = 1 Since
A,
, , is the ith column of B, it follows that the vectors B
A,
,, ,
i = e, are all the unit vectors except for the eth unit vector. In
particular, they are linearly independent and their eth component is
zero. On the other hand, B
A¸ is equal to d, Its eth entry,
1,
,, , is nonzero by the defnition of e. Thus, B
' A¸ is linearly
independent from the unit vectors B 'A,
,, , · , 
,b) We have y ¸ 0, Ay = b, and ,, = + for · , B, l , , . » . , B,m, Fur
thermore, the columns A¸
·
,
, ,A¸
,,
have j ust been shown to be
linearly independent . It follows that y is a basic feasible solution
associated with the basis matrix B ¯
Since ^ is positive, the new basic feasible solution ×+ ^ dis distinct
from ×, since d is a direction of cost decrease, the cost of this new basic
feasible solution is strictly smaller. We have therefore accomplished our
obj ective of moving to a new basic feasible solution with lower cost . We
can now summarize a typical iteration of the simplex method, also known
as a j·c·· (see Section I Ifor a discussion of the origins of this term) . For
our purposes, it is convenient to defne a vector Ï = ,o
·
, , o,) by letting
where A¸ is the column that enters the basis; in particular, o, * 1,
,, ,
for · = l , , m
An ìteratìonoithesìmpe×method
¡. In a typical iteration, we start with a basis consisting of the
basic columns A¡
, , , ,A,
¸
~
,
,and an associated basic feasible
solution ×
z. Compute the reduced costs c¸ = :¸  e�B' A¸ for all nonbasic
indices , If they are all nonnegative, the current basic feasible
solution is optimal, and the algorithm terminates; else, choose
some , for which c¸ <
1. Compute Ï = B
'A¸ If no component of Ï i s positive, we have
^ = 0, the optimal cost is 0, and the algorithm terminates.
Sec. 3. 2 Development of the simplex method
:. If some component of Ï is positive, let
^ * min
,
, ,
,
,
÷    ,
'
«, >c, ·,
u¡
s. Let /be such that ^ * ,
,, ;·, Form a new basis by replacing
n,
,,with n
,
• If y is the new basic feasible solution, the values
of the new basic variables are ,, * 1 and ,,
,,* +_
¸ ,,
¬ ( ·, ,
· =/
The simplex method is initialized with an arbitrary basic feasible
solution, which, for feasible standard form problems, is guaranteed to exist .
The following theorem states that , in the nondegenerate case, the simplex
method works correctly and terminates after a fnite number of iterations.
Iheorem1. 1 Assume that the feasible set is nonempty and that ev
ery basic feasible solution is nondegenerate. Then, the simplex method
terminates after a fnite number of iterations. At termination, there
are the following two possibilities:
(a) We have an optimal basis B and an associated basic feasible
solution which is optimal.
, b) We have found a vector dsatisfing nd* 0, d� 0, and e
d< o,
and the optimal cost is 0.
Prooi. I f the algorithm terminates due t o the stopping criterion i n Step
2, then the optimality conditions in Theorem I l have been met , B is an
optimal basis, and the current basic feasible solution is optimal.
If the algorithm terminates because the criterion in Step I has been
met , then we are at a basic feasible solution x and we have discovered a
nonbasic variable 
,
such that c¸ < 0 and such that the corresponding basic
direction d satisfes nd * 0 and d � O. In particular, x + 1d= P for all
( > o Since e
d * c¸ < o, by taking ( arbitrarily large, the cost can be
made arbitrarily negative, and the optimal cost is 0.
At each iteration, the algorithm moves by a positive amount ^ along
a direction dthat satisfes e
d< o Therefore, the cost of every successive
basic feasible solution visited by the algorithm is strictly less than the cost
of the previous one, and no basic feasible solution can be visited twice.
Since there is a fnite number of basic feasible solutions, the algorithm
must eventually terminate. .
Theorem I I provides an independent proof of some of the results
of Chapter 2 for nondegenerate standard form problems. In particular,
it shows that for feasible and nondegenerate problems, either the optimal
uz Chap. 3 The simplex method
cost is 0, or there exists a basic feasible solution which is optimal (cf.
Theorem z. · in Section z. c, . While the proof given here might appear more
elementary, its extension to the degenerate case is not as simple.
The simplex method for degenerate problems
We have been working so far under the assumption that all basic feasible
solutions are nondegenerate. Suppose now that the exact same algorithm
is used in the presence of degeneracy. Then, the following new possibilities
may be encountered in the course of the algorithm.
(a) If the current basic feasible solution x is degenerate, 1can be equal
to zero, in which case, the new basic feasible solution y is the same as
x. This happens if some basic variable ·j¸
,
,
is equal to zero and the
corresponding component 1
j¸
,
,
of the direction vector dis negative.
Nevertheless, we can still defne a new basis B, by replacing A
j¸
,
,
with A¸ [ef. Eqs. , I I, , I 1, ¦ , and Theorem I z is still valid.
(b) Even if 1is positive, it may happen that more than one of the original
basic variables becomes zero at the new point x + (* d Since only one
of them exits the basis, the others remain in the basis at zero level,
and the new basic feasible solution is degenerate.
Basis changes while staying at the same basic feasible solution are
not in vain. As illustrated in Figure I I, a sequence of such basis changes
may lead to the eventual discovery of a cost reducing feasible direction. On
the other hand, a sequence of basis changes might lead back to the initial
basis, in which case the algorithm may loop indefnitely. This undesirable
phenomenon is called ·,··oç An example of cycling is given in Section I I,
after we develop some bookkeeping tools for carrying out the mechanics of
the algorithm. It is sometimes maintained that cycling is an exceptionally
rare phenomenon. However, for many highly structured linear program
ming problems, most basic feasible solutions are degenerate, and cycling
is a real possibility. Cycling can be avoided by j udiciously choosing the
variables that will enter or exit the basis (see Section I 1, We now discuss
the freedom available in this respect .
Pivot Selection
The simplex algorithm, as we described it , has certain degrees of freedom:
in Step z, we are free to choose any j whose reduced cost c¸ is negative;
also, in Step ¯, there may be several indices £ that attain the minimum in
the defnition of 1 , and we are free to choose any one of them. Rules for
making such choices are called j·c···oç ¬s.
Regarding the choice of the entering column, the following rules are
some natural candidates:
Sec. 3. 2 Development of the simplex method
`
Fì,ure 1. 1. We visualize a problem in standard form, with
n  m = 2, by standing on the twodimensional plane defned by
the equality constraints A× = b. The basic feasible solution × is
degenerate. If x, and x, are the nonbasic variables, then the two
corresponding basic directions are the vectors g and f For either of
these two basic directions, we have 0' = O. However, if we perform
a change of basis, with x, entering the basis and x, exiting, the
new nonbasic variables are x, and x, , and the two basic directions
are h and g. ¸The direction g is the one followed if x, is in
creased while x, is kept at zero. ) In particular, we can now follow
direction hto reach a new basic feasible solution y with lower cost .
u1
¸a) Choose a column A¸ , with c¸ < o, whose reduced cost is the most
negative. Since the reduced cost is the rate of change of the cost
function, this rule chooses a direction along which costs decrease at
the fastest rate. However, the actual cost decrease depends on how
far we move along the chosen direction. This suggests the next rule.
¸b) Choose a column with c¸ < o for which the corresponding cost de
crease e*
c¸
is largest. This rule ofers the possibility of reaching
optimality afer a smaller number of iterations. On the other hand,
the computational burden at each iteration is larger, because we need
to compute e* for each column with c¸ < The available empirical
evidence suggests that the overall running time does not improve.
For large problems, even the rule that chooses the most negative c¸
can be computationally expensive, because it requires the computation of
the reduced cost of every variable. In practice, simpler rules are sometimes
ua Chap. 3 The simplex method
used, such as the soas· sos·ctj· rule, that chooses the smallest j for
which c¸ is negative. Under this rule, once a negative reduced cost is
discovered, there is no reason to compute the remaining reduced costs.
Other criteria that have been found to improve the overall running time
are the Ðc (Harris, l+¯I, and the s·js· 1ç rule ( Goldfarb and Reid,
l+¯¯, Finally, there are methods based on ·ao1t1a· ts·s whereby one
examines the reduced costs of nonbasic variables by picking them one at
a time from a prioritized list . There are diferent ways of maintaining
such prioritized lists, depending on the rule used for adding, removing, or
reordering elements of the list .
Regarding the choice of the exiting column, the simplest option is
again the soas· sos·ctj· rule: out of all variables eligible to exit the
basis, choose one with the smallest subscript . It turns out that by following
the smallest subscript rule for both the entering and the exiting column,
cycling can be avoided (cf. Section I 1,
3. 3 Implementations of the simplex method
In this section, we discuss some ways of carrying out the mechanics of the
simplex method. It should be clear from the statement of the algorithm
that the vectors B
·
A¸ play a key role. If these vectors are available,
the reduced costs, the direction of motion, and the stepsize (* are easily
computed. Thus, the main diference between alternative implementations
lies in the way that the vectors B
A¸ are computed and on the amount
of related information that is carried from one iteration to the next.
When comparing diferent implementations, it is important to keep
the following facts in mind (cf. Section l c, If B is a given m x m matrix
and b= J
'
is a given vector, computing the inverse of Bor solving a linear
system of the form B×* btakes O(m3 ) arithmetic operations. Computing
a matrixvector product Bb takes O(m
2
) operations. Finally, computing
an inner product p bof two mdimensional vectors takes O(m) arithmetic
operations.
Naive implementation
We start by describing the most straightforward implementation in which
no auxiliary information is carried from one iteration to the next . At the
beginning of a typical iteration, we have the indices B( l ) , . . . , B(m, of
the current basic variables. We form the basis matrix B and compute
p* e�B
,by solving the linear system p B* e�for the unknown vector
p ¸This vector pis called the vector of stoj oo·tjtcs associated with
the basis B ) The reduced cost c¸ * ·¸ ÷ e�B
A¸ of any variable ·¸ is
then obtained according to the formula
Sec. 3. 3 Implementations of the simplex method us
Depending on the pivoting rule employed, we may have to compute all of the
reduced costs or we may compute them one at a time until a variable with
a negative reduced cost is encountered. Once a column A¸ is selected to
enter the basis, we solve the linear system Bu* A¸ in order to determine
the vector u * B·A¸ At this point, we can form the direction along
which we will be moving away from the current basic feasible solution. We
fnally determine 0* and the variable that will exit the basis, and construct
the new basic feasible solution.
We note that we need O(m3 ) arithmetic operations to solve the sys
tems p' B* c� and Bu* A¸ In addition, computing the reduced costs of
all variables requires O(mn) arithmetic operations, because we need to form
the inner product of the vector pwith each one of the nonbasic columns A¸
Thus, the total computational efort per iteration is O(m3 + mn) . We will
see shortly that alternative implementations require only O( m
2
+mn) arith
metic operations. Therefore, the implementation described here is rather
inefcient , in general . On the other hand, for certain problems with a spe
cial structure, the linear systems p'B * c� and Bu* A¸ can be solved
very fast , in which case this implementation can be of practical interest .
We will revisit this point in Chapter 7, when we apply the simplex method
to network fow problems.
Revised simplex method
Much of the computational burden in the naive implementation is due to
the need for solving two linear systems of equations. In an alternative
implementation, the matrix B' is made available at the beginning of each
iteration, and the vectors c�B
·
and B
·
A¸ are computed by a matrix
vector multiplication. For this approach to be practical , we need an efcient
method for updating the matrix B' each time that we efect a change of
basis. This is discussed next.
Let
be the basis matrix at the beginning of an iteration and let
B* [A_¸
·
,
A_¸·
·
,
A¸ A_¸
·
·
,
A_¸
~
, ]
be the basis matrix at the beginning of the next iteration. These two basis
matrices have the same columns except that the £th column A_¸·,
¸the one
that exits the basis)has been replaced by A¸ It is then reasonable to expect
that B
·
contains information that can be exploited in the computation of
·
B . After we develop some needed tools and terminology, we will see that
this is indeed the case. An alternative explanation and line of development
is outlined in Exercise I lI
uc Chap. 3 The simplex method
Deñnìtìon1. : Given a matrix, not necessarily square, the operation
of adding a constant multiple of one row to the same or to another row
is called an eementaryrow operatìon
The example that follows indicates that performing an elementary row
operation on a matrix Cis equivalent to forming the matrix QC, where Ç
is a suitably constructed square matrix.
L×ampe1. 1 Let
Q
¬
¸ �
0
� '
1
0
and note that
¸
1 1
QC ¬
�
C ¬
¸ �
1
4
4
.
6
In particular, multiplication from the lef by the matrix
Q
has the efect of mul
tiplying the third row of C by two and adding it to the frst row.
Generalizing Example 3. 3, we see that multiplying the jth row by J
and adding it to the ith row (for i = j) is the same as leftmultiplying by
the matrix Ç = 1+ D,¸ , where D,¸ is a matrix with all entries equal to
zero, except for the ( i, j) th entry which is equal to (. The determinant of
such a matrix Ç is equal to l and, therefore, Ç is invertible.
Suppose now that we apply a sequence of 1 elementary row oper
ations and that the kth such operation corresponds to lefmultiplication
by a certain invertible matrix Ç
·
Then, the sequence of these elementary
row operations is the same as lefmultiplication by the invertible matrix
Ç,Ç,
·
Ç

Ç
·
We conclude that performing a sequence of elemen
tary row operations on a given matrix is equivalent to lefmultiplying that
matrix by a certain invertible matrix.
Since
B
1
B
= 1, we see that
B
·
A,
¸ , ,
is the ith unit vector e,
Using this observation, we have
B
'
B � ¸
'
e
·
e
,
·
Ï e
,
·
·
e,
l ·
.
·
·, l
Sec. 3. 3 Implementations of the simplex method uz
where Ï " B1 A¸ Let us apply a sequence of elementary row operations
that will change the above matrix to the identity matrix. In particular,
consider the following sequence of elementary row operations.
(a) For each · , f, we add the fth row times ·, /·, to the ith row.
(Recall that ·,> o ) This replaces ·,by zero.
(b) We divide the fth row by ·, This replaces ·,by one.
In words, we are adding to each row a multiple of the fth row to
replace the fth column Ï by the fth unit vector e£ . This sequence of ele
mentary row operations is equivalent to leftmultiplying B

1 B by a certain
invertible matrix Q. Since the result is the identity, we have QB1 B " 1,
which yields QB1 " B

1
. The last equation shows that if we apply
the same sequence of row operations to the matrix B1 (equivalently, left
multiply by Q) , we obtain B
1
. We conclude that all it takes to generate
B

1
, is to start with B
1
and apply the sequence of elementary row oper
ations described above.
L×ampe 1. a Let
b
·
=
¸
3
3
� '
and suppose that f = 3 Thus, our objective is to transform the vector 1 to the
unit vector c·= (0, 0, 1 ) . We multiply the third row by and add it to the frst
row. We subtract the third row from the second row. Finally, we divide the third
row by We obtain
b¯
·
=
�
1 . 5
1
'
3
1
When the matrix B
1
is updated in the manner we have described, we ob
tain an implementation of the simplex method known as the ·c·s1 s·oj
o·/·1, which we summarize below.
An ìteratìonoithe revìsedsìmpe×method
¡ . In a typical iteration, we start with a basis consisting of the basic
columns A¡
·
, ,+ v • , A¡
,, , an associated basic feasible solution
×, and the inverse B1 of the basis matrix.
z. Compute the row vector p
" c'B
1
and then compute the re
duced costs c¸ " :¸  p
A¸ If they are all nonnegative, the
current basic feasible solution is optimal, and the algorithm ter
minates; else, choose some j for which c¸ <
1. Compute Ï " B1 A¸ I f no component of Ï i s positive, the
optimal cost is 0, and the algorithm terminates.
us Chap. 3
a. If some component of 1 i s positive, let
(* = min
·,
, ,
,
,¸
.
·
¸ ·, >c¦ o,
The simplex method
s. Let f be such that (* = ·,
,
·
,o, Form a new basis by replacing
A,
,, with A¸ If y is the new basic feasible solution, the values
of the new basic variables are ,¸ = (* and ,,
, , = ·,
,, ÷ (* Ui ,
i =f.
c. Form the Î x ,m + i , matrix ¸B
1¦ . Add to each one of
its rows a multiple of the fth row to make the last column equal
to the unit vector e, The frst Î columns of the result is the
matrix B
The full tableau implementation
We fnally describe the implementation of simplex method in terms of the
socalled ) ·aao Here, instead of maintaining and updating the matrix
B
·
,we maintain and update the Î x (n + i, matrix
with columns B
·
b and B
·
A
, B
A

This matrix is called the
stoj ·aao Note that the column B
b, called the :c··/ ··ooo,
contains the values of the basic variables. The column B
A,is called the
ith column of the tableau. The column 1 = B
·
A¸ corresponding to the
variable that enters the basis is called the jtc·· ··ooo If the fth basic
variable exits the basis, the fth row of the tableau is called the jtc·· co
Finally, the element belonging to both the pivot row and the pivot column
is called the jtc·· oo· Note that the pivot element is o,and is always
positive ¸unless 1 _ 0, in which case the algorithm has met the termination
condition in Step I,
The information contained in the rows of the tableau admits the fol
lowing interpretation. The equality constraints are initially given to us
in the form b = A× Given the current basis matrix B, these equality
constraints can also be expressed in the equivalent form
which is precisely the information in the tableau. In other words, the rows
of the tableau provide us with the coefcients of the equality constraints
B
·
b= B
A×
At the end of each iteration, we need to update the tableau B
·
¸
b
A¦
and compute B
·
¸b
A¦ This can be accomplished by leftmultiplying the
Sec. 3. 3 Implementations of the simplex method uu
simplex tableau with a matrix Q satisfing QB· = B
·
As explained
earlier, this is the same as performing those elementary row operations that
turn B
·
to B
·
that is, we add to each row a multiple of the pivot row to
set all entries of the pivot column to zero, with the exception of the pivot
element which is set to one.
Regarding the determination of the exiting column A_
¸·,
and the
stepsize (* , Steps 1and ¯ in the summary of the simplex method amount
to the following: ·_¸,,
}o, is the ratio of the ith entry in the zeroth column
of the tableau to the ith entry in the pivot column of the tableau. We only
consider those i for which o, is positive. The smallest ratio is equal to (*
and determines f.
It is customary to augment the simplex tableau by including a top
row, to be referred to as the :c·/ c.o The entry at the top lef corner
contains the value

e�×_, which is the negative of the current cost . (The
reason for the minus sign is that it allows for a simple update rule, as will
be seen shortly. ) The rest of the zeroth row is the row vector of reduced
costs, that is, the vector c
'
= e'

e�B·A Thus, the structure of the
tableau is:
or, in more detail,

e�B
·
b e'

e�B· A
B· b B· A

e�×_ c
·
. «
·_
¸ : ,
B
· A
·
«
·_
~
,
·»
B
·
A
»
The rule for updating the zeroth row turns out to be identical to the
rule used for the other rows of the tableau: add a multiple of the pivot row
to the zeroth row to set the reduced cost of the entering variable to zero.
We will now verify that this update rule produces the correct results for
the zeroth row.
At the beginning of a typical iteration, the zeroth row is of the form
¸^
e'¦
¬
ç
'
¸b
A¦ ,
where ç' = e�B· Hence, the zeroth row is equal to ¸^
e'¦ plus a linear
combination of the rows of ¸b
A¦ Let column j be the pivot column, and
row f be the pivot row. Note that the pivot row is of the form h' ¸b
A¦ ,
where the vector h' is the fth row of B
· Hence, afer a multiple of the
¡uu Chap. 3 The simplex method
pivot row is added to the zeroth row, that row is again equal to ¸o
e' plus
a ¸diferent) linear combination of the rows of ¸b A
¦
, and is of the form
¸o
e
'
 p
¸
b
A
¦
,
for some vector p Recall that our update rule is such that the pivot column
entry of the zeroth row becomes zero, that is,
¸
¸
¸
,
p
A¸
¸
,
,
* ·¸ p
A¸ *
Consider now the B¸i) th column for i = £. ¸This is a column corresponding
to a basic variable that stays in the basis. ) The zeroth row entry of that
column is zero, before the change of basis, since it is the reduced cost of
a basic variable. Because B' A_
¸,,
is the ith unit vector and i = £, the
entry in the pivot row for that column is also equal to zero. Hence, adding
a multiple of the pivot row to the zeroth row of the tableau does not afect
the zeroth row entry of that column, which is left at zero. We conclude
that the vector p satisfes ·¸
¸ ,,
¯ p
A¸
¸, ,
* ofor every column A¸
¸ , ,
in
the new basis. This implies that � p B* u, and p*
�
B
·
Hence,
with our update rule, the updated zeroth row of the tableau is equal to
as desired.
We can now summarize the mechanics of the full tableau implemen
tation.
Anìteratìonoithemtabeauìmpementatìon
¡. A typical iteration starts with the tableau associated with a basis
matrix B and the corresponding basic feasible solution x.
z. Examine the reduced costs in the zeroth row of the tableau. If
they are all nonnegative, the current basic feasible solution is
optimal , and the algorithm terminates; else, choose some j for
which c¸ <
1. Consider the vector u * B
·
A¸ , which i s the jth column ¸the
pivot column) of the tableau. If no component of u is positive,
the optimal cost is 0, and the algorithm terminates.
:. For each i for which o, is positive, compute the ratio ·_
¸ , ,
}o,
Let £ be the index of a row that corresponds to the smallest ratio.
The column A_
¸
,
,
exits the basis and the column A¸ enters the
basis.
s. Add to each row of the tableau a constant multiple of the £th
row ¸the pivot row) so that o, ¸the pivot element) becomes one
and all other entries of the pivot column become zero.
Sec. 3. 3
Implementations of the simplex method
B = , a a ia,
c= ,a : a a,
�
D= , ta a a,
Fì,ure 1. a. The feasible set in Example 3. 5. Note that we
have fve extreme points. These are A = (0, 0, 0) with cost 0,
B = (0, 0, 10) with cost 120, C = (0, 10, 0) with cost 120,
L= ( 10, 0, 0) with cost 100, and E = ¸+, +, +)with cost 136. In
particular, E is the unique optimal solution.
L×ampe 1. s
Consider the problem
minimize lOXl 12x2 12x3
subject to Xl + 2X2 + 2X3
.
20
2Xl + X2 + 2X3
.
20
2Xl + 2X2 + X3
.
20
Xl , X2 , X3 2 0
The feasible set is shown in Figure 3 +
101
Afer introducing slack variables, we obtain the following standard form
problem:
minimize lOXl 12x2 12x3
subject to Xl + 2X2 + 2X3 + X4
20
2Xl + X2 + 2X3 + X5
20
2Xl + 2X2 + X3
+ X6
20
Xl , . . . , X
6
2 0.
Note that x = (0, 0, 0, 20, 20, 20) is a basic feasible solution and can be used to
start the algorithm. Let accordingly, B( l ) = +, B( 2) = 5, and B(3) = 6. The
¡uz Chap. 3 The simplex method
corresponding basis matrix is the identity matrix I To obtain the zeroth row of
the initial tableau, we note that CB = 0 and, therefore, C�XB = 0 and c = c.
Hence, we have the following initial tableau:
Xl X2 X3 X4 X5 X6
0 I0 I 2 I 2 0 0 0
20 I 2 2 I 0 0
X5 = 20 2· I 2 0 I 0
20 2 2 I 0 0 I
We note a few conventions in the format of the above tableau: the label Xi
on top of the ith column indicates the variable associated with that column. The
labels "Xi =" to the lef of the tableau tell us which are the basic variables and in
what order. For example, the frst basic variable XB
(
l
)
is X4 , and its value is 20
Similarly, XB
(
2
)
= X5 = 20, and XB
(
3
)
= X6 = 20 Strictly speaking, these labels
are not quite necessary. We know that the column in the tableau associated with
the frst basic variable must be the frst unit vector. Once we observe that the
column associated with the variable X4 is the frst unit vector, it follows that X4
is the frst basic variable.
We continue with our example. The reduced cost of Xl is negative and we
let that variable enter the basis. The pivot column is 1 = ¸ I , 2, 2) We form the
ratios XB
(
i
)
/Ui , i = I , 2, 3, the smallest ratio corresponds to i = 2 and i = 3 We
break this tie by choosing . = 2 This determines the pivot element , which we
indicate by an asterisk. The second basic variable XB
(
2
)
, which is X5 , exits the
basis. The new basis is given by 1¸ I ) = +, L¸2)= I , and L¸3) = 6. We multiply
the pivot row by ¯ and add it to the zeroth row. We multiply the pivot row by
IJ2and subtract it from the frst row. We subtract the pivot row from the third
row. Finally, we divide the pivot row by 2 This leads us to the new tableau:
Xl X2 X3 X4 X5 X6
I00 0 7 2 0 ¯ 0
X4 = I0 0 I ¯ I· I 0 ¯ 0
I0 I 0 ¯ I 0 0 ¯ 0
0 0 I I 0 I I
The corresponding basic feasible solution is x = ¸ I0, 0, 0, I0, 0, 0) In terms
of the original variables Xl , X2 , X3 , we have moved to point L = ¸ I0, 0, 0) in
Figure 3 + Note that this is a degenerate basic feasible solution, because the
basic variable X6 is equal to zero. This agrees with Figure 3 + where we observe
that there are four active constraints at point L
We have mentioned earlier that the rows of the tableau ¸other than the
zeroth row) amount to a representation of the equality constraints H¯
·
Ax =
H¯
i
b, which are equivalent to the original constraints Ax = b In our current
Sec. 3. 3 Implementations of the simplex method ¡u1
example, the tableau indicates that the equality constraints can be written i n the
equivalent form:
I0 Xl + 0. 5X2 + X3 + 0. 5X5
0 X2 X3 X5 + X6 .
We now return to the simplex method. With the current tableau, the
variables X2 and X3 have negative reduced costs. Let us choose X3 to be the one
that enters the basis. The pivot column is 1 = ¸ I , I , I ) Since U3 < 0, we only
form the ratios XB
(
i
)
JUi , for i = I , 2 There is again a tie, which we break by
letting £ = I , and the frst basic variable, X4 , exits the basis. The pivot element is
again indicated by an asterisk. After carrying out the necessary elementary row
operations, we obtain the following new tableau:
Xl X2 X3 X4 X5 X6
I20 0 + 0 2 + 0
X3 = I0 0 I ¯ I I 0 ¯ 0
0 I I 0 I I 0
I0 0 2 ¯· 0 I I ¯ I
In terms of Figure 3 +, we have moved to point L = ¸0, 0, I0) , and the cost
has been reduced to I20 At this point, X2 is the only variable with negative
reduced cost. We bring X2 into the basis, X6 exits, and the resulting tableau is:
Xl X2 X3 X4 X5 X6
I3ö 0 0 0 3 ö I ö I ö
X3 = + 0 0 I 0 + 0 + 0 ö
+ I 0 0 0 ö 0 + 0 +
+ 0 I 0 0 + 0 ö 0 +
We have now reached point E in Figure 3 + Its optimality is confrmed by
observing that all reduced costs are nonnegative.
In this example, the simplex method took three changes of basis to reach
the optimal solution, and it traced the path A L L E in Figure 3 + With
diferent pivoting rules, a diferent path would have been traced. Could the
simplex method have solved the problem by tracing the path A L E, which
involves only two edges, with only two iterations? The answer is no. The initial
and fnal bases difer in three columns, and therefore at least three basis changes
are required. In particular, if the method were to trace the path A L E, there
would be a degenerate change of basis at point L ,with no edge being traversed) ,
which would again bring the total to three.
¡ua Chap. 3 The simplex method
L×ampe 1. cThis example shows that the simplex method can indeed cycle.
We consider a problem described in terms of the following initial tableau.
., .. .· ., ., .
.
3 3J+ 20 IJ2 ö 0 0 0
0 IJ+
· 3 I 9 I 0 0
.
= 0 IJ2 I 2 IJ2 3 0 I 0
.= I 0 0 I 0 0 0 I
We use the following pivoting rules:
,a) We select a nonbasic variable with the most negative reduced cost c¸ to be
the one that enters the basis.
,b) Out of all basic variables that are eligible to exit the basis, we select the
one with the smallest subscript.
We then obtain the following sequence of tableaux ,the pivot element is indicated
by an asterisk) :
.
.
.. .· ., ., .
.
3 0 + ¯J2 33 3 0 0
0 I 32 + 3ö + 0 0
0 0 +
· 3J2 I ¯ 2 I 0
.= I 0 0 I 0 0 0 I
., ..
.· ., ., .
.
3 0 0 2 I3 I I 0
0 I 0 3
·
3+ I 2 3 0
..
= 0 0 I 3J3 I ¯J+ I J2 IJ+ 0
.= I 0 0 I 0 0 0 I
., .. .· ., ., .
.
3 IJ+ 0 0 3 2 3 0
0 IJ3 0 I 2IJ2 3J2 I 0
0 3Jö+ I 0 3JIö
· IJIö IJ3 0
.= I I J3 0 0 2IJ2 3J2 I I
Sec. 3. 3 Implementations of the simplex method 105
., .. .· ., ., .
.
3 IJ2 Iö 0 0 I I 0
.·= 0 ¯J2 ¯ö I 0 2· ö 0
0 IJ+ IöJ3 0 I IJ3 2J3 0
I ¯J2 ¯ö 0 0 2 ö I
., .. .· ., ., .
.
3 ¯J+ ++ IJ2 0 0 2 0
0 ¯J+ 23 IJ2 0 I 3 0
0 IJö + I Jö I 0 IJ3
·
0
.= I 0 0 I 0 0 0 I
., .. .· ., ., .
.
3 3J+ 20 I J2 ö 0 0 0
0 IJ+ 3 I 9 I 0 0
.
= 0 IJ2 I2 I J2 3 0 I 0
.= I 0 0 I 0 0 0 I
Afer six pivots, we have the same basis and the same tableau that we started
with. At each basis change, we had 0' = O. In particular, for each interme
diate tableau, we had the same feasible solution and the same cost. The same
sequence of pivots can be repeated over and over, and the simplex method never
terminates.
Comparison of the full tableau and the revised simplex
methods
Let us pretend that the problem is changed to
minimize c'x+ 0'
Y
subject to Ax+ !y * b
x, y � 0.
We implement the simplex method on this new problem, except that we
never allow any of the components of the vector y to become basic. Then,
the simplex method performs basis changes as if the vector ywere entirely
¡uc Chap. 3 The simplex method
absent . Note also that the vector of reduced costs in the augmented problem
is
Thus, the simplex tableau for the augmented problem takes the form

c�B' b c
'

c�B'
B' b B' A B'
In particular, by following the mechanics of the full tableau method on the
above tableau, the inverse basis matrix B' is made available at each iter
ation. We can now think of the revised simplex method as being essentially
the same as the full tableau method applied to the above augmented prob
lem, except that the part of the tableau containing B' Ais never formed
explicitly; instead, once the entering variable ·¸is chosen, the pivot column
B' A¸ is computed on the fy. Thus, the revised simplex method is j ust
a variant of the full tableau method, with more efcient bookkeeping. If
the revised simplex method also updates the zeroth row entries that lie on
top of B' ,by the usual elementary operations) , the simplex multipliers
p' = c�B' become available, thus eliminating the need for solving the
linear system p' B* c� at each iteration.
We now discuss the relative merits of the two methods. The full
tableau method requires a constant (and small) number of arithmetic op
erations for updating each entry of the tableau. Thus, the amount of com
putation per iteration is proportional to the size of the tableau, which is
O(mn) . The revised simplex method uses similar computations to update
B' and c�B' , and since only O( m
2
) entries are updated, the compu
tational requirements per iteration are O( m
2
) . In addition, the reduced
cost of each variable x.i can be computed by forming the inner product
p'A¸ ,which requires O(m) operations. In the worst case, the reduced cost
of every variable is computed, for a total of O( mn) computations per it
eration. Since m � n, the worstcase computational efort per iteration is
O( mn+m
2
) = O(mn) , under either implementation. On the other hand, if
we consider a pivoting rule that evaluates one reduced cost at a time, until
a negative reduced cost is found, a typical iteration of the revised simplex
method might require a lot less work. In the best case, if the frst reduced
cost computed is negative, and the corresponding variable is chosen to en
ter the basis, the total computational efort is only O( m
2
) . The conclusion
is that the revised simplex method cannot be slower than the full tableau
method, and could be much faster during most iterations.
Another important element in favor of the revised simplex method
is that memory requirements are reduced from O(mn) to O( m
2
) . As n is
often much larger than m, this efect can be quite signifcant . It could be
counterargued that the memory requirements of the revised simplex method
Sec. 3. 3 Implementations of the simplex method ¡uz
are also O(mn) because of the need to store the matrix A However, in
most large scale problems that arise in applications, the matrix A is very
sparse ¸has many zero entries) and can be stored compactly. ¸Note that the
sparsity of Adoes not usually help in the storage of the full simplex tableau
because even if Aand B are sparse, B' Ais not sparse, in general. )
We summarize this discussion i n the following table:
Fu tabeau uevìsed sìmpe×
Memory O(mn) O(m
2
)
Worstcasetìme O(mn) O(mn)
Bestcasetìme O(mn) O(m
2
)
Iabe 1. ¡. Comparison of the full tableau method and revised
simplex. The time requirements refer to a single iteration.
Practical performance enhancements
Practical implementations of the simplex method aimed at solving problems
of moderate or large size incorporate a number of additional ideas from
numerical linear algebra which we briefy mention.
The frst idea is related to ctoccst.o Recall that at each iteration
of the revised simplex method, the inverse basis matrix B
·
is updated
according to certain rules. Each such iteration may introduce roundof
or truncation errors which accumulate and may eventually lead to highly
inaccurate results. For this reason, it is customary to recompute the matrix
B' from scratch once in a while. The efciency of such reinversions can be
greatly enhanced by using suitable data structures and certain techniques
from computational linear algebra.
Another set of ideas is related to the way that the inverse basis matrix
B' is represented. Suppose that a reinversion has been just carried out
and B' is available. Subsequent to the current iteration of the revised
simplex method, we have the option of generating explicitly and storing
the new inverse basis matrix B
'
An alternative that carries the same
information, is to store a matrix Q such that QB
·
= B
·
. Note that Q
basically prescribes which elementary row operations need to be applied to
B' in order to produce B
'
It is not a full matrix, and can be completely
specifed in terms of m coefcients: for each row, we need to know what
multiple of the pivot row must be added to it .
Suppose now that we wish to solve the system Bu= A¸ for u, where
A¸ is the entering column, as is required by the revised simplex method.
We have u= B
'
A¸ = QB' A¸ , which shows that we can frst compute
¡us Chap. 3 The simplex method
B' A¸ and then lefmultiply by Q (equivalently, apply a sequence of el
ementary row operations) to produce u. The same idea can also be used
to represent the inverse basis matrix afer several simplex iterations, as a
product of the initial inverse basis matrix and several sparse matrices like
Q.
The last idea we mention i s the following. Subsequent to a "rein
version, " one does not usually compute B
·
explicitly, but B' is instead
represented in terms of sparse triangular matrices with a special structure.
The methods discussed in this subsection are designed to accomplish
two objectives: improve numerical stability (minimize the efect of roundof
errors) and exploit sparsity in the problem data to improve both running
time and memory requirements. These methods have a critical efect in
practice. Besides having a better chance of producing numerically trust
worthy results, they can also speed up considerably the running time of
the simplex method. These techniques lie much closer to the subject of
numerical linear algebra, as opposed to optimization, and for this reason
we do not pursue them in any greater depth.
3. 4 Anticycling: lexicography and Bland' s
rule
In this section, we discuss anticycling rules under which the simplex method
is guaranteed to terminate, thus extending Theorem I. I to degenerate prob
lems. As an important corollary, we conclude that if the optimal cost is f
nite, then there exists an optimal basis, that is, a basis satisfing B' b¸u
and (' = c'

c�B
·
A ¸u
Lexicography
We present here the lexicographic pivoting rule and prove that it prevents
the simplex method fom cycling. Historically, this pivoting rule was de
rived by analyzing the behavior of the simplex method on a nondegenerate
problem obtained by means of a small perturbation of the righthand side
vector b This connection is pursued in Exercise I l ã
We start with a defnition.
Deñnìtìon1. s A vector u E �n is said to be e×ìcographicay
arger (or smaer) than another vector v E P
»
if u = v and the
frst nonzero component ofu  v is positive (or negative, respectively).
Symbolically, we write
J
u > v or
J
u < v.
Sec. 3. 4 Anticyc1ing: lexcography and Bland's rule ¡uu
For example,
,^, z, I
^,
1
> ,^, z, i , 1) ,
, ^, ^,
1
, i , z, . 1, ã, < z, i ,
Le×ìcoçraphìcpìvotìnç rue
¡ . Choose an entering column n
,
arbitrarily, as long as its reduced
cost , is negative. Let u " B·n
,
be the jth column of the
tableau.
z. For each i with ·,> o, divide the ith row of the tableau ¸includ
ing the entry in the zeroth column) by ·,and choose the lexico
graphically smallest row. If row f is lexicographically smallest,
then the fth basic variable ¿¸,¸ exits the basis.
L×ampe 1. z Consider the following tableau ,the zeroth row is omitted) , and
suppose that the pivot column is the third one (j = 3)
I 0 ¯ 3
2 + ö I
3 0 ¯ o
Note that there is a tie in trying to determine the exiting variable because
a¡¸ , ¸ Ju, = IJ3and a¡¸
+
¸ Ju
+
= 3Jo = IJ3 We divide the frst and third rows of
the tableau by u, = 3 and u
+
= o, respectively, to obtain:
IJ3 0 ¯J3 I
+ + + +
IJ3 0 ¯Jo I
The tie between the frst and third rows is resolved by performing a lexicographic
comparison. Since ¯Jo < ¯J3, the third row is chosen to be the pivot row, and
the variable a¡¸
+
¸ exits the basis.
We note that the lexicographic pivoting rule always leads to a unique
choice for the exiting variable. Indeed, if this were not the case, two of the
rows in the tableau would have to be proportional . But if two rows of the
matrix B·nare proportional , the matrix B
·
nhas rank smaller than m
and, therefore, nalso has rank less than m, which contradicts our standing
assumption that nhas linearly independent rows.
¡¡u Chap. 3 The simplex method
Iheorem 1.a Suppose that the simplex algorithm starts with all the
rows in the simplex tableau, other than the zeroth row, lexicographi
cally positive. Suppose that the lexicographic pivoting rule is followed.
Then:
(a) Every row of the simplex tableau, other than the zeroth row,
remains lexicographically positive throughout the algorithm.

b
)
The zeroth row strictly increases lexicographically at each itera
tion.

c
)
The simplex method terminates after a fnite number of itera
tions.
Prooi.

a
)
Suppose that all rows of the simplex tableau, other than the zeroth
row, are lexicographically positive at the beginning of a simplex iter
ation. Suppose that ,enters the basis and that the pivot row is the
.th row. According to the lexicographic pivoting rule, we have u
·
>
^
and
(.th row)
�
(ith row)
, if i = . and u
, >
´
'¹
u
·
u
,
To determine the new tableau, the .th row is divided by the positive
pivot element u
·
and, therefore, remains lexicographically positive.
Consider the ith row and suppose that u
,
< In order to zero the
( i, j) th entry of the tableau, we need to add a positive multiple of
the pivot row to the ith row. Due to the lexicographic positivity of
both rows, the ith row will remain lexicographically positive after this
addition. Finally, consider the ith row for the case where u
, >
^and
i = .. We have
u
(new ith row) = (old ith row)  �(old .th row) .
u
·
Because of the lexicographic inequality
´
.
'¹
, which is satisfed by the
old rows, the new ith row is also lexicographically positive.

b
)
At the beginning of an iteration, the reduced cost in the pivot column
is negative. In order to make it zero, we need to add a positive
multiple of the pivot row. Since the latter row is lexicographically
positive, the zeroth row increases lexicographically.

c
)
Since the zeroth row increases lexicographically at each iteration, it
never returns to a previous value. Since the zeroth row is determined
completely by the current basis, no basis can be repeated twice and
the simplex method must terminate after a fnite number of iterations.
.
The lexicographic pivoting rule is straightforward to use if the simplex
method is implemented in terms of the full tableau. It can also be used
Sec. 3. 5 Finding an initial basic feasible solution ¡¡¡
in conj unction with the revised simplex method, provided that the inverse
basis matrix B
·
is formed explicitly ,see Exercise I lc, On the other
hand, in sophisticated implementations of the revised simplex method, the
matrix B
·
is never computed explicitly, and the lexicographic rule is not
really suitable.
We fnally note that in order to apply the lexicographic pivoting rule,
an initial tableau with lexicographically positive rows is required. Let us
assume that an initial tableau is available ,methods for obtaining an initial
tableau are discussed in the next section) . We can then rename the vari
ables so that the basic variables are the frst Î ones. This is equivalent
to rearranging the tableau so that the frst Î columns of B' Aare the Î
unit vectors. The resulting tableau has lexicographically positive rows, as
desired.
Bland' s rule
The smallest subscript pivoting rule, also known as Bland' s rule, is as fol
lows.
smaest subscrìpt pìvotìnç rue
¡ . Find the smallest j for which the reduced cost c¸ i s negative and
have the column A¸ enter the basis.
z. Out of all variables +
,
that are tied i n the test for choosing an
exiting variable, select the one with the smallest value of ·
This pivoting rule i s compatible with an implementation of the re
vised simplex method in which the reduced costs of the nonbasic variables
are computed one at a time, in the natural order, until a negative one is
discovered. Under this pivoting rule, it is known that cycling never occurs
and the simplex method is guaranteed to terminate after a fnite number
of iterations.
3. 5 Finding an initial basic feasible solution
In order to start the simplex method, we need to fnd an initial basic feasible
solution. Sometimes this is straightforward. For example, suppose that we
are dealing with a problem involving constraints of the form A×_ b, where
b ¸ 0. We can then introduce nonnegative slack variables s and rewrite
the constraints in the form A×+ s= b The vector ,×,s,defned by ×= u
and s= bis a basic feasible solution and the corresponding basis matrix is
the identity. In general , however, fnding an initial basic feasible solution
is not easy and requires the solution of an auxiliary linear programming
problem, as will be seen shortly.
112 Chap. 3 The simplex method
Consider the problem
minimize c' x
subject to nx b
x > a
By possibly multiplying some of the equality constraints by l , we can
assume, without loss of generality, that b� a We now introduce a vector
y = J
'
of ac·t¡·ta cactas and use the simplex method to solve the
auxiliary problem
minimize
Ji
+
J¹
+
+
J
'
subject to nx + y b
x > u
y > a
Initialization i s easy for the auxiliary problem: by letting x= u and
y = b, we have a basic feasible solution and the corresponding basis matrix
is the identity.
If x is a feasible solution to the original problem, this choice of 7
together with y = u, yields a zero cost solution to the auxiliary problem.
Therefore, if the optimal cost in the auxiliary problem is nonzero, we con
clude that the original problem is infeasible. If on the other hand, we obtain
a zero cost solution to the auxiliary problem, it must satisfy y = u, and 7
is a feasible solution to the original problem.
At this point , we have accomplished our objectives only partially. We
have a method that either detects infeasibility or fnds a feasible solution to
the original problem. However, in order to initialize the simplex method for
the original problem, we need a basic feasible solution, an associated basis
matrix B, and  depending on the implementation  the corresponding
tableau. All this is straightforward if the simplex method, applied to the
auxiliary problem, terminates with a basis matrix B consisting exclusively
of columns of n We can simply drop the columns that correspond to the
artifcial variables and continue with the simplex method on the original
problem, using B as the starting basis matrix.
Driving artifcial variables out of the basis
The situation is more complex if the original problem is feasible, the simplex
method applied to the auxiliary problem terminates with a feasible solution
x·to the original problem, but some of the artifcial variables are in the
fnal basis. ¸Since the fnal value of the artifcial variables is zero, this
implies that we have a degenerate basic feasible solution to the auxiliary
problem. ) Let k be the number of columns of nthat belong to the fnal basis
(k < m) and, without loss of generality, assume that these are the columns
n,
, , n,
., ¸In particular, ,
, , ,
.,
are the only variables
Sec. 3. 5 Finding an initial basic feasible solution ¡¡1
that can be at nonzero level. , Note that the columns A_
¸ , ¸ , ,A_
¸
¡¸ must
be linearly independent since they are part of a basis. Under our standard
assumption that the matrix A has full rank, the columns of A span �
~
,
and we can choose Î k additional columns A_
¸
¡,
·
¸ , ,A_
¸
~
¸ of A, to
obtain a set of Î linearly independent columns, that is, a basis consisting
exclusively of columns of A With this basis, all nonbasic components of
×· are at zero level, and it follows that ×· is the basic feasible solution
associated with this new basis as well . At this point , the artifcial variables
and the corresponding columns of the tableau can be dropped.
The procedure we have j ust described is called 1ctctoç ·/ ac·t¡·ta
cactas .o· .) ·/ asts , and depends crucially on the assumption that the
matrix Ahas rank Î. After all , if Ahas rank less than Î¸ constructing a
basis for �
~
using the columns of Ais impossible and there exist redundant
equality constraints that must be eliminated, as described by Theorem z ã
i n Section z I All of the above can be carried out mechanically, i n terms
of the simplex tableau, in the following manner.
Suppose that the lth basic variable is an artifcial variable, which is
in the basis at zero level. We examine the lth row of the tableau and fnd
some j such that the lth entry of B
·
A¸ is nonzero. We claim that A¸
is linearly independent from the columns A_
¸ , ¸ , ,A_
¸
¡¸ To see this,
note that B
·
A_
¸
, ¸ = e, , t = l , . . . , k, and since k < l, the lth entry of
these vectors is zero. It follows that the lth entry of any linear combination
of the vectors B
·
A_
¸ , ¸ , , B
·
A_
¸
¡¸ is also equal to zero. Since the
lth entry of B
·
A¸ is nonzero, this vector is not a linear combination
of the vectors B
·
A_
¸
·
¸ , ,B' A_
¸
¡¸ Equivalently, A¸ is not a linear
combination of the vectors A_
¸ , ¸ , ,A_
¸
¡¸ , which proves our claim. We
now bring A¸ into the basis and have the lth basic variable exit the basis.
This is accomplished in the usual manner: perform those elementary row
operations that replace B
·
A¸ by the lth unit vector. The only diference
from the usual mechanics of the simplex method is that the pivot element
,the lth entry of B
·
A¸ , could be negative. Because the lth basic variable
was zero, adding a multiple of the lth row to the other rows does not change
the values of the basic variables. This means that after the change of basis,
we are still at the same basic feasible solution to the auxiliary problem,
but we have reduced the number of basic artifcial variables by one. We
repeat this procedure as many times as needed until all artifcial variables
are driven out of the basis.
Let us now assume that the lth row of B' A is zero, in which case
the above described procedure fails. Note that the lth row of B
¬
·
A is
equal to ,
A, where ,
is the lth row of B
·
Hence, ,
A = 0' for some
nonzero vector ,, and the matrix Ahas linearly dependent rows. Since we
are dealing with a feasible problem, we must also have ,
b= 0. Thus, the
constraint ,
A×= ,
bis redundant and can be eliminated , cf. Theorem z ã
i n Section z I, . Since this constraint i s the information provided by the lth
row of the tableau, we can eliminate that row and continue from there.
¡¡: Chap. 3 The simplex method
L×ampe 1. s Consider the linear programming problem:
minimize .,  .
.  .·
subject to ., 
z..  ·.·
., + 2.. 
.·
+..  a.·
·.·
., , , .,�a
 .,
3
2
¯
I
In order to fnd a feasible solution, we form the auxiliary problem
minimize .,  .

. 
.
subject to ., + 2.
.  3.·  .,
., 
2.
.  .·  .
+..  a.· 
.
3.·  .,  .
., , . �a
3
2
¯
I
. basic feasible solution to the auxiliary problem is obtained by letting
¸., ,.
,. ,
)

b

¸ 3, 2, ¯, I) The corresponding basis matrix is the identity.
Frthermore, we have L@

¸ I , I , I , I ) We evaluate the reduced cost of each one
of the original variables ., which is L�A¸ , and form the initial tableau:
., .. .· ., ., .
. .
I I 0 3 2I I 0 0 0 0
3 I 2 3 0 I 0 0 0
2 I 2  0 0 I 0 0
.
¯ 0 + a 0 0 0 I 0
.
I 0 0 3 I· 0 0 0 I
We bring .,into the basis and have Xs exit the basis. The basis matrix H is still
the identity and only the zeroth row of the tableau changes. We obtain:
., .. .· ., ., .
. .
I0 0 3 I3 0 0 0 0 I
3 I 2 3 0 I 0 0 0
2 I 2  0 0 I 0 0
¯ 0 + a 0 0 0 I 0
I 0 0 3
· I 0 0 0 I
Sec. 3. 5 Finding an initial basic feasible solution ¡¡s
We now bring .·into the basis and have ., exit the basis. The new tableau is:
., .. .· ., .
.
.
. .
+ a s a a a a :
..
2 I 2 a I I a a I
a I 2· a 2 a I a 2
2 a + a 3 a a I 3
IJ3 a a I IJ3 a a a IJ3
We now bring .
.
into the basis and .
exits. Note that this i s a degenerate pivot
with 0'

O. The new tableau is:
., .. .· ., .
.
.
. .
+ + a a 2 a + a I
.
.
2 2· a a I I I a I
.. a IJ2 I a I a IJ2 a I
. 
2 2 a a I a 2 I I
IJ3 a a I IJ3 a a a IJ3
We now have ., enter the basis and .
.
exit the basis. We obtain the following
tableau:
., .. .· ., .
.
.
. .
a a a a a 2 2 a I
I I a a IJ2 IJ2 IJ2 a IJ2
IJ2 a I a 3J+ IJ+ IJ+ a 3J+
a a a a a I I I a
IJ3 a a I IJ3 a a a IJ3
Note that the cost in the auxiliary problem has dropped to zero, indicating that
we have a feasible solution to the original problem. However, the artifcial variable
.is still in the basis, at zero level. In order to obtain a basic feasible solution
to the original problem, we need to drive .out of the basis. Note that .is the
third basic variable and that the third entry of the columns H¯
·
A, ,)

I, ,+,
associated with the original variables, is zero. This indicates that the matrix
A has linearly dependent rows. At this point, we remove the third row of the
tableau, because it corresponds to a redundant constraint, and also remove all of
the artifcial variables. This leaves us with the following initial tableau for the
¡¡c Chap. 3 The simplex method
original problem:
., .. ., .,
+ + + + +
.,= I I 0 0 IJ2
IJ2 0 I 0 3J+
.,= IJ3 0 0 I IJ3
We may now compute the reduced costs of the original variables, fll i n the ze
roth row of the tableau, and start executing the simplex method on the original
problem.
We observe that in this example, the artifcial variable .,was unnecessary.
Instead of starting with ., = I , we could have started with ., = I thus elimi
nating the need for the frst pivot . More generally, whenever there is a variable
that appears in a single constraint and with a positive coefcient ,slack variables
being the typical example, , we can always let that variable be in the initial basis
and we do not have to associate an artifcial variable with that constraint .
The twophase simplex method
We can now summarize a complete algorithm for linear programming prob
lems in standard form.
Phase I.
¡. By multiplying some of the constraints by l , change the prob
lem so that b_ 0
2. Introduce artifcial variables µ_ , . . , µ
·
, i f necessary, and apply
the simplex method to the auxiliary problem with cost __
·
µ,
1. If the optimal cost i n the auxiliary problem i s positive, the orig
inal problem is infeasible and the algorithm terminates.
a. If the optimal cost in the auxiliary problem is zero, a feasible
solution to the original problem has been found. If no artifcial
variable is in the fnal basis, the artifcial variables and the cor
responding columns are eliminated, and a feasible basis for the
original problem is available.
s. If the eth basic variable i s an artifcial one, examine the eth entry
of the columns B
A¸ , j = l , . . . , n. If all of these entries are
zero, the eth row represents a redundant constraint and is elimi
nated. Otherwise, if the eth entry of the jth column is nonzero,
apply a change of basis ¸with this entry serving as the pivot
Sec. 3. 5 Finding an initial basic feasible solution ¡¡z
element, : the £th basic variable exits and 
,
enters the basis.
Repeat this operation until all artifcial variables are driven out
of the basis.
PhaseII.
¡. Let the fnal basis and tableau obtained from Phase I be the
initial basis and tableau for Phase II.
z. Compute the reduced costs of all variables for this initial basis,
using the cost coefcients of the original problem.
1. Apply the simplex method to the original problem.
The above twophase algorithm is a complete method, in the sense
that it can handle all possible outcomes. As long as cycling is avoided ,due
to either nondegeneracy, an anticycling rule, or luck, , one of the following
possibilities will materialize:
,a, If the problem is infeasible, this is detected at the end of Phase I.
,b) If the problem i s feasible but the rows of A are linearly dependent ,
this is detected and corrected at the end of Phase I, by eliminating
redundant equality constraints.
,c, If the optimal cost i s equal to 0, this is detected while running
Phase II.
,d, Else, Phase II terminates with an optimal solution.
The bigM method
We close by mentioning an alternative approach, the tç·' o·/.1, that
combines the two phases into a single one. The idea is to introduce a cost
function of the form
,.
,
.
where ' i s a large positive constant , and where ,, are the same artifcial
variables as in Phase I simplex. For a sufciently large choice of ', if the
original problem is feasible and its optimal cost is fnite, all of the artifcial
variables are eventually driven to zero ,E�ercise I. zc, , which takes us back
to the minimization of the original cost function. In fact , there is no reason
for fxing a numerical value for ' We can leave ' as an undetermined
parameter and let the reduced costs be functions of M. Whenever ' is
compared to another number ,in order to determine whether a reduced cost
is negative, , ' will be always treated as being larger.
118
Chap. 3 The simplex method
L×ampe 1. uWe consider the same linear programming problem U in Exam
ple 3. 8:
minimize ., + .. + .·
subject to ., + z.. + ·.· 3
., + z.. + .·
2
+.. + a.· ¯
·.· + ., I
., , ,., ¸a
We use the bigM method i n conjunction with the following auxiliary problem,
in which the unnecessary artifcial variable .
is omitted.
minimize ., + .. + .· + :., + :.
+ :.
subject to ., + z.. + ·.· + ., 3
., + z.. + .· + .
2
+.. + a.· + . ¯
·.· + .,
I
., , ,.¸a
. basic feasible solution to the auxiliary problem is obtained by letting
,., ,.
,., .,, = b = ¸ 3, 2, ¯, I ) The corresponding basis matrix is the identity.
Frthermore, we have c¡ = ( M, M, M, 0) . We evaluate the reduced cost of each
one of the original variables Xi , which is c
·
 c�A
·
, and form the initial tableau:
., .. .· ., ., .
.
10M I 8M+ l 18M + I 0 0 0 0
.,= 3 I 2 3 0 I 0 0
.
= 2 I 2  0 0 I 0
.= ¯ 0 + a 0 0 0 I
.,= I 0 0 3
· I 0 0 0
The reduced cost of .· is negative when M is large enough. We therefore bring
.· into the basis and have ., exit. Note that in order to set the reduced cost
of .·to zero, we need to multiply the pivot row by 6M IJ3and add it to the
zeroth row. The new tableau is:
., .. .· ., ., .
.
4M IJ3 I 8M+ l 0 6M IJ3 0 0 0
.,= 2 I 2 0 1 1 0 0
.
= 0 I 2
· 0 2 0 I 0
.= 2 0 + 0 3 0 0 1
.·= IJ3 0 0 I IJ3 0 0 0
The reduced cost of .
.
is negative when M is large enough. We therefore bring
..into the basis and .
exits. Note that this is a degenerate pivot with 0' = O
Sec. 3. 5 Column geometry and the simplex method ¡¡u
The new tableau is:
., .. .· ., ., .
.
I 3 2 I
+M   +M+
 a a 2M+  a +M   a
3 2 3 2
2 2· a a I I I a
..
a IJ2 I a I a IJ2 a
.
2 2 a a I a 2 I
IJ3 a a I IJ3 a a a
We now have .
,
enter and .,exit the basis. We obtain the following tableau:
.
.
.. .· ., ., .
.
I IJö a a a IJI2 2M 3J+ 2M+ IJ+ a
I I a a IJ2 IJ2 IJ2 a
..
IJ2 a I a 3J+ IJ+ IJ+ a
.
a a a a a I I I
IJ3 a a I IJ3
·
a a a
We now bring .,into the basis and .· exits. The new tableau is:
., .. .· ., ., .
.
¯J+ a a IJ+ a 2M  3J+ 2M+IJ+ a
IJ2 I a 3J2 a IJ2 IJ2 a
..
¯J+ a I oJ+ a IJ+ IJ+ a
.
a a a a a I I I
.,
I a a 3 I a a a
With M large enough, all of the reduced costs are nonnegative and we have
an optimal solution to the auxiliary problem. In addition, all of the artifcial
variables have been driven to zero, and we have an optimal solution to the original
problem.
3. 6 Column geometry and the simplex
method
In this section, we introduce an alternative way of visualizing the workings
of the simplex method. This approach provides some insights into why the
¡zu Chap. 3 The simplex method
simplex method appears to be efcient in practice.
We consider the problem
minimize c'×
subj ect to A× b
e'× l
,I c,
× > u,
where A i s an m x n matrix and e i s the ndimensional vector with all
components equal to one. Although this might appear to be a special type
of a linear programming problem, it turns out that every problem with a
bounded feasible set can be brought into this form ,Exercise I z·, The
constraint e'× = l is called the ·.oct·, ·.os·cto·. We also introduce
an auxiliary variable × defned by × = c'× If A, ,A
,
, . . . , A
^
are the o
columns of A, we are dealing with the problem of minimizing × subj ect to
the nonnegativity constraints ×� u, the convexity constraint _�
¸
·
·, = l,
and the constraint
In order to capture this problem geometrically, we view the horizontal
plane as an mdimensional space containing the columns of A, and we
view the vertical axis as the onedimensional space associated with the cost
components ·, Then, each point in the resulting threedimensional space
corresponds to a point ,A, ,·, , see Figure I ã
In this geometry, our objective i s to construct a vector , b, ×) , which
is a convex combination of the vectors ,A, ,·, , , such that × is as small as
possible. Note that the vectors of the form , b, ×) lie on a vertical line, which
we call the ·,otcoo· to, and which intersects the horizontal plane at
b If the requirement line does not intersect the convex hull of the points
,A, ,·, , , the problem is infeasible. If it does intersect it , the problem is
feasible and an optimal solution corresponds to the lowest point in the
intersection of the convex hull and the requirement line. For example, in
Figure I c, the requirement line intersects the convex hull of the points
,A, ,·, , the point G corresponds to an optimal solution, and its height is
the optimal cost .
We now need some terminology.
Deñnìtìon1. c

a
)
Acollection of vectors y
,
,
,
. ,y
·+i
in �
^
are said to be amney
ìndependent if the vectors y
,
_
y
·1 _
y
,
y
·
·
,
¸ ¸ ¸
,
y
·_
y
·
·
are linearly independent. (Note that we must have k _ n. )

b
)
The convex hull of k + l afnely independent vectors in �¨ is
called a kdimensional sìmpe×
Sec. 3. 6 Column geometry and the simplex method 121
Z
Fì,ure 1. s. The column geometry.
Thus, three points are either collinear or they are afnely independent
and determine a twodimensional simplex (a triangle) . Similarly, four points
either lie on the same plane, or they are afnely independent and determine
a threedimensional simplex (a pyramid) .
Let us now give an interpretation of basic feasible solutions to prob
lem , I c, in this geometry. Since we have added the convexity constraint ,
we have a total of m+ l equality constraints. Thus, a basic feasible solution
is associated with a collection of m + l linearly independent columns ,A, ,l ,
of the linear programming problem , I c, These are i n turn associated with
m + l of the points ,A, ,c, ) , which we call ast· j.to·s, the remaining points
,A, ,c, ) are called the o.oast· j.to·s. It is not hard to show that the m+ l
basic points are afnely independent (Exercise I. z+, and, therefore, their
convex hull is an mdimensional simplex, which we call the ast· stoj
Let the requirement line intersect the mdimensional basic simplex at some
point , b, Z) . The vector of weights ,used in expressing , b, Z) as a convex
combination of the basic points, is the current basic feasible solution, and Z
represents its cost . For example, in Figure I c, the shaded triangle CL1 is
the basic simplex, and the point ¬corresponds to a basic feasible solution
associated with the basic points C, L, and F.
Let us now interpret a change of basis geometrically. In a change of
basis, a new point ,A¸ ,c¸ ) becomes basic, and one of the currently basic
points is to become nonbasic. For example, in Figure I c, if C, L, 1,
are the current basic points, we could make point B basic, replacing 1
122 Chap. 3 The simplex method
Z
B
Fì,ure 1. c. Feasibility and optimality in the column geometry.
¸even though this turns out not to be proftable) . The new basic simplex
would be the convex hull of B, C, L, and the new basic feasible solution
would correspond to point I. Alternatively, we could make point E basic,
replacing C, and the new basic feasible solution would now correspond to
point G. Afer a change of basis, the intercept of the requirement line with
the new basic simplex is lower, and hence the cost decreases, if and only
if the new basic point is below the plane that passes through the old basic
points; we refer to the latter plane as the 1oa jao For example, point
E is below the dual plane and having it enter the basis is proftable; this is
not the case for point B In fact, the vertical distance from the dual plane
to a point n, , ,is equal to the reduced cost of the associated variable ,
¸Exercise + +^, requiring the new basic point to be below the dual plane
is therefore equivalent to requiring the entering column to have negative
reduced cost .
We discuss next the selection of the basic point that will exit the
basis. Each possible choice of the exiting point leads to a diferent basic
simplex. These m basic simplices, together with the original basic simplex
¸before the change of basis) form the boundary ¸the faces) of an ,m + l,
dimensional simplex. The requirement line exits this ,m + l )dimensional
simplex through its top face and must therefore enter it by crossing some
other face. This determines which one of the potential basic simplices will
be obtained afer the change of basis. In reference to Figure + I, the basic
Sec. 3. 6 Column geometry and the simplex method 123
points C, D, F, determine a twodimensional basic simplex. If point E
is to become basic, we obtain a threedimensional simplex (pyramid) with
vertices C, D, E, F. The requirement line exits the pyramid through its
top face with vertices C, D, F. It enters the pyramid through the face with
vertices D, E, F; this is the new basic simplex.
We can now visualize pivoting through the following physical analogy.
Think of the original basic simplex with vertices C, D, F, as a solid object
anchored at its vertices. Grasp the corner of the basic simplex at the vertex
C leaving the basis, and pull the corner down to the new basic point E.
While so moving, the simplex will hinge, or jtc.·, on its anchor and stretch
down to the lower position. The somewhat peculiar terms (e. g. , "simplex" ,
"pivot" ) associated with the simplex method have their roots in this column
geometry.
L×ampe 3. 10 Consider the problem illustrated in Figure 3. ¯,in which m = 1 ,
and the following pivoting rule: choose a point ,n, ,., , below the dual plane to
become basic, whose vertical distance fom the dual plane is largest. According to
Exercise 3. 30, this is identical to the pivoting rule that selects an entering variable
with the most negative reduced cost. Starting fom the initial basic simplex
consisting of the points ,n, ,., , , ,n
,.
, , the next basic simplex is determined
by the points ,n, ,.,, , ,n, ,., , , and the next one by the points ,n, ,., , , ,n
,
,
In particular, the simplex method only takes two pivots in this case. This example
indicates why the simplex method may require a rather small number of pivots,
even when the number of underlying variables is large.
<
ö
3
• 1
¯
•
next bais
b
Fìgure 3. 7: The simplex method fnds the optimal basis afer
two iterations. Here, the point indicated by a number i corre
sponds to the vector ,n, ,., ,
¡za Chap. 3 The simplex method
3. 7 Computational efciency of the simplex
method
The computational efciency of the simplex method is determined by two
factors:
,a) the computational efort at each iteration;
,b) the number of iterations.
The computational requirements of each iteration have already been dis
cussed in Section + +
For example, the full tableau implementation needs
O(mn) arithmetic operations per iteration; the same is true for the revised
simplex method in the worst case. We now turn to a discussion of the
number of iterations.
The number of iterations in the worst case
Although the number of extreme points of the feasible set can increase
exponentially with the number of variables and constraints, it has been
observed in practice that the simplex method typically takes only O(m)
pivots to fnd an optimal solution. Unfortunately, however, this practical
observation is not true for every linear programming problem. We will
describe shortly a family of problems for which an exponential number of
pivots may be required.
Recall that for nondegenerate problems, the simplex method always
moves from one vertex to an adjacent one, each time improving the value
of the cost function. We will now describe a polyhedron that has an ex
ponential number of vertices, along with a path that visits all vertices, by
taking steps from one vertex to an adj acent one that has lower cost . Once
such a polyhedron is available, then the simplex method  under a pivoting
rule that traces this path  needs an exponential number of pivots.
Consider the unit cube i n �n , defned by the constraints
t = 1 , . . . , no
The unit cube has 2
n
vertices ,for each t , we may let either one of the two
constraints 0 _ , or , _ 1 become active) . Frthermore, there exists a
path that travels along the edges of the cube and which visits each vertex
exactly once; we call such a path a sjaootoç ja·/ It can be constructed
according to the procedure illustrated in Figure + ·
Let us now introduce the cost function X
n
 Half of the vertices of
the cube have zero cost and the other half have a cost of l Thus, the cost
cannot decrease strictly with each move along the spanning path, and we do
not yet have the desired example. However, if we choose some , ( 0, 1/2)
and consider the perturbation of the unit cube defned by the constraints
_ 
.
_ 1 ,
,
_ ,_ l ,
.
t = 2, . . . , n,
,+
.,
,+ ·,
Sec. 3. 7 Computational efciency of the simplex method
¸ a) , b)
Fì,ure1. s. , a) .spanning path pz in the twodimensional cube.
¸b) . spanning path
). in the threedimensional cube. Notice
that this path is obtained by splitting the threedimensional cube
into two twodimensional cubes, following path pz in one of them,
moving to the other cube, and following pz in the reverse order.
This construction generalizes and provides a recursive defnition of
a spanning path for the general ndimensional cube.
¡zs
then i t can be verifed that the cost function decreases strictly with each
move along a suitably chosen spanning path. If we start the simplex method
at the frst vertex on that spanning path and if our pivoting rule is to always
move to the next vertex on that path, then the simplex method will require
2n  1 pivots. We summarize this discussion in the following theorem whose
proof is lef as an exercise ( Exercise 3. 32) .
Theorem 3. 5 Conside theIìnearprogamming p1oblem of minimiz
ing 

subject to the constraints (3. 7) (3. 8). Then:

a
ì
The feaible set ha 2n vertices.
( b) The vertice can be ordered so that each one is adjacent to and
has lower cot ta the¡revìous one.
(c) There e×ists a pivoting rule under which the simplex method
requires 2n ¯ 1 change of basis before it terinates.
We observe in Figure 3. 8 that the frst and the last vertex in the span
ning path are adj acent . This property persists in the perturbed polyhedron
as well . Thus, with a diferent pivoting rule, the simplex method could
terminate with a single pivot . We are thus led to the following question: is
it true that for every pivoting rule there are examples where the simplex
¡zc Chap. 3 The simplex method
method takes an exponential number of iterations? For several popular
pivoting rules, such examples have been constructed. However, these ex
amples cannot exclude the possibility that some other pivoting rule might
fare better. This is one of the most important open problems in the theory
of linear programming. In the next subsection, we address a closely related
issue.
The diameter of polyhedra and the Hirsch conjecture
The preceding discussion leads us to the notion of the diameter of a poly
hedron P, which is defned as follows. Suppose that from any vertex ol
the polyhedron, we are only allowed to j ump to an adjacent vertex. We
defne the distance x y
,
between two vertices xand y as the minimum
number of such j umps required to reach y starting fom x The diameter
D(P) of the polyhedron P is then defned as the maximum of x y
,
over
all pairs x y
,
of vertices. Finally, we defne D(n, m) as the maximum ol
D(P) over all .oo11 polyhedra in ln that are represented in terms of o
inequality constraints. The quantity Du (n, m) is defned similarly, except
that general, possibly unbounded, polyhedra are allowed. For example, we
have
D( 2, m) = l ; J '
and
Du ( 2, m) = m  2;
see Figure 3. 9.
, )
( b)
Fì,ure 1. u. Let n = 2 and m = 8. (a) A bounded polyhedron
with diameter lm/2J = + (b) An unbounded polyhedron with
diameter m  2 = ö
Suppose that the feasible set i n a linear programming problem has
diameter and that the distance between vertices xand y is equal to 1 Ìl
Sec. 3. 7 Computational efciency of the simplex method ¡zz
the simplex method (or any other method that proceeds from one vertex
to an adjacent vertex) is initialized at }ç and if y happens to be the unique
optimal solution, then at least steps will be required. Now, if �( n, m) or
�u (n, m) increases exponentially with n and m, this implies that there exist
examples for which the simplex method takes an exponentially increasing
number of steps, no matter which pivoting rule is used. Thus, in order to
have any hope of developing pivoting rules under which the simplex method
requires a polynomial number of iterations, we must frst establish that
�( n, m) or �u (n, m) grows with n and m at the rate of some polynomial.
The practical success of the simplex method has led to the conjecture that
indeed �(n, m) and �u( n, m) do not grow exponentially fast . In fact, the
following, much stronger, conjecture has been advanced:
Hirsch Conecture: �(n, m) : m  n
.
Despite the signifcance of �(n, m) and �u (n, m) , we are far from es
tablishing the Hirsch conjecture or even from establishing that these quan
tities exhibit polynomial growth. It is known (Klee and Walkup, 1967) that
the Hirsch conjecture is false for unbounded polyhedra and, in particular,
that
�u (n, m) � m  n + l �J .
Unfortunately, this is the best lower bound known; even though it disproves
the Hirsch conj ecture for unbounded polyhedra, it does not provide any
insights as to whether the growth of �u (n, m) is polynomial or exponential.
Regarding upper bounds, it has been established (Kalai and Kleit
man, 1993) that the worstcase diameter grows slower than exponentially,
but the available upper bound grows faster than any polynomial. In par
ticular, the following bounds are available:
�(n, m) : �u (n, m) < m
1+1
og2
n
(
2
n)
l
og2
m
.
The average case behavior of the simplex method
Our discussion has been focused on the worstcase behavior of the simplex
method, but this is only part of the story. Even if every pivoting rule re
quires an exponential number of iterations in the worst case, this is not
necessarily relevant to the typical behavior of the simplex method. For
this reason, there has been a fair amount of research aiming at an under
standing of the typical or average behavior of the simplex method, and an
explanation of its observed behavior.
The main difculty in studying the average behavior of any algorithm
lies in defning the meaning of the term "average. " Basically, one needs to
defne a probability distribution over the set of all problems of a given size,
and then take the mathematical expectation of the number of iterations
¡zs Chap. 3 The simplex method
required by the algorithm, when applied to a random problem drawn ac
cording to the postulated probability distribution. Unfortunately, there is
no natural probability distribution over the set of linear programming prob
lems. Nevertheless, a fair number of positive results have been obtained for
a few diferent types of probability distributions. In one such result , a set of
vectors L¸ a, a
J
^
and scalars ¡ , , 
·
is given. For · = 1 , . . . ¸ Î¸
we introduce either constraint a� .¸ 
,
or a� ._ 
,
, with equal probabil
ity. We then have z
·
possible linear programming problems, and suppose
that L of them are feasible. Haimovich ( 1983) has established that under
a rather special pivoting rule, the simplex method requires no more than
·}z iterations, on the average over those L feasible problems. This linear
dependence on the size of the problem agrees with observed behavior; some
empirical evidence is discussed in Chapter lz
3. 8 Summary
This chapter was centered on the development of the simplex method, which
is a complete algorithm for solving linear programming problems in stan
dard form. The cornerstones of the simplex method are:
,a) the optimality conditions , nonnegativity of the reduced costs) that
allow us to test whether the current basis is optimal;
,b) a systematic method for performing basis changes whenever the op
timality conditions are violated.
At a high level , the simplex method simply moves from one extreme
point of the feasible set to another, each time reducing the cost , until an
optimal solution is reached. However, the lower level details of the simplex
method, relating to the organization of the required computations and the
associated bookkeeping, play an important role. We have described three
diferent implementations: the naive one, the revised simplex method, and
the full tableau implementation. Abstractly, they are all equivalent , but
their mechanics are quite diferent . Practical implementations of the sim
plex method follow our general description of the revised simplex method,
but the details are diferent , because an explicit computation of the inverse
basis matrix is usually avoided.
We have seen that degeneracy can cause substantial difculties, in
cluding the possibility of nonterminating behavior ,cycling) . This is because
in the presence of degeneracy, a change of basis may keep us at the same
basic feasible solution, with no cost improvement resulting. Cycling can
be avoided if suitable rules for choosing the entering and exiting variables
,pivoting rules) are applied ,e. g. , Bland' s rule or the lexicographic pivoting
rule) .
Starting the simplex method requires an initial basic feasible solution,
and an associated tableau. These are provided by the Phase I simplex
algorithm, which is nothing but the simplex method applied to an auxiliary
Sec. 3. 9 Exercises ¡zu
problem. We saw that the changeover from Phase I to Phase II involves
some delicate steps whenever some artifcial variables are in the fnal basis
constructed by the Phase I algorithm.
The simplex method is a rather efcient algorithm and is incorporated
in most of the commercial codes for linear programming. While the number
of pivots can be an exponential function of the number of variables and
constraints in the worst case, its observed behavior is a lot better, hence
the practical usefulness of the method.
3. 9 Exercises
L×ereìse 1. ¡ (Local ninina of convcx functions) Let f : × ¬ ×be a
convex function and let S c × be a convex set. Let x¯ be an element of S
Suppose that x¯ i s a local optimum for the problem of minimizing f¸x) over S,
that i s, there exists some C > 0 such that f¸x¯ ) _ f¸x) for all x ¯ S for which
¡ ¸ x x¯ ¸ ¸ _ C. Prove that x¯ is globally optimal ; that is, f¸x¯ ) : f¸x) for all
x ¯ S
L×ereìse 1. z ( Optinality conditions) Consider the problem of minimizing
c´x over a polyhedron P. Prove the following:
(a) A feasible solution x is optimal if and only if c'd _ 0 for every feasible
direction d at x
(b) A feasible solution x i s the unique optimal solution i f and only i f c'
d > 0
for every nonzero feasible direction d at x.
L×ereìse 1. 1 Let x be an element of the standard form polyhedron P = {x ¯
× ¡ Ax = b, x _ 0] Prove that a vector d ¯ ×is a feasible direction at x if
and only if Ad= 0 and 1,_ 0 for every i such that .,= O.
L×ereìse 1. a Consider the problem of minimizing c´ xover the set P = x ¯
× ¡ Ax = b, Dx _ f, Ex ¸ g . Let x¯ be an element of P that satisfes
Dx¯ = f, Ex¯ < g. Show that the set of feasible directions at the point x¯ is the
set
¡d ¯ × ¡ Ad= 0, Dd ¸ 0
L×ereìse1. s Let P = x ¯ ×
·
.,+ .,+ .· = I , x _ 0 and consider the
vector x = ¸ 0, 0, I ) Find the set of feasible directions at x
L×ereìse 1. c ( Conditions for a uniquc optiniuni) Let xbe a basic feasible
solution associated with some basis matrix H Prove the following:
(a) If the reduced cost of every nonbasic variable is positive, then x is the
unique optimal solution.
(b) If x is the unique optimal solution and is nondegenerate, then the reduced
cost of every nonbasic variable is positive.
¡1u Chap. 3 The simplex method
E×ercìse 1. z ( Optinality conditions) Consider a feasible solution × to a
standard form problem, and let z = {i ¸ ., = 0}. Show that × is an optimal
solution if and only if the linear programming problem
minimize c
'
d
subject to Ad= 0
1,�a i Ç z,
has an optimal cost of zero. ( In this sense, deciding optimality i s equivalent to
solving a new linear programming problem. )
E×ercìse 1. s· This exercise deals with the problem of deciding whether a given
degenerate basic feasible solution is optimal and shows that this is essentially as
hard as solving a general linear programming problem.
Consider the linear programming problem of minimizing c' xover all × Ç P,
where P C
�·
is a given bounded polyhedron. Let
Q =  ,×, , Ç
�·
i
¸ × Ç P,  Ç ¸a i¦ 
(a) Show that Q is a polyhedron.
(b) Give an example of P and Q, with n = 2, for which the zero vector (in
�·
·
) is a degenerate basic feasible solution in Q; show the example in a
fgure.
(c) Show that the zero vector (in
�·
·
) minimizes (c, 0) ' y over all y Ç Q if
and only if the optimal cost in the original linear programming problem is
greater than or equal to zero.
E×ercìse 1. u (Ncccssary and sumcicnt conditions for a uniquc opti
nun) Consider a linear programming problem in standard form and suppose
that x' is an optimal basic feasible solution. Consider an optimal basis associ
ated with × Let B and Abe the set of basic and nonbasic indices, respectively.
Let
I
be the set of nonbasic indices i for which the corresponding reduced costs
,are zero.
(a) Show that if
I
is empty, then x' is the only optimal solution.
(b) Show that × is the unique optimal solution if and only if the following
linear programming problem has an optimal value of zero:
maximize
_
.,
,
:
subject to Ax= b
.,= a
.,�a
i Ç A¸
I
,
i Ç B U
I
.
E×ercìse 1. ¡u· Show that if n  m = 2, then the simplex method will not
cycle, no matter which pivoting rule is used.
E×ercìse 1. ¡¡· Construct an example with n  m = 3 and a pivoting rule
under which the simplex method will cycle.
Sec. 3. 9 Exercises
E×ercìse 3. 12 Consider the problem
minimize
subject to
z.
·
..
., ..
¸
.,  ..
_ ö
.
·
, ..¸ 0
131
(a) Convert the problem into standard form and construct a basic feasible
solution at which ,.
·
, ..
,= ¸ 0, 0) .
(b) Carry out the full tableau implementation of the simplex method, starting
with the basic feasible solution of part (a) .
(c) Draw a graphical representation of the problem in terms of the original
variables .
, ,..
,and indicate the path taken by the simplex algorithm.
E×ercìse 3. 13 This exercise shows that our efcient procedures for updating a
tableau can be derived from a usefl fact in numerical linear algebra.
(a) (ßatrix invcrsion lcnna) Let cbe an m x m invertible matrix and let
u, v be vectors in x
·
Show that
,c

«· ,
,
= c
·
c
·
«· c
·
I

.
c
·
«
(Note that «. is an m x m matrix) . Hint: Multiply both sides by (c

«. ,
( b) Assuming that c
·
i s available, explain how t o obtain ( c
«· )
,
using
only O,m
.
) arithmetic operations.
(c) Let B and B be basis matrices before and afer an iteration of the simplex
method. Let A_
,
,
, , A
7,
,
,
be the exiting and entering column, respectively.
Show that
B  B = ¸A
s,
,
,
÷ A_
,
,
, ) c� ,
where e£ is the £th unit vector.
(d) Note that c� B
,
is the ith row of b
·
and c� B
·
is the pivot row. Show
that
i = I , . . . , m,
for suitable scalars gi . Provide a formula for gi . Interpret the above equa
tion in terms of the mechanics for pivoting in the revised simplex method.
(c) Multiply both sides of the equation in part (d) by ¸b ¡ A and obtain an
interpretation of the mechanics for pivoting in the fll tableau implemen
tation.
E×ercìse3. 14 Suppose that a feasible tableau is available. Show how to obtain
a tableau with lexicographically positive rows. Hint: Permute the columns.
E×ercìse 3. 15 (Fcrturbationapproachtolcxicography) Consider a stan
dard form problem, under the usual assumption that the rows of A are linearly
independent. Let €be a scalar and defne
b,€) = b 
[
€
·
.
]
¡1z Chap. 3 The simplex method
For every E > 0, we defne the Eperturbed problem to be the linear programming
problem obtained by replacing i with i ·,
(a) Given a basis matrix n show that the corresponding basic solution XB (E)
in the Eperturbed problem is equal to
,i, Show that there exists some E* > 0 such that all basic solutions to the
Eperturbed problem are nondegenerate, for 0 < E < E* .
(c) Suppose that all rows of n ¸i ¡ i, are lexicographically positive. Show
that XB (E) is a basic feasible solution to the Eperturbed problem for E
positive and sufciently small.
(d) Consider a feasible basis for the original problem, and assume that all rows
of n ¸i ¡ i, are lexicographically positive. Let some nonbasic variable
Xj enter the basis, and defne 1 = n Aj . Let the exiting variable be
determined as follows. For every row i such that n,is positive, divide the ith
row of n ¸i ¸ i, by n, , compare the results lexicographically, and choose
the exiting variable to be the one corresponding to the lexicographically
smallest row. Show that this is the same choice of exiting variable as in
the original simplex method applied to the Eperturbed problem, when E is
sufciently small.
(c) Explain why the revised simplex method, with the lexicographic rule de
scribed in part (d) , is guaranteed to terminate even in the face of degener
acy.
L×ereìse 1. ¡c (Lcxicographyandthc rcviscdsinplcxncthod) Suppose
that we have a basic feasible solution and an associated basis matrix n such that
every row of n is lexicographically positive. Consider a pivoting rule that
chooses the entering variable Xj arbitrarily (as long as Cj < 0) and the exiting
variable as follows. Let 1 = n Aj • For each i with n, > 0, divide the ith row
of ¸n i ¡ n , by n, and choose the row which is lexicographically smallest . If
row I was lexicographically smallest , then the Ith basic variable XB( £) exits the
basis. Prove the following:
(a) The row vector .�n i .�n , increases lexicographically at each
iteration.
,i, Every row of n is lexicographically positive throughout the algorithm.
(c) The revised simplex method terminates afer a fnite number of steps.
L×ereìse 1. ¡z Solve completely (Le. , both Phase I and Phase II) via the sim
plex method the following problem:
minimize 2X1  3X2  3X3  X4 2X5
subj ect to Xl  3X2  4X4  X5 2
Xl  2X2 3X4  X5 2
Xl 4X2  3X3 I
Xl , . . . , X5 2 o.
Sec. 3. 9 Exercises ¡11
L×crcìsc 1. ¡s Consider the simplex method applied to a standard form prob
lem and assume that the rows of the matrix A are linearly independent . For each
of the statements that follow, give either a proof or a counterexample.
(a) An iteration of the simplex method may move the feasible solution by a
positive distance while leaving the cost unchanged.
(b) A variable that has just lef the basis cannot reenter in the very next
iteration.
(c) A variable that has just entered the basis cannot leave in the very next
iteration.
(d) If there is a nondegenerate optimal basis, then there exists a unique optimal
basis.
(t) If x is an optimal solution, no more than m of its components can be
positive, where m is the number of equality constraints.
L×crcìsc1. ¡u While solving a standard form problem, we arrive at the follow
ing tableau, with x. , x, , and x, being the basic variables:
I0 8 2 0 0 0
4 I ' I 0 0
I C 4 0 I 0
(
7
3 0 0 I
The entries C¸ (, ,, 8, ' in the tableau are unknown parameters. For each
one of the following statements, fnd some parameter values that will make the
statement true.
(a) The current solution is optimal and there are multiple optimal solutions.
(b) The optimal cost is ¬c.
(c) The current solution is feasible but not optimal .
L×crcìsc1. zu Consider a linear programming problem in standard form, de
scribed in terms of the following initial tableau:
0 0 0 0 8 3
7
e
/
0 I 0 C I 0 3
2 0 0 I 2 2 ' I
3 I 0 0 0 I 2 I
The entries C¸ (, ,, 8, ', e in the tableau are unknown parameters. Furthermore,
let B be the basis matrix corresponding to having x, , x¸ , and x, (in that order)
be the basic variables. For each one of the following statements, fnd the ranges
of values of the various parameters that will make the statement to be true.
(a) Phase II of the simplex method can be applied using this as an initial
tableau.
¡1a Chap. 3 The simplex method
(b) The frst row in the present tableau indicates that the problem is infeasible.
(c) The corresponding basic solution is feasible, but we do not have an optimal
basis.
(d) The corresponding basic solution is feasible and the frst simplex iteration
indicates that the optimal cost is ¬¯«
(t) The corresponding basic solution is feasible, x, is a candidate for entering
the basis, and when x, is the entering variable, x. leaves the basis.
(f) The corresponding basic solution is feasible, x, is a candidate for enter
ing the basis, but if it does, the solution and the objective value remain
unchanged.
E×ercìse 1. z¡ Consider the oil refnery problem in Exercise 1 . 16.
(U) Use the simplex method to fnd an optimal solution.
(b) Suppose that the selling price of heating oil is sure to remain fxed over the
next month, but the selling price of gasoline may rise. How high can it go
without causing the optimal solution to change?
(c) The refnery manager can buy crude oil B on the spot market at $40/barrel,
in unlimited quantities. How much should be bought?
E×ercìse1. zz Consider the following linear programming problem with a sin
gle constraint:
º
minimize _
c, x,
,
.;
subject to
_¤, x, = b
,÷;
x, _0, i = 1 , . . . , n.
(U) Derive a simple test for checking the feasibility of this problem.
(b) Assuming that the optimal cost is fnite, develop a simple method for ob
taining an optimal solution directly.
E×ercìse 1. z1 While solving a linear programming problem by the simplex
method, the following tableau is obtained at some iteration.
a a
1
a
Assume that in this tableau we have c¸ _0 for j = m + 1 , . . . , n ¬ 1 , and ¯
º
< O.
In particular, x
º
is the only candidate for entering the basis.
(U) Suppose that x
º
indeed enters the basis and that this is a nondegenerate
pivot (that is, 0' i 0) . Prove that x
º
will remain basic in all subsequent
Sec. 3. 9 Exercises ¡1s
iterations of the algorithm and that x
º
i s a basic variable i n any optimal
basis.
(b) Suppose that x
º
indeed enters the basis and that this is a degenerate pivot
(that is, 0' ¬ 0) . Show that x
º
need not be basic in an optimal basic
feasible solution.
E×ercìse 1. za Show that i n Phase I of the simplex method, if an artifcial vari
able becomes nonbasic, it need never again become basic. Thus, when an artifcial
variable becomes nonbasic, its column can be eliminated from the tableau.
E×ercìse 1. zs ( 1hc sinplcx ncthod vith uppcr bound constraints)
Consider a problem of the form
minimize
subject to
/
L x
Ax ¬ b
° � x � u,
where A has linearly independent rows and dimensions m x n. Assume that
n, > 0 for all i
( a) Let A_
, i
, ,A_
,
,, be mlinearly independent columns of A (the "basic"
columns) . We partition the set of all i ; B, I ) , ,B,m) into two disjoint
subsets L and U. We set x, ¬ 0 for all i ¯ L, and x, ¬ n,for all i ¯ U. We
then solve the equation Ax ¬ b for the basic variables x_
, ; , ,. . . , XB( r) .
Show that the resulting vector x is a basic solution. Also, show that it is
nondegenerate if and only if 0 < x, < n,for every basic variable x, .
(b) For this part and the next, assume that the basic solution constructed in
part (a) is feasible. We form the simplex tableau and compute the reduced
costs as usual . Let x¸ be some nonbasic variable such that x¸ ¬ 0 and
c¸ < a As in Section 3. 2, we increase x¸ by (, and adjust the basic variables
from x_ to x_ ÷0H
i
A¸ . Given that we wish to preserve feasibility, what is
the largest possible value of (? How are the new basic columns determined?
(L) Let x¸ be some nonbasic variable such that x¸ ¬ n¸ and c¸ > a We
decrease x¸ by (, and adjust the basic variables fom x_ to x_+ 0H
¯
i
A¸ .
Given that we wish to preserve feasibility, what is the largest possible value
of (? How are the new basic columns determined?
(d) Assuming that every basic feasible solution is nondegenerate, show that the
cost strictly decreases with each iteration and the method terminates.
E×ercìse 1. zs ( 1hcbigMncthod)Consider the variant of the bigM meth
od in which M is treated as an undetermined large parameter. Prove the follow
ing.
(a) If the simplex method terminates with a solution (x, y) for which y = 0,
then x i s an optimal solution to the original problem.
(b) If the simplex method terminates with a solution ¸x, yj for which y ; 0,
then the original problem i s infeasible.
(L) If the simplex method terminates with an indication that the optimal cost
in the auxiliary problem is ¬¯, show that the original problem is either
¡1s Chap. 3 The simplex method
infeasible or its optimal cost is ¯¯. Hint. When the simplex method ter
minates, it has discovered a feasible direction d = ¸d, ,d¸ ) of cost decrease.
Show that d¸ = 0.
(d) Provide examples to show that both alternatives in part (c) are possible.
L×ercìse 1. zz·
(a) Suppose that we wish to fnd a vector x ¯ R
º
that satisfes Ax = 0 and
x _ 0, and such that the number of positive components of xis maximized.
Show that this can be accomplished by solving the linear programming
problem
º
maximize _p,
,,
subject to A¸ z

y
,
= 0
p, s: 1 , for all i ,
z, y _ 0
(b) Suppose that we wish t o fnd a vector x ¯ R
º
that satisfes Ax= b and
x :: 0, and such that the number of positive components of xis maximized.
Show how this can be accomplished by solving a single linear programming
problem.
L×ercìse1. zs Consider a linear programming problem in standard form with
a bounded feasible set. Furthermore, suppose that we know the value of a scalar
U such that any feasible solution satisfes x, s: U, for all i. Show that the
problem can be transformed into an equivalent one that contains the constraint
_�
,
x, = 1 .
L×ercìse1. za Consider the simplex method, viewed i n terms of column geom
etry. Show that the m

1 basic points ¸A, ,c, ) , as defned in Section 3. 6, are
afnely independent .
L×ercìse 1. 1o Consider the simplex method, viewed in terms of column geom
etry. In the terminology of Section 3. 6, show that the vertical distance from the
dual plane to a point ¸A¸ ,c¸ ) is equal to the reduced cost of the variable x¸
L×ercìse 1. 1¡ Consider the linear programming problem
minimize x;  Jx, 
2x
. 
2x,
subject to 2x
;  3x,  x.  x, ,
x,  2x,  x.  Jx, ,
x,  x,  x.  x, I
x, , , x, _ 0,
where , , , are free parameters. Let I¸, ,, ) be the feasible set. Use the column
geometry of linear programming to answer the following questions.
(a) Characterize explicitly (preferably with a picture) the set of all ¸, ,, ) for
which I¸,, ) is nonempty.
Sec. 3. 10 Notes and sources ¡1z
( b) Characterize explicitly (preferably with a picture) the set of all ¸ 
:
,, ) for
which some basic feasible solution is degenerate.
(c) There are four bases in this problem; in the ith basis, all variables except
for x, are basic. For every ¸ 
·
,, ) for which there exists a degenerate basic
feasible solution, enumerate all bases that correspond to each degenerate
basic feasible solution.
(d) For i = 1 , . , 4, let S, = { ¸
·
, , ) ¡ the ith basis is optimal} . Identif,
preferably with a picture, the sets S
·
,. . . , S, .
(t) For which values of ¸ 
:
, , ) is the optimal solution degenerate?
(f) Let 
·
= 9/5 and , = 7/5. Suppose that we start the simplex method
with x, , x. , x, as the basic variables. Which path will the simplex method
follow?
E×ercìse1. 1z· Prove Theorem 3. 5.
E×ercìse 1. 11 Consider a polyhedron i n standard form, and let x, y be two
diferent basic feasible solutions. If we are allowed to move from any basic feasible
solution to an adjacent one in a single step, show that we can go from x to y in
a fnite number of steps.
3. 10 Notes and sources
1. z. The simplex method was pioneered by Dantzig i n 1947, who later
wrote a comprehensive text on the subject (Dantzig, 1963) .
1. 1. For more discussion of practical implementations of the simplex meth
od based on products of sparse matrices, instead of B
·
, see the books
by Gill, Murray, and Wright ( 1981) , Chvatal ( 1 983) , Murty ( 1983) ,
and Luenberger ( 1984) . An excellent introduction to numerical linear
algebra is the text by Golub and Van Loan ( 1983) . Example 3. 6,
which shows the possibility of cycling, i s due to Beale ( 1955) .
If we have upper bounds for all or some of the variables, instead
of converting the problem to standard form, we can use a suitable
adaptation of the simplex method. This is developed in Exercise 3. 25
and i n the textbooks that we mentioned earlier.
1. a. The lexicographic anticycling rule i s due to Dantzig, Orden, and Wolfe
( 1955) . It can be viewed as an outgrowth of a perturbation method
developed by Orden and also by Charnes ( 1952) . For an exposition
of the perturbation method, see Chvatal ( 1983) and Murty ( 1983) ,
as well as Exercise 3. 1 5. The smallest subscript rule is due to Bland
( 1977) . A proof that Bland' s rule avoids cycling can also be found in
Papadimitriou and Steiglitz ( 1982) , Chvatal ( 1983) , or Murty ( 1983) .
1. s. The column geometry interpretation of the simplex method i s due to
Dantzig ( 1 963) . For further discussion, see Stone and Tovey ( 1991 ) .
138 Chap. 3 The simplex method
3. 7. The example showing that the simplex method can take an exponen
tial number of iterations is due to Klee and Minty ( 1 972) . The Hirsch
conjecture was made by Hirsch in 1957. The frst results on the aver
age case behavior of the simplex method were obtained by Borgwardt
( 1 982) and Smale ( 1983) . Schrijver ( 1 986) contains an overview of
the early research in this area, as well as proof of the n/2 bound on
the number of pivots due to Haimovich ( 1 983) .
3. 9. The results i n Exercises 3. 1 0 and 3. 1 1 , which deal with the smallest
examples of cycling, are due to Marshall and Suurballe ( 1 969) . The
matrix inversion lemma [Exercise 3. 1 3(a)] is known as the Sherman
Morrison formula.
Lhapter 4
1DðÌ1!j !DOO1j
4. 1. Motivation
4. 2. The dual problem
4. 3. The duality theorem
Contents
4. 4. Optimal dual variables Û marginal costs
4. 5. Standard form problems and the dual simplex method
4. 6. Farks' lemma and linear inequalities
4. 7. Fom separating hyperplanes to duality
*
4. 8. Cones and extreme rays
4. 9. Representation of polyhedra
4. 10. General linear programming duality*
4. 1 1 . Summary
4. 12. Exercises
4. 13. Notes and sources
139
140 Chap. 4 Duality theory
In this chapter, we start with a linear programming problem, called the pri
mal , and introduce another linear programming problem, called the dual.
Duality theory deals with the relation between these two problems and un
covers the deeper structure of linear programming. It is a powerful theoret
ical tool that has numerous applications, provides new geometric insights,
and leads to another algorithm for linear programming (the dual simplex
method) .
4. 1 Motivation
Duality theory can be motivated as an outgrowth of the Lagrange multiplier
method, often used in calculus to minimize a function subj ect to equality
constraints. For example, in order to solve the problem
minimize .

+,

subj ect to . +, = 1 ,
we introduce a Lagrange multiplier j and form the Lagrangean L, ., ,, j,
defned by
L, ., ,, j, = .

+,

+j, l  .  ,,
While keeping j fxed, we minimize the Lagrangean over all . and ,, subject
to no constraints, which can be done by setting ðL}ð. and ðL}ð, to zero.
The optimal solution to this unconstrained problem is
and depends on j The constraint . +, = 1 gives us the additional relation
j = 1 , and the optimal solution to the original problem is . = , = 1/2.
The main idea i n the above example i s the following. Instead of
enforcing the hard constraint . + , = 1 , we allow it to be violated and
associate a Lagrange multiplier, or jct·, j with the amount 1  . µ
by which it is violated. This leads to the unconstrained minimization of
.

+ ,

+j, l  .  ,, When the price is properly chosen ,j = 1 , in our
example) , the optimal solution to the constrained problem is also optimal
for the unconstrained problem. In particular, under that specifc value of j,
the presence or absence of the hard constraint does not afect the optimal
cost .
The situation in linear programming is similar: we associate a price
variable with each constraint and start searching for prices under which
the presence or absence of the constraints does not afect the optimal cost.
It turns out that the right prices can be found by solving a new linear
programming problem, called the 1oa of the original. We now motivate
the form of the dual problem.
Sec. 4. 1 Motivation
Consider the standard form problem
mInImIZe c'x
subject to Ax = b
x _ 0,
141
which we call the jctoa problem, and let x* be an optimal solution, as
sumed to exist . We introduce a ca1 problem in which the constraint
Ax = b is replaced by a penalty p' , b Ax) , where p is a price vector of
the same dimension as b We are then faced with the problem
minimize c' x + p' ,b Ax)
subject to x _ a
Let ç,p,be the optimal cost for the relaxed problem, as a function of the
price vector p The relaxed problem allows for more options than those
present in the primal problem, and we expect ç,p,to be no larger than the
optimal primal cost c' x* . Indeed,
ç,p,= min [c' x + p' ,b
¬
Ax,
] :: c'x* + p' ,b
¬
Ax* , = c' x* ,
x;O
where the last inequality follows from the fact that x* i s a feasible solution
to the primal problem, and satisfes Ax* = b Thus, each p leads to a
lower bound ç,p,for the optimal cost c'x* . The problem
maximize ç,p,
subject to no constraints
can be then interpreted as a search for the tightest possible lower bound
of this type, and is known as the 1oa problem. The main result in du
ality theory asserts that the optimal cost in the dual problem is equal to
the optimal cost c'x* in the primal. In other words, when the prices are
chosen according to an optimal solution for the dual problem, the option
of violating the constraints Ax = bis of no value.
Using the defnition of ç,p, , we have
Note that
ç, p, min [c' x + p' ,b Ax)
]
x;O
p' b+ min,c'
¬
p'A) x.
x20
min, c'
¬
p' A, ×= ¸
x20
¬
0g
if c'
¬
p'A _ 0' ,
otherwise.
In maximizing ç, p, ,we only need to consider those values of p for which
ç,p,is not equal to
¬
0 We therefore conclude that the dual problem is
the same as the linear programming problem
maximize p' b
subj ect to p'A :: c' .
142 Chap. 4 Duality theory
In the preceding example, we started with the equality constraint
A× ¬ b and we ended up with no constraints on the sign of the price
vector p If the primal problem had instead inequality constraints of the
form A×2 b, they could be replaced by A×¨ Ö ¬b, Ö 2 a The equality
constraint can be written in the form
¸A
![ ¸
¸
¬0,
which leads to the dual constraints
p' ¸A
!
} ¸ ¸L
I
0'
}
,
or, equivalently,
p' A
¸
L
I
ç p2 a
Also, i f the vector ×i s free rather than signconstrained, we use the fact
, ' ' A
¹
¸ 0, if L
I

p'A¬0
'
,
m1n L

p × ¬
× 0, otherWIse,
to end up with the constraint p'A¬L
I
in the dual problem. These consid
erations motivate the general form of the dual problem which we introduce
in the next section.
In summary, the construction of the dual of a primal minimization
problem can be viewed as follows. We have a vector of parameters ¸dual
variables) p, and for every pwe have a method for obtaining a lower bound
on the optimal primal cost. The dual problem is a maximization problem
that looks for the tightest such lower bound. For some vectors p, the
corresponding lower bound is equal to 0, and does not carry any useful
information. Thus, we only need to maximize over those p that lead to
nontrivial lower bounds, and this is what gives rise to the dual constraints.
4. 2 The dual problem
Let Abe a matrix with rows a�and columns A
,
• Given a j¬oa problem
with the structure shown on the left, its 1oa is defned to be the maxi
mization problem shown on the right:
minimize L'× maximize
subject to a� ×2 , , t ', , subject to
a� ×¸ , , t '

,
a� ×¬, , t ', ,
·,
2 0, j ^, ,
·, ¸
0, j ^

,
·,
free, j ^, ,
p' b
j
, 2 0,
j, ¸
0,
j, fee,
p' A
, ¸ ·,
,
p'A
, 2 ·,
,
p' A
,
¬·,
,
t ', ,
t '

,
t ', ,
, ^,
j ^

,
j ^, .
Sec. 4. 2 The dua problem 143
Notice that for each constraint in the primal ¸other than the sign con
straints) , we introduce a variable in the dual problem; for each variable in
the primal, we introduce a constraint in the dual. Depending on whether
the primal constraint is an equality or inequality constraint , the corre
sponding dual variable is either free or signconstrained, respectively. In
addition, depending on whether a variable in the primal problem is free or
signconstrained, we have an equality or inequality constraint , respectively,
in the dual problem. We summarize these relations in Table 4. 1 .
PRIMAL minimize maximize DUAL
·
·,
·
a
constraints �·, � vriables
= ·, free
·
a < _ ·
¬ J
variables �
·
Cj constraints
free = Cj
Tble 4. 1: Relation between primal and dual vriables and constraints.
If we start with a maximization problem, we can always convert it
into an equivalent minimization problem, and then form its dual according
to the rules we have described. However, to avoid confsion, we will adhere
to the convention that the primal is a minimization problem, and its dual
is a maximization problem. Finally, we will keep referring to the objective
function in the dual problem as a "cost" that is being maximized.
A problem and its dual can be stated more compactly, in matrix
notation, if a particular form is assumed for the primal. We have, for
example, the following pairs of primal and dual problems:
minimize c' x
subject to Ax b
x > 0,
and
minimize c'x
subject to Ax
·
b,
maximize
subject to
maximize
subject to
p' b
p' A �c' ,
p' b
p' A = c'
r ·
a
Exrple 4. 1 Consider the primal problem shown on the lef and its dual shown
144
on the right :
minimize
subject to
Xl + 2X2 + 3X3
Xl + 3X2 = 5
2Xl  X2 + 3X3 �6
X3 :: 4
Xl � 0
X2 :: 0
X3 free,
Chap. 4 Duality theory
maximize
subject to
5
P
l + 6
P
2 + 4
P
3
P
l free
P
2 �0
P
3 :: 0

P
l + 2
P
2 :: 1
3
P
l 
P
2 �2
3
P2  P
3 = 3.
We transform the dual into an equivalent minimization problem, rename the
variables fom
P
l ,
P
2 ,
P
3 to Xl , X2 , X3 , and multiply the three last constraints by
1 . The resulting problem is shown on the lef. Then, on the right , we show its
dual :
minimize
subject to
5Xl  6X2  4X3
Xl free
X2 �0
X3 :: 0
Xl  2X2 � 1
3X
l
+ X2 :: 2
 3X2  X3 = 3,
maximize
subject to

P
l  2
P
2  3
P
3
P
l  3
P
2 = 5
2
P
l +
P
2  3
P
3 :: 6
Pl �0
P2 :: 0
P
3 free.

P
3 � 4
We observe that the latter problem is equivalent to the primal problem we started
with. (The frst three constraints in the latter problem are the same as the frst
three constraints in the original problem, multiplied by 1 . Also, if the maxi
mization in the latter problem is changed to a minimization, by multiplying the
objective function by 1 , we obtain the cost function in the original problem. )
The frst primal problem considered i n Example 1 l had all of the
ingredients of a general linear programming problem. This suggests that
the conclusion reached at the end of the example should hold in general .
Indeed, we have the following result . Its proof needs nothing more than
the steps followed in Example 1. l , with abstract symbols replacing specifc
numbers, and will therefore be omitted.
Iheorem4. 1 Iwe transform the dual into an equivalent minimiza
tion problem and then form its dual, we obtain a problem equivalent
to the original problem.
A compact statement that is ofen used to describe Theorem 1 l is
that "the dual of the dual is the primal. "
Any linear programming problem can be manipulated into one of
several equivalent forms, for example, by introducing slack variables or by
using the diference of two nonnegative variables to replace a single free
variable. Each equivalent form leads to a somewhat diferent form for the
dual problem. Nevertheless, the examples that follow indicate that the
duals of equivalent problems are equivalent .
Sec. 4. 2 The dual problem 145
Example 4. 2 Consider the primal problem shown on the left and its dual shown
on the right :
minimize c
'
x
subject to Ax � b
x fee,
maximize
subj ect to
p' b
p � 0
p' A ¬ c'
We transform the primal problem by introducing surplus variables and then ob
tain its dual:
minimize
subject to
c'
x
0
' s
Ax ÷ s ¬ b
x free
s � 0,
maximize
subject to
p
'
b
p free
p' A ¬ c'
÷p _ 0
Alternatively, i f we take the original primal problem and replace x by sign
constrained variables, we obtain the following pair of problems:
minimize
subj ect to
c' x
+
÷ c'x¯
Ax
+
÷ Ax¯ � b
x
+
� 0
x¯ � 0,
maximize
subject to
p
'
b
p � 0
p' A _ c'
÷p' A _ ÷c'
Note that we have three equivalent forms of the primal. We observe that the
constraint p � 0 is equivalent to the constraint ÷p _ 0 Furthermore, the con
straint p
'
A ¬ c
'
is equivalent to the two constraints p'A _ c' and ÷p'A _ ÷c'
Thus, the duals of the three variants of the primal problem are also equivalent .
The next example is in the same spirit and examines the efect of
removing redundant equality constraints in a standard form problem.
Example 4. 3 Consider a standard form problem, assumed feasible, and its
dual :
minimize c'x
subject to Ax b
x � 0,
maximize
subj ect to
p' b
p' A _ c'
Let a� , ,a� be the rows of A and suppose that a
·
¬ __�
,
·
·
a
·
for some
scalars ·, , ,·
·
, In particular, the last equality constraint is redundant and
can be eliminated. By considering an arbitrary feasible solution x, we obtain
·
,
·
,
'
·

a�x ¬
_~·
a�×¬
_·
· '·
·
,
· ,
(4. 1 )
Note that the dual constraints are of the form __
,
,
·
a� ¸c'and can be rewritten
as
·
,
_
¸,
· ·
·
,
·
) a� _ c
'
,
,
Furthermore, using Eq. (4. 1 ) , the dual cost __
,
,
· '·
is equal to
·
,
·,
146 Chap. 4 Duality theory
If we now let ç
· ¬
)· ¸·)
,, we see that the dual problem is equivalent to
,
·
maximize _ç·

·
·÷
·
,
·
subject to
_ç·
a�
±
c
'
.
·÷;
We observe that this is the exact same dual that we would have obtained if we
had eliminated the last (and redundant) constraint in the primal problem, before
forming the dual .
The conclusions of the preceding two examples are summarized and gener
alized by the following result .
Iheorem4. 2 Suppose that we have transformed a linear program
ming problem 01 to another linear programming problem O2 , by a
sequence of transformations of the following tyes:
(a
) Replace a fee variable with the diference of two nonnegative
variables.
(t, Replace an inequality constraint by an equality constrant involv
ing a nonnegative slack variable.
[e) If some row of the matrix A in a feasible standard form problem
is a linear combination of the other row, elminate the corre
sponding equality constraint.
Then, the duals of 0
1
ad 02 are equivalent, i . e. , they are either both
infeasible, or they have the same optimal cost.
The proof of Theorem 4. 2 involves a combination of the various steps
in Examples 4. 2 and 4. 3, and is left to the reader.
4. 3 The duality theorem
We saw in Section 4. 1 that for problems in standard form, the cost g(p)
of any dual solution provides a lower bound for the optimal cost. We now
show that this property is true in general .
Iheorem4. 3 (Weakduaìty)
Ifx is a feasible solution to the primal
problem and ,is a feaible solution to the dual problem, then
,tS e
'
x.
Sec. 4. 3 The duality theorem 147
Proof. For any vectors xand p, we defne
·, = ,, a� x
·, ,
Vj = (Cj  p
'
Aj ) Xj .
Suppose that xand p are primal and dual feasible, respectively. The def
inition of the dual problem requires the sign of ,, to be the same as the
sign of a� x ·, and the sign of Cj  p' Aj to be the same as the sign of Xj .
Thus, primal and dual feasibility imply that
and
Notice that
and
Vj � 0, \j.
_·,= p
'
Ax  p
'
b,
,
_Vj = 
x p
'
Ax.
j
We add these two equalities and use the nonnegativity of ·, Vj , to obtain
.
j
The weak duality theorem is not a deep result , yet it does provide
some useful information about the relation between the primal and the
dual. We have, for example, the following corollary.
Corollary 4. 1
( a) 
the optimal cost i n the primal i s ¨¯¡ then the dual problem
must be infeasible.
(b) If the optimal cost in the dual is +×, then the primal problem
must be infeasible.
Proof. Suppose that the optimal cost in the primal problem is ¬× and
that the dual problem has a feasible solution p. By weak duality, p satisfes
p' b �xfor every primal feasible x Taking the minimum over all primal
feasible xwe conclude that p' b �×. This is impossible and shows that
the dual cannot have a feasible solution, thus establishing part ¸a) . Part
¸b) follows by a symmetrical argument. ¯
Another important corollary of the weak duality theorem is the fol
lowing.
¡as Chap. 4 Duality theory
Coroary a. z Let . and p be feasible solutions to the primal and
the dual, respectively, and suppose that p' b  . Then, .and p are
optimal solutions to the primal and the dual, respectively.
Proot. Let .and p be as in the statement of the corollary. For every primal
feasible solution y,the weak duality theorem yields .p' b _ y which
proves that .is optimal . The proof of optimality of p is similar. .
The next theorem is the central result on linear programming dual
ity.
Iheorema. a

strong duaìty
)
Ia linear programming problem
has an optimal solution, so does its dual, and the respective optimal
costs are equal.
Proot. Consider the standard form problem
minimize .
subj ect to A. b
. > a
Let us assume temporarily that the rows of A are linearly independent and
that there exists an optimal solution. Let us apply the simplex method to
this problem. As long as cycling is avoided, e. g. , by using the lexicographic
pivoting rule, the simplex method terminates with an optimal solution .
and an optimal basis B. Let x, B

1
b be the corresponding vector of
basic variables. When the simplex method terminates, the reduced costs
must be nonnegative and we obtain


B
1
A >0
'
,
where _ is the vector with the costs of the basic variables. Let us defne
a vector p by letting p' _B
I
. We then have p' A _ 
which shows
that p is a feasible solution to the dual problem
In addition,
maximize p/b
subject to p' A _ 
'
b
/
B

I
b
/ /
P

, 
,
x,  x
It follows that p is an optimal solution to the dual (cf. Corollary 4. 2) , and
the optimal dual cost is equal to the optimal primal cost .
If we are dealing with a general linear programming problem I
I
that
has an optimal solution, we frst transform it into an equivalent standard
Sec. 4. 3 The dualty theorem 149
form problem II
2
, with the same optimal cost , and in which the rows of the
matrix A are linearly independent . Let Dl and D
2
be the duals of III and
II
2
, respectively. By Theorem 4. 2, the dual problems Dl and D
2
have the
same optimal cost . We have already proved that II
2
and D
2
have the same
optimal cost . It follows that III and Dl have the same optimal cost (see
Figure 4. 1 ) . .
equivalent
�� '2
duality for
standard form
problems
duals of equivalent
problems are
equivalent
Fìgurc 4. 1: Proof of the duality theorem for general linear pro
gramming problems.
The preceding proof shows that an optimal solution to the dual prob
lem is obtained as a byproduct of the simplex method as applied to a primal
problem in standard form. It is based on the fact that the simplex method
is guaranteed to terminate and this, in turn, depends on the existence of
pivoting rules that prevent cycling. There is an alternative derivation of the
duality theorem, which provides a geometric, algorithmindependent view
of the subject , and which is developed in Section 4. 7. At this point, we
provide an illustration that conveys most of the content of the geometric
proof.
L×ampc 4. 4 Consider a solid ball constrained to lie in a polyhedron defned
by inequality constraints of the form a� x ¸ bi . If left under the infuence of
gravity, this ball reaches equilibrium at the lowest corner x* of the polyhedron;
see Figure 4. 2. This corner is an optimal solution to the problem
minimize c
'
x
subject to a�x ¸bi , v i ,
where c is a vertical vector pointing upwards. At equilibrium, gravity is counter
balanced by the forces exerted on the ball by the "walls" of the polyhedron. The
latter forces are normal to the walls, that is, they are aligned with the vectors ai .
We conclude that c = 2
i
Piai , for some nonnegative coefcients Pi ; in particular,
150 Chap. 4
the vector p is a feasible solution to the dual problem
maximize p'h
subject to p' A = L
/
p � O.
Dualty theory
Given that forces can only be exerted by the walls that touch the ball, we must
have ,. = 0, whenever U� 7
+
> ·. Consequently, ,. ,·. ¯ U� 7
+
) = 0 for all i We
therefore have p'h = _
.
,. ·. = _
.
,. �× = L
/
7
+
. It follows ( Corollary + 2) that
p is an optimal solution to the dual , and the optimal dual cost is equal to the
optimal primal cost.
Figure 4. 2: A mechanical analogy of the duality theorem.
Recall that in a linear programming problem, exactly one of the fol
lowing three possibilities will occur:
(a) There is an optimal solution.
(b) The problem is "unbounded" ; that is, the optimal cost is ¬× (for
minimization problems) , or +× (for maximization problems) .
(c) The problem is infeasible.
This leads to nine possible combinations for the primal and the dual, which
are shown in Table 4. 2. By the strong duality theorem, if one problem has
an optimal solution, so does the other. Frthermore, as discussed earlier,
the weak duality theorem implies that if one problem is unbounded, the
other must be infeasible. This allows us to mark some of the entries in
Table 4. 2 as "impossible. "
Sec. 4. 3 The duality theorem 151
I I Finite optimum Unbounded Infeasible I
Finite optimum Possible Impossible Impossible
Unbounded Impossible Impossible Possible
Infeasible Impossible Possible Possible
Table 4. 2: The diferent possibilities for the primal and the dual.
The case where both problems are infeasible can indeed occur, as shown by
the following example.
Example 4. 5 Consider the infeasible primal
minimize Xl
subject to Xl
2Xl
Its dual is
maximize
Pl
subject to
Pl
Pl
which is also infeasible.






2X2
X2
2X2
3
P
2
2
P
2
2
P
2
1
3.
1
2,
There i s another interesting relation between the primal and the dual
which is known as Clark' s theorem (Clark, 1961 ) . It asserts that unless
both problems are infeasible, at least one of them must have an unbounded
feasible set (Exercise 4. 21 ) .
Complementary slackness
An important relation between primal and dual optimal solutions is pro
vided by the ·.ojoo·a_ sa·/oss conditions, which we present next .
Theorem 4. 5 (Complementary slackness,Let x and p be feasible
solutions to the primal and the dual problem, respectively. The vectors
x and p are optimal solutions for the two respective problems if and
only if
Pi a�x  bi , " 0, v t ,
(Cj  p' Aj )xj " 0,
¯
,
Proof. In the proof of Theorem 4. 3, we defned ·, " Pi a� x  bi , and
Vj = (cj p' Aj ) xj , and noted that for x primal feasible and p dual feasible,
152 Chap. 4 Dualty theory
we have ., � 0 and ·¸ � 0 for all t and , In addition, we showed that
c
'
x  p
'
b = _., 
_·¸
, ¸
By the strong duality theorem, if x and p are optimal, then c' x = p' b,
which implies that ., = ·¸ = 0 for all t , , Conversely, i f ., = ·¸ = 0 for all
t , , , then c'x = p' b, and Corollary 4. 2 implies that x and p are optimal.
.
The frst complementary slackness condition is automatically satis
fed by every feasible solution to a problem in standard form. If the pri
mal problem is not in standard form and has a constraint like a� x � ·,
the corresponding complementary slackness condition asserts that the dual
variable ,, is zero unless the constraint is active. An intuitive explanation
is that a constraint which is not active at an optimal solution can be re
moved from the problem without afecting the optimal cost, and there is no
point in associating a nonzero price with such a constraint . Note also the
analogy with Example 4. 4, where "forces" were only exerted by the active
constraints.
If the primal problem is in standard form and a nondegenerate optimal
basic feasible solution is known, the complementary slackness conditions
determine a unique solution to the dual problem. We illustrate this fact in
the next example.
Example 4. 6 Consider a problem in standard form and its dual:
minimize IJxi 
lOx2  6X3 maximize 8)i  J
)
,
subject to 5x
i  X2  Jx. 8 subject to 5
)i  J
),
´
IJ
Jxi  X2 J
)i  ),
´
I0
Xl , X2 , X3 �0, J
)i ´
6.
As will be verifed shortly, the vector x' = ¸ I , 0, I) i s a nondegenerate optimal
solution to the primal problem. Assuming this to be the case, we use the comple
mentary slackness conditions to construct the optimal solution to the dual . The
condition
)
, ¸a� x' ÷ , ) = 0 is automatically satisfed for each i, since the primal
is in standard form. The condition ¸c¸÷p'A¸) xj = 0is clearly satisfed for j = 2,
because X2 = O. However, since xi > 0 and X3 > 0, we obtain
and
J)i
= 6,
which we can solve to obtain
)i
= 2 and
), = I Note that this i s a dual feasible
solution whose cost is equal to I9, which is the same as the cost of x' . This
verifes that x' is indeed an optimal solution as claimed earlier.
We now generalize the above example. Suppose that ¸ is a ba
sic variable in a nondegenerate optimal basic feasible solution to a primal
Sec. 4. 3 The duaity theorem 153
problem i n standard form. Then, the complementary slackness condition
,:¸
,A¸ , .¸ = 0 yields ,A¸ = :¸ for every such , Since the basic columns
A¸ are linearly independent , we obtain a system of equations for ,which
has a unique solution, namely, ,= �n
,
A similar conclusion can also
be drawn for problems not in standard form (Exercise 4. 12) . On the other
hand, if we are given a degenerate optimal basic feasible solution to the
primal, complementary slackness may be of very little help in determining
an optimal solution to the dual problem (Exercise 4. 17) .
We fnally mention that if the primal constraints are of the form
A× 2 t × 2 0, and the primal problem has an optimal solution, then
there exist optimal solutions to the primal and the dual which satisf s·ct··
·.ojoo·ac, sa·/oss, that is, a variable in one problem is nonzero if
and only if the corresponding constraint in the other problem is active
(Exercise 4. 20) . This result has some interesting applications in discrete
optimization, but these lie outside the scope of this book.
1 geometric view
We now develop a geometric view that allows us to visualize pairs of primal
and dual vectors without having to draw the dual feasible set .
We consider the primal problem
minimize L
/
×
subject to a� ×2 bi , i = 1 , . . ¸ Î¸
where the dimension of ×is equal to n. We assume that the vectors ai span
�
»
The corresponding dual problem is
maximize , t
~
subject to _
>
i � = 
i
=l
P
2 0.
Let I be a subset of { 1 , . . . ¸ Î} of cardinality n, such that the vectors
�, i I, are linearly independent. The system a�×= bi , i I, has a unique
solution, denoted by ×' , which is a basic solution to the primal problem
(cf. Defnition 2. 9 in Section 2. 2) . We assume, that ×' is nondegenerate,
that is, a� × bi for i � I.
Let ,�
~
be a dual vector (not necessarily dual feasible) , and let
us consider what is required for ×' and ,to be optimal solutions to the
primal and the dual problem, respectively. We need:
(a) a�×' 2 bi , for all i , (primal feasibility) ,
(b)
P
i = O, for all i � I,
(c)
(d)
P
2 0,
(complementary slackness) ,
(dual feasibility) ,
(dual feasibility) .
154 Chap. 4 Dualty theory
C
i
Figure 4. 3: Consider a primal problem with two variables and
fve inequality constraints ,n = 2, m = 5) , and suppose that no
two of the vectors a, are collinear. Every twoelement subset i
of { ¡ , 2, J,4, 5] determines basic solutions x' and p
,
of the primal
and the dual , respectively.
If i= { ! , 2], x' is primal infeasible ,point A, and p
,
is dual in
feasible, because L cannot be expressed as a nonnegative linear
combination of the vectors aj and a, .
If i= { ¡ , J] , x' is primal feasible ,point B) and p
,
is dual infea
sible.
If i= ] ¡ , 4], x
i
is primal feasible ,point C) and p
,
is dual feasible,
because L can be expressed as a nonnegative linear combination of
the vectors aj and 8. In particular, x' and p
,
are optimal .
If i = { ! , 5], x
i
is primal infeasible ,point L, and p
,
is dual
feasible.
Given the complementary slackness condition (b) , condition (c) becomes
Since the vectors a, t I, are linearly independent , the latter equation
has a unique solution that we denote by p
l
. In fact , it is readily seen
that the vectors a, t I, form a basis for the dual problem (which is in
standard form) and p
I
is the associated basic solution. For the vector p
I
to be dual feasible, we also need it to be nonnegative. We conclude that
once the complementary slackness condition (b) is enforced, feasibility of
Sec. 4. 4 Optimal dual variables Ü marginal costs
Figure 4. 4: The vector x' is a degenerate basic feasible solution
of the primal. If we choose ì = { I , 2] , the corresponding dual
basic solution p
,
is infeasible, because L is not a nonnegative linear
combination of aI , a2 . On the other hand, if we choose ì= { ¡ , J]
or ì= {2, J] , the resulting dual basic solution p
,
is feasible and,
therefore, optimal .
155
the resulting dual vector p
·
is equivalent to L being a nonnegative linear
combination of the vectors a, t I, associated with the active primal
constraints. This allows us to visualize dual feasibility without having to
draw the dual feasible set ; see Figure 4. 3.
If x* i s a degenerate basic solution to the primal , there can be several
subsets I such that xl = x* . Using diferent choices for I, and by solving
the system
_
,
·
,, a,= Lç we may obtain several dual basic solutions p
l
. It
may then well be the case that some of them are dual feasible and some are
not ; see Figure 4. 4. Still, if p
·
is dual feasible ( Le. , all ,,are nonnegative)
and if x* is primal feasible, then they are both optimal , because we have
been enforcing complementary slackness and Theorem 4. 5 applies.
4. 4 Optimal dual variables as marginal costs
In this section, we elaborate on the interpretation of the dual variables as
prices. This theme will be revisited, in more depth, in Chapter 5.
Consider the standard form problem
minimize L
I
x
subject to Ax b
x > a
We assume that the rows of A are linearly independent and that there
156 Chap. 4 Duality theory
is a nondegenerate basic feasible solution x* which is optimal. Let B be
the corresponding basis matrix and let XB = B1 b be the vector of basic
variables, which is positive, by nondegeneracy. Let us now replace b by
b + d, where d is a small perturbation vector. Since B1 b > 0, we also
have B1 (b + d) > 0, as long as d is small . This implies that the same
basis leads to a basic feasible solution of the perturbed problem as well.
Perturbing the righthand side vector b has no efect on the reduced costs
associated with this basis. By the optimality of x* in the original problem,
the vector of reduced costs c' c'B1 A is nonnegative and this establishes
that the same basis is optimal for the perturbed problem as well. Thus,
the optimal cost in the perturbed problem is
where p' = c'B1 is an optimal solution to the dual problem. Therefore, a
small change of d in the righthand side vector b results in a change of p' d
in the optimal cost . We conclude that each component
j, of the optimal
dual vector can be interpreted as the oacçtoa ·.s· ,or s/a1.o jct·) per
unit increase of the ith requirement , .
We conclude with yet another interpretation of duality, for standard
form problems. In order to develop some concrete intuition, we phrase
our discussion in terms of the diet problem ,Example 1 . 3 in Section 1 . 1 ) .
We interpret each vector A¸ as the nutritional content of the jth available
food, and view b as the nutritional content of an ideal food that we wish to
synthesize. Let us interpret
j, as the "fair" price per unit of the ith nutrient .
A unit of the jth food has a value of :¸ at the food market , but it also has
a value of p' A¸ if priced at the nutrient market . Complementary slackness
asserts that every food which is used ,at a nonzero level) to synthesize the
ideal food, should be consistently priced at the two markets. Thus, duality
is concerned with two alternative ways of cost accounting. The value of the
ideal food, as computed in the food market , is c'x* , where x* is an optimal
solution to the primal problem; the value of the ideal food, as computed
in the nutrient market , is p' b. The duality relation c'x* = p' b states that
when prices are chosen appropriately, the two accounting methods should
give the same results.
4. 5 Standard form problems and the dual
simplex method
In this section, we concentrate on the case where the primal problem i s in
standard form. We develop the 1oa stoj o·/.1, which is an alternative
to the simplex method of Chapter 3. We also comment on the relation
between the basic feasible solutions to the primal and the dual, including
a discussion of dual degeneracy.
Sec. 4. 5 Standard form problems and the dual simplex method 157
In the proof of the strong duality theorem, we considered the simplex
method applied to a primal problem in standard form and defned a dual
vector ,by letting ,= �n
We then noted that the primal optimality
condition  �n
n _ 0' is the same as the dual feasibility condition
,n�  We can thus think of the simplex method as an algorithm that
maintains primal feasibility and works towards dual feasibility. A method
with this property is generally called a jctoa algorithm. An alternative is
to start with a dual feasible solution and work towards primal feasibility. A
method of this type is called a 1oa algorithm. In this section, we present a
dual simplex method, implemented in terms of the full tableau. We argue
that it does indeed solve the dual problem, and we show that it moves from
one basic feasible solution of the dual problem to another. An alternative
implementation that only keeps track of the matrix n
.
instead of the
entire tableau, is called a ccts1 1oa stoj method ¸Exercise 4. 23) .
The dual simplex method
Let us consider a problem in standard form, under the usual assumption
that the rows of the matrix nare linearly independent . Let nbe a basis
matrix, consisting of ! linearly independent columns of n and consider
the corresponding tableau
or, in more detail,
�X,
. . .

,
, ,
n
n
. . .
n
n

,
,
We do not require n
·
t to be nonnegative, which means that we
have a basic, but not necessarily feasible solution to the primal problem.
However, we assume that c _ 0; equivalently, the vector , = �n·
satisfes ,n �  and we have a feasible solution to the dual problem.
The cost of this dual feasible solution is , t= �n
·
t = �X,¸ which
is the negative of the entry at the upper lef corner of the tableau. If
the inequality n
t _ 0 happens to hold, we also have a primal feasible
solution with the same cost , and optimal solutions to both problems have
been found. If the inequality n
t_ 0 fails to hold, we perform a change
of basis in a manner we describe next .
158 Chap. 4 Duality theory
We fnd some £ such that ·j¸
,
,
< 0 and consider the £th row of the
tableau, called the jtc.· rw; this row is of the form ¸·j¸
,
,
,·
·
, o
»
, ,
where ·, is the £th component of B
1
A, For each t with ·, < 0 (if such t
exist) , we form the ratio c, ,
o, ¸ and let j be an index for which this ratio
is smallest ; that is, ·
,
< 0 and
c c,
¯ ¯ " min «
·
,
¸
(
,
·:
·,
(4. 2)
(We call the corresponding entry ·
,
the jtc.· oo· Note that ·,
must
be a nonbasic variable, since the jth column in the tableau contains the
negative element ·
,
, We then perform a change of basis: column A
,
enters the basis and column A
j¸
,
,
exits. This change of basis (or jtc.·) is
efected exactly as in the primal simplex method: we add to each row of the
tableau a multiple of the pivot row so that all entries in the pivot column
are set to zero, with the exception of the pivot element which is set to l . In
particular, in order to set the reduced cost in the pivot column to zero, we
multiply the pivot row by c
,
, ¸ o
,
¸ and add it to the zeroth row. For every
t , the new value of c, is equal to
.
c,
·,+ ·,
�
,
which is nonnegative because of the way that j was selected [ef. Eq. ( 4. 2) J .
We conclude that the reduced costs in the new tableau will also be nonneg
ative and dual feasibility has been maintained.
Example 4. 7 Consider the tableau
Xl X2 X3 X4 X5
0 2 6 10 0 0
2 2 4 1 1 0
1 4 2· 3 0 1
Since XB
(
2
)
< 0, we choose the second row to be the pivot row. Negative entries
of the pivot row are found in the second and third column. We compare the
corresponding ratios 6/ 1 2¸ and 10/ 1  3 1 . The smallest ratio is 6/ 1 2¸ and,
therefore, the second column enters the basis. ¸The pivot element is indicated by
an asterisk. ) We multiply the pivot row by 3 and add it to the zeroth row. We
multiply the pivot row by 2 and add it to the frst row. We then divide the pivot
row by 2 The new tableau is
Xl X2 X3 X4 X5
3 14 0 1 0 3
0 6 0 5 1 2
1/2 2 1 3/2 0 1 /2
Sec. 4. 5 Standard form problems and the dual simplex method 159
The cost has increased to J. Frthermore, we now have b
·
u 2 0, and an
optimal solution has been found.
Note that the pivot element ·,is always chosen to be negative, where
as the corresponding reduced cost c
,
is nonnegative. Let us temporarily
assume that c
,
is in fact positive. Then, in order to replace c
,
by zero, we
need to add a positive multiple of the pivot row to the zeroth row. Since
·_
¸
,
,
is negative, this has the efect of adding a negative quantity to the
upper left corner. Equivalently, the dual cost increases. Thus, as long as the
reduced cost of every nonbasic variable is nonzero, the dual cost increases
with each basis change, and no basis will ever be repeated in the course of
the algorithm. It follows that the algorithm must eventually terminate and
this can happen in one of two ways:
¸a) We have B
I
b 2 0 and an optimal solution.
¸b) All of the entries ·
.
v • v ·

in the pivot row are nonnegative and we
are therefore unable to locate a pivot element . In full analogy with
the primal simplex method, this implies that the optimal dual cost is
equal to +0 and the primal problem is infeasible; the proof is left as
an exercise ¸Exercise 4. 22) .
We now provide a summary of the algorithm.
An iteration of the dual simplex method
¡. A typical iteration starts with the tableau associated with a basis
matrix B and with all reduced costs nonnegative.
2. Examine the components of the vector B
1
b i n the zeroth col
umn of the tableau. If they are all nonnegative, we have an op
timal basic feasible solution and the algorithm terminates; else,
choose some f such that _
¸·,
<
3. Consider the fth row of the tableau, with elements ·_
¸
,
,
·,
·
¸the pivot row) . If ·,2 0 for all t , then the optimal dual cost
is +× and the algorithm terminates.
4. For each t such that ·, < 0, compute the ratio c¸¸ ·, ¸ and let j
be the index of a column that corresponds to the smallest ratio.
The column A_
¸
,
,
exits the basis and the column A
,
takes its
place.
5. Add to each row of the tableau a multiple of the fth row ¸the
pivot row) so that ·, ¸the pivot element) becomes 1 and all other
entries of the pivot column become
Let us now consider the possibility that the reduced cost c
,
in the
pivot column is zero. In this case, the zeroth row of the tableau does not
change and the dual cost C�BI b remains the same. The proof of termina
160 Chap. 4 Duality theory
tion given earlier does not apply and the algorithm can cycle. This can be
avoided by employing a suitable anticycling rule, such as the following.
Lexicographic pivoting rule for the dual simplex method
1. Choose any row £ such that XB
(
C
)
< 0, to be the pivot row.
z. Determine the index j of the entering column as follows. For each
column with t
,
< 0, divide all entries by t,
, and then choose
the lexicographically smallest column. If there is a tie between
several lexicographically smallest columns, choose the one with
the smallest index.
If the dual simplex method is initialized so that every column of the
tableau [that is, each vector (cj , B1 Aj ) ] is lexicographically positive, and
if the above lexicographic pivoting rule is used, the method terminates in a
fnite number of steps. The proof is similar to the proof of the corresponding
result for the primal simplex method (Theorem 3. 4) and is left as an exercise
(Exercise 4. 24) .
When should we use the dual simplex method
At this point , it is natural to ask when the dual simplex method should
be used. One such case arises when a basic feasible solution of the dual
problem is readily available. Suppose, for example, that we already have an
optimal basis for some linear programming problem, and that we wish to
solve the same problem for a diferent choice of the righthand side vector
b. The optimal basis for the original problem may be primal infeasible
under the new value of b. On the other hand, a change in b does not afect
the reduced costs and we still have a dual feasible solution. Thus, instead
of solving the new problem from scratch, it may be preferable to apply
the dual simplex algorithm starting from the optimal basis for the original
problem. This idea will be considered in more detail in Chapter 5.
The geometry of the dual simplex method
Our development of the dual simplex method was based entirely on tableau
manipulations and algebraic arguments. We now present an alternative
viewpoint based on geometric considerations.
We continue assuming that we are dealing with a problem in standard
form and that the matrix A has linearly independent rows. Let B be a basis
matrix with columns A¡
¸ , ,
, . ,A¡
¸
¸
,
This basis matrix determines a
basic solution to the primal problem with XB = Bl b. The same basis can
also be used to determine a dual vector p by means of the equations
p
'
AB
¸,,
=
`
¡
¸ ,,
, t = l , ... ,m.
Sec. 4. 5 Standard form problems and the dual simplex method 161
These are m equations i n m unknowns; since the columns A_¸ , ,
, , A_
¸
~
,
are linearly independent , there is a unique solution p. For such a vector p,
the number of linearly independent active dual constraints i s equal to the
dimension of the dual vector, and it follows that we have a basic solution
to the dual problem. In matrix notation, the dual basic solution p satisfes
p/B = � or p
i
= �n
.
which was referred to as the vector of simplex
multipliers in Chapter 3. If p is also dual feasible, that is, if p
i
A�L
I
¸ then
p is a basic feasible solution of the dual problem.
To summarize, a basis matrix n is associated with a basic solution
to the primal problem and also with a basic solution to the dual . A basic
solution to the primal (respectively, dual) which is primal (respectively,
dual) feasible, is a basic feasible solution to the primal (respectively, dual) .
We now have a geometric interpretation of the dual simplex method:
at every iteration, we have a basic feasible solution to the dual problem.
The basic feasible solutions obtained at any two consecutive iterations have
m l linearly independent active constraints in common (the reduced costs
of the m  1 variables that are common to both bases are zero) ; thus,
consecutive basic feasible solutions are either adjacent or they coincide.
Example a. s Consider the following standard form problem and its dual :
minimize
subject to
x, + x,
x, + 2x, ÷ x. = 2
x, x, = 1
xi , x, , x. , x, ¸0,
maximize
subject to
2
)i +
)
,
)i
+
), :
1
2
)i :
1
)i
,
), ¸0.
The feasible set of the primal problem i s 4dimensional . If we eliminate the
variables x. and x, , we obtain the equivalent problem
minimize x, + x,
subj ect to x, + 2x, ¸2
x, ¸1
x, ,x, 2 0.
The feasible sets of the equivalent primal problem and of the dual are shown in
Figures 4. 5(a) and 4. 5(b) , respectively.
There is a total of fve diferent bases in the standard form primal problem
and fve diferent basic solutions. These correspond to the points A, L, C, D,
and E i n Figure 4. 5( a) . The same fve bases also lead to fve basic solutions to
the dual problem, which are points A, L, C, D, and E in Figure 4. 5(b) .
For example, if we choose the columns A. and A,to be the basic columns,
we have the infeasible primal basic solution x = (0, 0, 2, 1 ) (point A, The
corresponding dual basic solution is obtained by letting p'A. = c
.
= ° and
p'A, = c, = 0, which yields p = (0, 0) . This is a basic feasible solution of the
dual problem and can be used to start the dual simplex method. The associated
initial tableau is
162 Chap. 4 Duality theory
b
Y
Pi
'») Þ¹
Figure 4. 5: The feasible sets in Example 4. 8.
xj x, x. x,
0 1 1 0 0
2 1 2* 1 0
1 1 0 0 1
We carry out two iterations of the dual simplex method to obtain the following
two tableaux:
xj x, x. x,
1 1/2 0 1/2 0
1 1/2 1 1 /2 0
1 1 * 0 0 1
xj x, x. x,
3/2 0 0 1/2 1/2
1/2 0 1 1/2 1/2
1 1 0 0 1
This sequence of tableaux corresponds to the path A¯ B ¯ C in either fgure. In
the primal space, the path traces a sequence of infeasible basic solutions until, at
Sec. 4. 5 Standard form problems and the dual simplex method 163
optimality, i t becomes feasible. In the dual space, the algorithm behaves exactly
like the primal simplex method: it moves through a sequence of (dual) basic
feasible solutions, while at each step improving the cost function.
Having observed that the dual simplex method moves from one basic
feasible solution of the dual to an adj acent one, it may be tempting to say
that the dual simplex method is simply the primal simplex method applied
to the dual. This is a somewhat ambiguous statement, however, because the
dual problem is not in standard form. If we were to convert it to standard
form and then apply the primal simplex method, the resulting method is
not necessarily identical to the dual simplex method (Exercise 4. 25) . A
more accurate statement is to simply say that the dual simplex method is
a variant of the simplex method tailored to problems defned exclusively in
terms of linear inequality constraints.
Duality and degeneracy
Let us keep assuming that we are dealing with a standard form problem
in which the rows of the matrix A are linearly independent. Any basis
matrix B leads to an associated dual basic solution given by p' = c�B1 .
At this basic solution, the dual constraint p' A
,
= ·,
is active if and only if
c�B 1 A
,
= ·,
,that is, if and only if the reduced cost c
,
is zero. Since p is
mdimensional, dual degeneracy amounts to having more than m reduced
costs that are zero. Given that the reduced costs of the m basic variables
must be zero, dual degeneracy is obtained whenever there exists a nonbasic
variable whose reduced cost is zero.
The example that follows deals with the relation between basic solu
tions to the primal and the dual in the face of degeneracy.
Example 4. 9 Consider the following standard fom problem and its dual:
minimize Jxi + x, maximize 2
)i
subject to x, + x, ÷ x. 2 subject to
)i + 2
)
,
2x
i x, ÷ x, 0
)i
÷
),
x
i
,x, ,x.
,x, : 0,
)i
,
), : O.
We eliminate x. and x, to obtain the equivalent primal problem
minimize
subject to
Jxi + x,
x
i + x, : 2
2x
i
÷ x, : 0
xi
, x, : O.
¸
¸
J
1
The feasible set of the equivalent primal and of the dual is shown in Figures 4. 6(a)
and 4. 6(b) , respectively.
There is a total of six diferent baes in the standard form primal problem,
but only four diferent basic solutions [points A, B, C, D in Figure 4. 6(a)] . In the
dual problem, however, the six baes lead to six distinct baic solutions [points
A, A' , A¨ , B, C, D in Figure 4. 6(b)] .
164 Chap. 4 Duality theory
D
'º'
' l'
Fì,ure4. 6: The feasible sets in Example 4. 9.
For example, i f we let columns A3 and A4 be basic, the primal basic solu
tion has Xl = X2 = 0and the corresponding dual basic solution is ,
P
l ,
P
2 ) = ¸0, 0)
Note that this is a basic feasible solution of the dual problem. If we let columns
Al and A3 be basic, the primal basic solution has again Xl = X2 = o. For
the dual problem, however, the equations p'Al = Cl and p'A3 = C3 yield
,
P
l ,
P
2 ) = ¸ 0, JJ2) , which is a basic feasible solution of the dual , namely, point
A in Figure 4. 6,b) . Finally, if we let columns A2 and A3 be basic, we still have
the same primal solution. For the dual problem, the equations p'A2 = Cl and
p'A3 = C3 yield ,
P
l ,
P
2 ) = ¸0, ÷I ) , which is an infeasible basic solution to the
dual, namely, point A¨ in Figure 4. 6,b) .
Example 4. 9 has established that diferent bases may lead to the same
basic solution for the primal problem, but to diferent basic solutions for the
dual. Furthermore, out of the diferent basic solutions to the dual problem,
it may be that some are feasible and some are infeasible.
We conclude with a summary of some properties of bases and basic
solutions, for standard form problems, that were discussed in this section.
,a) Every basis determines a basic solution to the primal , but also a
corresponding basic solution to the dual, namely, p
'
= ¸n
,b) This dual basic solution is feasible if and only if all of the reduced
costs are nonnegative.
,c) Under this dual basic solution, the reduced costs that are equal to
zero correspond to active constraints in the dual problem.
,d) This dual basic solution is degenerate if and only if some nonbasic
variable has zero reduced cost .
Sec. 4. 6 Farks' lemma and linear inequalities 165
4. 6 Farks' lemma and linear inequalities
Suppose that we wish to determine whether a given system of linear in
equalities is infeasible. In this section, we approach this question using
duality theory, and we show that infeasibility of a given system of linear
inequalities is equivalent to the feasibility of another, related, system of
linear inequalities. Intuitively, the latter system of linear inequalities can
be interpreted as a search for a ·c·t¡·a· .) to)asttt·, for the former
system.
To be more specifc, consider a set of standard form constraints Ax =
b and x 2 a Suppose that there exists some vector p such that p' A2 0'
and p' b < Then, for any x 2 0, we have p' Ax 2 0 and since p' b < 0,
i t follows that p' Ax =p' b. We conclude that Ax =b, for all x 2 0. This
argument shows that if we can fnd a vector p satisfying p' A 2 0' and
p'b < 0, the standard form constraints cannot have any feasible solution,
and such a vector p is a certifcate of infeasibility. Farkas' lemma below
states that whenever a standard form problem is infeasible, such a certifcate
of infeasibility p is guaranteed to exist .
Theorem 4. 6 (Fark' lemma) Let A be a matrix of dimensions
Î X n and let b be a vector in J
~
Then, exactly one of the following
two alteratives holds:
(a) There exists some x 2 0 such that Ax " b.
(
b
)
There exists some vector p such that p' A2 0' and p' b <
Proof. One direction i s easy. If there exists some x 2 0 satisfying Ax = b,
and if p' A 2 0' , then p' b = p' Ax 2 0, which shows that the second
alternative cannot hold.
Let us now assume that there exists no vector x 2 0 satisfing Ax =
b. Consider the pair of problems
maximize 0' x
subject to Ax b
x > 0,
minimize p' b
subject to p' A2 0' ,
and note that the frst is the dual of the second. The maximization prob
lem is infeasible, which implies that the minimization problem is either
unbounded (the optimal cost is 0) or infeasible. Since p = 0 is a feasi
ble solution to the minimization problem, it follows that the minimization
problem is unbounded. Therefore, there exists some p which is feasible,
that is, p' A2 0' , and whose cost is negative, that is, p' b < .
We now provide a geometric illustration of Farkas' lemma (see Fig
ure 4. 7) . Let A, , ,A
»
be the columns of the matrix A and note that
Ax = L�=
l
A, ., Therefore, the existence of a vector x 2 0 satisfying
166 Chap. 4 Duality theory
Ax = b is the same as requiring that b lies in the set of all nonnegative
linear combinations of the vectors AI ' . . . ' An , which is the shaded region
in Figure 4. 7. If b does not belong to the shaded region (in which case the
frst alternative in Farkas' lemma does not hold) , we expect intuitively that
we can fnd a vector p and an associated hyperplane {z
p'z = a, such
that b lies on one side of the hyperplane while the shaded region lies on the
other side. We then have p' b < 0 and p' Ai � 0 for all t, or, equivalently,
p' A � 0' , and the second alternative holds.
Farkas' lemma predates the development of linear programming, but
duality theory leads to a simple proof. A diferent proof, based on the
geometric argument we just gave, is provided in the next section. Finally,
there is an equivalent statement of Farkas' lemma which is sometimes more
convenient .
Corollary 4. 3 Let AI , . . . , An and b be given vectors and suppose
that any vector p that satisfes p' A, � 0, · " 1 , . . , n, must also
satisfy p' b � a Then, b can be expressed Ü a nonnegative linear
combination of the vectors AI , . . . , An .
Our next result is of a similar character.
Theorem 4. 7 Suppose that the system of linear inequalities Ax _ b
has at least one solution, and let 1 be some scalar. Then, the following
are equivalent:
l
a
)
Every feasible solution to the system Ax S b satisfes c' x _ 
(b) There exists some p � osuch that p' A = c' and p' b S 
Proof. Consider the following pair of problems
maximize
subj ect to
c'x
Ax S b,
minimize
subj ect to
p' b
p' A = c'
p � O,
and note that the frst is the dual of the second. If the system Ax S b
has a feasible solution and if every feasible solution satisfes c'x S 1, then
the frst problem has an optimal solution and the optimal cost is bounded
above by 1 By the strong duality theorem, the second problem also has
an optimal solution p whose cost is bounded above by 1 This optimal
solution satisfes p' A = c' , p � 0, and p'b S 1
Conversely, if some p satisfes p' A = c' , p � 0, and p' b S 1, then
the weak duality theorem asserts that every feasible solution to the frst
problem must also satisfy c' x S 1 .
Results such as Theorems 4. 6 and 4. 7 are often called ·/.cos .) ·/
Sec. 4. 6 Farks' lemma ad linear inequalities
Figure 4. 7: If the vector u does not belong to the set of all
nonnegative linear combinations of AI , . . . , A
n
, then we can fnd a
hyperplane {z ¡ p' z= O} that separates it from that set.
167
a·¬a·:c There are several more results of this type; see, for example,
Exercises 4. 26, 4. 27, and 4. 28.
Applications of Farks' lemma to asset pricing
Consider a market that operates for a single period, and in which n diferent
assets are traded. Depending on the events during that single period, there
are m possible states of nature at the end of the period. If we invest one
dollar in some asset t and the state of nature turns out to be C¸ we receive a
payof of ·,, Thus, each asset t is described by a payof vector ¸·
.
, ·
, )
The following m X n payof matrix gives the payofs of each of the n assets
for each of the m states of nature:
Let ,be the amount held of asset t A portfolio of assets is then a vector
} " ¸
.


, The components of a portfolio } can be either positive
or negative. A positive value of , indicates that one has bought ,units
of asset t and is thus entitled to receive ·,, , if state C materializes. A
negative value of ,indicates a "short" position in asset t this amounts to
selling
,
units of asset t at the beginning of the period, with a promise
to buy them back at the end. Hence, one must pay out ·,, l , l if state C
occurs, which is the same as receiving a payof of ·,, ,
168 Chap. 4 Duality theory
The wealth in state C that results from a portfolio x is given by
D
Ws
= _·
, .,
,
,
We introduce the vector w =
¸
o, ,  , wm) , and we obtain
w = Rx.
Let ,, be the price of asset t in the beginning of the period, and let p =
¸v, , ,
v
) be the vector of asset prices. Then, the cost of acquiring a
portfolio x is given by p' x.
The central problem in asset pricing is to determine what the prices
P
i should be. In order to address this question, we introduce the ··»
, ··.··, condition, which underlies much of fnance theory: asset prices
should always be such that no investor can get a guaranteed nonnegative
payof out of a negative investment . In other words, any portfolio that
pays of nonnegative amounts in every state of nature, must be valuable to
investors, so it must have nonnegative cost. Mathematically, the absence
of arbitrage condition can be expressed as follows:
if Rx � 0, then we must have p
'
x �
Given a particular set of assets, as described by the payof matrix R, only
certain prices p are consistent with the absence of arbitrage. What charac
terizes such prices? What restrictions does the assumption of no arbitrage
impose on asset prices? The answer is provided by Farks' lemma.
Theorem 4.8 The absence of arbitrage condition holds if and only if
there exists a nonnegative vector q = ,
,
, qm) , such that the price
of each asset t is given by
m
P
i = _,
.
·
,
,
Proof. The absence of arbitrage condition states that there exists no
vector x such that x'R' � 0' and x' p < This is of the same form as
condition ¸b) in the statement of Farkas' lemma ¸Theorem 4. 6) . ¸Note that
here p plays the role of b, and R' plays the role of A , Therefore, by
Farkas' lemma, the absence of arbitrage condition holds if and only if there
exists some nonnegative vector q such that R' q = p, which is the same as
the condition in the theorem' s statement . ¯
Theorem 4. 8 asserts that whenever the market works efciently enough
to eliminate the possibility of arbitrage, there must exist "state prices" ,
.
Sec. 4. 6 Fom separating hyerplanes to duality 169
that can be used to value the existing assets. Intuitively, it establishes
a nonnegative price qs for an elementary asset that pays one dollar if the
state of nature is C¸ and nothing otherwise. It then requires that every asset
must be consistently priced, its total value being the sum of the values of
the elementary assets from which it is composed. There is an alternative
interpretation of the variables qs as being (unnormalized) probabilities of
the diferent states C¸ which, however, we will not pursue. In general, the
state price vector q will not be unique, unless the number of assets equals
or exceeds the number of states.
The no arbitrage condition is very simple, and yet very powerful. It
is the key element behind many important results in fnancial economics,
but these lie beyond the scope of this text . ( See, however, Exercise 4. 33 for
an application in options pricing. )
4. 7 Fom separating hyperplanes to duality*
Let us review the path followed in our development of duality theory. We
started from the fact that the simplex method, in conjunction with an anti
cycling rule, is guaranteed to terminate. We then exploited the termination
conditions of the simplex method to derive the strong duality theorem. We
fnally used the duality theorem to derive Farkas' lemma, which we inter
preted in terms of a hyperplane that separates b from the columns of A In
this section, we show that the reverse line of argument is also possible. We
start from frst principles and prove a general result on separating hyper
planes. We then establish Farkas' lemma, and conclude by showing that the
duality theorem follows from Farkas' lemma. This line of argument is more
elegant and fundamental because instead of relying on the rather compli
cated development of the simplex method, it only involves a small number
of basic geometric concepts. Frthermore, it can be naturally generalized
to nonlinear optimization problems.
Closed sets and Weierstrass' theorem
Before we proceed any further, we need to develop some background ma
terial. A set  c J
is called ·.s1 if it has the following property: if
x
.
x
. . . is a sequence of elements of that converges to some x J
,
then x  In other words, contains the limit of any sequence of elements
of  Intuitively, the set contains its boundary.
Theorem 4. 9 Every polhedron j, c".
Proof. Consider the polyhedron P = {xJ
Ax2 b}. Suppose that
x
.
x
. « . is a sequence of elements of P that converges to some x· We have
170 Chap. 4 Duality theory
to show that x* E P. For each k, we have xk E P and, therefore, Axk 2 b.
Taking the limit , we obtain Ax* = A¸lim
k
o xk, = lim
k
o ¸Axk, 2 b,
and x* belongs to P. .
The following is a fundamental result from real analysis that provides
us with conditions for the existence of an optimal solution to an optimiza
tion problem. The proof lies beyond the scope of this book and is omitted.
Theorem 4. 10 (Weierstrass' theorem) nf · ln ' l is a con
tinuous function, and oS is a nonempty closed, and bounded subset
of ln, then there exsts some x* E S such that f(x* ) _ f(x) for all
X E S. Similarly, there exists some y* E S such that f(y* ) ¸ f(x) for
all x E S.
Weierstrass' theorem is not valid if the set S is not closed. Consider,
for example, the set S = {x E l
x > a, This set is not closed because we
can form a sequence of elements of S that converge to zero, but x = 0 does
not belong to S. We then observe that the cost function f(x) = x is not
minimized at any point in S; for every x > 0, there exists another positive
number with smaller cost , and no feasible x can be optimal. Ultimately,
the reason that S is not closed is that the feasible set was defned by means
of s·ct·· inequalities. The defnition of polyhedra and linear programming
problems does not allow for strict inequalities in order to avoid situations
of this type.
The separating hyperplane theorem
The result that follows is "geometrically obvious" but nevertheless ex
tremely important in the study of convex sets and functions. It states that
if we are given a closed and nonempty convex set S and a point x* , S,
then we can fnd a hyperplane, called a sjac·toç /,jçao, such that S
and x* lie in diferent halfspaces (Figure 4. 8) .
Theorem 4. 1 1 ( Separating hyperplane theorem) Let S be a non
empty closed convex subset of ln and let x* E �n be a vector that
does not belong to S. Then, there exists some vector L E ln such that
c'x* < c'x for all x E S.
Proof. Let
be the Euclidean norm defned by
I
l
x
l l
= ,×'×, ' ´

Let us
fx some element w of S, and let
B =
{
x ¸
l l
x

x*
1 1 ` I
l
w

x*
I I
¦ ,
and D = S n B
[
Figure 4. 9(a)
]
. The set D is nonempty, because w E D.
Sec. 4. 7 Fom separating hyperplanes to duality
L
Figure 4. 8: A hyperplane that separates the point x' from the
convex set S.
171
Frthermore, D is the intersection of the closed set with the closed set
B and is also closed. Finally, D is a bounded set because B is bounded.
Consider the quantity x x· ¸ where xranges over the set D. This is
a continuous function of x Since D is nonempty, closed, and bounded,
Weierstrass' theorem implies that there exists some yD such that
y x·
¸ x x· v x t
For any x  that does not belong to D, we have ¸ x x· > « x· 2
y x· ¸ We conclude that yminimizes x x·
over all x 
We have so far established that there exists an element yof which
is closest to x· We now show that the vector L * y x·has the desired
property [see Figure 4. 9(b)] .
Let x  For any xsatisfing 0 < x: 1 , we have y

xx y,
because i s convex. Since yminimizes x x· over all x we obtain
y  x·
<
:
 xx  y,  x·
y x·
:xy x· ,
x y,

x
x
y
which yields
:xy x· ,
x y,
x
x
:
2 a
We divide by xand then take the limit as xdecreases to zero. We obtain
y x· ,
x y,2 a
[This inequality states that the angle ( i n Figure 4. 9(b) i s no larger than
90 degrees. ] Thus,
y x· ,
x 2 y x· ,
y
172 Chap. 4 Dualty theory
×·
(
a
)
'
)
Fì,urea. u. Illustration of the proof of the separating hyperplane
theorem.
y x· ,
x·+ y x· ,
y x· ,
> y x· , x·
Setting L = y x· proves the theorem.
Farks' lemma revisited
¯
We now show that Farkas
'
lemma is a consequence of the separating hy
perplane theorem.
We will only be concerned with the difcult half of Farkas
'
lemma. In
particular, we will prove that if the system nx= b, x ¸ 0, does not have
a solution, then there exists a vector p such that p' n¸0' and p' b <
Let
 ¦ nx x ¸ s¦
¦y¸ there exists xsuch that y= nx x ¸ s}
and suppose that the vector b does not belong to  The set is clearly
convex; it is also nonempty because 0  Finally, the set is closed; this
may seem obvious, but is not easy to prove. For one possible proof, note
that is the proj ection of the polyhedron [ x y,
I
y= nx x ¸ s, onto
the ycoordinates, is itself a polyhedron ¸see Section 2. 8) , and is therefore
closed. An alternative proof is outlined in Exercise 4. 37.
We now invoke the separating hyperplane theorem to separate b from
and conclude that there exists a vector p such that p' b < , yfor every
Sec. 4. 7 Fom separating hyperplanes to duality* 173
y E S. Since 0 S, we must have p' b < 0. Frthermore, for every column
Ai of A and every A > 0, we have AAi S and p' b <
AP
' Ai . We divide
both sides of the latter inequality by
A
and then take the limit as A tends
to infnity, to conclude that p' Ai 2 0. Since this is true for every t , we
obtain p' A 2 0' and the proof is complete.
The duality theorem revisited
We will now derive the duality theorem as a corollary of Farkas' lemma.
We only provide the proof for the case where the primal constraints are of
the form Ax 2 b. The proof for the general case can be constructed along
the same lines at the expense of more notation (Exercise 4. 38) . We also
note that the proof given here is very similar to the line of argument used
in the heuristic explanation of the duality theorem in Example 4. 4.
We consider the following pair of primal and dual problems
minimize
subject to
c' x
Ax 2 b,
maximize
subject to
p' b
p'A = c'
p 2 0,
and we assume that the primal has an optimal solution x* . We will show
that the dual problem also has a feasible solution with the same cost . Once
this is done, the strong duality theorem follows from weak duality (cf. Corol
lary 4. 2) .
Let f = ¦t ¸ a� x* = bd be the set of indices of the constraints that
are active at x* . We will frst show that any vector d that satisfes a� d 2 0
for every t f, must also satisf c' d 2 0. Consider such a vector d and let
E be a positive scalar. We then have a� ¸x* + Ed) 2 �x* = bi for all t f.
In addition, if t , f and if E is sufciently small , the inequality a� x* > bi
implies that a� ¸x* +Ed) > bi . We conclude that when E is sufciently small,
x* + Ed is a feasible solution. By the optimality of x* , we obtain c' d 2 0,
which establishes our claim. By Farkas' lemma (cf. Corollary 4. 3) , c can
be expressed as a nonnegative linear combination of the vectors ai , t f,
and there exist nonnegative scalars Pi , t f, such that
(4. 3)
For t , f, we defne Pi * 0. We then have p 2 0 and Eq. (4. 3) shows that
the vector p satisfes the dual constraint p' A = c' . In addition,
'
b
"
b
"
, * , * p =
�
Pi i =
�
Pi
a
X = X ,
i
EI i EI
which shows that the cost of this dual feasible solution p i s the same as the
optimal primal cost . The duality theorem now follows fom Corollary 4. 2.
174 Chap. 4 Duality theory
.ºì ' lì
Figure 4. 10: Examples of cones.
In conclusion, we have accomplished the goals that were set out in the
beginning of this section. We proved the separating hyperplane theorem,
which is a very intuitive and seemingly simple result , but with many im
portant ramifcations in optimization and other areas in mathematics. We
used the separating hyperplane theorem to establish Farkas' lemma, and
fnally showed that the strong duality theorem is an easy consequence of
Farkas' lemma.
4. 8 Cones and extreme rays
We have seen in Chapter 2, that if the optimal cost in a linear programming
problem is fnite, then our search for an optimal solution can be restricted
to fnitely many points, namely, the basic feasible solutions, assuming one
exists. In this section, we wish to develop a similar result for the case where
the optimal cost is ¨¯. In particular, we will show that the optimal cost
is ¨¯ if and only if there exists a cost reducing direction along which we
can move without ever leaving the feasible set. Furthermore, our search for
such a direction can be restricted to a fnite set of suitably defned "extreme
rays. "
Cones
The frst step in our development is to introduce the concept of a cone.
Defnition 4. 1 A set C c ln is a cone o·xE C for all ·� 0 aJ
all } E C.
Sec. 4. 8 Oones and extreme rays 175
Notice that if C is a nonempty cone, then 0 E C. To this see, consider
an arbitrary element x of C and set . = 0 in the defnition of a cone; see
also Figure 4. 10. A polyhedron of the form P = {x E �n I Ax ;: O} is
easily seen to be a nonempty cone and is called a j.,/1c ·.o
Let x be a nonzero element of a polyhedral cone C. We then have
3x/2 E C and x/2 E C. Since x is the average of 3x/2 and x/2, it is not
an extreme point and, therefore, the only possible extreme point is the zero
vector. If the zero vector is indeed an extreme point , we say that the cone
is j.to·1. Whether this will be the case or not is determined by the criteria
provided by our next result .
Theorem 4. 12 Let C C �n be the polyhedral cone defned by the
constraints a�x ;: 0, · ¯ 1 , . . ,m. Then, the following are equivalent:
(a) The zero vector is an extreme point of C.
(b) The cone C does not contain a line.
(c) There exist n vectors out of the family al , . . . , am, which are
linearly independent.
Proof. This result is a special case of Theorem 2. 6 in Section 2. 5. .
Rays and recession cones
Consider a nonempty polyhedron
and let us h some
Y
E P. We defne the c·sst.o ·.o a· y as the set of
all directions d along which we can move indefnitely away from y, without
leaving the set P. More formally, the recession cone is defned as the set
{ d E �n I A(y + 'd) ;: b, for all . ;: a ¦
It is easily seen that this set is the same as
and is a polyhedral cone. This shows that the recession cone is independent
of the starting point y; see Figure 4. 1 1 . The nonzero elements of the
recession cone are called the c,s of the polyhedron P.
For the case of a nonempty polyhedron P = {x E �n I Ax = b, x ;:
0} in standard form, the recession cone is seen to be the set of all vectors
d that satisf
Ad = O, d ;:a
176 Chap. 4 Duality theory
Fìgure 4. 11: The recession cone at diferent elements of a polyhedron.
Extreme rays
We now defne the extreme rays of a polyhedron. Intuitively, these are the
directions associated with "edges" of the polyhedron that extend to infnity;
see Figure 4. 1 2 for an illustration.
Deñnìtìona. z

a
)
A nonzero element × of a polyhedral cone C c �
is called an
e×tremeray if there are n  1 linearly independent constraints
that are active at ×
(
b
)
An extreme ray of the recession cone associated with a nonempty
polyhedron P is also called an e×tremeray of P.
Note that a positive multiple of an extreme ray i s also an extreme ray.
We say that two extreme rays are ,otca/o· if one is a positive multiple of
the other. Note that for this to happen, they must correspond to the same
n1 linearly independent active constraints. Any n1 linearly independent
constraints defne a line and can lead to at most two nonequivalent extreme
rays ¸one being the negative of the other) . Given that there is a fnite
number of ways that we can choose n  1 constraints to become active,
and as long as we do not distinguish between equivalent extreme rays, we
conclude that the number of extreme rays of a polyhedron is fnite. A fnite
collection of extreme rays will be said to be a ·.oj/· s· .) ·co c,s
if it contains exactly one representative from each equivalence class.
Sec. 4. 8 Cones and extreme rays
'ºì
' ¹
Figure 4. 12: Extreme rays of polyhedral cones. ¸ a) The vector
y is an extreme ray because n = 2 and the constraint a� x = 0
is active at y. ¸b) A polyhedral cone defned by three linearly
independent constraints of the form a� x ¸ O. The vector Z is
an extreme ray because n = J and the two linearly independent
constraints a� x ¸0 and a�x ¸0 are active at Z.
177
The defnition of extreme rays mimics the defnition of basic feasible
solutions. An alternative and equivalent defnition, resembling the defni
tion of extreme points of polyhedra, is explored in Exercise 4. 39.
Characterization of unbounded linear programming
problems
We now derive conditions under which the optimal cost in a linear pro
gramming problem is equal to 0, frst for the case where the feasible set
is a cone, and then for the general case.
Theorem 4. 13 Consider the problem of minimizing c' x over a pointed
polyhedral cone C = {x �
I a� x ? 0, t = 1 , « . « , m} . The optimal
cost is equal to 0 if and only if some extreme ray d of C satisfes
c' d <
Proof. One direction of the result i s trivial because i f some extreme ray
has negative cost , then the cost becomes arbitrarily negative by moving
along this ray.
For the converse, suppose that the optimal cost is 0. In particular,
there exists some x E C whose cost is negative and, by suitably scaling x,
178 Chap. 4 Duality theory
we can assume that c'x " 1 . In particular, the polyhedron
P " {x E �
n
¸ a� x _ 0, , a�x _ O, c
'
x " I ¦
is nonempty. Since C is pointed, the vectors a
.
a
span �
n
and this
implies that P has at least one extreme point; let d be one of them. At d,
we have n linearly independent active constraints, which means that n  1
linearly independent constraints of the form a�x _ 0 must be active. It
follows that d is an extreme ray of C. .
By exploiting duality, Theorem 4. 13 leads to a criterion for unbound
edness in general linear programming problems. Interestingly enough, this
criterion does not involve the righthand side vector t
Theorem 4. 14 Consider the problem of minimizing c'x subject to
Ax _ t, and assume that the feasible set has at least one extreme
point. The optimal cost is equal to ¨¯ Hand only if some extreme
ray d of the feasible set satisfe c' d < a
Proof. One direction of the result i s trivial because i f an extreme ray has
negative cost, then the cost becomes arbitrarily negative by starting at a
feasible solution and moving along the direction of this ray.
For the proof of the reverse direction, we consider the dual problem:
maximize , t
subject to ,A " c'
,_ 0
If the primal problem is unbounded, the dual problem is infeasible. Then,
the related problem
maximize ,0
subject to ,A " c'
,_ 0,
i s also infeasible. This implies that the associated primal problem
minimize c' x
subject to Ax _ 0,
i s either unbounded or infeasible. Since x " 0 i s one feasible solution, it
must be unbounded. Since the primal feasible set has at least one extreme
point, the rows of A span �
n
, where n is the dimension of x. It follows
that the recession cone {x I Ax _ O} is pointed and, by Theorem 4. 13,
there exists an extreme ray d of the recession cone satisfing c' d < O. By
defnition, this is an extreme ray of the feasible set. .
Sec. 4. 9 Representation of polyhedra 179
The unboundedness criterion in the simplex method
We end this section by pointing out that if we have a standard form prob
lem in which the optimal cost is ¨¯ç the simplex method provides us at
termination with an extreme ray.
Indeed, consider what happens when the simplex method terminates
with an indication that the optimal cost is ¨¯. At that point, we have
a basis matrix B, a nonbasic variable ·,
with negative reduced cost, and
the jth column B' A
,
of the tableau has no positive elements. Consider
the jth basic direction d,which is the vector that satisfes d_= B' A
,
,
1,
= 1 , and 1, = 0 for every nonbasic index t other than j. Then, the
vector dsatisfes Ad= oand d¸ o, and belongs to the recession cone. It
is also a direction of cost decrease, since the reduced cost c
,
of the entering
variable is negative.
Out of the constraints defning the recession cone, the jth basic di
rection dsatisfes n ¨ 1 linearly independent such constraints with equality:
these are the constraints Ad= o ,m of them) and the constraints 1, = 0
for t nonbasic and diferent than j ,n ¨ m ¨ 1 of them) . We conclude that
dis an extreme ray.
4. 9 Representation of polyhedra
In this section, we establish one of the fundamental results of linear pro
gramming theory. In particular, we show that any element of a polyhedron
that has at least one extreme point can be represented as a convex combi
nation of extreme points plus a nonnegative linear combination of extreme
rays. A precise statement is given by our next result. A generalization to
the case of general polyhedra is developed in Exercise 4. 47.
Theorem 4. 15 (Resolution theorem) Let
be a nonempty polyhedron with at least one extreme point. Let
x
.
. » . x
.
be the extreme points, and let N

¸ · . · ç N
1
be a complete
set of extreme rays of P. Let
Then, Q " P.
180
Proof. We frst prove that Q C P. Let
k T
Chap. 4
x_x, x
,
+ _
,
«
,
,
. ,.
Dualty theory
be an element of Q, where the coefcients x, and 
,
are nonnegative, and
_¸
x, 1 . The vector y = _¸
x, x
,
is a convex combination of ele
ments of P. It therefore belongs to P and satisfes ny2 b. We also have
n«
,
2 0 for every j, which implies that the vector .= _
¸

,
«
,
satisfes
n.2 0. It then follows that the vector xy+ .satisfes nx2 b and
belongs to P.
For the reverse inclusion, we assume that P i s not a subset of Q and
we will derive a contradiction. Let . be an element of P that does not
belong to Q. Consider the linear programming problem
k T
maximize _x,+ _
,
,
.
,.
k T
subject to _x, x
,
+ _
,
«
,
.
,
. ,.
k
_x,
·
,
.
x,2 0, t 1 , . . . , k,

,
2 0, j 1 , ·
( 4. 4)
which is infeasible because ., Q. This problem is the dual of the problem
minimize ,.+ ,
subject to ,x
,
+,2 0,
,«
,
2 0,
t 1, . . . , k,
j = 1 , . . . ·
(4. 5)
Because the latter problem has a feasible solution, namely, ,0 and ,0,
the optimal cost is 0, and there exists a feasible solution , ,,whose
cost ,.+,is negative. On the other hand, ,x
,
+ ,2 0 for all t and this
implies that , .< ,x
,
for all t We also have ,«
,
2 0 for all j .
Having fxed , as above, we now consider the linear programming
problem
minimize ,x
subject to nx2 b.
If the optimal cost is fnite, there exists an extreme point x
,
which is op
timal. Since . is a feasible solution, we obtain ,x
,
¸
,. which is a
·
For an intuitive view of this proof, the purpose of this paragraph was to construct a
hyperplane that separates Z from Q.
Sec. 4. 9 Representation of polyhedra 181
contradiction. If the optimal cost i s ¨¯ç Theorem 4. 14 implies that there
exists an extreme ray «such that p «< 0, which is again a contradiction.
¯
Example 4. 10 Consider the unbounded polyhedron defned by the constraints
x
i
÷ x, � 2
x
i
+x, � I
xi , x, � °
¸see Figure 4. IJ) . This polyhedron has three extreme points, namely, x
,
¬ ¸0, 2) ,
x
,
¬ ¸0, I ) , and x
.
¬ ¸ I , 0) . The recession cone C i s described by the inequalities
d
i
÷ d, � 0, d
i
+d, � 0, and d, , d, � 0. We conclude that C ¬ { ¸d
; , d,
) ¸ 0 _
d, _ d, ] . This cone has two extreme rays, namely, v
¹
¬ ¸ ¡ , I ) and v
,
¬ ¸ I , 0) .
The vector y ¬ ( 2, 2) is an element of the polyhedron and can be represented as
However, this representation is not unique; for example, we also have
¸
I
0
¸
¡
I
¸
J
I
¸
I
, I
. J
¹
Y
¬
2
¬
_
I
+
_
°
+
_
I
¬
_
x +
_
x +
_
v .
Figure 4. 13: The polyhedron of Example 4. ¡0.
We note that the set Q in Theorem 4. 15 is the image of the polyhedron
¬ ¸ ·
.
·
.
º, 
, I _
·,
·
·, � 
�
¡
182
under the linear mapping
Chap. 4
.
Duality theory
,
·
. . . ·
. , 
, ¯ _Ai X
i
+ _0
j
W
j
.
i
=
l j=l
Thus, one corollary of the resolution theorem is that every polyhedron is
the image, under a linear mapping, of a polyhedron ¬with this particular
structure.
We now specialize Theorem 4. 1 5 to the case of bounded polyhedra,
to recover a result that was also proved in Section 2. 7, using a diferent line
of argument.
Corollary 4. 4 A nonempty bounded polyhedon is the convex hull of
its extreme points.
Proof. Let P " {x I Ax 2 b} be a nonempty bounded polyhedron. If +
is a nonzero element of the cone C " {x I Ax 2 O} and x is an element of
P, we have x + ·+E P for all ·2 0, contradicting the boundedness of P.
We conclude that C consists of only the zero vector and does not have any
extreme rays. The result then follows from Theorem 4. 15. ¯
There is another corollary of Theorem 4. 1 5 that deals with cones, fnd
which is proved by noting that a cone can have no extreme points other
than the zero vector.
Corollary 4.5 Assume that the cone C " {x I Ax 2 O} is pointed.
Then, every element of C can be expressed Ü a nonnegative linear
combination of the extreme rays of C.
Converse to the resolution theorem
Let us say that a set Q is ¡ot·, çoc·1 if it is specifed in the form
(4. 6)
where X
l
, . . . x
.
and v
_
, . . . «
are some given elements of �
^
. The res
olution theorem states that a polyhedron with at least one extreme point
is a fnitely generated set (this is also true for general polyhedra; see Exer
cise 4. 47) . We now discuss a converse result , which states that every fnitely
generated set is a polyhedron.
Sec. 4. 9 General 1inear progamming duality 183
As observed earlier, a fnitely generated set Q can be viewed as the
image of the polyhedron
¬
=
¸ ·, ·, º. . . .
, ºI "·,
= ¦ ·, ¸ a
º
¸ ¸ º ,
.,
under a certain linear mapping. Thus, the results of Section 2. 8 apply and
establish that a fnitely generated set is indeed a polyhedron. We record
this result and also present a proof based on duality.
Theorem 4. 16 A fnitely generated set is a polyhedron. In particular,
the convex hull of fnitely many vectors is a (bounded) polyhedron.
Proof. Consider the linear programming problem (4. 4) that was used in
the proof of Theorem 4. 15. A given vector .belongs to a fnitely generated
set Q of the form (4. 6) if and only if the problem (4. 4) has a feasible
solution. Using duality, this is the case if and only if problem ( 4. 5) has fnite
optimal cost . We convert problem ( 4. 5) to standard form by introducing
nonnegative variables ,
,
, P
,
,
, such that , = ,
,
 , , and , =
,
,
, as well as surplus variables. Since standard form polyhedra contain
no lines, Theorem 4. 13 shows that the optimal cost in the standard form
problem is fnite if and only if
,
,
,
. ,
,
.+,
,
 ,
¸ a
for each one of i ts fnitely many extreme rays. Hence, .E Q if and only if
.satisfes a fnite collection of linear inequalities. This shows that Q is a
polyhedron. .
In conclusion, we have two ways of representing a polyhedron:
(a) in terms of a fnite set of linear constraints;
(b) as a fnitely generated set , in terms of its extreme points and extreme
rays.
These two descriptions are mathematically equivalent, but can be
quite diferent from a practical viewpoint . For example, we may be able to
describe a polyhedron in terms of a small number of linear constraints. If on
the other hand, this polyhedron has many extreme points, a description as a
fnitely generated set can be much more complicated. Frthermore, passing
from one type of description to the other is, in general , a complicated
computational task.
4. 10 General linear programming duality*
In the defnition of the dual problem (Section 4. 2) , we associated a dual
variable ,,with each constraint of the form a� x= ·, a� x¸ ·, or a� x_ ·,
184 Chap. 4 Duality theory
However, no dual variables were associated with constraints of the form
, � a or , :: º In the same spirit , and in a more general approach
to linear programming duality, we can choose arbitrarily which constraints
will be associated with price variables and which ones will not . In this
section, we develop a general duality theorem that covers such a situation.
Consider the primal problem
minimize e
×
subj ect to A×�b
× , P,
where P is the polyhedron
P = {×I D×�+¦
We associate a dual vector ,with the constraint A×�b. The constraint
× , P is a generalization of constraints of the form , �a or , :: a and
dual variables are not associated with it .
As in Section 4. 1 , we defne the 1oa .,··to , ,, by
,,,= min [e
×+, t A×)]
(4. 7)
xEP
The 1oa jc.o is then defned as
maximize , ,,
subject to ,�0.
We frst provide a generalization of the weak duality theorem.
Theorem 4. 17 (Weak duality) I×is primal feasible ¸ A×� b and
×E P) , and pis dual feaible ,� 0) , then , ,, :: e
×.
Proof. If ×and ,are primal and dual feasible, respectively, then ,¸b 
A×) :: a which implies that
rem.
, ,, min [e
y+,
t Ay)
]
yEP
<
e
×+, t A×)
<
e
× .
We also have the following generalization of the strong duality theo
Theorem 4. 18 (Strong duality) If the primal problem has a op
timal solution, so does the dual, and the respective optimal costs are
equal.
Sec. 4. 1 0 General linear progaming duality*
Proof. Since P = ¦xI bx�+¦ the primal problem is of the form
minimize L
/
x
subject to nx�t
bx � +
and we assume that it has an optimal solution. Its dual, which is
maximize ,t+, +
subject to ,n+ ,b= L
/
, � s
, � s
185
(4. 8)
must then have the same optimal cost. For any fxed , the vector ,should
be chosen optimally in the problem (4. 8) . Thus, the dual problem (4. 8) can
also be written as
maximize ,t+ ,,,
subject to ,�0,
where , ,, i s the optimal cost i n the problem
maximize , +
subject to , b= L
/
¯ ,n
, � s
(4. 9)
[If the latter problem is infeasible, we set ,,, = ¯¯o ] Using the strong
duality theorem for problem ( 4. 9) , we obtain
,,, = min (Lx ,nx,
Dx;d
We conclude that the dual problem (4. 8) has the same optimal cost as the
problem
maximize , t+ min (Lx ,nx,
Dx;d
subject to ,�s
By comparing with Eq. ( 4. 7) , we see that this is the same as maximizing
, ,, over all ,�s .
The idea of selectively assigning dual variables to some of the con
straints is often used in order to treat "simpler" constraints diferently
than more "complex" ones, and has numerous applications in large scale
optimization. (Applications to integer programming are discussed in Sec
tion 1 1 . 4. ) Finally, let us point out that the approach in this section extends
to certain nonlinear optimization problems. For example, if we replace the
186 Chap. 4 Duality theory
linear cost function xby a general convex function x, and the poly
hedron P by a general convex set , we can again defne the dual objective
according to the formula
,,," min [x,+, t¯ nx,]
xEP
It turns out that the strong duality theorem remains valid for such nonlinear
problems, under suitable technical conditions, but this lies beyond the scope
of this book.
4. 1 1 Summary
We summarize here the main ideas that have been developed in this chapter.
Given a (primal) linear programming problem, we can associate with
it another (dual) linear programming problem, by following a set of me chan
ical rules. The defnition of the dual problem is consistent , in the sense that
the duals of equivalent primal problems are themselves equivalent .
Each dual variable is associated with a particular primal constraint
and can be viewed as a penalty for violating that constraint . By replacing
the primal constraints with penalty terms, we increase the set of available
options, and this allows us to construct primal solutions whose cost is less
than the optimal cost . In particular, every dual feasible vector leads to a
lower bound on the optimal cost of the primal problem (this is the essence of
the weak duality theorem) . The maximization in the dual problem is then
a search for the tightest such lower bound. The strong duality theorem
asserts that the tightest such lower bound is equal to the optimal primal
cost .
An optimal dual variable can also be interpreted as a marginal cost ,
that is, as the rate of change of the optimal primal cost when we perform a
small perturbation of the righthand side vector tassuming nondegeneracy.
A useful relation between optimal primal and dual solutions is pro
vided by the complementary slackness conditions. Intuitively, these con
ditions require that any constraint that is inactive at an optimal solution
carries a zero price, which is compatible with the interpretation of prices
as marginal costs.
We saw that every basis matrix in a standard form problem deter
mines not only a primal basic solution, but also a basic dual solution. This
observation is at the heart of the dual simplex method. This method is
similar to the primal simplex method in that it generates a sequence of
primal basic solutions, together with an associated sequence of dual basic
solutions. It is diferent , however, in that the dual basic solutions are dual
feasible, with ever improving costs, while the primal basic solutions are in
feasible (except for the last one) . We developed the dual simplex method by
simply describing its mechanics and by providing an algebraic justifcation.
Sec. 4. 12 Exercises 187
Nevertheless, the dual simplex method also has a geometric interpretation.
It keeps moving from one dual basic feasible solution to an adjacent one
and, in this respect, it is similar to the primal simplex method applied to
the dual problem.
All of duality theory can be developed by exploiting the termination
conditions of the simplex method, and this was our initial approach to the
subject . We also pursued an alternative line of development that proceeded
from frst principles and used geometric arguments. This is a more direct
and more general approach, but requires more abstract reasoning.
Duality theory provided us with some powerful tools based on which
we were able to enhance our geometric understanding of polyhedra. We
derived a few theorems of the alternative (like Farkas' lemma) , which are
surprisingly powerful and have applications in a wide variety of contexts.
In fact , Farks' lemma can be viewed as the core of linear programming
duality theory. Another major result that we derived is the resolution
theorem, which allows us to express any element of a nonempty polyhedron
with at least one extreme point as a convex combination of its extreme
points plus a nonnegative linear combination of its extreme rays; in other
words, every polyhedron is "fnitely generated. " The converse is also true,
and every fnitely generated set is a polyhedron (can be represented in
terms of linear inequality constraints) . Results of this type play a key
role in confrming our intuitive geometric understanding of polyhedra and
linear programming. They allow us to develop alternative views of certain
situations and lead to deeper understanding. Many such results have an
"obvious" geometric content and are often taken for granted. Nevertheless,
as we have seen, rigorous proofs can be quite elaborate.
4. 12 Exercises
Exercise 4. 1 Consider the linear programming problem:
minimize Xl X2
subject to 2Xl + 3X2 X3 +
3Xl + X2 + 4X3
Xl X2 + 2X3 +
Xl � 0
X2 , X3 � a
Write down the corresponding dual problem.
Exercise 4.2 Consider the primal problem
c' x minimize
subject to Ax � b
x � 0
X4
2X4
X4
� 0
� 3
6
Form the dual problem and convert it into an equivalent minimization problem.
Derive a set of conditions on the matrix A and the vectors b, c, under which the
188 Chap. 4 Dualty theory
dual is identical to the primal, and construct an example in which these conditions
are satisfed.
Exercise 4.3 The purpose of this exercise is to show that solving linear pro
gramming problems is no harder than solving systems of linear inequalities.
Suppose that we are given a subroutine which, given a system of linear in
equality constraints, either produces a solution or decides that no solution exists.
Construct a simple algorithm that uses a single call to this subroutine and which
fnds an optimal solution to any linear programming problem that has an optimal
solution.
Exercise 4. 4 Let A be a symmetric square matrix. Consider the linear pro
gramming problem
minimize
subject to
c'x
Ax _ c
x _ 0
Prove that i f x' satisfes Ax' = c and x' _ 0, then x' i s an optimal solution.
Exercise 4. 5 Consider a linear programming problem in standard form and
assume that the rows of A are linearly independent . For each one of the following
statements, provide either a proof or a counterexample.
(a) Let x'be a basic feasible solution. Suppose that for every basis correspond
ing to x' , the associated basic solution to the dual is infeasible. Then, the
optimal cost must be strictly less that c'x'.
(b) The dual of the auxiliary primal problem considered in Phase I of the
simplex method is always feasible.
(c) Let
P
i be the dual variable associated with the ith equality constraint in
the primal . Eliminating the ith primal equality constraint is equivalent to
introducing the additional constraint
P
i = 0 in the dual problem.
(d) If the unboundedness criterion in the primal simplex algorithm is satisfed,
then the dual problem is infeasible.
Exercise 4. 6 · (DualityinChcbychcvapproxination) Let A be an m x n
matrix and let b be a vector in R
¬
. We consider the problem of minimizing
I I Ax bl l o over all x Ç R
º
. Here 1 1 · 1 1 0 is the vector norm defned by I I Yl l o =
maXi I Yi l . Let 1 be the value of the optimal cost.
(a) Let p be any vector in R
¬
that satisfes __
,
I
P
i l = 1 and p'A = 0' . Show
that p' b_ 1.
(b) In order to obtain the best possible lower bound of the form considered in
part (a) , we form the linear programming problem
maximize p' b
subject to p'A = 0'
i
=
l
Show that the optimal cost in this problem is equal to 1.
Sec. 4. 12 Exercises 189
Exercise 4. 7 (Duality in picccvisc lincar convcx optinization) Con
sider the problem of minimizing maxi=1 , ¸
¬
¸a�x÷ ·, , over all x ¯ R
º
. Let 1
be the value of the optimal cost, assumed fnite. Let A be the matrix with rows
a
·
, . . . ,a
¬
, and let b be the vector with components ·
·
, , ·
¬
.
(a) Consider any vector p ¯ R
¬
that satisfes p'A = 0' , p ¸0, and __
1 P
i =
1 . Show that ÷p' b_ 1.
( b) I n order to obtain the best possible lower bound of the form considered in
part ¸ a, , we form the linear programming problem
maximize ÷p' b
subject to p'A 0'
p'c 1
p > 0,
where ci s the vector with all components equal to 1 . Show that the optimal
cost in this problem is equal to 1.
Exercise 4. 8 Consider the linear programming problem of minimizing c' xsub
ject to Ax= b, x ¸ 0 Let x' be an optimal solution, assumed to exist, and let
p' be an optimal solution to the dual .
(a) Let x be an optimal solution to the primal, when c is replaced by some C.
Show that ,c c) ' ¸x÷ x' ) _ O.
(b) Let the cost vector be fxed at c, but suppose that we now change b to b,
and let x be a corresponding optimal solution to the primal . Prove that
¸p' ) ' ¸b ÷ b) _ c' ¸x÷ x' ) .
Exercise 4. 9 (Hackpropagation of dual variablcs in a nultipcriod
problcn) A company makes a product that can be either sold or stored to
meet fture demand. Let t = 1, . . . , T denote the periods of the planning hori
zon. Let · be the production volume during period t , which is assumed to be
known in advance. During each period t, a quantity .of the product is sold, at
a unit price of 1 Frthermore, a quantity ,can be sent to longterm storage, at
a unit transportation cost of Alternatively, a quantity «can be retrieved from
storage, at zero cost. We assume that when the product is prepared for longterm
storage, it is partly damaged, and only a fraction )of the total survives. Demand
is assumed to be unlimited. The main question is whether it is proftable to store
some of the production, in anticipation of higher prices in the future. This leads
us to the following problem, where · stands for the amount kept in longterm
storage, at the end of period t.
,
maximize _
'1 . , ,+.
,
1
,
·
·,
·
subject to .+, «= · ,
·+« ·
·
 ),= 0,
·,= 0,
. , , , « , ·¸O.
t = 1 , . . . , T,
t = 1 , . . . , T,
Here, 1,,
·
is the salvage prive for whatever inventory is lef at the end of period
T. Frthermore, .is a discount factor, with 0 < .< 1, refecting the fact that
future revenues are valued less than current ones.
190 Chap. 4 Duaity theory
(a) Let
o
and ,be dual variables associated with the frst and second equality
constraint, respectively. Write down the dual problem.
(b) Assume that 0 < ) < 1, · ¸ 0, and
¸
¸ O. Show that the following
formulae provide an optimal solution to the dual problem:
o,
,
o
max ¡ .´
,
1,
),, .
,
¸
¦
max ¡ , ,
.
' 1 ¦
max ¡ .
' 1 ), .
'
¸
¡
t = 1 , . . . , T  1 ,
t = 1 , . . . , T  1 .
(c) Explain how the result in part (b) can be used to compute an optimal
solution to the original problem. Primal and dual nondegeneracy can be
assumed.
Exercise 4. 10 (Saddlc points ofthc Lagrangcan) Consider the standard
form problem of minimizing c'x subject to Ax = b and x ¸ 0. We defne the
Lagrngean by
L¸x, p) = c' x+p' ¸ b÷ Ax) .
Consider the following "game" : player 1 chooses some x 2 0, and player 2 chooses
some p, then, player 1 pays to player 2 the amount L¸x, p) . Player 1 would like
to minimize L¸x, p) , while player 2 would like to maximize it.
A pair ¸x' ,p' ) , with x' ¸ 0, is called an equilibrium point (or a saddle
point, or a Nash equilibrium) if
L¸x' , p) _ L¸ x' , p' ) _ L¸x, p' ) , v x ¸0, v p.
(Thus, we have an equilibrium i f no player i s able to improve her performance by
unilaterally modifing her choice. )
Show that a pair ¸x' ,p' ) is an equilibrium i f and only i f x' and p' are
optimal solutions to the standard form problem und
e
r consideration and its dual,
respectively.
Exercise 4. 1 1 Consider a linear programming problem in standard form which
is infeasible, but which becomes feasible and has fnite optimal cost when the last
equality constraint is omitted. Show that the dual of the original (infeasible)
problem is feasible and the optimal cost is infnite.
Exercise 4. 12 · (Dcgcncracyanduniqucncss I)Consider a general linear
programming problem and suppose that we have a nondegenerate basic feasible
solution to the primal. Show that the complementary slackness conditions lead
to a system of equations for the dual vector that has a unique solution.
Exercise 4. 13 · (Dcgcncracy and uniqucncss II) Consider the following
pair of problems that are duals of each other:
minimize
subject to
c'x
Ax b
x ¸ 0,
maximize
subject to
p'b
p' A _ c' .
Sec. 4. 1 2 Exercises 191
( a) Prove that i f one problem has a nondegenerate and unique optimal solution,
so does the other.
(b) Suppose that we have a nondegenerate optimal basis for the primal and
that the reduced cost for one of the basic variables is zero. What does the
result of part ,a) imply? Is it true that there must exist another optimal
basis?
Exercise 4. 14 (Dcgcncracy and uniqucncss ¯ III) Give an example in
which the primal problem has a degenerate optimal basic feasible solution, but
the dual has a unique optimal solution. ,The example need not be in standard
form. )
Exercise 4. 15 (Dcgcncracy and uniqucncss IV) Consider the problem
minimize x,
subject to x, = 1
xj ?
0
x, _a
Write down its dual . For both the primal and the dual problem determine whether
they have unique optimal solutions and whether they have nondegenerate optimal
solutions. Is this example in agreement with the statement that nondegeneracy
of an optimal basic feasible solution in one problem implies uniqueness of optimal
solutions for the other? Explain.
Exercise 4. 16 Give an example of a pair ,primal and dual) of linear program
ming problems, both of which have multiple optimal solutions.
Exercise 4. 17 This exercise is meant to demonstrate that knowledge of a pri
mal optimal solution does not necessarily contain information that can be ex
ploited to determine a dual optimal solution. In particular, determining an opti
mal solution to the dual is as hard as solving a system of linear inequalities, even
if an optimal solution to the primal is available.
Consider the problem of minimizing c´x subject to Ax
?
0, and suppose
that we are told that the zero vector is optimal. Let the dimensions of A be
»x » and suppose that we have an algorithm that determines a dual optimal
solution and whose running time o¸ ,»»,
·
, for some constant / ¸Note that if
x = 0 is not an optimal primal solution, the dual has no feasible solution, and we
assume that in this case our algorithm exits with an error message. ) Assuming
the availability of the above algorithm, construct a new algorithm that takes as
input a system of »linear inequalities in »variables, runs for 0 ¸,»
»,
·
, time,
and either fnds a feasible solution or determines that no feasible solution exists.
Exercise 4. 18 Consider a problem i n standard form. Suppose that the matrix
A has dimensions »x » and its rows are linearly independent. Suppose that
all basic solutions to the primal and to the dual are nondegenerate. Let x be a
feasible solution to the primal and let pbe a dual vector ,not necessarily feasible) ,
such that the pair ¸x, p) satisfes complementary slackness.
(a) Show that there exist » columns of A that are linearly independent and
such that the corresponding components of x are all positive.
192 Chap. 4 Dualty theory
(b) Show that × and µ are basic solutions to the primal and the dual , respec
tively.
(c) Show that the result of part (a) is false if the nondegeneracy assumption is
removed.
Exercise 4. 19 Let I {× = +º ¸ A× b, × _ 0}be a nonempty polyhedron,
and let m be the dimension of the vector b We call x¸ a null variable if x¸ 0
whenever × = I
(a) Suppose that there exists some µ = +
·
for which µ´A _ 0' , µ´ b 0, and
such that the jth component of µ´A is positive. Prove that x¸ is a null
variable.
(b) Prove the converse of (a) : if x¸ is a null variable, then there exists some
µ = +
·
with the properties stated in part (a) .
(c) If x¸ is not a null variable, then by defnition, there exists some y = I for
which p¸ > o. Use the results in parts (a) and (b) to prove that there exist
× = I and µ = +
·
such that :
µ´ A
¸
0´ , µ´ b 0, × + A´ µ > 0
Exercise 4. 20 · (Strict comµlcmcntary slackncss)
(a) Consider the following linear programming problem and its dual
minimize c´×
subject to A× b
× _ 0,
maximize
subject to
µ´ b
µ´ A ¸ c´ ,
and assume that both problems have an optimal solution. Fix some j .
Suppose that every optimal solution t o the primal satisfes x¸ 0 Show
that there exists an optimal solution µ to the dual such that µ´A, < c¸
(Here, A, is the jth column of A ) Hint: Let d be the optimal cost.
Consider the problem of minimizing ÷x¸ subject to A× b, × :: 0, and
c´ × _ d, and form its dual .
(b) Show that there exist optimal solutions × and µ to the primal and to the
dual , respectively, such that for every j we have either x ¸ > 0 or µ´A, < c¸
Hint: Use part (a) for each j, and then take the average of the vectors
obtained.
(c) Consider now the following linear programming problem and its dual :
minimize c´×
subject to A× _ b
× :: 0,
maximize
subject to
µ´ b
µ´ A
<
c´
µ _ 0
Assume that both problems have an optimal solution. Show that there
exist optimal solutions to the primal and to the dual , respectively, that
satisf strict complementary slackness, that is:
(i) For every j we have either x¸ > 0 or µ´A, < c¸ .
(ii) For every i , we have either a,× > , or
)
, > 0 (Here, a, is the ith
row of A ) Hint: Convert the primal to standard form and apply
part (b) .
Sec. 4. 12 Exercises 193
(d) Consider the linear programming problem
minimize 5x; + 5x,
subject to x; + x,
¸
2
2x; x,
¸
0
x; , x, ¸
0
Does the optimal primal solution ¸ 2JJ, 4JJ) , together with the correspond
ing dual optimal solution, satisfy strict complementary slackness? Deter
mine all primal and dual optimal solutions and identif the set of all strictly
complementary pairs.
Exercise 4. 21 · (Clark' stbcorcm) Consider the following pair of linear pro
gramming problems:
minimize
subject to
c' x
Ax > b
x
¸
0,
maximize
subject to
p' b
p
' A
<
c'
p
¸
0
Suppose that at least one of these two problems has a feasible solution. Prove
that the set of feasible solutions to at least one of the two problems is unbounded.
Hint: Interpret boundedness of a set in terms of the fniteness of the optimal cost
of some linear programming problem.
Exercise 4. 22 Consider the dual simplex method applied to a standard form
problem with linearly independent rows. Suppose that we have a basis which is
primal infeasible, but dual feasible, and let i be such that x_
,
,, < 0 Suppose
that all entries in the ith row in the tableau (other than x_
,
,, ) are nonnegative.
Show that the optimal dual cost is +c
Exercise 4. 23 Describe i n detail the mechanics of a revised dual simplex meth
od that works in terms of the inverse basis matrix H¯
:
instead of the fll simplex
tableau.
Exercise 4. 24 Consider the lexicographic pivoting rule for the dual simplex
method and suppose that the algorithm is initialized with each column of the
tableau being lexicographically positive. Prove that the dual simplex method
does not cycle.
Exercise 4. 25 This exercise shows that if we bring the dual problem into stan
dard form and then apply the primal simplex method, the resulting algorithm is
not identical to the dual simplex method.
Consider the following standard form problem and its dual .
minimize
subject to
x; + x,
x;
= 1
x, = 1
x; , x, ¸
0
maximize
subject to
)i + ),
)i ¸
1
), ¸
1 .
Here, there i s only one possible basis and the dual simplex method must terminate
immediately. Show that if the dual problem is converted into standard form and
the primal simplex method is applied to it , one or more changes of basis may be
required.
194 Chap. 4 Duality theory
Exercise 4. 26 Let A be a given matrix. Show that exactly one of the following
alternatives must hold.
(a) There exists some x ; 0 such that Ax = 0, x ? 0.
(b) There exists some p such that p
'
A > 0' .
Exercise 4. 27 Let A be a given matrix. Show that the following two state
ments are equivalent.
(a) Every vector such that Ax ? 0 and x ? 0 must satisfy x
i
= 0.
(b) There exists some p such that p
'
A : 0, p ? 0, and p
'
A1 < 0, where A1
i s the frst column of A.
Exercise 4. 28 Let a and a1 , . . . , am be given vectors i n +º Prove that the
following two statements are equivalent:
(a) For all x ? 0, we have a
'
x : maXi a� x.
(b) There exist nonnegative coefcients \. that sum to 1 and such that a :
2:
1
\. ».
Exercise 4. 29 (Inconsistent systems of linear inequalities) Let a1 , . . . , am
be some vectors in +º , with m > n + 1 . Suppose that the system of inequalities
a�x ? , , i = 1 , . . . , m, does not have any solutions. Show that we can choose
n + 1 of these inequalities, so that the resulting system of inequalities has no
solutions.
Exercise 4. 30 (Helly's theorem)
(a) Let : be a fnite family of polyhedra in +ºsuch that every n + 1 polyhedra
in : have a point in common. Prove that all polyhedra in : have a point
in common. Hint: Use the result in Exercise 4. 29.
(b) For n = 2, part ¸a) asserts that the polyhedra
í
, I, ,. . ., I, ( K ? J) in
the plane have a point in common if and only if every three of them have a
point in common. Is the result still true with "three" replaced by "two" ?
Exercise 4. 31 (Unit eigenvectors of stochastic matrices) We say that an
n x n matrix P, with entries ,.
is stochastic if all of its entries are nonnegative
and
º
_
,.
= 1 , V i ,
¸
,
that is, the sum of the entries of each row is equal to 1 .
Use duality t o show that if P is a stochastic matrix, then the system of
equations
p
'
P = p
'
, p ? 0,
has a nonzero solution. ¸Note that the vector p can be normalized so that its
components sum to one. Then, the result in this exercise establishes that every
fnite state Markov chain has an invariant probability distribution. )
Sec. 4. 12 Exercises 195
Exercise 4. 32 * (Leontief systems and Samuelson' s substitution the
orem) A Leontief matrix is an m x n matrix n in which every column has at
most one positive element. For an interpretation, each column n
corresponds
to a production process. If ,
is negative,
,
¸ represents the amount of goods
of type i consumed by the process. If ,
is positive, it represents the amount of
goods of type i produced by the process. If .
is the intensity with which process
j is used, then Ax represents the net output of the diferent goods. The matrix
nis called productive if there exists some x � 0 such that Ax > 0
(a) Let nbe a square productive Leontief matrix ,m= n) . Show that every
vector z that satisfes nz � 0 must be nonnegative. Hint: If z satisfes
nz � 0 but has a negative component, consider the smallest nonnega
tive ( such that some component of x + (z becomes zero, and derive a
contradiction.
(b) Show that every square productive Leontief matrix is invertible and that
all entries of the inverse matrix are nonnegative. Hint: Use the result in
part (a) .
(c) We now consider the general case where n � m, and we introduce a con
straint of the form e
'
x _ 1 , where e = ( 1 , . . . , 1) . (Such a constraint could
capture, for example, a bottleneck due to the fniteness of the labor force. )
An "output" vector y E R
¬
i s said t o be achievable if y � 0 and there
exists some x � 0 such that Ax = y and e
'
y _ 1. An achievable vector y
is said to be efcient if there exists no achievable vector z such that z � y
and z o y. (Intuitively, an output vector y which is not efcient can be im
proved upon and is therefore uninteresting. ) Suppose that nis productive.
Show that there exists a positive efcient vector y. Hint: Given a positive
achievable vector y. , consider maximizing _
,
,
,over all achievable vectors
y that are larger than y • +
(d) Suppose that nis productive. Show that there exists a set of mproduction
processes that are capable of generating all possible efcient output vectors
y. That is, there exist indices B, i ) , ,B,m) , such that every efcient
output vector y can be expressed in the form y = __
·
n,¸ ,, .,¸ ,, for
some nonnegative coefcients .,¸ ,, whose sum is bounded by 1 . Hint:
Consider the problem of minimizing e
'
x subject to Ax = y, x � 0, and
show that we can use the same optimal basis for all efcient vectors y.
Exercise 4. 33 ( Options pricing)Consider a market that operates for a single
period, and which involves three assets: a stock, a bond, and an option. Let s
be the price of the stock, in the beginning of the period. Its price Sat the end of
the period is random and is assumed to be equal to either so,with probability
(, or s1,with probability 1 ¯ (. Here oand d are scalars that satisfy 1< 1 < o
Bonds are assumed riskless. Investing one dollar i n a bond results in a payof
of 7¸ at the end of the period. (Here, 7 is a scalar greater than 1 . ) Finally, the
option gives us the right to purchase, at the end of the period, one stock at a fxed
price of K. If the realized price Sof the stock is greater than K, we exercise the
option and then immediately sell the stock in the stock market , for a payof of
S¯ K. If on the other hand we have S < K, there is no advantage in exercising
the option, and we receive zero payof. Thus, the value of the option at the end
of the period is equal to max{O, S ¯ K}. Since the option is itself an asset, it
196 Chap. 4 Duality theory
should have a value in the beginning of the time period. Show that under the
absence of arbitrage condition, the value of the option must be equal to
" max{O, Su  K} + 8 max{O, Sd  K},
where " and 8 are a solution t o the following system of linear equations:
u" + d8 1
" +
8
1
7
Hint: Write down the payof matrix H and use Theorem 4. 8.
Exercise 4. 34 (Finding separating hyperplanes) Consider a polyhedron
Ithat has at least one extreme point.
(a) Suppose that we are given the extreme points x
i
and a complete set of
extreme rays wi of I Create a linear programming problem whose solution
provides us with a separating hyperplane that separates Ifrom the origin,
or allows us to conclude that none exists.
(b) Suppose now that I is given to us in the form I = {x I a�x � bi , i =
1 , . . . , m}. Suppose that 0 ¡ I Explain how a separating hyperplane can
be found.
Exercise 4. 35 ( Separation of disjoint polyhedra) Consider two nonempty
polyhedra I = {x = +º I Ax :: b} and Q = {x = +º I Dx :: d} . We are
interested in fnding out whether the two polyhedra have a point in common.
(a) Devise a linear programming problem such that : if IQ is nonempty, it
returns a point in PQ, if PQis empty, the linear programming problem
is infeasible.
(b) Suppose that I Q is empty. Use the dual of the problem you have
constructed in part (a) to show that there exists a vector c such that
c
'
x < c
'
y for all x = I and y = Q
Exercise 4. 36 ( Containment of polyhedra)
(a) Let Iand Qbe two polyhedra in +ºdescribed in terms of linear inequality
constraints. Devise an algorithm that decides whether Iis a subset of Q
( b) Repeat part (a) i f the polyhedra are described i n terms of their extreme
points and extreme rays.
Exercise 4. 37 ( Closedness of fnitely generated cones) Let AI , . . . , A
n
be given vectors in +
·
. Consider the cone C = ¡ _�
,
Aixi ¸ Xi � O¦ and let
y
k
, / = 1 , 2, .  , be a sequence of elements of C that converges to some y. Show
that y = C (and hence C is closed) , using the following argument . With y fxed
as above, consider the problem of minimizing I l y  _�¸
,
Ai xdl o, subject to the
constraints Xl , •  • , X
n
� O. Here

stands for the maximum norm, defned by
I l xl i o = maxi I Xi l . Explain why the above minimization problem has an optimal
solution, fnd the value of the optimal cost, and prove that y = C.
Sec. 4. 12 Exercises 197
Exercise 4. 38 (Fom Frks' lemma to duality) Use Farks' lemma to
prove the duality theorem for a linear programming problem involving constraints
of the form a� x = , , a�x � , , and nonnegativity constraints for some of the
variables x ¸ . Hint: Start by deriving the form of the set of feasible directions at
an optimal solution.
Exercise 4. 39 (Extreme rays of cones) Let us defne a nonzero element d of
a pointed polyhedral cone C to be an extrme ry if it has the following property:
if there exist vectors f E C and g E C and some .. E (0, 1) satisfing d = f + g,
then both f and g are scalar multiples of d. Prove that this defnition of extreme
rays is equivalent to Defnition 4. 2.
Exercise 4. 40 (Extreme rays of a cone are extreme points of its sec
tions) Consider the cone C = {x E +º I �x � 0, i = 1, . . . ,m¦ and assume
that the frst n constraint vectors a
;
,. . . , an are linearly independent. For any
nonnegative scalar 7¸ we defne the polyhedron I, by
(a) Show that the polyhedron I, is bounded for every 7 � 0.
(b) Let 7 > O. Show that a vector x E I, i s an extreme point of I, i f and only
if x is an extreme ray of the cone C.
Exercise 4. 41 (Caratheodory' s theorem) Show that every element x of a
bounded polyhedron I C +º can be expressed as a convex combination of at
most n + 1 extreme points of I Hint: Consider an extreme point of the set of
all possible representations of x.
Exercise 4. 42 (Problems with side constraints) Consider the linear pro
gramming problem of minimizing c
'
x over a bounded polyhedron I C +º and
subject to additional constraints �x = , , i = 1, . . , L. Assume that the prob
lem has a feasible solution. Show that there exists an optimal solution which is
a convex combination of L + 1 extreme points of I Hint: Use the resolution
theorem to represent I
Exercise 4. 43
(a) Consider the minimization of c, x
; + c,x, subject to the constraints
Find necessary and sufcient conditions on (c
;
,c, ) for the optimal cost to
be fnite.
(b) For a general feasible linear programming problem, consider the set of all
cost vectors for which the optimal cost is fnite. Is it a polyhedron? Prove
your answer.
198 Chap. 4 Duality theory
Exercise 4. 44
(a) Let P = ¡ ,., .
. , I .,  ..
= 0, .,+ ..
= a} What are the extreme
points and the extreme rays of P?
(b) Let P = ¡ ,., .. , I +.,+z..
¸s z.,+..
:: s
}
What are the extreme
points and the extreme rays of P?
(c) For the polyhedron of part (b, , is it possible to express each one of its
elements as a convex combination of its extreme points plus a nonnega
tive linear combination of its extreme rays? Is this compatible with the
resolution theorem?
Exercise 4. 45 Let P be a polyhedron with at least one extreme point. Is it
possible to express an arbitrary element of P as a convex combination of its
extreme points plus a nonnegative multiple of a single extreme ray?
Exercise 4. 46 (Resolution theorem for polyhedral cones) Let C be a
nonempty polyhedral cone.
(a) Show that C can be expressed as the union of a fnite number CI , . . . , C
k
of pointed polyhedral cones. Hint: Intersect with orthants.
(b) Show that an extreme ray of C must be an extreme ray of one of the cones
CI , . , C
k
•
(c) Show that there exists a fnite number of elements W
I
, . . . , ¾
r
of C such
that
Exercise 4. 47 (Resolution theorem for general polyhedra) Let P be a
polyhedron. Show that there exist vectors x
,
. . . , x
k
and W
I
, . . . ¸ ¾
r
such that
Hint: Generalize the steps in the preceding exercise.
Exercise 4. 48 * (Polar, fnitely generated, and polyhedral cones) For
any cone C, we defne its polar CJ. by
CJ. = ¦ p I p
i
X :: 0, for all x E c}
(a) Let F be a fnitely generated cone, of the form
Show that F
J.
= {p I p
i
W
i
:: 0, i = 1 , . , r}, which is a polyhedral cone.
(b) Show that the polar of F
J.
is F and conclude that the polar of a polyhedral
cone is fnitely generated. Hint: Use Farks' lemma.
Sec. 4. 1 3 Notes and sources 199
(c) Show that a fnitely generated pointed cone F i s a polyhedron. Hint: Con
sider the polar of the polar.
(d) (Polar cone theorem) Let c be a closed, nonempty, and convex cone.
Show that ¸c`) `= O. Hint: Mimic the derivation of Farkas' lemma using
the separating hyperplane theorem (Section 4. 7) .
(e) Is the polar cone theorem true when c is the empty set?
Exercise 4. 49 Consider a polyhedron, and let x, y be two basic feasible solu
tions. If we are only allowed to make moves from any basic feasible solution to
an adjacent one, show that we can go fom x to y in a fnite number of steps.
Hint: Generalize the simplex method to nonstandard form problems: starting
fom a nonoptimal basic feasible solution, move along an extreme ray of the cone
of feasible directions.
Exercise 4. 50 We are interested in the problem of deciding whether a polyhe
dron
Q = {x = �º
I Ax : b, Dx 2 d, x 2 O¦
is nonempty. We assume that the polyhedron P = {x = �ºI Ax : b, x 2 O} is
nonempty and bounded. For any vector p, of the same dimension as d, we defne
g(p) = p' d + map' Dx.
×Ç!
( a) Show that i f Q i s nonempty, then g(p) 2 0 for all p 2 0.
(b) Show that i f Q i s empty, then there exists some p 2 0, such that g( p) < a
(c) If Q i s empty, what i s the minimum of g( p) over all p 2 O?
4. 13 Notes and sources
4. 3. The duality theorem i s due to von Neumann ( 1947) , and Gale, Kuhn,
and Tcker ( 1951 ) .
4. 6. Farkas' lemma i s due t o Farkas ( 1894) and Minkowski ( 1896) . See
Schrijver ( 1986) for a comprehensive presentation of related results.
The connection between duality theory and arbitrage was developed
by Ross ( 1976, 1978) .
4. 7. Weierstrass' Theorem and i ts proof can be found in most texts on real
analysis; see, for example, Rudin ( 1976) . While the simplex method is
only relevant to linear programming problems with a fnite number of
variables, the approach based on the separating hyperplane theorem
leads to a generalization of duality theory that covers more general
convex optimization problems, as well as infnitedimensional linear
programming problems, that is, linear programming problems with
infnitely many variables and constraints; see, e. g. , Luenberger ( 1969)
and Rockafellar ( 1970) .
4. 9. The resolution theorem and its converse are usually attributed to
Farkas, Minkowski , and Weyl .
200 Chap. 4 Duality theory
4. 10. For extensions of duality theory to problems involving general convex
functions and constraint sets, see Rockafellar ( 1970) and Bertseks
( 1 995b) .
4. 12 Exercises 4. 6 and 4. 7 are adapted from Boyd and Vandenberghe ( 1 995) .
The result on strict complementary slackness (Exercise 4. 20) was
proved by 'cker ( 1956) . The result in Exercise 4. 21 is due to Clark
( 1961) . The result in Exercise 4. 30 is due to Helly ( 1 923) . Input
output macroeconomic models of the form considered in Exercise 4. 32,
have been introduced by Leontief, who was awarded the 1973 Nobel
prize in economics. The result in Exercise 4. 41 is due to Caratheodory
( 1907) .
Lhapter Û
CODS1!1V1!j ðDðÌjS1S
coatcats
· : Local sensitivity analysis
ã. z. Global dependence on the righthand side vector
: : The set of all dual optimal solutions*
· 1 Global dependence on the cost vector
ã. ¯ Parametric programming
· Summary
ã ¨. Exercises
ã ·. Notes and sources
201
202 Chap. 5
Consider the standard form problem
and its dual
minimize e
×
subject to Ax b
× > 0,
maximize p
b
subject to p
A : e
Sensitivity anaysis
In this chapter, we study the dependence of the optimal cost and the opti
mal solution on the coefcient matrix A, the requirement vector b, and the
cost vector e This is an important issue in practice because we often have
incomplete knowledge of the problem data and we may wish to predict the
efects of certain parameter changes.
In the frst section of this chapter, we develop conditions under which
the optimal basis remains the same despite a change in the problem data,
and we examine the consequences on the optimal cost. We also discuss
how to obtain an optimal solution if we add or delete some constraints. In
subsequent sections, we allow larger changes in the problem data, resulting
in a new optimal basis, and we develop a global perspective of the depen
dence of the optimal cost on the vectors b and e The chapter ends with
a brief discussion of parametric programming, which is an extension of the
simplex method tailored to the case where there is a single scalar unknown
parameter.
Many of the results in this chapter can be extended to cover general
linear programming problems. Nevertheless, and in order to simplif the
presentation, our standing assumption throughout this chapter will be that
we are dealing with a standard form problem and that the rows of the m x n
matrix A are linearly independent .
5. 1 Local sensitivity analysis
In this section, we develop a methodology for performing sensitivity anal
ysis. We consider a linear programming problem, and we assume that we
already have an optimal basis B and the associated optimal solution ×
We then assume that some entry of A, b, or e has been changed, or that
a new constraint is added, or that a new variable is added. We frst look
for conditions under which the current basis is still optimal. If these con
ditions are violated, we look for an algorithm that fnds a new optimal
solution without having to solve the new problem from scratch. We will
see that the simplex method can be quite useful in this respect .
Having assumed that B is an optimal basis for the original problem,
the following two conditions are satisfed:
,feasibility)
Sec. 5. 1 Local sensitivity analysis
e
 e
n
,
n > 0
'
_
¯
¸
203
( optimality) .
When the problem is changed, we check to see how these conditions are
afected. By insisting that both conditions (feasibility and optimality) hold
for the modifed problem, we obtain the conditions under which the basis
matrix nremains optimal for the modifed problem. In what follows, we
apply this approach to several examples.
A new variable is added
Suppose that we introduce a new variable 

,
together with a corre
sponding column n

,,and obtain the new problem
minimize e
× +

,


,,
subject to n× + n

,, 

,, b
×
¸
0.
We wish to determine whether the current basis nis still optimal.
We note that ,×, 

,
,
=
,× ,a, is a basic feasible solution to the
new problem associated with the basis nand we only need to examine the
optimality conditions. For the basis nto remain optimal, it is necessary
and sufcient that the reduced cost of 

,,be nonnegative, that is,

,,
=

,, �n
n

,
¸ U
If this condition i s satisfed, × ,a, i s an optimal solution to the new prob
lem. If, however,

,, < a then ¸× ,a, is not necessarily optimal. In
order to fnd an optimal solution, we add a column to the simplex tableau,
associated with the new variable, and apply the primal simplex algorithm
starting from the current basis n Typically, an optimal solution to the new
problem is obtained with a small number of iterations, and this approach
is usually much faster than solving the new problem from scratch.
L×ampe 5. 1 Consider the problem
minimize
subject to
÷5xi
x, + 12x3
Jxi + 2x, + X3
5xi + Jx, + x,
x; ,. . , x, _ O
10
16
An optimal solution to this problem i s given by x = ( 2, 2, 0, 0) and the corre
sponding simplex tableau is given by
x; x, x3 x,
12 0 0 2 7
x;
= 2 1 0 ÷J 2
x, = 2 0 1 5 ÷J
204 Chap. 5 Sensitivity analysis
Note that H¯
·
is given by the last two columns of the tableau.
Let us now introduce a variable X5 and consider the new problem
minimize ÷5x; x, + 12x3 x5
subject to Jx; + 2x, + x3 + x5 ia
5x; + Jx, + x, + x5 i
x; ,. . . , X5 2 O.
We have A5 = , i i, and
C5 = C5 ÷ c_H
¯
;
A5 = i ¸÷ 5 ÷ i[ � � ¸
¸ �
¸
= 4.
Since C5 is negative, introducing the new variable to the basis can be benefcial.
We observe that H¯
·
A5 = ¸ 1 , 2) and augment the tableau by introducing a
column associated with X5 :
x; x, x3 x, x5
iz a a 2 7 4
2 1 a · 2 i
2 a i 5 · 2
We then bring X5 into the basis; x, exits and we obtain the following tableau,
which happens to be optimal:
x; x, x3 x, x5
i a 2 i2 i a
· i a. 5 a a. 5 a
x5 = i a a 5 2 5 ÷1 . 5 i
An optimal solution is given by 7 = ,· a a a i,
A new inequality constraint is added
Let us now introduce a new constraint a�
,
.
x � ·
,
where a
,
.
and
·
,
are given. If the optimal solution x·to the original problem satisfes
this constraint , then x·is an optimal solution to the new problem as well.
If the new constraint is violated, we introduce a nonnegative slack variable


,
.
and rewrite the new constraint in the form a�
,
x 

,, " ·
,
We obtain a problem in standard form, in which the matrix nis replaced
by
Sec. 5. 1 Loca sensitivity analysis 205
Let nbe an optimal basis for the original problem. We form a basis
for the new problem by selecting the original basic variables together with


,
The new basis matrix nis of the form
¸ n 0
¸
n
=
a 1 '
where the row vector acontains those components of a�
,
l
associated with
the original basic columns. The determinant of this matrix is the negative
of the determinant of n hence nonzero, and we therefore have a true basis
matrix. , The basic solution associated with this basis is x· a�
,
x·
·
,¸ and is infeasible because of our assumption that x· violates the
new constraint . Note that the new inverse basis matrix is readily available
because
¸ n
0 ¸
n
=
a n
1
.
To see this, note that the product n
nis equal to the identity matrix. ,
Let _ be the mdimensional vector with the costs of the basic vari
ables in the original problem. Then, the vector of reduced costs associated
with the basis nfor the new problem, is given by
¸

a
[

¸

,
a
[ ¸
a
��
.
�
¸
[
a
t
,
.
� , =
¸

 
,
n
n a
}
and is nonnegative due to the optimality of n for the original problem.
Hence, nis a dual feasible basis and we are in a position to apply the dual
simplex method to the new problem. Note that an initial simplex tableau
for the new problem is readily constructed. For example, we have
¸ n
n
·
a
,
an
n a
,
where n
n is available from the fnal simplex tableau for the original
problem.
rxa»,i5. 2 Consider again the problem in Example 5. 1 :
minimize
subject to
5Xl X
2 + 12x3
3Xl + 2X
2 + X3
5Xl + 3X2 + X4
Xl
,
· · ·
, X4 2 0,
and recall the optimal simplex tableau:
Xl X
2
12 0 0
X
l
= 2 1 0
2 0 1
X3 X4
2 7
3 2
5 3
10
16
206 Chap. 5 Sensitivity analysis
We introduce the additional constraint x; + x, ?
5, which is violated by the
optimal solution x' = , z z a a, We ha
v
e a
¬+·
= , i i a a, ¡
+·
= 5, and
a�
+i
x' < 
¬
,
i
. We form the standard form problem
minimize ÷5x
i
x, + iz.,
subject to Jx
·
+ 2x, + x, ia
5x
·
+ Jx, + x+ i
x
i
+ x, ÷ x, 5
x
i
,. . , x,
?
o.
Let aconsist of the components of a
¬
,
·
associated with the basic variables.
We then have a= , i i, and
/
¯
·
/
¸ [
¸ i
aB A ÷ a
¬+·
= i i
a
o
1
÷J
5
¸�
¸
¸i i a a[ = ¸a a z i[
The tableau for the new problem is of the form
x
i
x, x, x, x,
iz a a z 7 a
x;
= z 1 a ÷J z a
x, = z a 1 5 J a
1 a a z 1 1
We now have all the information necessary to apply the dual simplex method to
the new problem.
Our discussion has been focused on the case where an inequality con
straint is added to the primal problem. Suppose now that we introduce
a new constraint ,n,,, � ,,, in the dual . This is equivalent to intro
ducing a new variable in the primal, and we are back to the case that was
considered in the preceding subsection.
A new equality constraint is added
We now consider the case where the new constraint is of the form a�
,,
x=
·
,, and we assume that this new constraint is violated by the optimal
solution x·to the original problem. The dual of the new problem is
maximize ,
t+ ,
,, ·
,
.
subj ect to
¸,
,
,, ,
a
n
¸ � 
,,
where
,
,, i s a dual variable associated with the new constraint . Let ,·
be an optimal basic feasible solution to the original dual problem. Then,
,· ,is a feasible solution to the new dual problem.
Sec. 5. 1 Loca sensitivity analysis 207
Let m be the dimension of , which is the same as the original num
ber of constraints. Since ,· is a basic feasible solution to the original dual
problem, m of the constraints in ,· , n¸are linearly independent and
active. However, there is no guarantee that at ,· 0) we will have m+ 1 lin
early independent active constraints of the new dual problem. In particular,
,· a, need not be a basic feasible solution to the new dual problem and
may not provide a convenient starting point for the dual simplex method
on the new problem. While it may be possible to obtain a dual basic feasi
ble solution by setting ,,,to a suitably chosen nonzero value, we present
here an alternative approach.
Let us assume, without loss of generality, that �
,
_ 2· > ·
,, • We
introduce the auxiliary primal problem
minimize
subj ect to
x + :

,,
nx = b
�
,,
2 

,, = ·
,,
x¸0, 

,,¸0,
where :i s a large positive constant . A primal feasible basis for the aux
iliary problem is obtained by picking the basic variables of the optimal
solution to the original problem, together with the variable 

,, The re
sulting basis matrix is the same as the matrix nof the preceding subsection.
There is a diference, however. In the preceding subsection, nwas a dual
feasible basis, whereas here n is a primal feasible basis. For this reason,
the primal simplex method can now be used to solve the auxiliary problem
to optimality.
Suppose that an optimal solution to the auxiliary problem satisfes


,, = 0; this will be the case if the new problem is feasible and the
coefcient :is large enough. Then, the additional constraint a�
,,
x =
·
,,has been satisfed and we have an optimal solution to the new problem.
Changes in the requirement vector b
Suppose that some component ·, of the requirement vector b is changed
to ·,+ 8. Equivalently, the vector b is changed to b+ 8ei , where ei is the
ith unit vector. We wish to determine the range of values of 8 under which
the current basis remains optimal. Note that the optimality conditions are
not afected by the change in b We therefore need to examine only the
feasibility condition
( 5. 1)
Let g = +,, +
, +
, ,be the ith column of n
,
Equation ( 5. 1)
becomes
2_ + 8g 2 0,
or,
j = 1, . . . , m.
208
Equivalently,
Chap. 5 Sensitivity analysis
max
.¬.
J
.
<
·
<
min
.¬.
J
.
.
¸
,
,
¸ ¸
,
,
¸
.
,
: ·
,,
 
.
,
·: ·
,,
For · in this range, the optimal cost , as a function of · is given by
�n
.
t+ ·, , * ,t+ ·
,, where , * �n
.
is the (optimal) dual
solution associated with the current basis n
If · is outside the allowed range, the current solution satisfes the
optimality ,or dual feasibility) conditions, but is primal infeasible. In that
case, we can apply the dual simplex algorithm starting from the current
basis.
rxa»,i 5.3 Consider the optimal tableau
x, x, x. x,
12 0 0 2 7
2 1 0 J 2
2 0 1 5 ÷J
from Example 5. I .
Let us contemplate adding c t o , . We look at the frst column of H¯
·
which is ¸ ÷J, 5) . The basic variables under the same basis are x, = 2 ÷ Jc and
2+ 5c. This basis will remain feasible as long as 2÷ Jc 2 0 and 2+ 5c2 0, that
is, if 2J5: c : 2JJ. The rate of change of the optimal cost per unit change of
c is given by c�H
i
c
; = ¸ 5, 1) ' ,÷J, 5) = 10.
If c i s increased beyond 2JJ, then x, becomes negative. At this point , we
can perform an iteration of the dual simplex method to remove x, from the basis,
and x. enters the basis.
Changes in the cost vector c
Suppose now that some cost coefcient :
,
becomes :
, + · The primal
feasibility condition is not afected. We therefore need to focus on the
optimality condition
�n
.
n ¸
If :
,
is the cost coefcient of a nonbasic variable X
j
, then ,does not
change, and the only inequality that is afected is the one for the reduced
cost of 
,
we need
or
Sec. 5. 1 Local sensitivity analysis 209
If this condition holds, the current basis remains optimal; otherwise, we can
apply the primal simplex method starting from the current basic feasible
solution.
If ,
is the cost coefcient of the £th basic variable, that is, if j * B, £) ,
then , becomes , + ·, and all of the optimality conditions will be
afected. The optimality conditions for the new problem are
\ i = j.
(Since 
,
is a basic variable, its reduced cost stays at zero and need not be
examined. ) Equivalently,
\ i = j,
where ,,,is the £th entry of n
,
n, which can be obtained from the simplex
tableau. These inequalities determine the range of ·for which the same
basis remains optimal.
L×ampe 5. 4 We consider once more the problem in Example 5. 1 and deter
mine the range of changes c, of c, , under which the same basis remains optimal .
Since X3 and X4 are nonbasic variables, we obtain the conditions
63 2 C3 = 2,
64 2 C4 = 7.
Consider now adding 6l to c
;
Fom the simplex tableau, we obtain ç
:
, = 0,
ql3 = 3, ql4 = 2, and we are led to the conditions
6l 2 2/3,
6l
<
7/2.
Changes in a nonbasic column of A
Suppose that some entry ,,
in the jth column n
,
of the matrix n is
changed to ,,
+ · We wish to determine the range of values of ·for which
the old optimal basis remains optimal.
If the column n
,
is nonbasic, the basis matrix n does not change,
and the primal feasibility condition is unafected. Frthermore, only the
reduced cost of the jth column is afected, leading to the condition
or,
, ·,,? 0,
where p
* �n
.
If this condition i s violated, the nonbasic column n
,
can be brought into the basis, and we can continue with the primal simplex
method.
210 Chap. 5 Sensitivity analysis
Changes in a basic column of A
If one of the entries of a basic column A¸ changes, then both the feasibil
ity and optimality conditions are afected. This case is more complicated
and we leave the full development for the exercises. As it turns out , the
range of values of ·for which the same basis is optimal is again an interval
Exercise 5. 3) .
Suppose that the basic column A¸ is changed to A¸+ :e, , where e,
is the ith unit vector. Assume that both the original problem and its dual
have unique and nondegenerate optimal solutions x· and p, respectively.
Let x··,be an optimal solution to the modifed problem, as a function of
· It can be shown Exercise 5. 2) that for small ·we have

x· ·,= 
x·
:·¸,,+
·
,
For an intuitive interpretation of this equation, let us consider the diet
problem and recall that a,¸ corresponds to the amount of the ith nutrient
in the jth food. Given an optimal solution x· to the original problem,
an increase of a,¸ by ·means that we are getting "for free" an additional
amount ·¸of the ith nutrient . Since the dual variable ,, is the marginal
cost per unit of the ith nutrient , we are getting for free something that is
normally worth :,, ·
¸
, and this allows us to reduce our costs by that same
amount .
Production planning revisited
In Section 1 . 2, we introduced a production planning problem that DEC had
faced in the end of 1988. In this section, we answer some of the questions
that we posed. Recall that there were two important choices, whether to
use the constrained or the unconstrained mode of production for disk drives,
and whether to use alternative memory boards. As discussed in Section 1 . 2,
these four combinations of choices led to four diferent linear programming
problems. We report the solution to these problems, as obtained from a
linear programming packge, in Table 5. 1 .
Table 5 . 1 indicates that revenues can substantially increase by using
alternative memory boards, and the company should defnitely do so. The
decision of whether to use the constrained or the unconstrained mode of
production for disk drives is less clear. In the constrained mode, the revenue
is 248 million versus 213 million in the unconstrained mode. However,
customer satisfaction and, therefore, future revenues might be afected,
since in the constrained mode some customers will get a product diferent
than the desired one. Moreover, these results are obtained assuming that
the number of available 256K memory boards and disk drives were 8, 000
and 3, 000, respectively, which is the lowest value in the range that was
estimated. We should therefore examine the sensitivity of the solution as
the number of available 256K memory boards and disk drives increases.
Sec. 5. 1 Local sensitivity analysis
At. boards Mode Revenue , 
, ,
no constr. 145 0 2. 5 0 0. 5
yes constr. 248 1 . 8 2 0 1
no unconstr. 133 0. 272 1 . 304 0. 3 0. 5
yes unconstr. 213 1. 8 1. 035 0. 3 0. 5
Iabe5. 1: Optimal solutions to the four variants of the produc
tion planning problem. Revenue is in millions of dollars and the
quantities x, are in thousands.
211
,
2
2
2. 7
2. 7
With most linear programming packages, the output includes the val
ues of the dual variables, as well as the range of parameter variations under
which local sensitivity analysis is valid. Table 5. 2 presents the values of
the dual variables associated with the constraints on available disk drives
and 256K memory boards. In addition, it provides the range of allowed
changes on the number of disk drives and memory boards that would leave
the dual variables unchanged. This information is provided for the two lin
ear programming problems corresponding to constrained and unconstrained
mode of production for disk drives, respectively, under the assumption that
alternative memory boards will be used.
Mode Constraìned Uneonstraìned
Revenue 248 213
Duavarìabe
15 0
ior 256K boards
Ran,e
[ 1 . 5, 0. 2] [ 1 . 62, 0]
ior 256K boards
Dua varìabe
0 23. 52
ior dìskdrìves
Ran,e
[ 0. 2, 0. 75] [ 0. 91 , 1 . 13]
ior dìskdrìves
Iabe5. 2: Dual prices and ranges for the constraints correspond
ing to the availability of the number of 256K memory boards and
disk drives.
212 Chap. 5 Sensitivity analysis
In the constrained mode, increasing the number of available 256K
boards by 0. 2 thousand the largest number in the allowed range, results
in a revenue increase of 15 x 0. 2 " 3 million. In the unconstrained mode,
increasing the number of available 256K boards has no efect on revenues,
because the dual variable is zero and the range extends upwards to infnity.
In the constrained mode, increasing the number of available disk drives by
up to 0. 75 thousand the largest number in the allowed range,has no efect
on revenue. Finally, in the unconstrained mode, increasing the number
of available disk drives by 1 . 13 thousand results in a revenue increase of
23. 52 x 1 . 13 " 26. 57 million.
In conclusion, in the constrained mode of production, it is important
to aim at an increase of the number of available 256K memory boards,
while in the unconstrained mode, increasing the number of disk drives is
more important .
This example demonstrates that even a small linear programming
problem with fve variables, in this case, can have an impact on a com
pany' s planning process. Moreover, the information provided by linear pro
gramming solvers dual variables, ranges, etc. , can ofer signifcant insights
and can be a very useful aid to decision makers.
5. 2 Global dependence on the righthand side
vector
In this section, we take a global view of the dependence of the optimal cost
on the requirement vector t
Let
i t, " ¦xI nx" t x� s¦
be the feasible set , and note that our notation makes the dependence on t
explicit. Let
S " ¦tI it,is nonempty¦ ,
and observe that
S " ¦ nxI x� s¦
in particular, S is a convex set . For any t,S, we defne
r t, " min x
XEP( b)
which i s the optimal cost as a fnction of t
Throughout this section, we assume that the dual feasible set {p I
p
i
n ¸ 
, is nonempty. Then, duality theory implies that the optimal
primal cost rt,is fnite for every t, S. Our goal is to understand the
structure of the function rt, for t,S.
Sec. 5. 2 Global dependence on the righthand side vector 213
Let us h a particular element t·of S Suppose that there exists a
nondegenerate primal optimal basic feasible solution, and let nbe the cor
responding optimal basis matrix. The vector 2_ of basic variables at that
optimal solution is given by 2_ " n
t· and is positive by nondegeneracy.
In addition, the vector of reduced costs is nonnegative. If we change t·to t
and if the diference t t·is sufciently small, n
tremains positive and
we still have a basic feasible solution. The reduced costs are not afected
by the change from t· to tand remain nonnegative. Therefore, nis an
optimal basis for the new problem as well. The optimal cost r t, for the
new problem is given by
rt, " �n
t" , t for tclose to t·
where , " �n
is the optimal solution to the dual problem. This
establishes that in the vicinity of t· rt,is a linear function of tand its
gradient is given by ,
We now turn t o the global properties of r t,
Iheorem 5. 1 The optimal cost rt,is a convex function of t on
the set S
Prooi. Let t
and t
be two elements of S For · " 1 , 2, let 2
¡
be an
optimal solution to the problem of minimizing x subj ect to x � 0 and
nx" t
¡
. Thus, r t
, " x
and r t
, " x
Fix a scalar > , [0, 1] ,
and note that the vector y" ·x
+ ( 1  ·,x
is nonnegative and satisfes
ny" ·t
+ (1  ·, t
In particular, yis a feasible solution to the linear
programming problem obtained when the requirement vector t is set to
·t
+ (1  ·, t
Therefore,
r¸·t
+ ( 1  >, t
, ¸y" ·x
+( 1  ·, x
" ·rt
,+( 1  ·, r t
,
establishing the convexity of r .
We now corroborate Theorem 5. 1 by taking a diferent approach,
involving the dual problem
maximize , t
subject to ,n¸
which has been assumed feasible. For any t , S, rt,is fnite and, by
strong duality, is equal to the optimal value of the dual objective. Let
,
,
,
·
be the extreme points of the dual feasible set. ¸ Our standing
assumption is that the matrix nhas linearly independent rows; hence its
columns span �
·
. Equivalently, the rows of nspan �
·
and Theorem 2. 6
i n Section 2. 5 implies that the dual feasible set must have at least one
214 Chap. 5 Sensitivity analysis
¹
°
:
¸ µ')¸ b+
°
d
:
f
Fì,ure 5. 1: The optimal cost when the vector b is a function
of a scalar parameter. Each linear piece is of the form ¸µ
·
) ' ¸b¯+
³d) , where µ
·
is the ith extreme point of the dual feasible set .
In each one of the intervals ³ < ³
.
, ³
.
< ³ < ³, , and ³ > ³, ,
we have diferent dual optimal solutions, namely, µ
.
, µ
,
, and µ
+
,
respectively. For ³ = ³
.
or ³ = ³, , the dual problem has multiple
optimal solutions.
extreme point . ) Since the optimum of the dual must be attained at an
extreme point , we obtain
rt,= _ max ,
,
,
t
I J , . .  , 1
t S ( 5. 2)
In particular, ri s equal to the maximum of a fnite collection of linear
functions. It is therefore a piecewise linear convex function, and we have a
new proof of Theorem 5. 1 . In addition, within a region where ris linear,
we have rt,= ,
,
, t where ,
,
is a corresponding dual optimal solution,
in agreement with our earlier discussion.
For those values of tfor which ris not diferentiable, that is, at the
junction of two or more linear pieces, the dual problem does not have a
unique optimal solution and this implies that every optimal basic feasible
solution to the primal is degenerate. (This is because, as shown earlier in
this section, the existence of a nondegenerate optimal basic feasible solution
to the primal implies that ris locally linear. )
We now restrict attention t o changes i n tof a particular type, namely,
t = t· ^d, where t· and d are fxed vectors and 0 is a scalar. Let
1( 0) = r t·^d)be the optimal cost as a function of the scalar parameter
^ Using Eq. ( 5. 2) , we obtain
1(0) = _ max ,
,
,
t·
^d) ,
1 1 , . .  , 1
t·^dS
Sec. 5. 3 The set of all dual optimal solutions*
E, b, E, b,
b'
b b'
Fì,ure 5. 2: Illustration of subgradients of a function I at a
point b' A subgradient p is the gradient of a linear fnction
I¸ b' ) + p
'
¸b÷ b' ) that lies below the function I¸b) and agrees
with it for b = b' .
215
b
This is essentially a "section" of the function Ej it is again a piecewise linear
convex functionj see Figure 5. 1 . Once more, at breakpoints of this function,
every optimal basic feasible solution to the primal must be degenerate.
5. 3 The set of all dual optimal solutions*
We have seen that if the function E is defned, fnite, and linear in the
vicinity of a certain
·
vector b ,then there is a unique optimal dual solution,
equal to the gradient of Eat that point , which leads to the interpretation
of dual optimal solutions as marginal costs. We would like to extend this
interpretation so that it remains valid at the breakpoints of E This is
indeed possible: we will show shortly that any dual optimal solution can
be viewed as a "generalized gradient" of E We frst need the following
defnition, which is illustrated in Figure 5. 2.
Deñnìtìon 5. 1 Let Ebe a convex function defned on a convex set S
Let b be an element of S We say that a vector pi s a sub,radìent
of E at b if
E¸b ) p
¸b b ) _ E¸ b) , \ b  S
Note that if b is a breakpoint of the function E, then there are
several subgradients. On the other hand, if E is linear near b , there is a
unique subgradient , equal to the gradient of E
216 Chap. 5 Sensitivity analysis
Iheorem 5. 2 Suppose that the linear programming problem of min
imizing · xsubject to nx" t· and x _ ° is feasible and that the
optimal cost is fnite. Then, a vector ,i an optimal solution to the
dual problem if and only if it is a subgradient of the optimal cost
function rat the point t·
Prooi. Recall that the function ris defned on the set S, which is the
set of vectors tfor which the set it,of feasible solutions to the primal
problem is nonempty. Suppose that ,is an optimal solution to the dual
problem. Then, strong duality implies that , t·" r t· , Consider now
some arbitrary t  S For any feasible solution x  i,t, weak duality
yields ,t � x Taking the minimum over all x  i,t, we obtain
,t� rt, Hence, ,t ,t· � rt, rt· , and we conclude that ,
is a subgradient of rat t·
We now prove the converse. Let ,be a subgradient of rat t· that
is,
r t· ,,
t t· ,� rt, \ t  S. ( 5. 3)
Pick some x ¸ 0, let t " nx and note that x  it, In particular,
rt,� x Using Eq. ( 5. 3) , we obtain
,
nx" ,
t� rt, rt· ,,
t·� 
x r t· ,,
t·
°
Since this is true for all x¸ 0, we must have ,n�  which shows that ,
is a dual feasible solution. Also, by letting x" 0, we obtain r t· ,� , t·
Using weak duality, every dual feasible solution , must satisf , t· �
rt· ,� , t· , which shows that ,is a dual optimal solution. ¯
5. 4 Global dependence on the cost vector
In the last two sections, we fxed the matrix nand the vector  and we
considered the efect of changing the vector t The key to our development
was the fact that the set of dual feasible solutions remains the same as t
varies. In this section, we study the case where nand tare fxed, but the
vector varies. In this case, the primal feasible set remains unafected; our
standing assumption will be that it is nonempty.
We defne the dual feasible set
,,= ¦ ,I ,
n� ¦
and let
T" ¦I ,,is nonempty¦ .
If 
.
 T and 
 T, then there exist ,· and ,
such that ,
.
, n � 
and ,
, n�  For any scalar x, [0, 1] , we have
¸x,
.
,
( 1  x, ,
, , n� ·
.
( 1  ·, 
Sec. 5. 5 Parametric programming 217
and this establishes that 'c
l
+ ( 1  ', c
 T. We have therefore shown
that T is a convex set .
If c , T, the infeasibility of the dual problem implies that the optimal
primal cost is
=
0e On the other hand, if c  T, the optimal primal cost
must be fnite. Thus, the optimal primal cost, which we will denote by
Gc, , is fnite if and only if c T.
Let x
.
x
• • • x
·
be the basic feasible solutions in the primal feasible
set ; clearly, these do not depend on c. Since an optimal solution to a
standard form problem can always be found at an extreme point , we have
Gc,= . min x
,
.
,
. . . ·
Thus, Gc,is the minimum of a fnite collection of linear functions and is
a piecewise linear concave function. If for some value c* of c, the primal
has a unique optimal solution x
,
we have · , x
,
< · , x
,
for all j = t
For c very close t o · the inequalities x
,
< x
,
j = t , continue t o hold,
implying that x
,
is still a unique primal optimal solution with cost x
,
We conclude that , locally, Gc,= x
,
On the other hand, at those values
of c that lead to multiple primal optimal solutions, the function G has a
breakpoint .
We summarize the main points of the preceding discussion.
Iheorem 5. 3 Consider a feasible linear programming problem in stan
dard form.
(a
ì
The set T of all c for which the optimal cost is fnite, is convex.
(b
) The optimal cost G¸c) is a concave function of c on the set T.
[c) Ifor some vlue of c the primal problem has a unique optimal
solution x· . then G is linear in the vicinity of c and its gradient
is equal to x·
5. 5 Parametric programming
Let us fx A, b, c, and a vector +of the same dimension as c. For any
scalar (, we consider the problem
minimize ¸c + +, x
subject to Ax b
x > 0,
and let , , be the optimal cost as a function of (. Naturally, we assume
that the feasible set is nonempty. For those values of ( for which the optimal
cost is fnite, we have
, , = . min ¸c + +, x
,
.
,
. . . ·
218 Chap. 5 Sensitivity analysis
where ×
·
, ,×` are the extreme points of the feasible set ; see Figure 5. 3.
In particular, ,,i s a piecewise linear and concave function of the param
eter In this section, we discuss a systematic procedure, based on the
simplex method, for obtaining , , for all values of We start with an
example.
Þ
J` cpti ual
×´ opti ual ×` optimal ×' optimal 6
Fì,ure 5. 3: The optimal cost g(() as a function of (.
L×ampe 5. 5 Consider the problem
minimize ( 3 + 2()XI + (3  () X
2
+ X3
subject to Xl + 2X
2
3X3
±
5
2XI + X
2
4X3
<
7
XI , X
2
, X3 2 o.
We introduce slack variables i n order to bring the problem into standard form,
and then let the slack variables be the basic variables. This determines a basic
feasible solution and leads to the following tableau.
Xl X
2
X3 X4 X5
0 3 + 2( 3  ( 1 0 0
5 1 2 3 1 0
X5 = 7 2 1 4 0 1
If 3 + 2( 2 0 and 3 ¬ ( 2 0, all reduced costs are nonnegative and we
have an optimal basic feasible solution. In particular,
g(() = 0, i f �
<
(
<
3.
2
 
Sec. 5. 5 Parametric programming 219
I f 8 i s increased slightly above 3, the reduced cost of x, becomes negative
and we no longer have an optimal basic feasible solution. We let x, enter the
basis, x, exits, and we obtain the new tableau:
x; x, x. x, x,
7. 5 + 2. 58 4. 5 + 2. 58 0 5. 5  1 . 58 1 . 5 + 0. 58 0
2. 5 0. 5 1 1 . 5 0. 5 0
4. 5 1. 5 0 2. 5 0. 5 1
We note that all reduced costs are nonnegative if and only if 3 � 8 � 5. 5/1 . 5.
The optimal cost for that range of values of 8 is
g(8) = 7. 5  2. 58, i f 3
<
8
<
5. 5
.
 
1 . 5
I f 8 i s increased beyond 5. 5/1. 5, the reduced cost of x.
becomes negative. I f we
attempt to bring x. into the basis, we cannot fnd a positive pivot element in the
third column of the tableau, and the problem is unbounded, with g(8) = ¬¯.
Let us now go back to the original tableau and suppose that 8 i s decreased
to a value slightly below 3/2. Then, the reduced cost of x; becomes negative, we
let x
;
enter the basis, and x, exits. The new tableau is:
x; x, x. x, x,
10. 5  78 0 4. 5  28 5 + 48 0 1 . 5  8
x, = 1 . 5 0 1 . 5 1 1 0. 5
3. 5 1 0. 5 2 0 0. 5
We note that all of the reduced costs are nonnegative i f and only i f 5/4 � 8 � 3/2.
For these values of 8, we have an optimal solution, with an optimal cost of
g(8) = 10. 5 + 78, ' f �
<
8
<
�
4
 
2 '
Finally, for 8 < 5/4, the reduced cost of x. is negative, but the optimal cost is
equal to ¬¯, because all entries in the third column of the tableau are negative.
We plot the optimal cost in Figure 5. 4.
We now generalize the steps in the preceding example, in order to
obtain a broader methodology. The key observation is that once a basis
is fxed, the reduced costs are afne (linear plus a constant) functions of
Then, if we require that all reduced costs be nonnegative, we force to
belong to some interval. (The interval could be empty but if it is nonempty,
its endpoints are also included. ) We conclude that for any given basis, the
set of ( for which this basis is optimal is a closed interval.
220 Chap. 5 Sensitivity analysis
v '7¹
8
Fì,ure5. 4: The optimal cost g(9) as a function of 9, in Example
5. 5. For 9 outside the interval [ 5/4, 1 1 /3] , g(9) is equal to ¬¯.
Let us now assume that we have chosen a basic feasible solution and
an associated basis matrix B, and suppose that this basis is optimal for (
satisfing (
1
��
Let ·¸ be a variable whose reduced cost becomes
negative for >
Since this reduced cost is nonnegative for (
1
��
it must be equal to zero when J * 
We now attempt to bring ·¸ into
the basis and consider separately the diferent cases that may arise.
Suppose that no entry of the jth column B
1
A¸ of the simplex
tableau is positive. For  > 
the reduced cost of ·¸ is negative, and
this implies that the optimal cost is 0 in that range.
If the jth column of the tableau has at least one positive element , we
carry out a change of basis and obtain a new basis matrix B. For J " 
the reduced cost of the entering variable is zero and, therefore, the cost
associated with the new basis is the same as the cost associated with the
old basis. Since the old basis was optimal for J * 
the same must be
true for the new basis. On the other hand, for 0 < J
the entering variable
·¸ had a positive reduced cost. According to the pivoting mechanics, and
for 0 < 
a negative multiple of the pivot row is added to the pivot row,
and this makes the reduced cost of the exiting variable negative. This
implies that the new basis cannot be optimal for 0 < 
We conclude that
the range of values of ( for which the new basis is optimal is of the form

��, for some , By continuing similarly, we obtain a sequence of
bases, with the ith basis being optimal for (i ��
,
Note that a basis which is optimal for  [(i , (i+
1
] cannot be optimal
for values of ( greater than 
, Thus, if (i+
1
>(i for all i , the same basis
cannot be encountered more than once and the entire range of values of (
will be traced in a fnite number of iterations, with each iteration leading
to a new breakpoint of the optimal cost function g(() . (The number of
breakpoints may increase exponentially with the dimension of the problem. )
Sec. 5. 6 Summay 221
The situation i s more complicated if for some basis we have Oi = 0Hl .
In this case, it is possible that the algorithm keeps cycling between a fnite
number of diferent bases, all of which are optimal only for 0 = Oi = Oi+
1
'
Such cycling can only happen in the presence of degeneracy in the primal
problem (Exercise 5. 1 7) , but can be avoided if an appropriate anticycling
rule is followed. In conclusion, the procedure we have outlined, together
with an anticycling rule, partitions the range of possible values of 0 into
consecutive intervals and, for each interval, provides us with an optimal
basis and the optimal cost function as a function of O.
There i s another variant of parametric programming that can be used
when L is kept fxed but bis replaced by bOd, where dis a given vector
and 0 is a scalar. In this case, the zeroth column of the tableau depends
on O. Whenever 0 reaches a value at which some basic variable becomes
negative, we apply the dual simplex method in order to recover primal
feasibility.
5. 6 Summary
In this chapter, we have studied the dependence of optimal solutions and of
the optimal cost on the problem data, that is, on the entries of A, b, and
L. For many of the cases that we have examined, a common methodology
was used. Subsequent to a change in the problem data, we frst examine its
efects on the feasibility and optimality conditions. If we wish the same basis
to remain optimal , this leads us to certain limitations on the magnitude of
the changes in the problem data. For larger changes, we no longer have
an optimal basis and some remedial action (involving the primal or dual
simplex method) is typically needed.
We close with a summary of our main results.
(a) If a new variable is added, we check its reduced cost and if it is
negative, we add a new column to the tableau and proceed from
there.
(b) If a new constraint is added, we check whether it is violated and if
so, we form an auxiliary problem and its tableau, and proceed from
there.
(c) If an entry of bor L is changed by ·we obtain an interval of values
of ·for which the same basis remains optimal.
(d) If an entry of Ais changed by ·a similar analysis is possible. How
ever, this case is somewhat complicated if the change afects an entry
of a basic column.
(e) Assuming that the dual problem is feasible, the optimal cost is a
piecewise linear convex function of the vector b(for those bfor which
the primal is feasible) . Frthermore, subgradients of the optimal cost
function correspond to optimal solutions to the dual problem.
zzz Chap. 5 Sensitivity analysis
¸f) Assuming that the primal problem is feasible, the optimal cost is a
piecewise linear concave function of the vector e¸for those efor which
the primal has fnite cost) .
¸g) If the cost vector is an afne function of a scalar parameter e, there
is a systematic procedure ¸parametric programming) for solving the
problem for all values of e. A similar procedure is possible if the
vector b is an afne function of a scalar parameter.
5. 7 Exercises
L×ereìse s. ¡ Consider the same problem as in Example 5. 1 , for which we al
ready have an optimal basis. Let us introduce the additional constraint x
;+x, =
3. Form the auxiliary problem described in the text, and solve it using the pri
mal simplex method. Whenever the "large" constant :is compared to another
number, :should be treated as being the larger one.
L×ereìse s. z ( Sensitivity with respect to changes in a basic column
of A) In this problem (and the next two) we study the change in the value
of the optimal cost when an entry of the matrix A is perturbed by a small
amount. We consider a linear programming problem in standard form, under the
usual assumption that A has linearly independent rows. Suppose that we have
an optimal basis H that leads to a nondegenerate optimal solution x' , and a
nondegenerate dual optimal solution p. We assume that the frst column is basic.
We will now change the frst entry of A, from U¿ to U¿ + c, where c is a small
scalar. Let E be a matrix of dimensions m x m (where m is the number of rows
of A) , whose entries are all zero except for the top left entry C¿¸ which is equal
to 1 .
(a) Show that i f cis small enough, H+cEi s a basis matrix for the new problem.
(b) Show that under the basis H+cE, the vector 7_ of basic variables in the
new problem is equal to ¸I+cH¯
·
E)
;
H
;
b
(c) Show that i f ci s sufciently small , H+cEi s an optimal basis for the new
problem.
(d) We use the symbol ;:to denote equality when second order terms in care ig
nored. The following approximation is known to be true: ¸I +cH
·
E)
·

I÷ cH¯
·
E Using this approximation, show that
where x! (respectively, ,¸is the frst component of the optimal solution to
the original primal (respectively, dual) problem, and 7_ has been defned
in part (b) .
L×ereìse s. 1 ( Sensitivity with respect to changes in a basic column
of A) Consider a linear programming problem in standard form under the usual
assumption that the rows of the matrix A are linearly independent. Suppose
that the columns A, , , A
¬
form an optimal basis. Let A, be some vector and
suppose that we change Aj to A, + cA, Consider the matrix H¸c) consisting of
Sec. 5. 7 Exercises 223
the columns A, + cA
;
, A2 , ,A, Let [11 , 12 ] be a closed interval of values of
1 that contains zero and in which the determinant of H¸c) is nonzero. Show that
the subset of [11 , 12] for which H¸o) is an optimal basis is also a closed interval.
L×ereìse s. a Consider the problem in Example 5. 1 , with a1 1 changed fom
3 to 3 + o. Let us keep Xl and X2 as the basic variables and let H¸o) be the
corresponding basis matrix, as a function of o.
¸a) Compute H¸c) ¯
;
b For which values of 0 i s H¸o) a feasible basis?
(b) Compute c�H¸c) ¯
;
For which values of 0 is H¸o) an optimal basis?
(c) Determine the optimal cost, as a function of 0, when 0 is restricted to those
values for which H¸o) is an optimal basis matrix.
L×ereìse s. s While solving a standard form linear programming problem using
the simplex method, we arrive at the following tableau:
X
; X2 X3 X4 X5
0 0 C3 0 C5
1 0 1 1 0
/
2 0 0 2 1
'
3 1 0 4 0 0
Suppose also that the last three columns of the matrix Aform an identity matrix.
¸a) Give necessary and sufcient conditions for the basis described by this
tableau to be optimal (in terms of the coefcients in the tableau) .
(b) Assume that this basis is optimal and that C3 = O. Find an optimal basic
feasible solution, other than the one described by this tableau.
(c) Suppose that
'
2 o. Show that there exists an optimal basic feasible
solution, regardless of the values of C3 and C5 .
(d) Assume that the basis associated with this tableau is optimal. Suppose
also that 
;
in the original problem is replaced by ; + C. Give upper and
lower bounds on C so that this basis remains optimal .
¸t) Assume that the basis associated with this tableau is optimal. Suppose
also that c
;
in the original problem is replaced by c
;
+ C. Give upper and
lower bounds on C so that this basis remains optimal .
L×ereìse s. c Company A has agreed to supply the following quantities of spe
cial lamps to Company B during the next 4 months:
Month January February March April
Units 150 160 225 180
Company A can produce a maximum of 160 lamps per month at a cost of $35
per unit. Additional lamps can be purchased from Company C at a cost of $50
zza Chap. 5 Sensitivity anaysis
per lamp. Company A incurs an inventory holding cost of $5 per month for each
lamp held in inventory.
(a) Formulate the problem that Company A is facing as a linear programming
problem.
(b) Solve the problem using a linear programming package.
(c) Company A is considering some preventive maintenance during one of the
frst three months. If maintenance is scheduled for January, the company
can manufacture only 151 units (instead of 160) ; similarly, the maximum
possible production if maintenance is scheduled for February or March is
153 and 155 units, respectively. What maintenance schedule would you
recommend and why?
(d) Company D has ofered to supply up to 50 lamps (total) to Company A
during either January, February or March. Company D charges $45 per
lamp. Should Company A buy lamps fom Company D? If yes, when and
how many lamps should Company A purchase, and what is the impact of
this decision on the total cost?
(t) Company C has ofered to lower the price of units supplied to Company
A during February. What is the maximum decrease that would make this
ofer attractive to Company A?
(f) Because of anticipated increases in interest rates, the holding cost per lamp
is expected to increase to $8 per unit in February. How does this change
afect the total cost and the optimal solution?
(g) Company B has just informed Company A that it requires only 90 units in
January (instead of 150 requested previously) . Calculate upper and lower
bounds on the impact of this order on the optimal cost using information
from the optimal solution to the original problem.
L×ereìse s. z A paper company manufactures three basic products: pads of
paper, 5packs of paper, and 20packs of paper. The pad of paper consists of a
single pad of 25 sheets of lined paper. The 5pack consists of 5 pads of paper,
together with a small notebook. The 20pack of paper consists of 20 pads of
paper, together with a large notebook. The small and large notebooks are not
sold separately.
Production of each pad of paper requires 1 minute of papermachine time,
1 minute of supervisory time, and $ . 10 in direct costs. Production of each small
notebook takes 2 minutes of papermachine time, 45 seconds of supervisory time,
and $. 20 in direct cost . Production of each large notebook takes 3 minutes of
paper machine time, 30 seconds of supervisory time and $. 30 in direct costs. To
package the 5pack takes 1 minute of packager' s time and 1 minute of supervisory
time. To package the 20pack takes 3 minutes of packager' s time and 2 minutes
of supervisory time. The amounts of available papermachine time, supervisory
time, and packager' s time are constants 
;
, , , . , respectively. Any of the three
,Products can be sold to retailers in any quantity at the prices $. 30, $1 . 60, and
$7. 00, respectively.
Provide a linear programming formulation of the problem of determining
an optimal mix of the three products. (You may ignore the constraint that only
integer quantities can be produced. ) Ty to formulate the problem in such a
Sec. 5. 7 Exercises zzs
way that the following questions can be answered by looking at a single dual
variable or reduced cost in the fnal tableau. Also, for each question, give a brief
explanation of why it can be answered by looking at j ust one dual price or reduced
cost.
(a) What is the marginal value of an extra unit of supervisory time?
(b) What is the lowest price at which it is worthwhile to produce single pads
of paper for sale?
(L) Suppose that parttime supervisors can be hired at $8 per hour. Is it
worthwhile to hire any?
(d) Suppose that the direct cost of producing pads of paper increases from $. 1 0
t o $. 12. What i s the proft decrease?
L×ereìse s. s A pottery manufacturer can make four diferent types of dining
room service sets: JJP English, Currier, Primrose, and Bluetail . Frthermore,
Primrose can be made by two diferent methods. Each set uses clay, enamel, dry
room time, and kiln time, and results in a proft shown in Table 5. 3. (Here, Ibs
is the abbreviation for pounds) .
Resourees L C Iota
Cay (bs
) 10 15 10 10 20 130
Lname (bs) 1 2 2 1 1 13
Dryroom (hours) 3 1 6 6 3 45
Kìn (hours) 2 4 2 5 3 23
Proñt 51 102 66 66 89
Iabes. 1. The rightmost column in the table gives the manufac
turer' s resource availability for the remainder of the week. Notice
that Primrose can be made by two diferent methods. They both
use the same amount of clay ( 10 Ibs. ) and dry room time (6 hours) .
But the second method uses one pound less of enamel and three
more hours in the kiln.
The manufacturer is currently committed to making the same amount of
Primrose using methods 1 and 2. The formulation of the proft maximization
problem is given below. The decision variables E, C, I
;
,I, ,B are the number
of sets of type English, Currier, Primrose Method 1, Primrose Method 2, and
Bluetail , respectively. We assume, for the purposes of this problem, that the
number of sets of each type can be fractional .
zzc Chap. 5 Sensitivity analysis
maximize ¯IE + I02c + ööí + ööI, + 3oL
subject to IE + I¯c + I0I, + I0I, + 20L
±
I30
E + 2c + 2
í + I, + L
<
I3
3E + c + öí + öI, + 3L
±
+¯
2E + +c + 2
í + ¯I, + 3L
<
23
í
I, 0
E,c,I, ,I, ,L 2 0
The optimal solution to the primal and the dual , respectively, together with
sensitivity information, is given in Tables ¯ + and ¯ ¯ Use this information to
answer the questions that follow.
L
C
PI
P2
B
Optìma Redueed Ob¡eetìve Aowabe Aowabe
Vaue Cost Coemeìent Inerease Deerease
0 3. 571 51 3. 571 0
2 0 102 16. 667 12. 5
0 0 66 37. 571 0
0 37. 571 66 37. 571 0
5 0 89 47 12. 5
Iabe s. a. The optimal primal solution and its sensitivity with
respect to changes in coefcients of the objective function. The
last two columns describe the allowed changes in these coefcients
for which the same solution remains optimal .
(a) What is the optimal quantity of each service set, and what is the total
proft?
(b) Give an economic (not mathematical) interpretation of the optimal dual
variables appearing in the sensitivity report, for each of the fve constraints.
(c) Should the manufacturer buy an additional 20 lbs. of Clay at $ 1 . 1 per
pound?
(d) Suppose that the number of hours available in the dry room decreases by
30 Give a bound for the decrease in the total proft.
(e) In the current model , the number of Primrose produced using method 1 was
required to be the same as the number of Primrose produced by method 2
Consider a revision of the model i n which this constraint i s replaced by the
constraint I, I, 2 o. In the reformulated problem would the amount of
Primrose made by method 1 be positive?
L×ereìse s. u Using the notation of Section ¯ 2, show that for any positive
scalar , and any bE S, we have I¸ Xb) = XI¸b) Assume that the dual feasible
set is nonempty, so that I¸b) is fnite.
Sec. 5. 7 Exercises zzz
saek Dua Constr. Aowabe Aowabe
Vaue Varìabe RHs Inerease Deerease
Cay 130 1 . 429 130 23. 33 43. 75
Lname 9 0 13 0 4
DryRm. 17 0 45 0 28
Kìn 23 20. 143 23 5. 60 3. 50
Prìm. 0 1 1 . 429 0 3. 50 0
Iabe s. s. The optimal dual solution and its sensitivity. The
column labeled "slack value" gives us the optimal values of the
slack variables associated with each of the primal constraints. The
third column simply repeats the righthand side vector b, while the
last two columns describe the allowed changes in the components
of b for which the optimal dual solution remains the same.
L×ereìse s. ¡u Consider the linear programming problem:
minimize x; +x,
subject to x; + 2x, = 0,
x; ,x, 2 O.
(a) Find (by inspection) an optimal solution, as a function of 0
(b) Draw a graph showing the optimal cost as a function of 0
(L) Use the picture i n part ( b) to obtain the set of all dual optimal solutions,
for every value of 0
L×ereìse s. ¡¡ Consider the function q¸0) , as defned i n the beginning of Sec
tion 5. 5. Suppose that q¸0) is linear for 0 E ¸0
;
,0, Is it true that there exists a
unique optimal solution when 0
;
< 0 < 0, : Prove or provide a counterexample.
L×ereìse s. ¡z Consider the parametric programming problem discussed in Sec
tion 5. 5.
( a) Suppose that for some value of 0, there are exactly two distinct basic feasible
solutions that are optimal. Show that they must be adj acent .
(b) Let 0' be a breakpoint of the function q¸0) Let x
;
, x
,
, x
.
be basic feasible
solutions, all of which are optimal for 0 = 0' Suppose that x
;
is a unique
optimal solution for 0 < 0' , x
.
is a unique optimal solution for 0 > 0' , and
x
;
,x
,
,x
.
are the only optimal basic feasible solutions for 0 = 0' Provide
an example to show that x
;
and x
.
need not be adjacent.
zzs Chap. 5 Sensitivity analysis
L×ereìse s. ¡1 Consider the following linear programming problem:
minimize 4X
l +
5X3
subject to 2X
l + X2 5X3 1
3X
l + 4X3 + X4 2
Xl
, X2 , X3 , X4 � 0.
(a) Write down a simplex tableau and fnd an optimal solution. I s i t unique?
(b) Write down the dual problem and fnd an optimal solution. Is it unique?
(c) Suppose now that we change the vector b from b = ( 1 , 2) to b = ( 1 
20, 2  30) , where 0 is a scalar parameter. Find an optimal solution and
the value of the optimal cost, as a function of O. (For all 0, both positive
and negative. )
L×ereìse s. ¡a Consider the problem
minimize
subject to
(c
+
Od) ' x
Ax b + or
x � 0,
where A is an m x n matrix with linearly independent rows. We assume that the
problem is feasible and the optimal cost 1(0) is fnite for all values of 0 in some
interval [01 , O2] .
¸a) Suppose that a certain basis is optimal for 0 = 1 0 and for 0 = 10. Prove
that the same basis is optimal for 0 = 5.
( b) Show that 1( 0) i s a piecewise quadratic function of O. Give an upper bound
on the number of "pieces. "
(c) Let b = 0 and c = 0 Suppose that a certain basis i s optimal for 0 = I
For what other nonnegative values of 0 is that same basis optimal?
(d) Is 1(0) convex, concave or neither?
L×ereìse s. ¡s Consider the problem
minimize c' x
subject to Ax b + Od
x � 0,
and let 1( 0) be the optimal cost, as a fnction of O.
(a) Let X(O) be the set of all optimal solutions, for a given value of O. For
any nonnegative scalar t , defne X(O, t) to be the union of the sets X(O) ,
° _ 0 _ t . I s X( O, t) a convex set? Provide a proof or a counterexample.
(b) Suppose that we remove the nonnegativity constraints x � 0 from the
problem under consideration. Is X( O, t) a convex set? Provide a proof or
a counterexample.
(c) Suppose that X
l
and x
2
belong to X(O, t) . Show that there is a continuous
path fom X
l
to x
2
that is contained within X(O, t) . That is, there exists
a continuous function g(>.) such that g(>.1 ) = X
l
, g(>2 ) = x
2
, and g(>.) =
X(O, t) for all >. = (
>
1 , >2 ) .
Sec. 5. 8 Notes and sources 229
L×ereìse s. ¡c Consider the parametric programming problem of Section 5. 5.
Suppose that some basic feasible solution i s optimal i f and only i f ( i s equal to
some 0'
(a) Suppose that the feasible set is unbounded. Is it true that there exist at
least three distinct basic feasible solutions that are optimal when ( = 0' :
(b) Answer the question in part (a) for the case where the feasible set is
bounded.
L×ereìse s. ¡z Consider the parametric programming problem. Suppose that
every basic solution encountered by the algorithm is nondegenerate. Prove that
the algorithm does not cycle.
5. 8 Notes and sources
The material in this chapter, with the exception of Section 5. 3, is standard,
and can be found in any text on linear programming.
s. ¡. A more detailed discussion of the results of the production planning
case study can be found in Feund and Shannahan ( 1992) .
s. 1. The results i n this section have beautiful generalizations t o the case
of nonlinear convex optimization; see, e. g. , Rockfellar ( 1970) .
s. s. Anticycling rules for parametric programming can be found in Murty
( 1983) .
ÍOÍOIODCOS
scu
570 References
AHUJA, R. K. , T. L. MAGNANTI , and J . B. aRLIN. 1993. Network Flows,
Prentice Hall , Englewood Clifs, NJ.
ANDERSEN, E. , J. GONDZIIO, C. MESZAROS, and X. XU. 1996. Implementa
tion of interior point methods for large scale linear programming, in Inte
rior point methods in mathematical progrmming, T. Terlaky ¸ed. ) , Kluwer
Academic Publisher, Boston, MA.
ApPLEGATE, D. , and W. COOK. 1991 . A computational study of the job shop
scheduling problem, ORSA Joural on Computing, 3, 149156.
BALAS, E. , S. CERIA, G. CORNUEJOLS, and N. NATRAJ . 1995. Gomory cuts
revisited, working paper, CarnegieMellon University, Pittsburgh, PA.
BALAS, E. , S . CERIA, and G. CORNUlboLS. 1995. Mixed 0  1 programming by
lifandproject in a branch and cut environment, working paper, Carnegie
Mellon University, Pittsburgh, PA.
BARAHONA, F. , and £. TARDOS. 1989. Note on Weintraub' s minimum cost
circulation algorithm, SIAM Joural on Computing, 18, 579583.
BARNES, E. R. 1986. A variation on Karmarkar' s algorithm for solving linear
programming problems, Mathematical Prgrmming, 36, 1 74182.
BARR, R. S . , F. GLOVER, and D. KLINGMAN. 1977. The alternating path basis
algorithm for the assignment problem, Mathematical Progrmming, 13, 1
13.
BARTHOLDI , J. J . , J . B. aRLI N, and H. D. RATLIFF. 1980. Cyclic scheduling via
integer programs with circular ones, Opertions Research, 28, 1074 1085.
BAZARAA, M. S . , J. J. JARVIS , and H. D. SHERALI . 1990. Linear Prgramming
and Network Flows, 2nd edition, Wiley, New York, NY.
BEALE, E. M. L. 1955. Cycling in the dual simplex algorithm, Naval Research
Logistics Quarterly, 2, 269275.
BELLMAN, R. E. 1958. On a routing problem, Quarterly of Applied Mathematics,
16, 8790.
BENDERS, J. F. 1962. Partitioning procedures for solving mixedvariables pro
gramming problems, Numerische Mathematik, 4, 238252.
BERTSEKAS , D. P. 1979. A distributed algorithm for the assignment problem,
working paper, Laboratory for Information and Decision Systems, M. LT. ,
Cambridge, MA.
BERTSEKAS, D. P. 1981 . A new algorithm for the assignment problem. Mathe
matical Progrmming, 21, 152171 .
BERTSEKAS, D. P. 1991. Linear Network Optimization, M. LT. Press, Cambridge,
MA.
BERTSEKAS, D. P. 1995a. Dynamic Prgrmming and Optimal Control, Athena
Scientifc, Belmont, MA.
BERTSEKAS, D. P. 1995b. Nonlinear Progrmming, Athena Scientifc, Belmont,
MA.
BERTSEKAS , D. P. , and J. N. TSITSIKLI S. 1989. Parallel and Distributed Com
putation: Numerical Methods, Prentice Hall, Englewood Clifs, NJ.
BERTSIMAS, D. , and L. Hsu. 1997. A branch and cut algorithm for the job shop
scheduling problem, working paper, Operations Research Center, M. LT. ,
Cambridge, MA.
BERTSIMAS, D. , and X. Luo. 1997. On the worst case complexity of potential
reduction algorithms for linear programming, Mathematical Prgramming,
to appear.
References 571
BERTSIMAS, D. , and S. STOCK. 1997. The air trafc fow management problem
with enroute capacities, Opertions Researh, to appear.
BLAND, R. G. 1977. New fnite pivoting rules for the simplex method, Mathe
matics of Opertions Research, 2, 103 107.
BLAND, R. G. , D. GOLDFARB, and M. J. TODD. 1981 . The ellipsoid method: a
survey, Opertions Research, 29, 1039 1091.
BORGWARDT, K.  H. 1982. The average number of pivot steps required by the
simplexmethod is polynomial, Zeitschrij jr Opertions Research, 26, 157
177.
BOYD, S . , and L. VANDENBERGHE. 1995. Introduction to convex optimization
with engineering applications, lecture notes, Stanford University, Stanford,
CA.
BRADLEY, S. P. , A. C. HAX, and T. L. MAGNANT! . 1977. Applied Mathematical
Progrmming, AddisonWesley, Reading, MA.
CARATHEODORY, C. 1907.
¡
ber den Variabilitatsbereich der Koefzienten von
Potenzreihen, die gegebene Werte nicht annehmen, Mathematische An
nalen, 64, 95 115.
CERIA, S. , C. CORDIER, H. MARCHAND, and L. A. WOLSEY. 1995. Cutting
planes for integer programs with general integer variables, working paper,
Columbia University, New York, NY.
CHARNES , A. 1952. Optimality and degeneracy in linear programming, Econo
metrica, 20, 160 170.
CHRISTOFIDES, N. 1975. Worstcase analysis of a new heuristic for the traveling
salesman problem, Report 388, Graduate School of Industrial Administra
tion, CarnegieMellon University, Pittsburgh, PA.
CHVATAL, V. 1983. Linear Progrmming, W. H. Feeman, New York, NY.
CLARK, F. E. 1961 . Remark on the constraint sets in linear programming, Amer
ican Mathematical Monthly, 68, 351352.
COBHAM, A. 1965. The intrinsic computational difculty of functions, in Logic,
Methodology and Philosophy of Science, Y. BarHillel ¸ed. ) , NorthHolland,
Amsterdam, The Netherlands, 2430.
COOK, S. A. 1971 . The complexity of theorem proving procedures, in Prceedings
of the 3m ACM Symposium on the Theory of Computing, 151 158.
CORMEN, T. H. , C. E. LEISERSON, and R. L. RIVEST. 1990. Introduction to
Algorithms, McGrawHill, New York, NY.
CUNNINGHAM, W. H. 1976. A network simplex method, Mathematical Progrm
ming, 11, 105 1 16.
DAHLEH, M. A. , and 1 . DIAZBoBILLO. 1995. Contrl of Uncertain Systems: A
Linear Progrmming Apprach, Prentice Hall, Englewood Clifs, NJ.
DANTZIG, G. B. 1951 . Application of the simplex method to a transportation
problem, in Activity Analysis of Prduction and Allocation, T. C. Koop
mans ¸ed. ) , Wiley, New York, NY, 359373.
DANTZI G, G. B. 1963. Linear Prgramming and Extensions, Princeton University
Press, Princeton, NJ.
DANTZIG, G. B. 1992. An tprecise feasible solution to a linear program with a
convexity constraint in lJt
2
iterations independent of problem size, working
paper, Stanford University, Stanford, CA.
DANTZIG, G. B. , A. ORDEN, and P. WOLFE. 1955. The generalized simplex
method for minimizing a linear form under linear inequality constraints,
Pacifc Journal of Mathematics, 5, 183195.
572 References
DANTZI G, G. B. , and P. WOLFE. 1960. The decomposition principle for linear
programs, Opertions Research, 8, 101 1 1 I .
DIJKSTRA, E. 1959. A note on two problems i n connexion with graphs, Nu
merische Mathematik, 1 , 26927I .
DIKIN, I . I . 1967. Iterative solutions of problems of linear and quadratic pro
gramming, Soviet Mathematics Doklady, 8, 674675.
DIKIN, I . I . 1974. On the convergence of an iterative process, Uprvlyaemye
Sistemi, 12, 5460. (In Russian. )
DINES, L. L. 1918. Systems of linear inequalities, Annals of Mathematics, 20,
191199.
DUDA, R 0. , and P. E. HART. 1973. Patter Classifcation and Scene Analysis,
Wiley, New York, NY.
EDMONDS, J. 1965a. Paths, trees, and fowers, Canadian Joural of Mathematics,
17, 449467.
EDMONDS, J. 1965b. Maximum matching and a polyhedron with 0  1 vertices,
Joural of Research of the National Burau of Standars, 69B, 125 130.
EDMONDS, J. 1971 . Matroids and the greedy algorithm, Mathematical Prgrm
ming, 1, 127136.
EDMONDS, J . , and R M. KARP. 1972. Theoretical improvements in algorithmic
efciency for network fow problems, Joural of the ACM, 19, 248264.
ELIAS, P. , A. FEINSTEI N, and C. E. SHANNON. 1956. Note on maximum fow
through a network, IRE Tnsactions on Information Theory, 2, 1 17 1 19.
FARKAS, G. 1894. On the applications of the mechanical principle of Fourier,
Mathematikai es Termeszettudomanyi
E
rtesito, 12, 457472. (In Hungar
ian. )
FIACCO, A. V. , and G. P. MCCORMICK. 1968. Nonlinear prgrmming: sequen
tial unconstrined minimization techniques, Wiley, New York, NY.
FEDERGRUEN, A. , and H. GROENEVELT. 1986. Preemptive scheduling of uniform
machines by ordinary network fow techniques, Management Science, 32,
341349.
FISHER, H. , and G. L. THOMPSON. 1963. Probabilistic learning combinations of
local j ob shop scheduling rules, in Industrial Scheduling, J. F. Muth and G.
L. Thompson (eds. ) , Prentice Hall, Englewood Clifs, NJ, 225251 .
FLOYD, R. W. 1962. Algorithm 97: shortest path. Communications of ACM, 5,
345.
FORD, L. R 1956. Network fow theory, report P923, Rand Corp. , Santa Monica,
CA.
FORD, L. R. , and D. R FULKERSON. 1956a. Maximal fow through a network,
Canadian Joural of Mathematics, 8, 399404.
FORD, L. R, and D. R FULKERSON. 1956b. Solving the transportation problem,
Management Science, 3, 2432.
FORD, L. R, and D. R. FULKERSON. 1962. Flows in Networks, Princeton
University Press, Princeton, NJ.
FOURIER. J. B. J. 1827. Analyse des Tavaux de l ' Academie Royale des Sciences,
pendant l' annee 1824, Partie mathematique, Histoire de l 'Academie Royale
des Sciences de l 'Institut de Fnce, 7, xlviiIv.
FREUND, R. M. 1991 . Polynomialtime algorithms for linear programming based
only on primal afne scaling and projected gradients of a potential function,
Mathematical Prgramming, 51, 203222.
References 573
FREUND, R. M. , and B. SHANNAHAN. 1992. Shortrun manufacturing problems
at DEC, report, Sloan School of Management, M. LT. , Cambridge, MA.
FRISCH, M. R. 1956. La resolution des problemes de programme lineaire par la
methode du potential logarithmique, Cahiers du Seminaire D' Econome
trie, 4, 720.
FULKERSON, D. R. , and G. B. DANTZIG. 1955. Computation of maximum fow
in networks, Naval Research Logistics Quarterly, 2, 277283.
GALE, D. , H. W. KUHN, and A. W. TUCKER. 1951 . Linear programming and
the theory of games, in Activity Analysis of Prduction and Allocation,
T. C. Koopmans ,ed. ) , Wiley, New York, NY, 317329.
GALE, D. , and L. S. SHAPLEY. 1962. College admissions and the stability of
marriage, American Mathematical Monthly, 69, 9 15.
GAREY, M. R. , and D. S. JOHNSON. 1979. Computers and Intmctability: a
Guide to the Theory of NP completeness, W. H. Feeman, New York, NY.
GEMAN, S. and D. GEMAN. 1984. Stochastic Relaxation, Gibbs distribution, and
the Bayesian restoration of images, IEEE Tansactions on Patter Analysis
and Machine Intelligence, 6, 721741.
GILL, P. E. , W. MURRAY, and M. H. WRIGHT. 1981. Practical Optimization,
Academic Press, New York, NY.
GILMORE, P. C. , and R. E. GOMORY. 1961. A linear programming approach to
the cutting stock problem, Opemtions Research, 9, 849859.
GILMORE, P. C. , and R. E. GOMORY. 1963. A linear programming approach to
the cutting stock problem  part II, Opemtions Research, 11, 863888.
GOEMANS, M. , and D. BERTSIMAS. 1993. Survivable networks, LP relaxations
and the parsimonious property, Mathematical Prgmmming, 60, 145 166.
GOEMANS, M. , and D. WILLIAMSON. 1993. A new 3/4 approximation algorithm
for MAX SAT, in Prceedings of the 3rd International Confernce in Integer
Prgmmming and Combinatorial Optimization, 313321 .
GOLDBERG, A. V. , and R. E. TARJAN. 1988. A new approach to the maximum
fow problem, Joural of the ACM, 35, 921940.
GOLDBERG, A. V. , and R. E. TARJAN. 1989. Finding minimumcost circulations
by cancelling negative cycles, Joural of the ACM, 36, 873886.
GOLDFARB, D. , and J. K. REID. 1977. A practicable steepestedge simplex
algorithm, Mathematical Prgmmming, 12, 361371 .
GOLUB, G. H. , and C. F. VAN LOAN. 1983. Matrix Computations, The Johns
Hopkins University Press, Baltimore, MD.
GOMORY, R. E. 1958. Outline of an algorithm for integer solutions to linear
programs, Bulletin of the American Mathematical Society, 64, 275278.
GONZAGA, C. 1989. An algorithm for solving linear programming i n o, »
·
ì,
operations, in Prgress in Mathematical Progmmming, N. Megiddo ,ed. ) ,
SpringerVerlag, New York, NY, 128.
GONZAGA, C. 1990. Polynomial afne algorithms for linear programming, Math
ematical Progmmming, 49, 721 .
GROTSCHEL, M. , L. LOVASZ, and A. SCHRIJVER. 1981. The ellipsoid method and
its consequences in combinatorial optimization, Combinatorica, 1, 169197.
GROTSCHEL, M. , L. LOVASZ, and A. SCHRIJVER. 1988. Geometric Algorithms
and Combinatorial Optimization, SpringerVerlag, New York, NY.
HAIMOVICH, M. 1983. The simplex method is very good!  on the expected
number of pivot steps and related properties of random linear programs,
preprint.
574 References
HAJEK, B. 1988. Cooling schedules for optimal annealing, Mathematics of Oper
ations Researh, 13, 31 1329.
HALL, L. A. , and R. J. VANDERBEI . 1993. Two thirds is sharp for afne scaling,
Opertions Research Letters, 13, 197201 .
HALL, L. A. , A. S . SCHULTZ, D. B. SHMOYS, and J . WEIN. 1996. Scheduling
to minimize average completion time; ofline and online approximation
algorithms, working paper, Johns Hopkins University, Baltimore, MD.
HANE , C. A. , C. BARNHART, E. L. JOHNSON, R. E. MARSTEN, G. L. NEMHAUS
ER, and G. SIGISMONDI. 1995. The feet assignment problem: solving a
largescale integer program, Mathematical Prgrmming, 70, 2 1 1232.
HARRIS, P. M. J. 1973. Pivot selection methods of the Devex LP code, Mathe
matical Progrmming, 5, 128.
HAYKIN, S. 1994. Neurl Networks: A Comprhensive Foundation, McMillan,
New York, NY.
HELD, M. , and R. M. KARP. 1962. A dynamic programming approach to se
quencing problems, SIAM Joural on Applied Mathematics, 10, 196210.
HELD, M. , and R. M. KARP. 1970. The traveling salesman problem and mini
mum spanning trees, Opertions Researh, 18, 1 138 1 162.
HELD, M. , and R. M. KARP. 1971 . The traveling salesman problem and mini
mum spanning trees: part II, Mathematical Progrmming, 1, 625.
HELLY, E. 1923. Lber Mengen konvexer Korper mit gemeinschaftlichen Punkten,
Jahrsbericht Deutsche Mathematische Vereinungen, 32, 175 176.
HOCHBAUM, D. ,ed. ) . 1996. Approximation algorithms for NPhard problems,
Kluwer Academic Publishers, Boston, MA.
Hu, T. C. 1969. Integer Prgrmming and Network Flows, AddisonWesley,
Reading, MA.
IBARRA, O. H. , and C. E. KIM. 1975. Fast approximation algorithms for the
knapsack and sum of subset problems, Joural of the ACM, 22, 463468.
INFANGER, G. 1993. Planning under uncertainty: solving largescale stochastic
linear progrms, Boyd & Faser, Danvers, MA.
JOHNSON, D. S . , C. ARAGON, L. MCGEOCH, and C. SCHEVON. 1990. Optimiza
tion by simulated annealing: an experimental evaluation, part I: graph
partitioning, Opertions Research, 37, 865892.
JOHNSON, D. S . , C. ARAGON, L. MCGEOCH, and C. SCHEVON. 1992. Optimiza
tion by simulated annealing: an experimental evaluation, part II: graph
coloring and number partitioning, Opertions Research, 39, 378406.
KALAl , G. , and D. KLEITMAN. 1992. A quasipolynomial bound for the diameter
of graphs of polyhedra, Bulletin of the American Mathematical Society, 26,
315316.
KALL, P. , and S. W. WALLACE. 1994. Stochastic Prgrmming, Wiley, New
York, NY.
KARMARKAR, N. 1984. A new polynomialtime algorithm for linear program
ming, Combinatorica, 4, 373395.
KARP, R. M. 1972. Reducibility among combinatorial problems, in Complexity
of Computer Computations, R. E. Miller and J. W. Thacher ,eds. ) , Plenum
Press, New York, NY, 85 103.
KARP, R. M. 1978. A characterization of the minimum cycle mean in a digraph,
Discrete Mathematics, 23, 30931 1 .
References 575
KARP, R. M. , and C. H. PAPADIMITRIOU. 1982. On linear characterizations
of combinatorial optimization problems, SIAM Joural on Computing, 11,
620632.
KHACHIAN, L. G. 1979. A polynomial algorithm i n linear programming, Soviet
Mathematics Doklady, 20, 191 194.
KIRKPATRICK, S . , C. D. GELATT, JR. , and M. P. VECCHI . 1983. Optimization
by simulated annealing, Science, 220, 671680.
KLEE, V. , and G. J. MINTY. 1972. How good is the simplex algorithm?, in
Inequalities  II, O. Shisha (ed. ) , Academic Press, New York, NY, 159
175.
KLEE, V. , and D. W. WALKUP. 1967. The dstep conjecture for polyhedra of
dimension d < 6, Acta Mathematica, 117, 5378.
KLEIN, M. 1967. A primal method for minimal cost fows with application to the
assignment and transportation problems, Management Science, 14, 205
220.
KOJIMA, M. , S. MIZUNO, and A. YOSHISE. 1989. A primaldual interior point
algorithm for linear programming, in Progrss in Mathematical Progrm
ming, N. Megiddo (ed. ) , SpringerVerlag, New York, NY, 2947.
KUHN, H. W. 1955. The Hungarian method for the assignment problem, Naval
Research Logistics Quarterly, 2, 8397.
LAWLER, E. L. 1976. Combinatorial Optimization: Networks and Matroids, Holt,
Rinehart, and Winston, New York, NY.
LAWLER, E. L. , J. K. LENSTRA, A. H. G. RINNOOY KAN, and D. B. SHMOYS
(eds. ) . 1985. The Taveling Salesman Prblem: a Guided Tour of Combi
natorial Optimization, Wiley, New York, NY.
LENSTRA, J . K. , A. H. G. RINNOOY KAN, and A. SCHRIJVER (eds. ) . 1991.
History of Mathematical Progrmming: A Collection of Personal Reminis
cences, Elsevier, Amsterdam, The Netherlands.
LEVI N, A. Y. 1965. On an algorithm for the minimization of convex functions,
Soviet Mathematics Doklady, 6, 286290.
LEVI N, L. A. 1973. Universal sorting problems, Prblemy Perdachi Informatsii,
9, 265266. (In Russian. )
LEWIS , H. R. , and C. H. PAPADIMITRIOU. 1981 . Elements of the Theory of
Computation, Prentice Hall, Englewood Clifs, NJ.
LUENBERGER, D. G. 1969. Optimization by Vector Space Methods, Wiley, New
York, NY.
LUENBERGER, D. G. 1984. Linear and Nonlinear Progrmming, 2nd ed. , Addison
Wesley, Reading, MA.
LUSTI G, 1 . , R. E. MARSTEN, and D. SHANNO. 1994. Interior point methods:
computational state of the art, ORSA Journal on Computing, 6, 114.
MAGNANTI , T. L. , and L. A. WOLSEY. 1995. Optimal Tees, in Handbook of
Opertions Research and Management Science, Volume 6, Network Models,
M. O. Ball, C. L. Monma, T. L. Magnanti and G. L. Nemhauser (eds. ) ,
North Holland, Amsterdam, The Netherlands, 503615.
MARSHALL, K. T. , and J. W. SUURBALLE. 1969. A note on cycling i n the
simplex method, Naval Research Logistics Quarterly, 16, 121137.
MARTI N, P. , and D. B. SHMOYS . 1996. A new approach to computing opti
mal schedules for the j obshop scheduling problem, in Prceedings of the
5th International Conference in Integer Prgrmming and Combinatorial
Optimization, 389403.
576 References
MCSHANE, K. A. , C. L. MONMA, and D. SHANNO. 1991 . An implementation of
a primaldual interior point method for linear programming, ORSA Joural
on Computing, 1, 7083.
MEGIDDO, N. 1989. Pathways to the optimal set in linear programming, in
Progrss in Mathematical Progrmming, N. Megiddo ¸ed. ) , SpringerVerlag,
New York, NY, 131158.
MEGIDDO, N. , and SHUB, M. 1989. Boundary behavior of interior point algo
rithms in linear programming, Mathematics of Opertions Research, 14,
97 146.
MINKOWSKI , H. 1896. Geometrie der Zahlen, Teubner, Leipzig, Germany.
MIZUNO, S . 1996. Infeasible interior point algorithms, in Interior Point Algo
rithms in Mathematical Progrmming, T. Terlaky ¸ed. ) , Kluwer Academic
Publishers, Boston, MA.
MONTEIRO, R. D. C. , and I. ADLER. 1989a. Interior path following primaldual
algorithms; part I: linear programming, Mathematical Progrmming, 44,
2741.
MONTEIRO, R. D. C. , and I . ADLER. 1989b. Interior path following primal
dual algorithms; part II: convex quadratic programming, Mathematical
Prgrmming, 44, 4366.
MOTZKIN, T. S. 1936. Beitrage zur Theorie der linearen Ungleichungen ¸Inau
gural Dissertation Basel) , Azriel, Jerusalem.
MURTY, K. G. 1983. Linear Progrmming, Wiley, New York, NY.
NEMHAUSER, G. L. , and L. A. WOLSEY. 1988. Integer and Combinatorial Opti
mization, Wiley, New York, NY.
NESTEROV, Y. , and A. NEMIROVSKI I . 1994. Interior point polynomial algo
rithms for convex prgrmming, SIAM, Studies in Applied Mathematics,
13, Philadelphia, PA.
VON NEUMANN, J . 1947. Discussion of a maximum problem, unpublished working
paper, Institute for Advanced Studies, Princeton, NJ.
VON NEUMANN, J. 1953. A certain zerosum twoperson game equivalent to the
optimal assignment problem, in Contributions to the Theory of Games, II,
H. W. Kuhn and A. W. Tcker ¸eds. ) , Annals of Mathematics Studies, 28,
Princeton University Press, Princeton, NJ, 5 12.
ORDEN, A. 1993. LP from the '40s t o the ' 90s, Interfaces, 23, 21 2.
aRLIN, J . B. 1984. Genuinely polynomial simplex and nonsimplex algorithms
for the minimum cost fow problem, technical report 161584, Sloan School
of Management, M. I. T. , Cambridge, MA.
PADBERG, M. W. , and M. R. RAo. 1980. The Russian method and integer
programming, working paper, New York University, New York, NY.
PAPADIMITRIOU, C. H. 1994. Computational Complexity, AddisonWesley, Read
ing, MA.
PAPADIMITRIOU, C. H. , and K. STEIGLITZ. 1982. Combinatorial Optimization:
Algorithms and Complexity, Prentice Hall , Englewood Clifs, NJ.
PLOTKIN, S . , and E. Tardos. 1990. Improved dual network simplex, in Proceed
ings of the First ACMSIAM Symposium on Discrete Algorithms, 367376.
POLJAK, B. T. 1987. Introduction to Optimization, Optimization Sofware Inc. ,
New York, NY.
PRIM, R. C. 1957. Shortest connection networks and some generalizations, Bell
System Technical Joural, 36, 1389 1401 .
References 577
QUEYRANNE, M. 1993. Structure of a simple scheduling polyhedron, Mathemat
ical Prgrmming, 58, 263285.
RECSKI , A. 1989. Matroid Theory and its Applications in Electric Network The
ory and in Statics, SpringerVerlag, New York, NY.
RENEGAR, J . 1988. A polynomial time algorithm based on Newton' s method for
linear programming, Mathematical Progrmming, 40, 5993.
ROCKAFELLAR, R. T. 1970. Convex Analysis, Princeton University Press, Prince
ton, NJ.
ROCKAFELLAR, R. T. 1984. Network Flows and Monotropic Optimization, Wiley,
New York, NY.
Ross, S . 1976. Risk, return, and arbitrage, in Risk and Retur in Finance, L
Fiend, and J. Bicksler ,eds. ) , Cambridge, Ballinger, England.
Ross, S . 1978. A simple approach to the valuation of risky streams, Joural of
Business, 51, 453475.
RUDIN, W. 1976. Real Analysis, McGrawHill , New York, NY.
RUSHMEIER, R. A. , and S . A. KONTOGIORGIS. 1997. Advances in the optimiza
tion of airline feet assignment, Tansportation Science, to appear.
SCHRIJVER, A. 1986. Theory of Linear and Integer Progrmming, Wiley, New
York, NY.
SCHULTZ, A. S . 1996. Scheduling to minimize total weighted completion time:
performance guarantees of LPbased heuristics and lower bounds, in Pro
ceedings of the 5th Interational Conference in Integer Prgrmming and
Combinatorial Optimization, 301315.
SHOR, N. Z. 1970. Utilization of the operation of space dilation i n the minimiza
tion of convex functions, Cyberetics, 6, 715.
SMALE, S. 1983. On the average number of steps i n the simplex method of linear
programming, Mathematical Prgrmming, 27, 241262.
SMITH, W. E. 1956. Various optimizers for singlestage production, Naval Re
search Logistics Quarterly, 3, 5966.
STIGLER, G. 1945. The cost of subsistence, Joural of Farm Economics, 27,
303314.
STOCK, S. 1996. Allocation of NSF graduate fellowships, report, Sloan School of
Management, M. LT. , Cambridge, MA.
STONE, R. E. , and C. A. TOVEY. 1991 . The simplex and projective scaling
algorithms as iteratively reweighted least squares, SIAM Review, 33, 220
237.
STRANG, G. 1988. Linear Algebr and its Applications, 3rd ed. , Academic Press,
New York, NY.
TARDOS, £. 1985. A strongly polynomial minimum cost circulation algorithm,
Combinatorica, 5, 247255.
TEO, C. 1996. Constructing approximation algorithms via linear programming
relaxations: primal dual and randomized rounding techniques, Ph. D. thesis,
Operations Research Center, M. LT. , Cambridge, MA.
TSENG, P. 1989. A simple complexity proof for a polynomialtime linear pro
gramming algorithm, Opertions Research Letters, 8, 155159.
TSENG, P. , and Z.  Q. Luo. 1992. On the convergence of the afne scaling
algorithm, Mathematical Prgrmming, 56, 301319.
szs References
TSUCHIYA, T. 1991 . Global convergence of the afne scaling methods for de
generate linear programming problems, Mathematical Progrmming, 52,
377404.
TSUCHlYA, T. , and M. MURAMATSU. 1995. Global convergence of a longstep
afne scaling algorithm for degenerate linear programming problems, SIAM
Joural on Optimization, 5, 525551 .
TUCKER, A. W. 1956. Dual systems of homogeneous linear relations, i n Linear
Inequalities and Related Systems, H. W. Kuhn and A. W. Tcker ,eds. ) ,
Princeton University Press, Princeton, NJ, 3 18.
VANDERBEI , R. J. , M. S. MEKETON, and B. A. FREEDMAN. 1986. A mod
ifcation of Karmarkar
'
s linear programming algorithm, Algorithmica, 1 ,
395407.
VANDERBEI , R. J . , J. C. LAGARIAS. 1990. I. I. Dikin
'
s convergence result for
the afnescaling algorithm, in Mathematical Developments Arising fm
Linear Progrmming, J. C. Lagarias and M. J. Todd ,eds. ) , American
Mathematical Society, Providence, RI, Contemporry Mathematics, 1 14,
109 1 19.
VRANAS, P. 1996. Optimal slot allocation for European air trafc fow manage
ment, working paper, German aerospace research establishment, Berlin,
Germany.
WAGNER, H. M. 1959. On a class of capacitated transportation problems, Man
agement Science, 5, 304318.
WARSHALL, S. 1962. A theorem on boolean matrices, Joural of the ACM, 23,
1 1 12.
WEBER, R. 1995. Personal communication.
WEINTRAUB, A. 1974. A primal algorithm to solve network fow problems with
convex costs, Management Science, 21, 8797.
WILLIAMS, H. P. 1990. Model Building in Mathematical Progrmming, Wiley,
New York, NY.
WILLIAMSON, D. 1994. On the design of approximation algorithms for a class of
graph problems, Ph. D. thesis, Department of EECS, M. LT. , Cambridge,
MA.
YE, Y. 1991. An O,n
3
L) potential reduction algorithm for linear programming,
Mathematical Prgrmming, 50, 239258.
YE, Y. , M. J. TODD, and S. MIZUNO. 1994. An O, ¸L)iteration homogeneous
and selfdual linear programming algorithm, Mathematics of Operations
Research, 19, 5367.
YUDIN, D. B. , and A. NEMIROVSKI I . 1977. Informational complexity and efcient
methods for the solution of convex extremal problems, Matekon, 13, 2545.
ZHANG, Y. , and R. A. TAPIA. 1993. A superlinearly convergent polynomial
primaldual interior point algorithm for linear programming, SIAM Joural
on Optimization, 3, 1 18 133.
ÍDOO×
szu
580
A
Absolute values, problems with, 17 19, 35
Active constraint, 48
Adjacent
bases, 56
basic solutions, 53, 56
vertices, 78
Afne
function, 15, 34
independence, 120
subspace, 3031
transformation, 364
Afne scaling algorithm, 394, 395409,
440441 , 448, 449
initialization, 403
longstep, 401 , 402403, 440, 441
performance, 403404
shortstep, 401 , 404409, 440
Air trafc fow management , 544551 , 567
Algorithm, 3234, 40, 361
complexity of, 866 running time
efcient , 363
polynomial time, 362, 515
Analytic center, 422
Anticycling
in dual simplex, 160
in network simplex, 357
in parametric programming, 229
in primal simplex, 108 1 1 1
Approximation algorithms, 480, 50751 1 ,
528530, 558
Arbitrage, 168, 199
Arc
backward, 269
balanced, 316
directed, 268
endpoint of, 267
forward, 269
in directed graphs, 268
in undirected graphs, 267
incident , 267, 268
incoming, 268
outgoing, 268
Arithmetic model of computation, 362
Artifcial variables, 1 1 2
elimination of, 1 12 1 13
Asset pricing, 167 169
Assignment problem, 274, 320, 323,
325332
with side constraint , 526527
Auction algorithm, 270, 325332, 354, 358
Augmenting path, 304
Average computational complexity,
127 128, 138
B
Ball, 364
Barrier function, 419
Barrier problem, 420, 422, 431
Basic column, 55
Basic direction, 84
Basic feasible solution, 50, 52
existence, 6265
existence of an optimum, 6567
fnite number of, 52
initial, 866 initialization
magnitude bounds, 373
to bounded variable LP, 76
to general LP, 50
Basic solution, 50, 52
Index
to network fow problems, 280284
to standard form LP, 5354
to dual, 154, 161164
Basic indices, 55
Basic variable, 55
Basis, 55
adjacent , 56
degenerate, 59
optimal, 87
relation to spanning trees, 280284
Basis matrix, 55, 87
Basis of a subspace, 29, 30
Bellman equation, 332, 336, 354
BellmanFord algorithm, 336339, 354355,
358
Benders decomposition, 254260, 263, 264
BigM method, 1 17 1 19, 135 136
Big 0 notation, 32
Binary search, 372
Binding constraint , 48
Bipartite matching problem, 326, 353, 358
Birkhofvon Neumann theorem, 353
Bit model of computation, 362
Bland' s rule, 866 smallest subscript rule
Bounded polyhedra, representation, 6770
Bounded set , 43
Branch and bound, 485490, 524, 530,
542544, 560562
Branch and cut, 489450, 530
Bring into the basis, 88
c
Candidate list , 94
Capacity
of an arc, 272
of a cut, 309
of a node, 275
Caratheodory' s theorem, 76, 197
Cardinality, 26
Caterer problem, 347
Index
Central path, 420, 422, 444
Certifcate of infeasibility, 165
Changes in data, see sensitivity analysis
Chebychev approximation, 188
Chebychev center, 36
Cholesky factor, 440, 537
Clark' s theorem, 151, 193
Classifer, 14
Closedness of fnitely generated cones
172, 196
Circuits, 315
Circulation, 278
decomposition of, 350
simple, 278
Circulation problem, 275
Clique, 484
Closed set, 169
Column
of a matrix, notation, 27
zeroth, 98
Column generation, 236238
Column geometry, 1 19 123, 137
Column space, 30
Column vector, 26
Combination
convex, 44
linear, 29
Communication network, 12 13
Complementary slackness, 151 155, 191
economic interpretation, 329
i n assignment problem, 326327
in network fow problems, 314
strict, 153, 192, 437
Complexity theory, 514523
Computer manufacturing, 7 10
Concave function, 15
characterization, 503, 525
Cone, 174
containing a line, 175
pointed, 175
polyhedral, 175
Connected graph
directed, 268
undirected, 267
Connectivity, 352
Convex combination, 44
Convex function, 15, 34, 40
Convex hull, 44, 68, 74, 183
of integer solutions, 464
Convex polyhedron, see polyhedron
Convex set, 43
Convexity constraint , 120
Corner point, see extreme point
Cost function, 3
Cramer's rule, 29
Crossover problem, 541542
Currency conversion, 36
Cut, 309
capacity of, 309
minimum, 310, 390
st, 309
Cutset, 467
Cutset formulatio
581
of minimum spanning tree problem,
467
of traveling salesman problem, 470
Cutting plane method
for integer programming, 480484, 530
for linear programming, 236239
for mixed integer programming, 524
Cutting stock problem, 234236, 260, 263
Cycle
cost of, 278
directed, 269
in directed graphs, 269
in undirected graphs, 267
negative cost, 291
unsaturated, 301
Cyclic problems, 40
Cycling, 92
in primal simplex, 104 105, 130, 138
see also anticycling
D
DNA sequencing, 525
DantzigWolfe decomposition, 239254,
261263, 264
Data ftting, 1920
Decision variables, 3
Deep cuts, 380, 388
Degeneracy, 5862, 536, 541
and interior point methods, 439
and uniqueness, 190 191
i n assignment problems, 350
in dual, 163 164
in standard form, 5960, 62
in transportation problems, 349
Degenerate basic solution, 58
Degree, 267
Delayed column generation, 236238
Delayed constraint generation, 236, 263
Demand, 272
Determinant , 29
Devex rule, 94, 540
Diameter of a polyhedron, 126
Diet problem, 5, 40, 156, 260261
Dijkstra's algorithm, 340342, 343, 358
Dimension, 29, 30
of a polyhedron, 68
Disjunctive constraints, 454, 472473
Dual algorithm, 157
Dual ascent
approximate, 266
582
i n network fow problems, 266,
316325, 357
steepest, 354
termination, 320
Dual plane, 122
Dual problem, 141, 142, 142 146
optimal solutions, 215216
Dual simplex method, 156 164, 536537,
540544
for network fow problems, 266,
323325, 354, 358
geometry, 160
revised, 157
Dual variables
in network fow problems, 285
interpretation, 155 156
Duality for general LP, 183187
Duality gap, 399
Duality in integer programming, 494507
Duality in network fow problems, 312316
Duality theorem, 146155, 173, 184, 197,
199
Dynamic programming, 490493, 530
integer knapsack problem, 236
zeroone knapsack problem, 491493
traveling salesman problem, 490
E
Edge of a polyhedron, 53, 78
Edge of an undirected graph, 267
Efcient algorithm, see algorithm
Electric power, 10 1 1 , 255256, 564
Elementary direction, 316
Elementary row operation, 96
Ellipsoid, 364, 396
Ellipsoid method, 363392
complexity, 377
for fulldimensional bounded polyhe
dra, 371
for optimization, 378380
practical performance, 380
sliding objective, 379, 389
Enter the basis, 88
Epsilonrelaxation method, 266, 358
Evaluation problem, 517
Exponential number of constraints,
380387, 465472, 551562
Exponential time, 33
Extreme point , 46, 50
see also basic feasible solution
Extreme ray, 67, 176 177, 197, 525
Euclidean norm, 27
F
Facility location problem, 453454,
462464, 476, 518, 565
Farkas' lemma, 165, 172, 197, 199
Feasible direction, 83, 129
Feasible set, 3
Feasible solution, 3
Finitely generated
cone, 196, 198
set , 182
Index
Fixed charge network design problem, 476,
566
Fleet assignment problem, 537544, 567
Flow, 272
feasible, 272
Flow augmentation, 304
Flow conservation, 272
Flow decomposition theorem, 298300, 351
for circulations, 350
FloydWarshall algorithm, 355356
Forcing constraints, 453
FordFulkerson algorithm, 305312, 357
FourierMotzkin elimination, 7074, 79
Factional programming, 36
Fee variable, 3
elimination of, 5
Fulldimensional polyhedron, see polyhe
dron
Full rank, 30, 57
Full tableau, 98
G
Gaussian elimination, 33, 363
Global minimum, 15
Gomory cutting plane algorithm, 482484
Graph, 267272
connected, 267, 268
directed, 268
undirected, 267
Graph coloring problem, 566567
Graphical solution, 2125
Greedy algorithm
for minimum spanning trees, 344, 356
Groundholding, 545
H
Halfspace, 43
Hamilton circuit , 521
HeldKarp lower bound, 502
Helly' s theorem, 194
Heuristic algorithms, 480
Hirsch conjecture, 126 127
Hungarian method, 266, 320, 323, 358
Hyperplane, 43
I
Identity matrix, 28
Index
Incidence matrix, 277, 457
truncated, 280
Independent set problem, 484
Initialization
afne scaling algorithm, 403
DantzigWolfe decomposition, 250251
negative cost cycle algorithm, 294
network fow problems, 352
network simplex algorithm, 286
potential reduction algorithm, 416418
primal path following algorithm,
429431
primaldual path following algorithm,
435437
primal simplex method, 1 1 1 1 19
Inner product , 27
Instance of a problem, 360361
size, 361
Integer programming, 12, 452
mixed, 452, 524
zeroone, 452, 517, 518
Interior, 395
Interior point methods, 393449, 537
computational aspects, 439440,
536537, 540544
Intree, 333
Inverse matrix, 28
Invertible matrix, 28
J
Job shop scheduling problem, 476,
551563, 565, 567
K
KarushKuhnTucker conditions, 421
Knapsack problem
approximation algorithms, 507509,
530
complexity, 518, 522
dynamic programming, 491493, 530
integer, 236
zeroone, 453
KonigEgervary theorem, 352
L
Label correcting methods, 339340
Labeling algorithm, 307309, 357
Lagrange multiplier, 140, 494
Lagrangean, 140, 190
Lagrangean decomposition, 527528
Lagrangean dual, 495
solution to, 502507
Lagrangean relaxation, 496, 530
Leaf, 269
Length, of cycle, path, walk, 333
Leontief systems, 195, 200
Lexicographic pivoting rule, 108 1 1 1 ,
131132, 137
in revised simplex, 132
in dual simplex, 1 60
Libraries, see optimization libraries
Line, 63
Linear algebra, 2631 , 40, 137
Linear combination, 29
Linear inequalities, 165
inconsistent, 194
Linear programming, 2, 38
examples, 6 14
583
Linear programming relaxation, 12, 462
Linearly dependent vectors, 28
Linearly independent vectors, 28
Linearly independent constraints, 49
Local minimum, 15, 82, 131
Local search, 51 1512, 530
Lot sizing problem, 475, 524
ß
Marginal cost, 155156
Marriage problem, 352
Matching problem, 470472, 477478
see also bipartite matching, stable
matching
Matrix, 26
identity, 28
incidence, 277
inversion, 363
inverse, 28
invertible, 28
nonsingular, 28
positive defnite, 364
rotation, 368, 388
square, 28
Matrix inversion lemma, 131, 138
Maxfow mincut theorem, 31031 1 , 351,
357
Maximization problems, 3
Maximum fow problem, 273, 301312
Maximum satisfability, 529530
Mincut problem, see cut
Mean cycle cost minimization, 355, 358
Minimum spanning tree problem, 343345,
356, 358, 466, 477
multicut formulation, 476
Modeling languages, 534535, 567
Moment problem, 35
Multicommodity fow problem, 13
Multiperiod problems, 10 1 1 , 189
N
NP, 518, 531
584
NPcomplete, 519, 531
NPhard, 518, 531, 556
NSF fellowships, 459461 , 477
Nash equilibrium, 190
Negative cost cycle algorithm, 291301 , 357
largest improvement rule, 301 , 351, 357
mean cost rule, 301 , 357
Network, 272
Network fow problem, 13, 551
capacitated, 273, 291
circulation, 275
complementary slackness, 314
dual, 312313, 357
formulation, 272278
integrality of optimal solutions,
289290, 300
sensitivity, 313314
shortest paths, relation to, 334
single source, 275
uncapacitated, 273, 286
with lower bounds, 276, 277
with piecewise linear convex costs, 347
see also primaldual method
Network simplex algorithm, 278291 ,
356357, 536
anticycling, 357, 358
dual, 323325, 354
Newton
direction, 424, 432
method, 432433, 449
step, 422
Node, 267, 268
labeled, 307
scanned, 307
sink, 272
source, 272
Nodearc incidence matrix, 277
truncated, 280
Nonbasic variable, 55
Nonsingular matrix, 28
Null variable, 192
Nullspace, 30
Nurse scheduling, 1 1 12, 40
o
Objective function, 3
Onetree, 501
Operation count , 3234
Optimal control, 20 21, 40
Optimal cost , 3
Optimal solution, 3
to dual , 215216
Optimality conditions
for LP problems 8287, 129, 130
for maximum fow problems, 310
for network fow problems, 298300
KarushKuhnTcker, 421
Optimization libraries, 535537, 567
Optimization problem, 517
Options pricing, 195
Order of magnitude, 32
Orthant, 65
Orthogonal vectors, 27
p
P, 515
Parametric programming, 21 7221 ,
227229
Path
augmenting, 304
directed, 269
in directed graphs, 269
in undirected graphs, 267
shortest, 333
unsaturated, 307
walk, 333
Idex
Path following algorithm, primal, 419431
complexity, 431
initialization, 429431
Path following algorithm, primaldual,
431438
complexity, 435
infeasible, 435436
performance, 437438
quadratic programming, 445446
selfdual, 436437
Path following algorithms, 395396, 449,
542
Pattern classifcation, 14, 40
Perfect matching, 326, 353
see also matching problem
Perturbation of constraints and degener
acy, 60, 131132, 541
Piecewise linear convex optimization,
16 17 189, 347
Piecewise linear function, 15, 455
Pivot, 90, 158
Pivot column, 98
Pivot element , 98, 158
Pivot row, 98, 158
Pivot selection, 9294
Pivoting rules, 92, 108 1 1 1
Polar cone, 198
Polar cone theorem, 198 199
Polyhedron, 42
containing a line, 63
fulldimensional, 365, 370, 375377,
389
in standard form, 43, 5358
isomorphic, 76
see also representation
Polynomial time, 33, 362, 515
Index
Potential function, 409, 448
Potential reduction algorithm, 394,
409419, 445, 448
complexity, 418, 442
initialization, 416418
performance, 419
with line searches, 419, 443444
Preemptive scheduling, 302,
357
Prefowpush methods, 266, 358
Preprocessing, 540
Price variable, 140
Primal algorithm, 157, 266
Primal problem, 141, 142
Primaldual method, 266, 320, 321323,
353, 357
Primaldual path following method, see
path following algorithm
Probability consistency problem, 384386
Problem, 360
Product of matrices, 28
Production and distribution problem, 475
Production planning, 7 10, 35, 40,
210 212, 229
Project management, 335336
Projections of polyhedra, 7074
Proper subset , 26
Pushing fow, 278
Q
Quadratic programming, 445446
H
Rank, 30
Ray, 172
see also extreme ray
Recession cone, 1 75
Recognition problem, 515, 517
Reduced cost , 84
in network fow problems, 285
Reduction (of a problem to another) , 515
Redundant constraints, 5758
Reinversion, 107
Relaxation, see linear programming relax
ation
Relaxation algorithm, 266, 321 , 358
Relaxed dual problem, 237
Representation
of bounded polyhedra, 67
of cones, 182, 198
of polyhedra, 179 183, 198
Requirement line, 122
Residual network, 295297
Resolution theorem, 1 79, 198, 199
Restricted problem, 233
Revised dual simplex method, 157
Revised simplex method, 9598,
105 107
lexicographic rule, 132
Rocket control, 21
Row
space, 30
vector, 26
zeroth, 99
Running time, 32, 362
s
585
Saddle point of Lagrangean, 190
Samuelson' s substitution theorem, 195
Scaling
in auction algorithm, 332
in maximum fow problem, 352
in network fow problems, 358
Scanning a node, 307
Scheduling, 1 1 12, 302, 357, 551563, 567
with precedence constraints, 556
Schwartz inequality, 27
Selfarc, 267
Sensitivity analysis, 201215, 216217
adding new equality constraint,
206207
adding new inequality constraint ,
204206
adding new variable, 203204
changes in a nonbasic column, 209
changes in a basic column, 210, 222223
changes in b, 207208, 212215
changes in C, 208209, 216 217
in network fow problems, 313314
Separating hyperplane, 170
between disjoint polyhedra, 196
fnding, 196
Separating hyperplane theorem, 170
Separation problem, 237, 382, 392, 555
Sequencing with setup times, 457459, 518
Set covering problem, 456457, 518
Set packing problem, 456457, 518
Set partitioning problem, 456457, 518
Setup times, 457459, 518
Shadow price, 156
Shortest path problem, 273, 332343
allpairs, 333, 342343, 355356, 358
alltoone, 333
relation to network fow problem, 333
Side constraints, 197, 526527
Simplex, 120, 137
Simplex method, 9091
average case behavior, 127 128, 138
column geometry, 1 19 123
computational efciency, 124 128
dual, see dual simplex method
586
for degenerate problems, 92
for networks, see network simplex
full tableau implementation, 98105,
105 107
history, 38
implementations, 94 108
initialization, 1 1 1 1 1 9
naive implementation, 9495
performance, 536537, 54541
revised, see revised simplex method
termination, 91 , 1 10
twophase, 1 16 1 1 7
unbounded problems, 179
with upper bound constraints, 135
Simplex multipliers, 94, 161
Simplex tableau, 98
Simulated annealing, 512514, 531
Si ze of an instance, 361
Slack variable, 6, 76
Sliding objective ellipsoid method, 379, 389
Smallest subscript rule, 94, 1 1 1 , 137
Span, of a set of vectors, 29
Spanning path, 124
Spanning tree, 271272
see also minimum spanning trees
Sparsity, 107, 108, 440, 536, 537
Square matrix, 28
Stable matching problem, 563, 567
Standard form, 45
reduction to, 56
visualization, 25
Steepest edge rule, 94, 540543
Steiner tree problem, 391
Stochastic matrices, 194
Stochastic programming, 254260, 264,
564
Strong duality, 148, 184
Strong formulations, 461465
Strongly polynomial, 357
Subdiferential, 503
Subgradient , 215, 503, 504, 526
Subgradient algorithm, 505506, 530
Submodular function minimization,
391392
Subspace, 29
Subtour elimination
in the minimum spanning tree problem,
466
in the traveling salesman problem, 470
Supply, 272
Surplus variable, 6
Survivable network design problem, 391 ,
528529
1
Theorems of the alternative, 166, 194
Total unimodularity, 357
Tour, 383, 469
Tournament problem, 347
Index
Tansformation (of a problem to another) ,
516
Tansportation problem, 273, 274275, 358
degeneracy, 349
Tanspose, 27
Tansshipment problem, 266
Taveling salesman problem, directed,
478, 518
branch and bound, 488489
dynamic programming, 490
integer programming formulation, 477
Taveling salesman problem, undirected,
478, 565, 518, 526
approximation algorithm, 509510, 528
integer programming formulation,
469470, 476
local search, 51 1512, 530
lower bound, 383384, 501502
with triangle inequality, 509510, 521 ,
528
Tee, 269
of shortest paths, 333
see also spanning tree
Tee solution, 280
feasible, 280
Typography, 524
u
Unbounded cost , 3
Unbounded problem, 3
characterization, 177 179
Unique solution, 129, 130
to dual , 152, 190 191
Unit vector, 27
Unrestricted variable, see free variable
v
Vector, 26
Vehicle routing problem, 475
Vertex, 47, 50
see also basic feasible solution
Volume, 364
of a simplex, 390
von Neumann algorithm, 446448, 449
Vulnerability, 352
w
Walk
directed, 269
in directed graphs, 268
in undirected graphs, 267
Weak duality, 146, 184, 495
Index
Weierstrass' theorem, 170, 199
Worstcase running time, 362
z
Zeroth column, 98
Zeroth row, 99
587
Introduction to Linear Optimization
ATHENA SCIENTIFIC SERIES IN OPTIMIZATION AND NEURAL COMPUTATION
1 . Dynamic Programming and Optimal Control, Vols. I and II, by Dim itri P. Bertsekas, 1995.
2.
Nonlinear Programming, by Dimitri P. Bertsekas, 1995.
3. NeuroDynamic Programming, by Dimitri P. Bertsekas and John N. Tsitsiklis, 1996.
4. Constrained Optimization and Lagrange Multiplier Methods, by Dim itri P. Bertsekas, 1996.
5. Stochastic Optimal Control: The DiscreteTime Case, by Dimitri P. Bertsekas and Steven E. Shreve, 1996. 6. Introduction to Linear Optimization, by Dimitris Bertsimas and John N. Tsitsiklis, 1997.
Introduction to Linear Optimization
Dimitris Bertsimas John
N.
Tsitsiklis
Massachusetts Institute of Technology
~
Athena Scientific, Belmont, Massachusetts
Athena Scientific Post Office Box 391 Belmont, Mass. 021789998 U.S.A.
WWW information
Email: athenasc@world.std.com and orders: http://world.std.com; athenasc/
Cover Design:
Ann Gallager
1997 Dimitris Bertsimas and John N. Tsitsiklis All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means ( including photocopying, recording, or information storage and retrieval ) without permission in writing from the publisher.
©
Publisher's CataloginginPublication Data
Bertsimas, Dimitris, Tsitsiklis, John N. Introduction to Linear Optimization Includes bibliographical references and index 1. Linear programming. 2. Mathematical optimization. 3. Integer programming. 1. Title. 9678786 519.7 T57.74.B465 1997
ISBN 1886529191
To Georgia. who left us so early To Alexandra and Melina . and to George Michael.
.
. 1. . . 2. and sources 1 1. xi 1. Extreme points. .2. . 1. 3.2. .4. 2. 1. 2. Variants o f the linear programming problem . . . and basic feasible solutions Polyhedra in standard form Degeneracy . . . .11. 3. . . 2. .8. .3. Optimality of extreme points Representation of bounded polyhedra * Projections of polyhedra: FourierMotzkin elimination * Summary .7. Examples o f linear programming problems Piecewise linear convex objective functions Graphical representation and solution Linear algebra background and notation Algorithms and operation counts Exercises . .8. History. Notes and sources . . . . . 1. . 1.1.6. .3. . . . . 1. . . . notes. .2.1.6. . . Exercises .9. 7. 2. . 1. 2. .5. . .4. . . . 2.1. 42 46 53 58 62 65 67 70 75 75 79 81 The simplex method Optimality conditions . 3. . . Development o f the simplex method .Contents Preface . . . .5. . Existence of extreme points . Implementations of the simplex method 3. . . . 2. . . Introduction . . . . . 2. 2 6 15 21 26 32 34 38 41 The geometry of linear programming Polyhedra and convex sets . . 82 87 94 vii . . 2. . . . 2. vertices. . . . 2. . . . 10.3. . .
. . . . . . 6. Notes and sources 4. Notes and sources 108 111 119 124 128 129 137 139 Duality theory . . Representation of polyhedra . 3. 6. . 4. . . . DantzigWolfe decomposition Stochastic programming and Benders decomposition Summary . 4. 5. . .3. .5. Exercises . . 5. Global dependence on the righthand side vector The set of all dual optimal solutions* .3.7.1. 6. 4.5. . . 140 142 146 155 156 165 169 174 179 183 186 187 199 201 Sensitivity analysis Local sensitivity analysis . . 3.8.2. . . 4.6. .8. The dual problem . 4. . . 3. .1.2.9. Anticycling: lexicography and Bland's rule Finding an initial basic feasible solution Column geometry and the simplex method Computational efficiency of the simplex method Summary . . 10. Notes and sources . . 5. Motivation .7.8. 202 212 215 216 217 221 222 229 231 Large scale optimization Delayed column generation The cutting stock problem Cutting plane methods . . 13. 3. . 4. . . 4. .6. . . .3.4. 5. 5. 6. .4. 6. . 6. 4. . . Optimal dual variables as marginal costs Standard form problems and the dual simplex method Farkas' lemma and linear inequalities . . 4. . . 3.viii Contents 3. Notes and sources .6. 10. Exercises .5.12.6. From separating hyperplanes to duality* Cones and extreme rays .11. General linear programming duality* Summary . The duality theorem . 5. . 5. Global dependence on the cost vector Parametric programming Summary .7. Exercises . 5.8.4. 4. 3.1.5.9. Exercises . .7.2. 4. 5.4. 4. 232 234 236 239 254 260 260 263 . 6. . 4. 6. 6. .
11. . . 7. . . . . . The key geometric result behind the ellipsoid method The ellipsoid method for the feasibility problem The ellipsoid method for optimization . 9.3.8. 265 7. Duality in network flow problems Dual ascent methods* . . . . 9. . . 10. Exercises . . . .5. . 9. . .3. . 7. . . 8. . 9. 10. . . . .6. Guidelines for strong formulations Modeling with exponentially many Summary Exercises . 8. 7. .Contents ix 7. .8.2. . . Formulation of the network flow problem The network simplex algorithm .3. . 7. 7. . Notes and sources 9.5.9. . . .2. . .2.1. . . .7.2. . 8.12. 10. . . . .13. . . Summary . . . .4.6.4. 452 461 465 472 472 . . 267 272 278 291 301 312 316 325 332 343 345 347 356 359 Complexity of linear programming and the ellipsoid method . 7. . 1. . Notes and sources 8. 360 363 370 378 380 387 388 392 393 Interior point methods The affine scaling algorithm Convergence of affine scaling* The potential reduction algorithm The primal path following algorithm The primaldual path following algorithm An overview Exercises . . . 7. 8. . . Exercises .5. . . . .6.8. 1. 8. .7. 8. . . The negative cost cycle algorithm The maximum flow problem . 9. . 7. .4.1. constraints . 10. . .5. . Network flow problems Graphs . 10. . . 7. The minimum spanning tree problem . 8. 10. . 7. 9. Problems with exponentially many constraints* Summary .7. . 395 404 409 419 431 438 440 448 451 Integer programming formulations Modeling techniques . . 7. 7. . . . Notes and sources .3. . . . The assignment problem and the auction algorithm . 8. 9. . .4. . 10. The shortest path problem . . . . 9. Efficient algorithms and computational complexity .
6. 480 485 490 494 507 511 512 514 522 523 530 . 11. 12.x Contents 10.8.6.2. 12. . Modeling languages for linear optimization .3.7.5. 1 1. Linear optimization libraries and general observations The fleet assignment problem The air traffic flow management problem The job shop scheduling problem Summary Exercises Notes and sources 534 535 537 544 551 562 563 567 569 579 References Index . .6. 12.11. 11. Notes and sources 477 479 Integer programming methods Cutting plane methods Branch and bound Dynamic programming Integer programming duality Approximation algorithms Local search Simulated annealing Complexity theory Summary Exercises Notes and sources 11.8. . 12. 11. 11. 12.10. The art in linear optimization 533 12.7.4. 12. 11. 12. 11.2.1.4.1. 11.3. 11. 12.9. 11. 11.5.
We give special attention to theory. We build understanding using figures and geometric arguments. the last chapter. linear programming. and then translate ideas into algebraic formulas and algorithms. and these can only be described algebraically. convince the reader that progress on challenging problems requires both problem specific insight . The success of any optimization methodology hinges on its ability to deal with large and important problems. or to discuss every possible detail relevant to a particular algorithm. large scale optimization. problems are solved by algorithms. as well as the state of the art . describe the available solution methods. we expect that the reader will develop the ability to pass from one domain to the other without much effort . and the presentation of case studies that stretch the limits of currently available algorithms and computers. insightful. Another of our objectives is to be comprehensive.Preface The purpose of this book is to provide a unified. Our general philosophy is that insight matters most . On the other hand. and build an understanding of the qualitative properties of the solutions they provide. xi . We discuss both classical top ics. However. we wish to develop the ability to formulate fairly complex optimization problems. It will. or a researcher. we have not tried to be encyclopedic. we hope. Our last objective is to bring the reader up to date with respect to the state of the art . In that sense. We have made an effort to cover and highlight all of the principal ideas in this field. Our main objective is to help the reader become a sophisticated practitioner of (linear) optimiza tion. Our premise is that once mature understanding of the basic principles is in place. This is especially true in our treatment of interior point methods. that is. Hence. on the art of linear optimization. is a critical part of this book. For the sub ject matter of this book. but economical. network flow problems. our focus is on the beautiful interplay between algebra and geometry. Given enough time. as well as a deeper understanding of the underlying theory. further details can be acquired by the reader with little additional effort. and modern treatment of linear optimization. and discrete linear optimization. More specifically. provide an appreciation of the main classes of problems that are practically solvable. this necessarily requires a geometric view. but also cover applications and present case studies.
It then introduces the revised simplex method. It starts with a highlevel and geometrically motivated derivation of the simplex method. this approach often leaves the reader wondering whether certain properties are generally true. von Neumann's algorithm. a thorough treatment of interior point methods. extreme points ) within this context . Chapter 2: Deals with the basic geometric properties of polyhedra. when it comes to algorithms. But other than that. we separate the structural understanding of linear programming from the particulars of the simplex method. as well as a discussion of the column geometry. ( Of course. the simplex method is developed in terms of the full simplex tableau. which is sufficient for the purposes of the simplex method. For example. a unified view of delayed column generation and cutting plane methods. such as Leontief systems. submodular function minimization. These include a discussion of the col umn geometry and of the insights it provides into the efficiency of the simplex method. . We have found that the full simplex tableau is a useful device for working out numerical examples. Finally. this book contains a treatment of several important topics that are not commonly covered. There are also several noteworthy topics that are covered in the exercises. stochastic programming and Benders decomposition. options pricing.xii Preface In any book dealing with linear programming. and bounds for a number of integer programming problems. We depart from this tradition: we consider the general form of linear program ming problems and define key concepts ( e. which tends to become the central topic. and provides some background material on linear algebra. the auction algorithm for the assignment problem. and a whole chapter on the practice of linear optimization. Chapter 1: Introduces the linear programming problem. Traditionally. Here is a chapter by chapter description of the book.g. Introductory treatments often focus on standard form problems. strict complementarity. On the other hand. and emphasizing the interplay betwen the geometric and the algebraic viewpoints. we have tried not to overemphasize its importance. and concludes with the simplex tableau. certain theoretical implications of the ellipsoid algorithm. The usual topics of Phase I and anticycling are . the connection between duality and the pricing of finan cial assets. ) In the same spirit . there are some impor tant choices to be made regarding the treatment of the simplex method. Chapter 3: Contains more or less the classical material associated with the simplex method. and can hinder the deeper understanding of the subject. we often have to special ize to the standard form. together with a number of examples. focus ing on the definition and the existence of extreme points. we include a derivation of duality theory that does not rely on the simplex method. Let us also mention another departure from many other textbooks.
and provides examples where the ellipsoid algorithm can be used to derive polynomial time results for problems involving an exponential number of constraints. which is developed from first principles. The duality theorem is first obtained as a corollary of the simplex method. DantzigWolfe decomposition. as well as convergence proofs and complexity analysis. a development of the el lipsoid method. and a proof of the polynomiality of linear programming. Chapter 7: Provides a comprehensive review of the principal results and methods for the different variants of the network flow problem. rather than on the refinements that can lead to better complexity estimates. i. stochastic programming. cutting planes. These methods are first developed at a high level. based on the separat ing hyperplane theorem. that is. including exact methods (branch and bound. It contains representatives from all major types of algorithms: primal descent (the simplex method) .e. Chapter 4: It is a comprehensive treatment of linear programming du ality. It ends with a deeper look into the geometry of polyhedra. including parametric programming. A more abstract derivation is also provided. and path following (both primal and primaldual) methods. including affine scaling.Preface xiii also covered. Chapter 12: Deals with the art in linear optimization. the process . and Benders decomposition. Chapter 5: Discusses sensitivity analysis. and are then made concrete by discussing the cutting stock prob lem. dual ascent (the primaldual method) . as well as some intuition as to what constitutes a "strong" formulation. . It provides a number of examples. It also discusses the equivalence of separation and optimization. approximation algorithms. Chapter 6: Presents the complementary ideas of delayed column gen eration and cutting planes. Chapter 9: Contains an overview of all major classes of interior point methods. and heuristic methods (local search and simulated annealing) . It includes a discussion of the underlying geometric ideas and computational issues. the dependence of so lutions and the optimal cost on the problem data. potential reduction. Chapter 10: Introduces integer programming formulations of discrete optimization problems. The focus is on the major algorithmic ideas. Chapter 1 1 : Covers the major classes of integer programming algorithms. It also develops a characterization of dual optimal solutions as subgradients of a suitably defined optimal cost function. Chapter 8: Includes a discussion of complexity. dynamic pro gramming) . and approximate dual ascent (the auction algorithm) . It also introduces a duality theory for integer programming.
Furthermore. with the exception of some of the applications in Chapter 12. linear independence. We have made a special effort to keep the text as modular as possible. Also. Starred exercises are supposed to be fairly hard. except that interior point . Most of them are intended to deepen the understanding of the subject .17.xiv Preface of modeling. and we conclude with a case study in Section 12.5.2 ) can immediately move to any later section in that chapter. in Chapter 7 (on network flow problems) . but there are also other meaningful alternatives. There is a fair number of exercises that are given at the end of each chapter. Even within the core chapters (Chapters 14 ) . a reader who has gone through the problem formulation (Sections 7. it is only a small subset of linear algebra that is needed (e. For example. . The only prerequisite is a working knowledge of linear algebra. The book can be used to teach several different types of courses. as opposed to routine drills. We discuss modeling in Section 10. complexity in Section 8. the interior point algorithms of Chapter 9 are not used later. the concepts of subspaces. several numerical exercises are also included. In fact . for students in engineering and operations re search. We also give some indication of the size of problems that can be currently solved. cover a small number of topics from Chapters 912. . . much of the material in Chapters 5 and 6 is rarely used in the rest of the book. (b) An alternative could be the same as above. However. depending on the students' background and the course's objectives. exploiting problem structure. However. or to explore extensions of the theory in the text .3.5. (a) Cover most of Chapters 17. An important theme that runs through several chapters is the model ing.5. and the rank of a matrix) . and algorithms for problems with an exponential number constraints. A solutions manual for qualified instructors can be obtained from the authors. We discuss the relative performance of interior point meth ods and different variants of the simplex method. Some sections have been marked with a star indicating that they contain somewhat more advanced material that is not usually covered in an introductory course. The first two suggestions below are onesemester variants that we have tried at M. complexity. these elementary tools are sometimes used in subtle ways. in a realistic large scale setting. and if time permits. there are many sections that can be skipped during a first reading.LT. and some mathematical maturity on the part of the reader can lead to a better appreciation of the subject . The book was developed while we took turns teaching a firstyear graduate course at M. allowing the reader to omit certain topics without loss of continuity.g. and fine tuning of optimization algorithms.LT. algorithmic approaches in Chapter 6 and 8.
the sources that we cite are either original references of historical interest . or recent texts where additional information can be found. Lakis Poly menakos. (d) Finally. Spyros Kontogiorgis. as well as Chapters 1012. Paul Tseng. Georgia Perakis. We are grateful to our colleagues Dimitri Bertsekas and Rob Freund.17. love. We would like to express our thanks to a number of individuals. and support in the course of this long project .3. 9. based on parts of Chapters 1 and 8.1. and we make no attempt to provide a comprehensive bibliography. 4.3.1 . 14. colleagues. The core of such a course could consist of Chapter 1. and friends have con tributed by reading parts of the manuscript . Peter Mar bach. Tsitsiklis Cambridge. To a great extent . David Gamarnik. 5. and Ben Van Roy. as well as for reading parts of the manuscript. Several of our students. January 1 99 7 . and an application from Chapter 12. 10. (c) A broad overview course can be constructed by concentrating on the easier material in most of the chapters.4. Jay Sethuraman. Gina Mourtzinou. Yannis Paschalidis. providing critical comments. Sarah Stock. 3. replacing network flow prob lems (Chapter 7 ) . For those topics. 12. for many discussions on the subjects in this book. 13. we are grateful to our families for their patience. Leon Hsu.1. and working on the exercises: Jim Christodouleas.5. Dimitris Bertsimas John N. that touch upon current research. the book is also suitable for a halfcourse on integer pro gramming. Austin Frakt .Preface xv algorithms (Chapter 9 ) are fully covered. some of the easier material in Chapter 11. we also provide pointers to recent j ournal articles. There is a truly large literature on linear optimization. Sections 2. Thalia Chryssikou. however. 7. But mostly.
.
2. 3. notes.Chapter 1 Introd uction Contents 1 . and sources 1 . 1.4. 1 . Variants of the linear programming problem Examples of linear programming problems Piecewise linear convex objective functions Graphical representation and solution Linear algebra background and notation Algorithms and operation counts Exercises History. 6.8. 1. 7.5. 1 . 1 . 1. 1 . 1 .
1. subject to a set of linear equality and inequality constraints. a/ x is their inner product 2::=1 aixi .2 Chap. . the problem of mini mizing a linear cost function subject to linear equality and inequality con straints. and X4 are variables whose values are to be chosen to minimize the linear cost function 2X l .5 before continuing. The remaining constraints are of the form a/ x :::: b. We also solve a few simple examples and obtain some basic geometric intuition on the nature of the problem. 0 . a/ x = b. we pose the linear programming problem. Example 1 . The example we give is devoid of any interpretation. such as X l 2 0 and X3 :::: 0 . whenever we refer to a vector x inside the text. 1 ) and b = 2. Some of these constraints. . = = = As discussed further in Section 1. cn ) and we seek to minimize a linear cost function c ' x 2:�= 1 CiXi over all ndimensional vectors x (XI. . In a general linear programming problem.. 1 . . all vectors are assumed to be column vectors. 1 . X 2 . we will have ample opportunity to develop examples that arise in practical settings. we have a = ( 1 .1 Variants of the linear programming problem In this section. We consider a few equivalent forms and then present a number of examples to illustrate the applicability of linear programming to a wide variety of contexts. Later on. . Row vectors are indicated as transposes of (column) vectors. x n ) . 5 2 Here X l . or a/ x 2 b. .5. even though x is a column vector. X 2 . The reader who is unfamiliar with our notation may wish to consult Section 1. and establish some standard notation that we will be using. However. X3 . we are given a cost vector c ( Cl. we use the more economical notation x = ( Xl. a3 . in Section 1. 1 The following is a linear programming problem: minimize subject to 2X l Xl + X2 X2 3X 2 + 4X3 X3 X3 X3 + + X4 X4 :::: Xl 2 3 2 0 :::: o. where a = (a l .2. X3 . discuss a few special forms that it takes. We now generalize. 1 Introduction In this chapter. For example.X 2 + 4X3 . and are treated as such in matrixvector products. . we introduce linear programming. a4) is a given vector l . Xn ) . and b is a given scalar. Rather than starting abstractly. amount t o simple restrictions o n the sign o f certain variables. which is meant to facilitate understanding of the formal definition that will follow. x = (X l . X4) is the vector of decision variables. . The chapter ends with a review of linear algebra and of the conventions used in describing the computational requirements (operation count) of algorithms.. . we first state a concrete example. in the first constraint. a 2 . .
a� a� . . In particular. j EN2• Xj < 0. bTn). bi can be rewritten as ( . let MI. Suppose that there is a total of m such constraints. = = = = i EMI. . . bi and a�x 2 bi• In addition. The function c ' x is called the objective function or cost function.. A feasible solution x* minimize c ' x subject to a�x > bi . The set of all feasible solutions is called the feasible set or feasible region.. a�x < bi . and a vector x sat isfying all of the constraints is called a feasible solution or feasible vector. m. there are no restrictions on the sign of Xj . c ' x* :s. because maximizing c' x is equivalent to minimizing the linear cost function c ' x. M3 be some finite index sets. We then consider the problem that minimizes the objective function (that is. . Xj > 0. the constraints a�x 2 bi . and suppose that for every i in any one of these sets.00 or that the cost is unbounded below. that will be used to form the ith constraint . A = Then. Let also NI and N2 be subsets of {I. m.. . simply. 0 are special cases of constraints of the form a�x 2 bi . i EM2.. An equality constraint a�x bi is equivalent to the two constraints a�x :s. any constraint of the form a�x :s. X n are called decision variables. . i 1 . a�x i EM3. . . . an optimal solution. . . c' x. where � is a unit vector and bi O. we say that the optimal cost is . constraints of the form Xj 2 0 or Xj :s. M2 .Sec. if for every real number K we can find a feasible solution x whose cost is less than K. Finally. in which case we say that Xj is a free or unrestricted variable. indexed by i 1 .. we are given an ndimensional vector ai and a scalar bi .ai )' x 2 bi. The value of c ' x* is then called the optimal cost.1) bi .. a� . The variables X l . we will abuse terminology and say that the problem is unbounded. We conclude that the feasible set in a general linear programming problem can be expressed exclusively in terms of inequality constraints of the form a�x 2 bi. that is. and let A be the m x n matrix whose rows are the row vectors a� .) We finally note that there is no need to study maximization problems separately. If j is in neither NI nor N2 . and the linear programming problem can be written = [. . n} that indicate which variables Xj are constrained to be nonnegative or nonpositive. On the other hand.. for all feasible x) is called an optimal feasible solution or. let b (b l . 1. (Sometimes. . (1. ..1 Variants of the linear programming problem 3 subject to a set of linear equality and inequality constraints. . can be expressed compactly in the form Ax 2 b.j. j ENI. . respectively.
. 1 can b e rewritminimize subject to 2X I Xl X2 + 4X3 . which is a�x.X4 X2 3X2 X3 3X2 + X3 X3 + X4 X3 c = Xl 2 2 > 5 > 5 3 2 2 0 2 0. for every i. .2 . . We provide an interpretation of problems in standard form.2) Inequalities such as Ax 2:: b will always be interpreted componentwise. there are n available resource vectors A I ' . c' x (1 . 0) . 0) . while minimizing the cost l:�= l CiXi . Standard form problems A linear programming problem of the form minimize subject to Ax c'x x > 0. . .2 ten as The linear programming problem in Example 1 . and a target vector b. 0. b (1 . . . Then.1 . 2 ) . Suppose that x has dimension n and let A I . that is. The following is a more concrete example. which is of the same form as the problem ( 1 . the constraint Ax = b can be written in the form L Aixi i =l n = b. 3. 5. ' A n . where Ci is the unit cost of the ith resource. A n be the columns of A.4 as minimize subject to Chap. is greater than or equal to the ith component bi of the vector b. Example 1. .3) is said to be in standard form. A= 1 0 0 0 1 0 1 3 3 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 and b = ( . Intuitively. the ith component of the vector Ax. with (2. 4. 1 Introduction Ax 2:: b. 5 . We wish to "synthesize" the target vector b by using a non negative amount X i of each resource vector A i .
. am n Let A be the m x n matrix with entries aij. Here. we mean that given a feasible solution to one problem. the two problems have the same optimal cost and given an optimal solution to one problem. Reduction to standard form As argued earlier. and that we are given the following table with the nutritional content of a unit of each food: food 1 nutrient 1 . is a special case of the general form ( 1 .xj ." We then interpret the standard form problem as the problem of mixing nonnegative quantities Xi of the available foods.. in that case. In particular. 1 ) . In a variant of this problem. . 3) . we replace it by xj . ( b ) Elimination of inequality constraints: Given an inequality constraint of the form n j=1 L aij Xj ::.3 (The diet problem) Suppose that there are n different foods and m different nutrients. where xj and xj are new variables on which we impose the sign constraints xj . to synthesize the ideal food at minimal cost .1 Variants of the linear programming problem 5 1. a specification of the nutritional contents of an "ideal food.Sec. and the problem is not in standard form. equivalently... food n Example al l al n nutrient m aml . The underlying idea is that any real number can be written as the difference of two nonnegative numbers. We now argue that the converse is also true and that a general linear programming problem can be transformed into an equivalent problem in standard form.. Note that the jth column Aj of this matrix represents the nutritional content of the jth food. the vector b specifies the minimal requirements of an adequate diet .::: b. 1.::: 0 and xj 2 O. bi . any linear programming problem. including the standard form problem ( 1 .. when we say that the two problems are equivalent. . Let b be a vector with the requirements of an ideal diet or. the constraints Ax = b are replaced by Ax . we can construct an optimal solution to the other. with the same cost . The problem transformation we have in mind involves two steps: ( a ) Elimination of free variables: Given an unrestricted variable Xj in a problem in general form. we can construct a feasible solution to the other.
given the feasible solution (Xl . given the feasible solution (Xl .::: 0. X� .Xl . 1 ) to the standard form problem.x. X. an inequality constraint �7=1 aij Xj . = Example 1. X3 ) = (6. 2 . X 2 ) = (6..2x. therefore. > O.::: b to develop the theory of linear programming. x .::: bi can be put in standard form by intro ducing a surplus variable Si and the constraints �7 =1 aij Xj . we will be focusing on the standard form Ax b. 2 Xl i s equivalent t o the standard form problem minimize subject to 2 Xl + 4x� 4x..Si bi.::: O. 0) to the standard form problem.6 we introduce a new variable n Chap. x�. we obtain the feasible solution (Xl . Si . 6. L aijXj + Si j= l Si Such a variable Si is called a slack variable. One of our purposes is to indicate the vast range of situations to which linear programming can be applied. x. X 2 ) = (8. 0.4 The problem minimize subject to + 4X2 Xl + X2 2 3 14 3Xl + 2X2 Xl 2 0. 1 . .. Conversely. x.. Similarly. = 1. 1 Introduction Si and the standard form constraints bi .. when it comes to algorithms. However. X3 2 o. 5) to the original problem with the same cost . we obtain the feasible solution (Xl . and especially the simplex and interior point methods.2 Examples of linear programming problems In this section. We conclude that a general problem can be brought into standard form and. we will often use the general form Ax . which is computationally more convenient. 2) to the original prob lem. X3) = (8. which has the same cost. we discuss a number of examples of linear programming problems.. In the sequel.X3 3Xl + 2x� . For example.3 14 Xl + x� . Another purpose is to develop some familiarity with the art of constructing mathematical formulations of loosely defined optimization problems. we only need to develop methods that are capable of solving standard form problems. X� .
1 .. . Shipments of this new family of products started in the third quarter and ramped slowly during the fourth quarter. The following difficulties were anticipated for the next quarter: . and GP3. and the memory usage. . In the second quarter of 1988 . Then. GPl uses four 256K memory boards.000 $30. In Table 1. System GP1 GP2 GP3 WS1 WS2 Price $60.n. which are general purpose computer systems with different memory. maximize subject to CIXI + a il X l + Xj > 0.. . and 3 out of every 10 units are produced with a disk drive. the list prices. + + Cn X n ai n Xn < bi . .000 # disk drives 0. The jth good. j 1 . GP2.Sec. . m. i 1. Let Xj . .. be the available amount of the ith raw material. = Production planning by a computer manufacturer The example that we consider here is a problem that Digital Equipment Corporation (DEC) had faced in the fourth quarter of 1988.3 1.. DEC introduced a new family of (single CPU) computer systems and workstations: GPl. Let bi . as well as WSl and WS2 . be the amount of the jth good. the problem facing the firm can be formulated as follows: = = = . disk storage. and expansion capabilities.n. . requires aij units of the ith material and results in a revenue of Cj per unit produced. For example. m.2 Examples of linear programming problems 7 A production problem A firm produces n different goods using m different raw materials.. j 1.000 $40. . The firm faces the problem of deciding how much of each good to produce in order to maximize its total revenue.000 $30. we list the models. . In this example. . i = 1. j 1.7 0 1 . the choice of the decision variables is simple. It illustrates the complexities and uncertainties of real world applications..n. the average disk usage per system. 1.4 0 # 256K boards 4 2 2 2 1 Table 1 .000 $15 . as well as the usefulness of mathematical modeling for making important strategic decisions. . 1 : Features of the five different DEC systems. .. which are work stations.
GP3. the marketing department established that the maximum demand for the first quarter of 1989 would be 1 . as revenue.000 units.800 for GP.000 units. Hence.000 units. GP2. 1 Introduction (a) The inhouse supplier of CPUs could provide at most 7.1 systems. In the previous quarters.1 . X4 . should DEC use alternative memory boards for its GP. instead of four 256K memory boards. DEC could address the shortage of 256K memory boards by using two alternative boards. (b) The supply of disk drives was uncertain and was estimated by the manufacturer to be in the range of 3. We refer to this way of configuring the systems as the constrained mode of production. The manufacturing staff would like to concentrate their efforts on either decreasing the shortage of disks or decreasing the shortage of 256K memory boards. The following decisions needed to be made: (a) The production plan for the first quarter of 1989.1 systems? ( d) A final decision that had to be made was related to tradeoffs be tween shortages of disk drives and of 256K memory boards. we introduce variables (in thousands) of GP. It was clear to the manufacturing staff that the problem had become complex. This can be accomplished by truncating each X i after the third decimal point.800 systems for the whole GP family. in order to address the disk drive shortage. and WS2 systems. WS. 500 orders for WS. and customer satisfaction were at risk.1 systems wanted a disk drive) .8 Chap. should DEC continue to manufacture products in the constrained mode. GP3. or should it plan to satisfy cus tomer preferences? (c) Concerning memory boards. and WS2 with no disk drive (although 3 out of 10 customers for GP. WS. respectively. 300 for GP3 systems. in the GP. and GP2. given the size of the demand and the size of the X l .1 . X 2 . due to debugging problems. X3 .000 alternative boards for the next quarter. Strictly speaking. (b) Concerning disk drive usage.1 . since 1000Xi stands for number of units.1 with one disk drive. that represent the number . profitability. they would like to know which alternative would have a larger effect on revenue. DEC had produced GP.000 to 16 .000 to 7. On the demand side. DEC could provide 4. and 3. and 400 orders for WS2 that had already been received and had to be fulfilled in the next quarter. to be produced in the next quarter. X5 . In addition. In order to model the problem that DEC faced.1 system.200 systems for the WS family. ( c) The supply of 256K memory boards was also limited in the range of 8. Included in these projections were 500 orders for GP2.1 . 3. it must be an integer.
1 ) (min demand for WS2) Notice that the objective function is in millions of dollars. 3 is replaced by Furthermore. X4. even though a demand for 1. In another respect. X2.8 Xl < 0. It is actually of interest to determine the solution to this problem as the 256K memory availability ranges from 8 to 16.Sec.5 X2 > 0. the constraint X2+ X4 ::. and the disk drive availability ranges from 3 to 7.8. as follows. X5 2: O. X3.5 X4 Xs > 0. if we wish to use alternative memory boards in we replace the constraint 4Xl + 2X2 + 2X3 + 2X4 + Xs ::.3 X3 < 3. The problem can be formulated as follows: maximize 60Xl + 40X2 + 30X3 + 30X4 + 15xs (total revenue) subject to the following constraints: X l + X2 + X3 + X4 + X5 < 7 4Xl + 2X2 + 2X3 + 2X4 + Xs < 8 < 3 X2 + X4 < 1. In order to accommodate the other three choices that DEC had. 1. there are four different combinations of possible choices.8 Xl + X2 + x3 X4 + Xs < 3. which is the lowest value in the range that was estimated. If we use the unconstrained mode of production for disk drives.2 Examples of linear programming problems 9 variables Xi . 8 by the two . (CPU availability) (256K availability) (disk drive availability) (max demand for GP1) (max demand for GP3) (max demand for GP) (max demand for WS) (min demand for GP2 ) (min demand for WS. some of the problem constraints have to be modified. this has a negligible effect and the integrality constraint on 1000Xi can be ignored. As a result .2 > 0. We first develop a model for the case where alternative memory boards are not used and the constrained mode of production of disk drives is selected. 1.4 Xl . DEC had to make two distinct decisions: whether to use the con strained mode of production regarding disk drive usage. respectively.800 GP1 systems is not guaranteed. it assumes that the revenue from GP1 systems is 60Xl for any X l ::. because this provides valuable information on the sensitivity of the optimal solution on availability. because the 256K memory and disk drive availability were set to 8 and 3. this is a pessimistic formulation. for example. the formulation is optimistic because. and whether to use alternative memory boards for the GP1 system. In some respects. GP1 systems.
A least cost capacity expansion plan is desired. which is in oilfired plants. nuclear capacity available in year t. T . Let Xt and Yt be the amount of coal respectively. ( ) ( ) �)ctXt + ntYt ) . Coal plants last for 20 years. t=l T Since coalfired plants last for t Wt= 20 years. presumed accurate. 2X2 + 2X3 + 2X4 + X5 < 8. The state has a forecast of dt megawatts. . Methods for solving linear programming problems.t14} L Ys. . Zt= s=rnax{1. . as discussed earlier. . while nuclear plants last for 15 years. 1 Introduction The four combinations of choices lead to four different linear programming problems. t = 1. T . . . There is a capital cost of Ct per megawatt of coalfired capacity that becomes operational at the beginning of year t. each of which needs to be solved for a variety of parameter values because. . Let Wt and Zt be the total coal respectively. . For various political and safety reasons. where this case study is revisited. The first step in formulating this problem as a linear programming problem is to define the decision variables.t19} L t Similarly. will be studied in Chapter 5. when certain parameters are allowed to vary. Chap. the righthand side of some of the constraints is only known to lie within a certain range. T . . Multiperiod planning of electric p ower capacity A state wants to plan its electricity capacity for the next T years. nuclear capacity brought on line at the beginning of year t. The cost of a capacity expansion plan is therefore. that will not be retired and will be available during year t. . There are two alternatives for expanding electric capacity: coal fired or nuclear power plants. . of the demand for electricity during year t = 1 . for nuclear power plants. we have t= 1. . The existing capacity. . s=rnax{1. is et.10 constraints 2Xl < 4. The corresponding capital cost for nuclear power plants i s nt. it has been decided that no more than 20% of the total capacity should ever be nuclear.
. .t14} T L t t = 1. The demand for nurses for the night shift on day j is an integer dj .2. However. . . . t = 1 . . since n o more than 20% o f the total capacity should ever b e nuclear. For this reason. . s=max{1. . t = 1 . . .T. we choose the decision variables differently. We now discuss an example where this choice is less obvi ous. A scheduling problem In the previous examples. the capacity expansion problem is as follows: minimize subject to l ) CtXt + ntYt ) t=l Wt Xs = 0. and define Xj as . Wt + Z t + et which can be written as  Summarizing. we require t = 1 . we would not be able to capture the constraint that every nurse works 5 days in a row. 7. . . it can provide a ballpark estimate of the true cost . . . With this definition.2 Examples of linear programming problems 11 Since the available capacity must meet the forecasted demand. .T. we have Zt < 0. . . . . however. We note that this formulation is not entirely realistic. Every nurse works 5 days in a row on the night shift . .t19} t Zt Ys = 0. . . Finally. .T. The problem is to find the minimal number of nurses the hospital needs to hire.T. . . L t = 1 . . A hospital wants to make a weekly night shift ( 12pm8am ) schedule for its nurses. 1. because it disregards certain economies of scale that may favor larger plants. One could try using a decision variable Yj equal to the number of nurses that work on day j . . t= 1 .T. .Sec. s=max{1. j = 1 . the choice of the decision variables was fairly straightforward.T.
A link allowing oneway transmission from node i to node j is described by an ordered pair (i. this is the exception rather than the rule: finding optimal solu tions to general integer programming problems is typically difficult . then it is also an optimal solution to the original problem. some methods will be discussed in Chapter 11. Nodes are con nected by communication links. One way of dealing with this issue is to ignore ( "relax" ) the integrality constraints and obtain the socalled linear programming re laxation of the original problem. 1. j) E A can carry up to Uij bits per second. except for the constraint that each Xj must be an integer. an optimal solution can be found without too much effort . In order to formulate this problem as a linear programming problem. we can round each Xj upwards. solution to the original problem. (For example. We assume that each link (i. + + + + + + X6 + X7 X6 + X7 X6 + X7 X6 + X7 + X7 X6 X6 + > > > > > > X7 > dl d2 d3 d4 d5 d6 d7 This would be a linear programming problem. either through a direct link (k. and therefore more options. £ ) or by tracing a sequence of links. Because the linear programming problem has fewer constraints.j). It turns out that for this partic ular problem. that have to be transmitted to node £. 6. Let A be the set of all links. 1 Introduction the number of nurses starting their week on day j. 7. we introduce variables x'iJ indicating the amount of data with origin k and . The problem is to choose paths along which all data reach their intended destinations. thus obtaining a feasible. and we actually have a linear integer program ming problem.12 Chap. We allow the data with the same origin and destination to be split and be transmitted along different paths. the optimal cost will be less than or equal to the optimal cost of the original problem. There is a positive charge Cij per bit transmitted along that link. a nurse whose week starts on day 5 will work days 5. but not necessarily optimal. Each node k generates data. However. If the optimal solution to the linear programming relaxation happens to be integer. 2.) We then have the following problem formulation: minimize subject to Xl Xl Xl Xl Xl Xl + + + + + X2 X2 X2 X2 X2 X2 + + + + + X3 + + + + + + X4 X4 Xj � 0. while minimizing the total cost . at the rate of bk€ bits per second. If it is not integer. Xj X3 X3 X3 X3 X3 + + + X5 X5 X5 X4 X4 + X5 X4 + X5 X4 + X5 integer. Choosing paths in a communication network Consider a communication network consisting of n nodes.
n. .. This problem is known as the multicommodity flow problem. _b kt if i = £. b7£ is the net amount of such data that enter node i from outside the network. £ = 1 .j)EA L L L Cijx 7} k =l £ = 1 n n i. . . b�£ is the net inflow at node i. 0. Instead.i)EA} x k£ J< represents the amount of data with the same origin and destination that enter node i through some link. j). The second constraint expresses the requirement that the total traffic through a link (i. with the traffic corresponding to each origindestination pair viewed as a different commodity. n n (i. j) EA. which is the subject of Chapter 7. £ = 1 . The network flow problem. of data with origin k and destination £. . Finally. The first constraint is a flow conservation constraint at node i for data with origin k and destination £. and the objective is to transport material from the supply nodes to the demand nodes. in which we do not distinguish between different commodities. the maximum flow problem. .. .j)EA} represents the amount of data with origin and destination k and £.Sec. A mathematically similar problem arises when we consider a transportation company that wishes to transport several commodities from their origins to their destinations through a network. 1. otherwise. respec tively. L Lx 7} :::.2 Examples of linear programming problems 13 destination £ that traverse link b�£ < = { (i. k=l £=1 x�}?:O . that leave node i along some link. known as the minimum cost network flow problem. and the assignment problem. Let b k£ ' if i = k. we are given the amount bi of external supply or demand at each node i.. from outside the network.n. at minimum cost . k. Uij. . We then have the following formulation: minimize subject to (i . The expression {j1(j. There is a version of this problem. x�£ <J k. (i. j)EA. j) cannot exceed the link ' s capacity. The expression {j1(i. contains as special cases some important problems such as the shortest path problem. Thus.
then there exists some other choice (obtained by multiplying x and X n + l by a suitably large positive scalar) that satisfies a�x. Given a new object with feature vector a. . i tf. a�x <X n +! . and operates as follows. 1 Introduction We are given m examples of objects and for each one. In words. a description of its features in terms of an ndimensional vector a. More concretely. Objects belong to one of two classes.::: X n + l . a linear classifier makes decisions on the basis of a linear combina tion of the different features. i E S. Our objective is to use the available examples in order to design a "good" linear classifier.  i E S. and its color.S. but a reasonable starting point could be the requirement that the classifier must give the correct answer for each one of the available examples. suppose that each object is an image of an apple or an orange (these are our two classes) . X n +l . There are many ways of approaching this problem. we can use a threedimensional feature vector ai to summarize the contents of the ith image.14 Pattern classification Chap. say the ith one. The three components of ai (the features) could be the ellipticity of the object . We are interested in designing a classifier which.. This issue can be bypassed by observing that if some choice of x and X n +l satisfies all of the above constraints. will figure out whether it is an image of an apple or of an orange. A linear classifier is defined in terms of an ndimensional vector x and a scalar X n +1 . as measured in some scale.S. i tf. Let S be the set of examples of the first class. Note that the second set of constraints involves a strict inequality and is not quite of the form arising in linear programming. We are then looking for some x and X n +l that satisfy the constraints a�x .::: a�x ::. In this context . and for each example we are told the class that it belongs to. given a new object (other than the originally available examples) . the classifier declares it to be an object of the first class if and of the second class if a/x <x n + l . Xn +l 1. We conclude that the search for a linear classifier consistent with all avail able examples is a problem of finding a feasible solution to a linear pro gramming problem. the length of its stem.
let d1 . The inequality therefore means that when we restrict attention to such a segment. 1 ) . It is easily seen that a function f is convex if and only if the function . 1 (a) . where ao. (It turns out that affine functions are the only functions that are both convex and concave.dm be scalars.A )y ) :2: Af( x) + lRn. The definition of a convex function refers to the values of f. the graph of the function lies no higher than the graph of a corresponding linear function. . We also say that x is a global minimum if f(x) :::. f(y) for all y in the vicinity of x. 1] ' then points of the form AX + (1 ..=max. Let Cl . called an affine function. . . . . is both convex and concave. .A )f(y)· Note that if x and y are vectors in lRn and if A ranges in [0.Sec. the inequality in part (a) of the definition would hold with equality. . . (1 . there is an important class of optimization problems with a nonlinear objective function that can be cast as linear programming problems.A )f(y) · E function f : lRn It lR is called concave if for every x. 1 (a) A function f : lR n It lR is called convex i f for every x. and consider the function f : lRn It lR defined by f(x) = 1. Note that a function of the form f(x) = ao + I:�=l aixi . . 1. as its argument traces this segment.) Convex (as well as concave) functions play a central role in optimization.3 1.3 Piecewise linear convex objective functions Piecewise linear convex objective funct ions 15 All of the examples in the preceding section involved a linear objective function. A f ( x ) (b) A + (1. .Cm be vectors in lRn . A convex function cannot have local minima that fail to be global minima (see Figure 1 .y and every A E E [0. . We say that a vector x is a local minimum of f if f(x) :::. f(y) for all y. . and this property is of great help in designing efficient optimization algorithms. We first need some definitions: Definition 1 . f ( AX + ( 1 . .A ) Y ) :::. If f were linear.f is concave.m (c�x + di) l. y and every A E [0. we have f ( AX + ( 1 .. 1]. . these are examined next . However.A )y belong to the line segment joining x and y. we have lRn. 1]. an are scalars. see Figure 1 . ..
. where the objective function is piecewise linear and convex rather than linear: minimize subject to . ( c ) A function that is neither convex nor concave.. Let x. . .A) h ( y )) t= l . . fm : �n the function f defined by f (x) = It � be convex functions. 1 : ( a ) Illustration of the definition of a convex function. Then. y E �n and let A E [0. . 0 convex function can be used to approximate a general convex function.) y y ~ ( b) (c) x ( a) ( b) A concave function. Theorem 1 . We now consider a generalization of linear programming. . . A simple example is the absolute value function defined by f ( x ) I x l max{ x. Proof.A ) fi (Y) t=l . Such a function is convex. We have f (AX + ( 1 . . max ( 1 . . .m t= l .m (c�x+di ) is called a piecewise linear convex function.16 V (x) + ( 1 . . . As illustrated in Figure 1 . .. max fi (AX + (1 .m < .m Ax . Figure 1 . as a consequence of the following result . max Afi (X) + .. note that A is a local. . . maxi =l . . .A)Y) . . 2 (a)] . . 1 Introduction "�:7 f(Y) / f(x) h + ( l .A)Y) 'l = l . but not global. x } . .. ffi Af (x) + ( 1 .) f(y) Chap. . 1 Let II . minimum. 1] .. . .. . [see Figure 1 . 2 (b) . . .m fi (X) is also convex. . A function of the form maxi = l . a piecewise linear = = .J. max (Afi (X) + ( 1 . max (c�x + di ) t=l .::: h .J. ffi < . .A)f( y ) .
. We finally note that if we are given a constraint of the form f (x) :s: h . . 1. On the other hand. the optimization problem under consideration is equivalent to the linear programming problem minimize subject to � c�x + di . L Ci I Xi I i=l . .m (f[x+9i ) . Note that maxi=l . such a constraint can be rewritten as = fIx + 9i :s: h. .m (c�x + di ) is equal to the smallest number z that satisfies z � c�x + di for all i. . such a piecewise linear approximation is not always a good idea because it can turn a smooth function into a nonsmooth one (piecewise linear functions have discontinuous derivatives) . . For this reason. . where f is the piecewise linear convex function f(x) maxi= 1 .. z z i = 1. . where the decision variables are z and x. i = 1 . .. . . m.Sec. . x .3 Piecewise linear convex objective functions 17 x . . .2: ( a) A piecewise linear convex function of a single variable. . . linear programming can be used to solve problems with piecewise linear convex cost functions. and the latter class of functions can be used as an approximation of more general convex cost functions. and linear programming is again applicable. . Ax � b. m. Problems involving absolute values Consider a problem of the form minimize n subject to Ax � b. To summarize. ( b ) An approximation of a convex function by a piecewise linear convex function. (a) (b) Figure 1.
.18 = Chap. i i = = 1. the desired relation IX i I xi + xi now follows. Having guaranteed that either xi 0 or xi 0. and the validity of this reformulation may not be entirely obvious. otherwise. = = = = = = = = Example 1. and for each i. x. . An alternative method for dealing with absolute values is to introduce new variables xi . 1 Introduction where x (X l ' .5 Consider the problem minimize subject to 21xI I + X2 X l + X2 2: 4. < Zi . At an optimal solution to the reformulated problem. 5 . being the sum of the piecewise linear convex functions Ci IXi I is easily shown to be piecewise linear and convex (Exercise 1 . while reducing the cost. the cost criterion is nonconvex. . xi 2:: 0. because otherwise we could reduce both xi and xi by the same amount and preserve feasibility. and a more direct route is preferable. we must have either xi 0 or Xi 0. However. . and where the cost coefficients c. are not enough to guarantee that IX i I xi + xi . Zi and Xi :::. Let us assume for simplicity that Ci > 0 for all i. xi 2:: 0. We also note that the nonnegativity assumption on the cost coefficients Ci is crucial because. . .xi .) We then replace every occurrence of IXi l with xi + xi and obtain the alternative formulation = = = minimize subject to Ax+ . where x+ (xt . .2) . Zi. . . . and we obtain the linear programming formulation n minimize subject to Xi Xi Ax > b LCiZi i= l < Zi . xi . (Our intention is to have X i xi or X i xi . in contradiction of optimality. . are assumed to be nonnegative. The formal correctness of the two reformulations that have been pre sented here.. depending on whether Xi is positive or negative. The cost criterion. . = L ci (x i + xi ) i= l 2:: n b (x l ' .n . ' Xn ) .Axx+ .n.t ) and x The relations X i xi . expressing this cost criterion in the form maxj (cj x + dj ) is a bit involved. . and in a somewhat more general setting. . ' x . is the subject of Exercise 1 . .xi . . x.2:: 0. and let X i x i . We observe that IX i I is the smallest number Zi that satisfies X i :::. . 1. constrained to be nonnegative.:) .
. In such a situation. bi). i = 1. . m. Zl X l ::::. at the ith data point is defined as I bi . . one often uses a linear model of the form b = a/ x. while the second yields minimize subject to 2xt + 2xI + X2 x+ x l + X2 2: 4 1 x+ 2: 0 1 Xl > o. m. •  with respect to x. This is the problem of minimizing max I bi a�x l . the residual. i = 1 .e . . Zl . Given a particular parameter vector x. .a�x l · i=l .3 Piecewise linear convex objective functions 19 Our first reformulation yields minimize subject to 2 Zl + X2 Xl + X2 2: 4 X l ::::. One possibility is to minimize the largest residual. m. where x is a parameter vector to be determined. a model that results in small residuals. i. Z. Data fitting We are given m data points of the form (�. 1. . Given a choice between alternative models. the decision variables being z and x. . i = 1 . . . subject to no constraints.a�x l . one should choose a model that "explains" the available data as best as possible. we could adopt the cost criterion m L I bi . . The following is an equivalent linear programming formulation: minimize subject to z bi bi a�x ::::. Note that we are dealing here with a piecewise linear convex cost criterion. or prediction error. . In an alternative formulation.Sec. . We now continue with some applications involving piecewise linear convex objective functions. + a�x �  z. . where � E �n and bi E �. and wish to build a model that predicts the value of the variable b from knowledge of the vector a.
satisfies 1 Introduction and Ibi . t = 0. or even a model of economic growth. Zi . . and u(t) is a control vector that we are free to choose subject to linear constraints of the form Du(t) ::. an engine. T . .. . . bi + a�x ::. . m. x(t + 1 ) = Ax(t) + Bu(t) ." This is a problem which is easier than linear programming. In practice. .m.e. . u(T . but its discussion is outside the scope of this book.a�x ::. assumed for simplicity to be zero. T .20 Since Cbap. . . Z 1 + . . a manufacturing system. . . . . d. hard bounds on the magnitude of each com ponent of u(t)] . .T We then obtain the following linear programming problem: minimize subject to Z z ::. .1 ly(t) l .. . . Zi . y(t) ::. . x(T) = 0. Zi . T . Optimal control of linear systems Consider a dynamical system that evolves according to a model of the form x(t + 1 ) y(t) Ax(t) + Bu(t) . . a mechanical system. a�x) 2 . t = 0. . .a�xl is the smallest number Zi that bi + a�x ::.a�x ::.1 . In addition to zeroing the state. an electrical circuit . Du(t) ::. c ' x(t) . . . t = 1. y(t) is the system output . and we may wish to minimize t= 1max . it is often desirable to keep the magnitude of the output small at all intermediate times.1 . We are also given the initial state x(O) . To mention some possible applications. . x(O) = given. . t = 1. it can be solved using calculus methods. T . one may wish to use the quadratic cost criterion 2:7: 1 (bi in order to obtain a "least squares fit . i = 1 . . we are to choose the values of the control variables u(O) . . + Zrn i = 1 . i.1 ) to drive the state x(T) to a target state. we obtain the formulation minimize subject to bi . Here x(t) is the state of the system at time t. . z. .1 . . this could be a model of an airplane. d [these might include saturation constraints.1. . . y(t) = c 'x(t) . . Zi bi . In one possible problem. . assumed scalar.
Rocket control Consider a rocket that travels along a straight path. 3 2: o. can also be incorporated. Furthermore.4 Graphical representation and solution 21 Additional linear constraints o n the state vectors x ( t ) . Alternatively. this is the line described by the equation X l . In • . we obtain an approximate discretetime model of the form Xt + l vt + l X t + Vt . we wish to accomplish this in an economical fashion. In order to find an optimal solution.1 . In a rough model. XT = 1 and VT = o . 1 . . velocity. of the rocket at time t. o r a more general piecewise linear convex cost function of the state and the control. Under either alternative. which is maxt l at l . 1. Different values of z lead t o different lines.:�/ l at l spent subject to the pre ceding constraints. Example 1. the problem can be formulated as a linear programming problem (Exercise 1.3. the magnitude l at l of the accelera tion can be assumed to be proportional to the rate of fuel consumption at time t. we proceed as follows. Vt + at · We assume that the acceleration at is under our control. Our first example involves the graphical solution of a linear programming problem with two variables. Suppose that the rocket is initially at rest at the origin. that is. One possibility is to minimize the total fuel L '{'. as it is determined by the rocket thrust. We wish the rocket to take off and "land softly" at unit dis tance from the origin after T time units. Vt .Sec. Let Xt . we consider the set of all points whose cost c ' x is equal to z.6) . X2 X2 2X2 ::. 3 X2 ::. we may wish to minimize the maximum thrust required. respectively.1 ) . that is.4 Graphical represent at ion and solution In this section. By discretizing time and by taking the time increment to be unity. Xo = 0 and Vo = o . For any given scalar z. and at be the position. all of them parallel t o each other. we consider a few simple examples that provide useful geo metric insights into the nature of linear programming problems.6 Consider the problem minimize subject to X l Xl + 2X l + Xl . The feasible set is the shaded region in Figure 1 . and acceleration.X2 = Z Note that this line is perpendicular to the vector c = ( .
1 . the vector x = ( 1 .X l . as long as we do not leave the feasible set . i = 1 . particular. increasing z corresponds t o moving the line z = . we would like to move the line as much as possible in the direction of . 1 ) is an optimal solution.22 Chap. 3 .X 2 along the direction of the vector c. and the objective is to slide this plane as much as possible in the direction of c.1 ) . The best we can do is z = 2 (see Figure 1 . instead of a line. 6 . ) For a problem i n three dimensions. 1 .4) . and the vector x = ( 1 . .3) . and that c = (.c . described by the constraints 0 :S Xi :S 1. 7 Suppose that the feasible set is the unit cube. 1 ) is an optimal solution. the same approach can b e used except that the set of points with the same value of c ' x is a plane.3: Graphical solution of the problem in Example 1 . as long as we do not leave the feasible region. Since we are interested in minimizing z. 2 . the optimal solution happens to be a corner of the feasible set (Figure 1 . . . (The concept of a "corner" will be defined formally in Chapter 2 . This plane is again perpendicular to the vector c. Once more.1 . 1 Introduction Figure 1 . Note that this is a corner of the feasible set . Example 1 . Then.
This is not always the case and we have some additional possibilities that are illustrated by the example that follows.4 Graphical representation and solution 23 . 1. X2 ) . namely.00 .00 . . there are multiple optimal solutions. 1 . and the problem has a unique optimal solution. 5 . 1 ) .1 ) . (e) If we impose an additional constraint of the form Xl + X2 :s: that no feasible solution exists. there are multiple optimal solutions.Sec. the set of optimal solutions is unbounded (contains vectors of arbitrarily large magnitude) . with 0 :s: X2 :s: 1 . c = ( 1 . Example 1. every vector x of the form x = (Xl . we can always produce another feasible solution with less cost . (d) Consider the cost vector c = ( . the feasible set is bounded ( does not extend to infinity ) . which is shown in Figure 1 . is optimal. Therefore. (c) For the cost vector c = ( 0. 0 ) . 1 ) . by increasing the value of Xl . Furthermore. it is evident . We therefore say that the optimal cost i s . no feasible solution is optimal. every vector x of the form x = (0. 0 ) . X2 ) . it is clear that x = (0. by considering vectors (Xl . namely. is optimal.c = (1 . (a) For the cost vector optimal solution.8 2 Consider the feasible set in lR defined by the constraints Xl + X2 :s: 1 Xl > 0 X2 > 0. 7. 1) Figure 1 . In both of the preceding examples. 2.1 . we can obtain a sequence of feasible solutions whose cost converges to . For any feasible solution (Xl . In this case. 0) is the unique (b) For the cost vector c = ( 1 . with Xl 2': 0.4: The threedimensional linear programming problem in Example 1 . X2) with ever increasing values of Xl and X2 . Note that the set of optimal solutions is bounded.
c . we will show that this is a general feature of linear programming problems. For each choice of an optimal solution is obtained by moving as much as possible in the direction of . We will see later in this book that this possibility never arises in linear programming. and no feasible solution is optimal. 1 Introduction c = (1 . the set of optimal solutions can be either bounded or unbounded. there is an additional possibility: an optimal solution does not exist even though the problem is feasible and the optimal cost is not . then an optimal solution can be found among the corners of the feasible set.8. In the examples that we have considered. The feasible set in Example 1 . In Chapter 2. In principle.24 Chap. To summarize the insights obtained from Example following possibilities: (a) There exists a unique optimal solution.5: c. if the problem has at least one optimal solution. (d) The feasible set is empty. .00 .0) Figure 1 . as long as the feasible set has at least one corner. but the optimal cost is not .00 . in this case. 1. (c) The optimal cost is . this is the case. 8.00 ) . in the problem of minimizing l /x subject to x > 0 (for every feasible solution. for example. we have the (b) There exist multiple optimal solutions. there exists another with less cost .
4 Graphical representation and solution 25 Visualizing standard form problems We now discuss a method that allows us to visualize standard form problems even if the dimension n of the vector x is greater than three. then the feasible set has the appearance of a triangle in twodimensional space. In particular. (a) (b) Figure 1. and that the matrix A has dimensions m x n. the decision vector x is of dimension n and we have m equality constraints. X 2 . enough variety is obtained even if x has dimension three. 1. . each constraint removes one of the "degrees of freedom" of x. n . the feasible set of a standard form problem does not have much variety and does not provide enough insight into the general case. Furthermore. the feasible set can be drawn as a twodimensional set defined by n linear inequality constraints. . the feasible set is only constrained by the linear inequality constraints X i ?: 0. consider the feasible set in lR3 defined by the constraints Xl + X 2 + X3 = 1 and Xl .) If we "stand" on that (n .m )dimensional set and ignore the m dimensions orthogonal to it . . i = 1 . and note that n = 3 and m = 1 . X 2 . (Intuitively. (In contrast. 6 (b) .6( a)] . . if the feasible set is described by constraints of the form Ax ?: b. We assume that m ::::: n and that the constraints Ax = b force x to lie on an (n .6: An (n  m )dimensional (a) A n ndimensional view o f the feasible set . if n . see Figure 1 . X3 ?: 0. The reason for wishing to do so is that when n ::::: 3. In particular. ) Suppose that we have a standard form problem. X3 ?: 0 [Figure 1 .Sec. If we stand on the plane defined by the constraint Xl + X 2 + X3 = 1 . each edge of the triangle corresponds to one of the constraints X l .m)dimensional set .m = 2. (b) view of the same set . . To illustrate this approach.
. we define the closed and open intervals [a. We use the notation � n to indicate the set of all ndimensional vectors. we say that 8 is a proper subset of T. For any vector x E �n .26 1. A set can be specified in the form 8 {x I x satisfies P } .e. A row vector is a matrix with m 1 and a column vector is a matrix with n 1 .5 Chap. The cardinality of a finite set 8 is denoted by 1 8 1 . bl = {x E � I a :::. . we use the notation aij or [Al ij to refer to its (i. respectively. Vectors will be usually denoted by lower case boldface characters. by = [a. . 8 could be equal to T. i. in particular. j)th entry. It also contains a brief review of those results from linear algebra that are used in the sequel. The symbols :J and 'V have the meanings "there exists" and "for all. For any real numbers a and b. We use � to denote the set of real numbers. . The word vector will always mean column vector unless the contrary is explicitly stated. We use 8\T to denote the set of all elements of 8 that do not belong to T. bl and (a. and (a. Set theoretic notat ion If 8 is a set and x is an element of 8. If A is a matrix. b) = Vectors and matrices A matrix of dimensions m x n is an array of real numbers aij : Matrices will be always denoted by upper case boldface characters. as the set of all elements having property P. X n to = = . b) . . every element of 8 is also an element of T. The notation 8 c T means that 8 is a subset of T. X 2 . b} . we use X l . If in addition 8 f= T. we write x E 8. {x E � I a < x < b}." respectively. x :::. The union of two sets 8 and T is denoted by 8 U T. 1 Introduction Linear algebra background and not at ion This section provides a summary of the main notational conventions that we will be employing. We use 0 to denote the empty set . and their intersection by 8 n T.
The transpose A' of an m x n matrix A is the n x m matrix = that is. . If A is an m x n matrix. Two vectors are called orthogonal if their inner product is zero. . = = . The more economical notation x (X l . Thus. X n ) will also be used even if we are referring to column vectors. amj ) . . Aj (a lj . we use Aj to denote its jth column. ) We also use ai to denote the vector formed by the entries of the ith row. If x and y are two vectors in �n . its transpose x' is XI Y This quantity is called the and y . that is. . The ith unit vector ei is the vector with all components equal to zero except for the ith component which is equal to one. that is. Thus. [A / lij [Alji . . We use 0 to denote the vector with all components equal to zero.X / X is the Euclidean norm of x and is denoted by I l x l l . a 2j . ai (a i l .t Xi Yi . . X 2 .5 Linear algebra background and notation 27 indicate its components.Sec. we have = n L.. with equality holding if and only if x o. The Schwartz inequality asserts that for any two vectors of the same dimension. i= l inner product of x = I Yx = '"""' IX/y l � Ilxll · llyll . Similarly. . if x is a vector in the row vector with the same entries. then = �n . · · · . ai n ) . 1. with equality holding if and only if one of the two vectors is a scalar multiple of the other. The expression ". ai 2 . (This is our only exception to the rule of using lower case characters to represent vectors. . Note that XiX � 0 for every vector x.
otherwise. Such a matrix B. Let A be an m x n matrix with columns A i .1 = B . [ 1 �x Ax i s provided by where a� . . Matrix inversion Let A be a square matrix. that is. . a K . The identity matrix satisfies IA = A and BI = B for any matrices A. the notation x � 0 and x > 0 means that every component of x is nonnegative (respectively. B of dimensions compatible with those of I. . ) . We also have (AB ) ' = B' A' . it is not commutative. not all of them zero. AB is a matrix of dimensions m x k whose entries are given [AB ] ij n = I )A] ie [B ] £j = a� Bj . If x is a vector. . . they are called linearly = independent. x K E lRn .1 ) ' . . and B j is the jth column of B. B of dimensions m x n and n x k . . . the equality AB = BA is not always true.1 . We note that (A. If A is a matrix. called the inverse of A. a� are the rows of A. A matrix is called square if the number m of its rows is equal to the number n of its columns. which is a square matrix whose diagonal entries are equal to one and its offdiagonal entries are equal to zero. If there exists a square matrix B of the same dimensions satisfying AB = BA = I. i. 1 Introduction product Given two matrices A. Any vector x E lRn can be written in the form x = Z::: �= 1 Xi e i . £=1 where a� is the ith row of A. if A and B are invertible matrices of the same dimensions.1 A 1 Given a finite collection of vectors Xl . . .1 = (A . An equivalent definition of linear independence requires that . (AB ) C = A(BC) .28 their by Chap. Matrix multiplication is associative. . in general. is unique and is de noted by A . but . . then AB is also invertible and (AB ) . . . . We then have Ae i = A i . we say that A is invertible or non singular.e. respectively. the inequalities A � 0 and A > 0 have a similar meaning. which leads to n n eiXi = L Aeixi Ax = A L i= 1 i= 1 = n L Ai Xi · i= 1 A different representation o f the matrixvector product the formula a� x a� x Ax = . We use I to denote the identity matrix. such that z::: J: 1 a k xk = 0. Also. we say that they are linearly dependent if there exist real numbers al . positive) .
x K is a linear combination of the remaining vectors (Exercise 1 . the linear system A x The matrix sol u tion . The rows o f A are linearly independent. 5 Linear algebra background and notation 29 none of the vectors X l . . There exists some vector has a unique sol ution . 18) . . If. . Then. as well as later. y E S and every a. . Theorem 1 . the following state ments are equivalent: ( a) ( b) (c ) (d) (e) (f) A is invertible. . . . . Here. Finally. For every vector b. twodimensional subspaces are planes through the origin. The matrix A' i s invertible. 1 . the notation det (A) is used to denote the determinant of a square matrix A. . x K Given a subspace S of 3tn . The determinant of A is nonzero. Specifically. In particular.Sec. the jth component of x is given by Xj = det (Aj ) . an explicit formula for the solution x = A . 2 Let A be a square matrix. . x K in 3tn is the subspace of 3tn defined as the set of all vectors y of the form y = � � l a k x k . the set {O} is a subspace and its dimension is defined to be zero. . is given by Cramer's rule.l b of the system A x = b. except that its jth column is replaced by b. in addition. the dimension of 3tn is equal to n and every proper subspace of 3tn has dimension smaller than n. = b has a unique ( g) b such that the linear system Ax =b Assuming that A is an invertible square matrix. a basis of S is a collection of vectors that are linearly independent and whose span is equal to S. Note that onedimensional subspaces are lines through the origin. Note that every subspace must contain the zero vector. . Subspaces and bases A nonempty subset S of 3tn is called a subspace of 3tn if ax + by E S for every x. we say that S is a proper subspace. Every basis of a given subspace has the same number of vectors and this number is called the dimension of the subspace. Any such vector y is called a linear combination of x l . S =1= 3tn . b E 3t. det (A) where Aj is the same matrix as A . . The span of a finite number of vectors x l . The columns of A are linearly independent. We have the following result . with S =1= {O} . where = each a k is a real number.
Then: K (a) There exists a basis of 8 consisting of m of the vectors X l . then there exists a nonzero vector a which is orthogonal to 8. . Finally. . . x are linearlykindependent. min { m. . = = Proof. K dimension m. we K now have k + 1 of the vectors X l . . . .k times. x . . xK k is linearly independent from X l . = Theorem 1. and the latter vectors form a basis. . we can form a l basis of 8 by starting with x . The column space of A is the subspace of �m spanned by the columns of A. . The dimension of the column space is always equal to the dimension of the row space. = = Affine subspaces Let 80 be a subspace of �n and let x be some vector.30 Chap. . . . . and this number is called the rank of A . a' x 0 for every xE 8. . if 8 has dimension m < n. this amounts to translating 80 by x . . m k. . .) Otherwise. . rank ( A ) :::. By repeating this process m . then every vector in the span of x . x . The row space of A is the subspace of �n spanned by the rows of A . x is k also a linear combination of X l . . . . . .3 8uppose that the span S of the vectors X l . . . x . there exist n . vectors X x . at least one of the vectors xk + l . In general. 8 is not a subspace. and choosing m k of the k+l K . x . . . . If we add x to O every element of 80 . The resulting set 8 can be defined formally by O O 8 80 + xO = = { x + xO I x E 80 } . . 1 Introduction If 8 is a proper subspace of �n . . . . x has .rank ( A ) . More generally. x that are linearly independent . . we end up with the desired basis of 8. . We only prove part ( b ) . . . x . that is. By picking one such vector.m linearly independent vectors that are orthogonal to 8. . it is a subspace of �n and its dimension is equal to n . and it is called an affine subspace.k because Ka ) is the special case of part ( l (b) + D Let A be a matrix of dimensions m x n. n } . The result that follows provides some important facts regarding bases and linear independence. . k (b) If k :::. Clearly. with k O. the set { x E �n I Ax o} is called the nullspace of A. • • . . If every vector x . . ( In particular. n } . The matrix A is said to have full rank if rank ( A ) min { m. m and x l . .x can be expressed as a linear k K l combination of X l . The dimension of 8 is defined to be equal to the dimension of the underlying subspace 80. . . because it does not necessarily contain the zero vector. .
.m . 5 Linear algebra background and notation 31 A s an example. and the affine subspace 8 also has dimension k. The vector a is perpendicular to S. and 8 is an affine subspace of �n . x E 8 if and only if x . For a second example. Hence. x k . . In particular. If the vectors x l . Hence. and 8 is an affine subspace. We conclude that 8 = {y + xO l y E 80 } . .XO ) o. x k b e some vectors i n �n . which we assume t o be nonempty. . each one of the constraints a�x = bi removes one degree of freedom from x.mj see Figure 1. or A(x . . xk are linearly independent . and we consider the set 8 = {x E �n I Ax = b } . = Consider a set S in � defined by a single equality constraint a/ x = b. X l . . For this case. Let us fix some xO such that Axo = b. then every x E S is of the form x = XO + A 1 X l + A 2 X2 . . If A has m linearly independent rows. A k are arbitrary scalars. thus reducing the dimension from n to n . . . Intuitively. the affine subspace 8 also has dimension n . their span has dimension k. . An arbitrary vector x belongs to 8 if and only if Ax = b = Axo . . we are given an m x n matrix A and a vector b E �m . . if a� are the rows of A. . let XO . 7: . 7 for an illustration. Figure 1 . .m . .Sec. S is a twodimensional affine subspace. and consider the set 8 of all vectors of the form where >'1 . 80 can be identified with the span of the vectors X l .xO belongs to the subspace 80 = {y I Ay = O } . 1 . If X l and x2 are linearly independent vectors that are orthogonal to a. . Let xO be an element of S. its nullspace 80 has dimension n .
l arithmetic operations . (b) Let A and B be matrices of dimensions n x n. A more refined approach will be discussed briefly in Chapter 8. f (n) = O (g (n) ) if there exist positive numbers no and f (n) :. this can be accomplished by counting the number of arithmetic operations ( additions. conditional statements. Loosely speaking.l additions. cg (n) for all n . Example 1. Thus. Although the running time of an algorithm may depend substantially on clever programming or on the computer hard ware available.32 1. c such that . a total of ( 2n . in Example 1 . and the number of operations in matrix multiplication increases cubically with n. As a first approx imation. read and write statements. ) . we are interested in comparing algorithms without having to examine the details of a particular implementation. etc.9 (a) Let a and b b e vectors in �n . 9 .::: no . for a total of 2n . The traditional way of computing AB forms the inner product of a row of A and a column of B to obtain an entry of AB . For this reason. an exact count is usually very difficult . In Example 1 . Since there are n 2 entries to be evaluated. comparisons ) required by an algorithm.::: cg (n) for all n . all computational problems are solved by algorithms.::: no . we will settle for an estimate of the rate of growth of the number of arithmetic operations. divisions. we might be content to say that the number of operations in the computation of an inner product increases linearly with n.2 (a) We write Let f and g be functions that map positive numbers to positive numbers. more generally. an exact operation count was possible. Definition 1. This leads us to the order of magnitude notation that we define next.1 )n 2 arithmetic operations are involved . multiplications. The natural algorithm for computing a' b requires n multiplications and n .6 Chap. ( c ) We write f (n) = 8 (g (n) ) if both f(n) = O (g(n)) and f (n) n (g (n)) hold. as a function of the problem parameters. for more complicated problems and algorithms. 9 . ( b ) We write f (n) = n (g(n)) if there exist positive numbers no and c such that f (n) . However. an algorithm is a finite set of instructions of the type used in common pro gramming languages ( arithmetic operations. This approach is often adequate even though it ignores the fact that adding or multiplying large integers or highprecision floating point numbers is more demanding than adding or multiplying singledigit integers. 1 Introduction Algorit hms and operat ion counts Optimization problems such as linear programming and.
Such algorithms are said to run in polynomial time. whereas the equation . problems with truly large values of n will always be difficult to handle.9 is predictable. the average running time is much more difficult to estimate. A similar argument applies to algorithms whose running time is O ( nk ) for some positive integer k. the "average" running time of an algorithm might be more relevant . For very small n. However. Example 1 .g. it is customary to estimate the running time for the worst possible input data of a given "size. It is then reasonable to expect that no matter how much technology improves. and for this reason. we have 3n3 + n 2 + 10 = 8 ( n3 ) . for n = 3. Example 1 . n log n = O ( n 2 ) . 6 Algorithms an d operation counts 33 For example. the worstcase approach is widely used. The classical method that eliminates one variable at a time ( Gaussian elimination ) is known to require O(n 3 ) arithmetic operations in order to either compute a solution or to decide that no solution exists. 10 (Operation count of linear system solvers and matrix inversion) Consider the problem of solving a system of n linear equations in n Is the O ( n3 ) running time of Gaussian elimination good or bad? Some perspective into this question is provided by the following observation: each time that technological advances lead to computer hardware that is faster by a factor of 8 ( presumably every few years ) . In such cases. we can increase the value of n that we can handle only by 1. or even to define. and n log n = O (n) . we can solve problems of twice the size than earlier possible. 1 . the exponential time algorithm is preferable. we might be interested in estimating its worstcase running time over all problems with a given number of variables and constraints. in practice. these are said to take at least exponential time." For example. To gain some perspective as to what happens for larger n. each time that computer hardware becomes faster by a factor of 2 . . Algorithms also exist whose running time is O (2cn ) . The equation IO n /100 = 10 7 X 1000 yields n = 12.Sec. The running time of the first is IO n /100 ( exponential ) and the running time of the second is lOn 3 ( polynomial ) . where n is a parameter representing problem size and c is a constant . While the running time of the algorithms considered in Example 1. instead of trying to estimate the running time for each possible choice of the input . This emphasis on the worst case is somewhat conservative and. Practical methods for matrix inversion also require O (n 3 ) arithmetic operations . suppose that we have access to a workstation that can execute 10 7 arithmetic operations per second and that we are willing to let it run for 1000 seconds. if we have an algorithm for linear programming. Let us figure out what size problems can each algorithm handle within this time frame. the running time of more complicated algorithms often depends on the numerical values of the input data. unknowns . For such algorithms and if c = 1. e. These facts will be of use later on. 1 1 Suppose that we have a choice of two algorithms.
The point of view emerging from the above discussion is that . 1. indicating that the polynomial time algorithm allows us to solve much larger problems. and the latter as relatively slow.3 Consider the problem o f minimizing a cost c'x + I(d'x) . so is I. (b) Show that if each Ii is piecewise linear and convex.8. lR is both concave and convex. . let Exercise 1. 1 * Suppose that a function I : lRn I is an affine function.101 I XI + 2 1 + I X2 1 :::. so is I. 1 Introduction = 10 7 X 1000 yields n = 1000.34 lOn 3 Chap. Exercise 1.8: The function I of Exercise 1 . . 1m are convex functions from lRn into lR and I(x) = 2::1 li (X) . . . Exercise 1. and reformulate it as a linear programming problem.7 Exercises f+ Prove that Exercise 1 . I(x ) 1 x Figure 1. it is useful to juxtapose polynomial and exponential time algorithms. function o f the form Here. the former being viewed as relatively fast and efficient . subject to the linear constraints Ax � b . ( a ) Show that if each Ii is convex.4 Consider the problem minimize subject to 2Xl + 3 1 x2 . 5 .2 Suppose that Jr . d is a given vector and the function I : lR f+ lR is as specified in Figure 1.but not all contexts and we will be returning to it later in this book. 3 . Provide a linear programming formulation of this problem. . This point of view is justified in many . as a first cut .
Sec.7 Exercise 1. . 1 . in the presence of such negative entries. . 5 Consider a linear optimization problem. Exercise 1. P I .' I aij pj . Assume that all entries of B and d are nonnegative.8 (Road lighting) Consider a road divided into n segments that is illuminated by m lamps. . (c) Provide an example to show that if B has negative entries. . 1. ) Exercise 1. a linear programming re formulation is implausible. b 'v' i. the student population of grade i is Big . while minimizing the total distance traveled by all students. or all three have the same optimal cost . In each neighborhood i. Material produced during . Each school j has a capacity of Gj g for grade g. Let Pj be the power of the jth lamp. The illumination Ii of the ith segment is assumed to be 'E. (a) Provide two different linear programming formulations. 3 . the problem may have a local minimum that is not a global minimum. ) G Exercise 1. 3 .6 Provide linear programming formulations o f the two variants of the rocket control problem discussed at the end of Section 1 . (You may ignore the fact that numbers of students must be integer. of the following form: minimize subject to c' x + d' Y Ax + By Yi = l Xi i . Note that the wording of the problem is loose and there is more than one possible formulation. with absolute values. We are interested in choosing the lamp powers Pj so that the illuminations Ii are close to the desired illuminations Ii . (It will be seen in Chapter 2 that this is never the case in linear programming problems. Show how linear programming = 4 can be used to approach this problem. Finally. with probabilities Po . 1 0 (Production and inventory planning) A company must de liver di units of its product at the end of the ith month. J schools. along the lines dis cussed in Section 1 . Provide a reasonable linear program ming formulation of this problem. the distance of school j from neighborhood i is dij .7 Exercises 35 Exercise 1 . PK . . .9 Exercise 1 . Consider a school district with I neighborhoods. = Let Ii be the desired illumination of road i. K. :::. Formulate a linear programming problem whose objective is to assign all students to schools. where aij are known coefficients. Hence. . and grades at each school. . (The moment problem) Suppose that Z i s a random variable taking values in the set 0. respectively. ( b ) Show that the original problem and the two reformulations are equivalent in the sense that either all three are infeasible. We are given the values of the first two moments E[Z] = 'E:=o kPk and E[Z 2 ] = 'E:= o k 2Pk of Z and we would like to obtain upper and lower bounds on the value 4 of the fourth moment E[Z ] = 'E: o k p k of Z.
that is. . but the company is constrained by cash availability and machine capacity.000 machine hours available in the current production period. available currencies. 45% . 14 A company produces and sells two different products. We are interested in finding a ball with the largest possible radius. we have 'ri1 i2 'ri 2 i 3 . m } . it incurs a cost of c2 l x i + l . The production costs are $3 and $2 per unit of the first and second product. A ball with center y and radius 'r is defined as the set of all points within (Euclidean) distance 'r from y. Assume that inventory left at the end of the year has no value and does not incur any storage costs. . 1 Introduction a month can be delivered either at the end of the same month or can be stored as inventory and delivered at the end of a subsequent month.13 (Linear fractional programming) Consider the problem minimize subject to c'x + d f' x + g Ax :S b f' x + g > O. 'rik . that uses linear programming as a subrou tine. . furthermore. ) There also certain regulations that impose a limit Ui on the total amount of currency i that can be exchanged on any given day. . i = 1 . Each unit of the first and second product requires 3 and 4 machine hours. x :S bi . Provide a linear programming formulation of this problem. Suppose that we start with B units of currency 1 and that we would like to maximize the number of units of currency N that we end up with at the end of the day. . 'rij Exercise 1 . The de mand for each product is unlimited.1 ik 'ri k i 1 :S 1 . .40 per unit. and assume that one unit of currency i can be exchanged for units of currency j . Exercise 1. (The center of such a ball is called the Chebychev cente'r of P. respectively. L] . . Exercise 1 . reflecting the cost of switching to a new production level. we assume that 'rij > 0 . which is entirely contained within the set P. Assume that for any sequence iI . (Naturally. . and that allows us to compute the optimal cost within any desired accuracy. there is a storage cost of Cl dollars per month for each unit of product held in inventory. There are 20. respectively.36 Chap. The year begins with zero inventory. P = {x E 1Rn I a. The selling prices of the first and second product are $6 and $5. Suppose that we have some prior knowledge that the optimal cost belongs to an interval [K.xi i dollars. Formulate a linear programming problem whose objective is to minimize the total cost of the production and in ventory schedule over a period of twelve months. which means that wealth cannot be multiplied by going through a cycle of currencies.000. through a sequence of currency transactions.) Provide a linear programming formulation of this problem. 1 2 (Chebychev center) Consider a set P described by linear inequality constraints. ik of currencies. . Hint: Consider the problem of deciding whether the optimal cost is less than or equal to a certain number. however. If the company produces Xi units in month i and Xi + l units in month i + 1 . The available cash is $4. 1 1 (Optimal currency conversion) Suppose that there are N Exercise 1 . respectively. . Provide a procedure.
3 barrels of crude A and 5 barrels of crude B are used to produce 4 barrels of gasoline and . Given the current personnel of the company. at a cost of $7 per hour. which sells for $33 per barrel. Which of the above two elements can be easily incorporated into the lin ear programming formulation and how? If one or both are not easy to incorporate. and $ 1 .000.9 worth of raw materials. 7 Exercises 37 of the sales revenues from the first product and 30% of the sales revenues from the second product will be made available to finance operations during the current period. there can be at most 90 hours of assembly labor and 80 hours of testing. 15 A company produces two kinds of products. For example. Should the investment be made? Exercise 1 . ( a ) Formulate a linear programming problem that aims at maximizing net in come subject to the cash availability and machine capacity limitations. after spending $400 for certain repairs. ( a ) Formulate a linear programming problem that can be used to maximize the daily profit of the company. ( c ) Suppose that the company could increase its available machine hours by 2. ( ii ) Suppose that the raw material supplier provides a 10% discount if the daily bill is above $300. or home heating oil. 2 worth o f raw materials. (b) Solve the problem graphically to obtain an optimal solution. indicate how you might nevertheless solve the problem. (b) Consider the following two modifications to the original problem: ( i ) Suppose that up to 50 hours of overtime assembly labor can be sched uled. respectively.production during the coming month. each day. and 5 million barrels of crude oil B allocated for . Products of the first and second type have a market value of $9 and $8. A product o f the second type requires 1 / 3 hours of assembly. A product of the first type requires 1/4 hours of assembly labor. 1/3 hours of testing. 1 6 A manager of an oil refinery has 8 million barrels of crude oil A All quantities are in barrels. 1/8 hours of testing. with the first process. There are three production processes with the following characteristics: Process 1 Input crude A Input crude B Output gasoline Output heating oil Cost 3 5 4 3 $51 Process 2 1 1 1 1 $11 Process 3 5 3 3 4 $40 Exercise 1 . These resources can be used to make either gasoline. and $0.Sec. which sells for $38 per barrel. 1 .
. . .01 x (50. the investor pays transaction costs at the rate of 1 % of the amount transacted. 1 Introduction 3 barrels of heating oil. The investor expects that the price of one share of stock i in one year will be Ti . he owes 0. Formulate a linear programming problem that would help the manager maximize net revenue over the next month. For example.000. the simplex method. The costs in this table refer to variable and allocated overhead costs.20 (a) Let S = {Ax I x subspace of !Rn . We wish to express y as a linear combination of the basis vectors. net of capital gains and transaction costs. such that V = {y l a�y = bi ' i = 1 . the investor pays taxes at the rate of 30% on capital gains.000) = $6. it was realized that this process could often be aided by solving op timization problems involving linear constraints and linear objectives. . E !Rn } . .000 shares of this stock he nets 50.500 = $43.8 History. bn . 19 Suppose that we are given a set of vectors in !Rn that form a basis.30 x (50. while maximizing the expected value of his portfolio next year. In addition.6. n Exercise 1 . and sources The word "programming" has been used traditionally by planners to de scribe the process of operations planning and resource allocation.000) = $500 on transaction costs.000 30. He has bought Si shares of stock i at price pi.38 Chap. notes .000 on capital gain taxes and 0. Show that there exist linearly independent vectors al . suppose that the investor sells 1. . He receives $50. .m } . Show that S is a (b) Assume that S is a proper subspace of !Rn . The term "linear programming" then emerged. n . and let y be an arbitrary vector in !Rn . . . within the context of military planning prob lems. . The initial impetus came in the aftermath of World War II. ( c ) Suppose that V i s an mdimensional affine subspace o f !Rn . He has bought these shares at $30 per share. Formulate the problem of selecting how many shares the investor needs to sell in order to raise an amount of money K. However.500. anrn . Dantzig proposed an algorithm. Exercise 1. . and there are no separate cost items for the cost of the crudes. If he sells shares. Hint: Use vectors that are orthogonal to S to form the matrix B. . So. . Show that there exists a matrix B such that S = {y E !Rn I By = O } . In 1947. 1 . How can this be accomplished? Exercise 1 .000 shares of a stock at $50 per share. and scalars bl . by selling 1 . Exercise 1 . 1 7 (Investment under taxation) An investor has a portfolio of different stocks.rn . with m < n. i = 1 . .000 . . which . The current price of one share of stock i is qi. where A is a given matrix. n .000 . In the 1940s. .18 Show that the vectors in a given finite collection are linearly independent if and only if none of the vectors can be expressed as a linear com bination of the others.
in the period 18701930. For a more detailed account of the history of linear programming. were cast in this framework. de la Vallee Poussin developed a method. Murty ( 1 983) . the mathematical structures that under lie linear programming were independently studied. On the theoretical front . such as Farkas. several models arising in classical. economics were studied and refined. etc. new powerful methods have been discovered. played an important role and eventually (in 1 975) shared the Nobel Prize in economic science with Kantorovich. and sources 39 made the solution of linear programming problems practical. Dantzig's development of the simplex method has been a defining moment in the history of the field. 8 History. scheduling. Around the same time. We defer the discussion of this research to the notes and sources sections of later chapters. integer programming. Luenberger ( 1984) . economics. the range of applications has expanded. Some more recent texts are Papadimitriou and Steiglitz ( 1 982) . Orden ( 1 993) . Rinnooy Kan. Early work goes back to Fourier. and others. Since then. In 1910. There fol lowed a period of intense activity during which many important problems in transportation. Koopmans. by many prominent mathematicians. and complexity theory. linear programming is a routinely used tool that can be found in some spreadsheet software packages. von Neumann developed an im portant result in game theory that would later prove to have strong con nections with the deeper structure of linear programming. notes. military operations. Jarvis. and Schrijver (1991) (see especially the article by Dantzig in that volume) . and led to formulations closely related to linear programming. an economist . who in 1824 developed an algo rithm for solving systems of linear inequalities. Bazaraa. in 1928. because it came at a time of grow ing practical needs and of advances in computing technology. the reader is referred to Schrijver ( 1 986) . similar to the sim plex method. but this issue was not relevant at the time. Chvatal ( 1 983) . . Also. computer technology has advanced rapidly. the Soviet mathematician Kantorovich became in terested in problems of optimal resource allocation in a centrally planned economy. 1 . There are several texts that cover the general subject of linear pro gramming. Today. I n the late 1930s. starting with a comprehensive one by Dantzig ( 1 963) . and the underlying mathematical understanding has become deeper and more comprehensive. He also pro vided a solution method. and the volume edited by Lenstra. there has been much and important research in areas such as large scale optimization. Fourier's method is far less efficient than the simplex method. Subsequent to Dantzig's work. interior point methods. for minimizing maxi I bi a�xl . Walrasian. as is the case with most "scientific revolutions. for which he gave linear programming formulations. Minkowski. and Sherali ( 1 990) . 3 . but his work did not become widely known at the time. Fi . a problem that we discussed in Section 1 .Sec. Caratheodory. But . network optimization." the history of the field is much richer.
1. but more advanced reference on the subject . 1 . Schrijver (1986) is a comprehensive. Ex ercises 1 . Hax.3. 1 . For a more detailed treatment of algorithms and their computational requirements. 8 is adapted from Boyd and Vandenberghe ( 1 995) . Orlin. see Strang ( 1988) . and Ratliff ( 1 980) . e. Linear programming arises in control problems. . Papadimitriou and Steiglitz ( 1 982) . and Orlin ( 1 993) . and Rivest ( 1 990) . 9 and 1 . see. . see Lewis and Papadimitriou ( 1 98 1 ) . 1 1 is adapted from Ahuja. Exercise 1 . or Cormen. 1. 7.6. The formulation o f the diet problem is due t o Stigler ( 1 945) .5. More information on pattern classification can be found in Duda and Hart (1973) . For an introduction t o linear algebra. in ways that are more sophisticated than what is described here. 1. Magnanti.g. 1. 14 are adapted from Bradley. Methods for dealing with the nurse scheduling and other cyclic problems are studied by Bartholdi. A deep and comprehensive treatment o f convex functions and their properties is provided by Rockafellar ( 1970) . 1 Introd uction nally. The case study o n DEC 's production planning was developed by Fre und and Shannahan ( 1 992) . Dahleh and DiazBobillo ( 1 995) .2. and Magnanti ( 1977) . Exercise 1 .40 Chap. Leiserson. 1. or Haykin ( 1 994) .
Chapter
2
T he geometry of linear programmIng
•
Contents
2.1. 2.2. 2.3. 2.4. 2.5. 2.6. 2.7. 2.8. 2.9. 2 . 10. 2.11. Polyhedra and convex sets Extreme points, vertices, and basic feasible solutions Polyhedra in standard form Degeneracy Existence of extreme points Optimality of extreme points Representation of bounded polyhedra* Projections of polyhedra: FourierMotzkin elimination* Summary Exercises Notes and sources
41
42
Chap. 2
The geometry of linear programming
In this chapter, we define a polyhedron as a set described by a finite number of linear equality and inequality constraints. In particular, the feasible set in a linear programming problem is a polyhedron. We study the basic geometric properties of polyhedra in some detail, with emphasis on their "corner points" ( vertices ) . As it turns out , common geometric intuition derived from the familiar threedimensional polyhedra is essentially correct when applied to higherdimensional polyhedra. Another interesting aspect of the development in this chapter is that certain concepts ( e.g. , the concept of a vertex) can be defined either geometrically or algebraically. While the geometric view may be more natural, the algebraic approach is essential for carrying out computations. Much of the richness of the subject lies in the interplay between the geometric and the algebraic points of view. Our development starts with a characterization of the corner points of feasible sets in the general form {x I Ax ;::: b}. Later on, we focus on the case where the feasible set is in the standard form {x I Ax = b, x ;::: O}, and we derive a simple algebraic characterization o f the corner points. The latter characterization will play a central role in the development of the simplex method in Chapter 3. The main results of this chapter state that a nonempty polyhedron has at least one corner point if and only if it does not contain a line, and if this is the case, the search for optimal solutions to linear programming problems can be restricted to corner points. These results are proved for the most general case of linear programming problems using geometric arguments. The same results will also be proved in the next chapter, for the case of standard form problems, as a corollary of our development of the simplex method. Thus, the reader who wishes to focus on standard form problems may skip the proofs in Sections 2 . 5 and 2.6. Finally, Sections 2 . 7 and 2 . 8 can also be skipped during a first reading; any results in these sections that are needed later on will be rederived in Chapter 4, using different techniques. 2.1 Polyhedra and convex set s
In this section, we introduce some important concepts that will be used to study the geometry of linear programming, including a discussion of convexity.
Hyperplanes, halfspaces, and p olyhedra
We start with the formal definition of a polyhedron.
Definition 2 . 1 A polyhedron is a set that can be descri bed in the form {x E lR n I Ax ;::: b}, where A is an m x n matrix and b is a
vector in
lRm .
Sec.
2. 1
Polyhedra and convex sets
43
As discussed in Section 1 . 1 , the feasible set of any linear programming problem can be described by inequality constraints of the form Ax � b, and is therefore a polyhedron. In particular, a set of the form {x E �n I Ax b, x � o} is also a polyhedron and will be referred to as a polyhedron
=
in standard form.
A polyhedron can either "extend to infinity," or can be confined in a finite region. The definition that follows refers to this distinction.
Definition 2.2
K such that the absolute value of every component of every element of S is less than or equal to K.
A
set S
c
?Rn
is bounded if there exists a constant
The next definition deals with polyhedra determined by a single linear constraint .
Definition 2.3 Let a (a) The set {x E �n (b) The set {x E �n
be a nonzero vector in ?Rn and let b be a scalar. I a/x = b} is called a hyperplane. I a/x � b} is called a halfspace.
Note that a hyperplane is the boundary of a corresponding halfspace. In addition, the vector a in the definition of the hyperplane is perpendicular to the hyperplane itself. [To see this, note that if x and y belong to the same hyperplane, then a/x a/yo Hence, a/ ( x  y) 0 and therefore a is orthogonal to any direction vector confined to the hyperplane.] Finally, note that a polyhedron is equal to the intersection of a finite number of halfspaces; see Figure 2 . 1 .
= =
Convex Sets
We now define the important notion of a convex set .
Definition 2.4
A
E
[O, IJ, we have AX + ( 1
A
set S
c
?Rn is convex if for any x, y E

A)y E S.
S, and any
Note that if A E [0, 1] then Ax + ( 1  A)y is a weighted average of ' the vectors x, y, and therefore belongs to the line segment joining x and y. Thus, a set is convex if the segment joining any two of its elements is contained in the set; see Figure 2 . 2 . Our next definition refers t o weighted averages o f a finite number of vectors; see Figure 2 . 3 .
44
Chap. 2
The geometry of linear programming
( a)
(b)
Figure 2 . 1 : ( a) A hyperplane and two halfspaces. (b) The poly hedron {x I �x ;::: bi , i = 1, . . . , 5} is the intersection of five halfs paces. Note that each vector ai is perpendicular to the hyperplane {x I �x = bi } .
Definition 2 . 5 (a) (b )
The vector the vectors The
Let
Xl , . . . , xk
b e vectors in
�n
and let >'1 ,
nonnegative scalars whose sum is unity.
. . . , Ak
be of
L:�=l AiXi Xl , . . . , x k
is said to be a
convex combination
convex hull
of the vectors
xl , . . . , xk
is the set of all convex
combinations of these vectors.
The result that follows establishes some important facts related to convexity.
Theorem 2.1 (a ) The intersection of convex sets is convex. ( b ) Every polyhedron is a convex set . ( c ) A convex combination of a finite number of elements of a convex
set also belongs to that set.
(d )
The convex hull of a finite number of vectors is a convex set .
Sec. 2. 1
Polyhedra and convex sets
45
Figure 2.2: The set S is convex, but the set Q is not , because the segment joining x and y is not contained in Q .
.._____
x3
Figure 2.3: The convex hull of seven points in R2 . Proof.
Let Si , i E I, be convex sets where I is some index set , and suppose that x and y belong to the intersection ni E I Si . Let A E [0 , 1] . Since each Si is convex and contains x, y, we have AX + ( 1  A)y E Si , which proves that AX + (1  A)y also belongs to the intersection of the sets Si . Therefore, ni E I Si is convex. (b ) Let a be a vector and let b a scalar. Suppose that x and y satisfy a'x � b and a'y � b, respectively, and therefore belong to the same halfspace. Let A E [0, 1] . Then, a' (Ax + ( 1  A)Y) � Ab + ( 1  A)b = b, which proves that AX + (1  A)y also belongs to the same halfspace. Therefore a halfspace is convex. Since a polyhedron is the intersection of a finite number of halfspaces, the result follows from part (a) . (c) A convex combination of two elements of a convex set lies in that
(a)
. therefore. k+ 1 ) . are nonnegative and sum to unity.4 that an optimal solution to a linear programming problem tends to occur at a "corner" of the polyhedron over which we are optimizing. This shows that ). ) L {}i Xi i= l = k L ( ). (2.2 Extreme p oint s . . �k �k L. .).).) {}i) Xi .d ( l . x k+ 1 of a convex set S and let ). k+ 1 f=. ( d ) Let S be. k. 1] .y + (1 . . where (i 2:: 0. as an induction hypothesis.y + ( 1 . We then have (2. . . vertices . k+ 1 ) E S. and is illustrated in Figure 2. Let ).4. We assume. . Our first definition defines an extreme point of a polyhedron as a point that cannot be expressed as a convex combination of two other elements of the polyhedron. k . and basic feasible solutions We observed in Section 1 . . the fact that S is convex and Eq. 1) imply that L 7�1 ). and the induction step is complete.)z = ). i= l = We note that the coefficients ).). ). . . . {}i 2:: 0. both different from and a scalar ). i 1 . .y + ( 1 .)z is a convex combination of X l . x . i 1 .6 Let P b e a polyhedron . . without loss of generality. Definition 2. belongs to S. 1 Then. are non negative and sum to unity. Notice that this definition is entirely geometric and does not refer to a specific representation of a polyhedron in terms of linear constraints.).) {}i . 2. . the convex hull of the vectors x l . and L 7= 1 (i L 7= 1 (}i 1. (i + (1 . 1] ' such that = ). . E [0.. Let us assume. k+ 1 be nonnegative scalars that sum to 1 . . E [0. (i + ( 1 . z E P.1 . that a convex combination of k elements of a convex set belongs to that set .).t i = l {}i X' be two elements of S.1) The coefficients ). L (iXi + ( 1 .. . using the induction hypothesis. = = = = = k k ). .46 Chap. by the definition of convexity. .).i xi /(1 . A vector x E P is an ex treme point of P if we cannot find two vectors y. x k and. x. This estab D lishes the convexity of S. i Xi E S. . 1 . L 7= 1 ). . xk and let y . . we suggest three different ways of defining the concept of a "corner" and then show that all three definitions are equivalent . that ). Then.). . .).t i = l (iX' .)z. In this section. Consider k + 1 elements x l . 2 The geometry of linear programming set . Z L.
. ) z and '>" E [0. In other words. Definition 2. P.Sec. In order to provide such a definition. and . vertices. and M3 are finite index sets. An alternative geometric definition defines a P as the unique optimal solution to some linear programming problem with feasible set P. 2 Extreme points. each ai is a vector in 3?n . see Figure 2.7 L e t P b e a polyhedron. where MI . we need some more terminology. i E M3 . or x = y. or x = z . P.4: The vector w is not an extreme point because it is a convex combination of v and u. a�x < bi .. We would like to have a definition that relies on a representation of a polyhedron in terms of linear constraints and which reduces to an algebraic test .>. i E MI .y + ( 1 . Consider a polyhedron P C 3?n defined in terms of the linear equality and inequality constraints = vertex of a polyhedron a�x > bi .5. x is a vertex of P if and only if P is on one side of a hyperplane ( the hyperplane {y I e ' y e/x}) which meets P only at the point x. and basic feasible solu tions 47 fI Y '" Figure 2. 1] ' then either y f/. A vector x E P i s a vertex of P if there exists some e such that e/x < e ' y for all y satisfying y E P and y i= x..>. a�x bi . The vector x is an extreme point: if x = . 2. The two geometric definitions that we have given s o far are not easy to work with from an algorithmic point of view. i E M2 . or z f/. M2 .
48 Chap. has a uniq ue solution. Suppose that the vectors ai . that is. 2 The geometry of linear programming Figure 2. The span o f t h e vectors element of vectors x* . the span of n. Definition 2. i E I.8 If a vector x* at satisfies a� x* = bi for some i in or or M3 . n of . i E I. On the other hand.5 . If there are n constraints that are active at a vector x* . Then. �n ai . we say that the corresponding constraint is x* . together with a slight generalization. i E I. active binding MI . span �n .2 ( a) (b ) ( c) Let x* be an element onJ�n and let I = be the set of indices of constraints that are active at following are eq uivalent: There exist n vectors in the set independent . 3 ( a ) in Section 1 . By Theorem 1 . these vectors has dimension Proof. a� x i s all o f �n . every can be expressed as a linear com bination of the The system of equations = bi . Theorem 2." The result that follows gives a precise meaning to this statement . This system has a unique solution if and only if these n equations are "linearly independent . the { ai l i E I} . The definition that follows is illustrated in Figure 2. each bi is a scalar. which are linearly ai . {i I a� x* = bd Then .6. w is not a vertex because there is no hyperplane that meets P only at w . i E I. then x* satis fies a certain system of n linear equations in n unknowns.5: The line at the bottom touches P at a single point and x is a vertex. M2 .
Sec. i E I. X3 � o } .6: Let P = { (Xl . X3 ) I Xl + X2 + X3 = 1 . We have therefore established D that ( b ) and ( c ) are equivalent . all equality straints are linearly independent. if the vectors ai . Since d is orthogonal to every vector ai . we also have a� (x + d) = bi for all i E I. There are three constraints that are active at each one of the these vectors form a basis of �n. say x l and x 2 . suppose that n of the vectors ai . Figure 2. There are only two constraints that are active at point E. X2 . we will often say that certain con meaning that the corresponding vectors ai are linearly independent . every element of �n is a linear combination of the vectors � . are linearly independent . 2. If x satisfies a�x = bi for all i E I.  With a slight abuse of language. statement ( a) in The orem 2. Conversely. Xl . Then. and are therefore linearly independent . then the nonzero vector d = X l x 2 satisfies a�d = 0 for all i E I. has multiple solutions. do not span �n . X2 . vertices. the subspace spanned by these n vectors is ndimensional and must be equal to �n . do not span �n . With this terminology. Note that since we are interested in a feasible solution. i E I. choose a nonzero vector d which is orthogonal to the subspace spanned by these vectors. thus obtaining multiple solutions. 2 Extreme points. B. namely Xl + X2 + X3 = 1 and X 2 = o. i E I. i E I. We are now ready to provide an algebraic definition of a corner point . If the system of equations a�x = bi .2 requires that there exist n linearly independent constraints that are active at x* . and basic feasible solutions 49 points A. i E I. as a feasible solution at which there are n linearly independent active con straints. . i E I. C and D. This establishes the equivalence of ( a ) and ( b ) . Hence. d is not a linear combination of these vectors and it follows that the vectors ai . Conversely.
= We have given so far three different definitions that are meant to cap ture the same concept. ( c ) x * is a basic feasible solution. * (b ) If x is a basic solution that satisfies all of the constraints. B. and there are no basic or basic feasible solutions. two of them are geometric ( extreme point. we note that points A. Definition 2. the three terms can be used interchangeably. Point E is feasible.3 Let P be a nonempty polyhedron and let Then. we say that it is a basic feasible solution. and let x * be an element of 3?n . (a) The vector x* is a basic solution if: (i) All equality constraints are active. there are n of them that are linearly independent.6. . then D would be a basic solution. Once we have n linearly independent active constraints. This suggests the following way of looking for corner points: first impose the equality constraints and then require that enough additional constraints be active. However. (b ) x* is an extreme point. according to our definition.50 Chap.9 is also illustrated in Figure 2. Fortunately. because some of the inactive constraints could be violated. this procedure has no guarantee of leading to a feasible vector x * . x * E P. so that we get a total of n linearly independent active constraints. 2 The geometry of linear programming constraints must be active. and C are basic feasible solutions. in the latter case we say that we have a basic ( but not basic feasible ) solution. Theorem 2. the number of active constraints at any given point must also be less than n. the following are equivalent: ( a) x * is a vertex. Point D is not a basic solution because it fails to satisfy the equality constraint . a unique vector x * is determined ( Theorem 2. all three definitions are equivalent as we prove next and. for this reason. (ii) Out of the constraints that are active at x* . If the equality constraint X l + X 2 + X3 1 were to be replaced by the constraints X l + X 2 + X3 � 1 and X l + X 2 + X3 � 1 .7.2) .9 Consider a polyhedron P defined by linear equality and inequality constraints. Definition 2. In reference to Figure 2. vertex ) and the third is algebraic ( basic feasible solution ) . but not basic. Note that if the number m of constraints used to define a polyhedron P C 3?n is less than n. This shows that whether a point is a basic solution or not may depend on the way that a polyhedron is represented.
Since x* is not a basic feasible solution.I. We finally notice that x* = (y + z)/2.Sec. Let I = {i I a� x* = bi }. we will also have a� y > bi . Thus. and basic feasible solutions 51 Figure 2. therefore. D. Proof.A)z) and. y E P and. then e'x* < e ' y and e ' x* < e'z . x* =1= Ay+ ( 1 .6) . we assume that P is represented in terms of constraints of the form a� x � bi and a� x = bi . there do not exist n linearly independent vectors in the family ai . we have a� x* > bi and. y =1= x* .Ed.7. C. Z =1= x* . ( It suffices to choose E so that E l a� d l < a� x* bi for all i 1:.A) z . which implies that e ' x* < e ' (Ay + (1 . We will show that x* is not an extreme point of P. there are two linearly independent constraints that are active. E. lie in a proper subspace of �n . Let E be a small positive number and consider the vectors y x* + Ed and z = x* . therefore. for all i E I. Z E P.) Thus. D.I. and ° � A � 1 . E. Furthermore. Thus. provided that E is small. for i E I.2 Extreme points. For the purposes o f this proof and without loss o f generality. F are all basic solutions because at each one of them. which implies that x* is not an extreme point . Extreme point =? Basic feasible solution Suppose that x* E P is not a basic feasible solution. z E P.7: The points A. If Y E P. x* cannot be expressed as a convex combination of two other elements of P and is. by a similar argument . Notice that a� y = a� x* = bi . the vectors � . F are basic feasible solutions. Then. there exists some e E �n such that e ' x* < e 'y for every y satis fying y E P and y =1= x* . 2. by Definition 2. Definition 2. i E I. for i 1:. vertices. and there exists some nonzero vector d E �n such that a� d = 0. B. Points C. Vertex =? Extreme point Suppose that x* E P is a vertex. =  . an extreme point ( cf. i E I. when E is small enough.
and (2. we have �x c ' x = L �x 2: L bi . which is finite. there are n linearly independent . active constraints. as pointed out in the discussion that followed Definition 2. 2 The geometry of linear programming Basic feasible solution ::::} Vertex Let x* be a basic feasible solution and let I c = 2: i EI ai ' We then have c' x * = {i I a�x* bd. Therefore.2) Furthermore. there are n linearly independent constraints that are active at x* . For example.n } is defined in terms of 2n constraints. iEI i EI This shows that x* is an optimal solution to the problem of minimizing c ' x over the set P. Since x* is a basic feasible solution. Corollary 2 . Since any n linearly independent active constraints de fine a unique point . Xi S. Although the number of basic and. therefore.2) . equality holds in (2. which is representation dependent. it follows that different basic solutions correspond to different sets of n linearly independent active constraints. Proof. Let L a�x * = L bi .. i E I ( Theorem 2. it can be very large. there can only be a finite number of basic or basic feasible solutions.. i = 1. ( This is in contrast to the definition of a basic solution.2) if and only if a�x = bi for all i E I..) We finally note the following important fact .9. Consider a system of m linear inequality constraints imposed on a vector x E Rn . 1. x* is a vertex of P. the number of basic solutions is bounded above by the number of ways that we D can choose n constraints out of a total of m . At any basic solution. basic feasible solutions is guaranteed to be finite. and since the definition of an extreme point does not refer to any particular representation of a polyhedron. the unit cube {x E Rn l O S. It follows that x* is the unique minimizer of c ' x over the set P and. but has 2 n basic feasible solutions. .52 Chap. and x* is the unique solution to the system of equations a�x = bi . for any x E P and any i. 1 Given a finite number of linear inequality constraints. D Since a vector is a basic feasible solution if and only if it is an extreme point. we conclude that the property of being a basic feasible solution is also independent of the representation used. therefore. Furthermore. i EI i EI 2: bi .
. then Xi O.7. In order to obtain a total of n active constraints. In reference to Figure 2.Sec. = = Proof. as shown by the following result . n. therefore. B ( m) such that: (a) The columns AB ( l ) . B ( m ) .3 Polyhedra in st andard form The definition of a basic solution (Definition 2. . we show that when P is nonempty. we need to choose n m of the variables X i and set them to zero. and let the dimensions of A be m x n. . (b) If i =F B ( l ) . We will now specialize to polyhedra in standard form. every basic solution must satisfy the equality constraints Ax = b.9) refers to general polyhe dra.) linearly dependent rows of A correspond to redundant constraints that can be discarded. Let P = {x E �n I Ax = b. Recall that at any basic solution. for the resulting set of n active constraints to be linearly independent . In most of our discussion of standard form problems. then the line segment that joins them is called an edge of the feasible set (see also Exercise 2 . Consider some x E �n and suppose that there are indices B ( l ) . . which makes the corresponding nonnegativity constraints Xi 2: 0 active.  Theorem 2. these are linearly independent because of our assumption on the rows of A.  2. there must be n linearly indepen dent constraints that are active. . x 2: O} be a polyhedron in standard form. Furthermore. . m ::. our linear independence assumption can be made without loss of generality. The definitions and the results in this section are central to the development of the simplex method in the next chapter. . . 1 5 ) . (Since the rows are ndimensional. . . .4 Consider the constraints Ax b and x 2: 0 and as sume that the m x n matrix A has linearly independent rows.3 Polyhedra in standard form 53 Adj acent basic solutions Two distinct basic solutions to a set of linear constraints in �n are said to be adjacent if we can find n 1 linearly independent constraints that are active at both of them. . A vector x E �n is a basic solution if and only if we have Ax = b. A and C are adjacent to D . the choice of these n m variables is not entirely arbitrary. If two adjacent basic solutions are also feasible. 2. and there exist indices B ( l ) . also. this requires that At the end of this section. D and E are adjacent to B. " " AB ( m ) are linearly independent. we will make the assumption that the m rows of the matrix A are lin early independent . where m is the number of equality constraints. . However. which provides us with m active constraints.
. contradicting the uniqueness of the solution.B ( l ) . . it can be obtained from this procedure. . are linearly independent. . the system of equations formed by = In view of Theorem 2. . i =I=. . 2 . It follows [ef. XB( l ) . . . since every basic feasible solution is a basic solution. ] We have shown that the columns AB( l ) . Since A has m linearly independent rows. B (m) .B ( l ) . have a unique b and Xi solution ( ef. the equation 2:: 7=1 AB(i)XB(i) b has a unique solution. . B (k) ( because k :::.B ( l ) . . . . AB( m ) · 2. there are n linearly independent active constraints. . . . the variables X B( 1 ) . By Theorem 2 . 3. such that 2:: 7= 1 AB(i) Ai 0. Therefore. . we could find scalars A I . B (m) . . . [If they were not . . X B( m ) are called . . . i 1 .B ( l ) .5] that we can find m . XB( m ) are uniquely determined. Thus. . . . XB( k ) be the components of x that are nonzero.B ( l ) .k additional columns AB( k+ 1 ) . then it is feasible. . 2 ) . . . Procedure for constructing basic solutions 1. m. AB( m ) so that the columns AB( i) . . the system of equations formed by the active con straints 2:: 7= 1 AiXi 0. . if i =I=. For the converse. Theorem 1 . . . . . . In addition. Solve the system of m equations Ax b for the unknowns X B( 1 ) ' = = . . . equivalently. It follows that the columns AB( l ) . i 1 . we assume that x is a basic solution and we will show that conditions ( a ) and ( b ) in the statement of the theorem are satis fied. Let Xi ° for all i =I=. . AB( k ) are linearly inde pendent and this implies that k :::. which span �m . B (k) . If x is a basic solution. Let XB( l) . . . m) . . . . . . both conditions ( a) and ( b ) in 0 the statement of the theorem are satisfied. . .54 Chap. it also has m linearly independent columns. all basic solutions to a standard form poly hedron can be constructed according to the following procedure. = = = = = = = i= l i= l Since the columns AB( i) . . . . XB( m ) · If a basic solution constructed according to this procedure is nonneg ative. . 2 The geometry of linear programming B(m) that satisfy ( a) and ( b ) in the statement of the theorem. m. . . AB( k) are linearly independent . not all of them zero. . B (m) . The active constraints Xi 0. Theorem 2 .3 ( b ) in Section 1 . . and Ax b imply that = = m n the active constraints has a unique solution. .4. i =I=. Conversely. . A k . and it is a basic feasible solution. . and this implies that x is a basic solution. Choose m linearly independent columns AB( 1 ) . Since x is a basic solution. . . . m. . This would imply that 2:: 7= 1 AB( i) (XB(i) + Ai) b. are linearly independent . and Xi 0. then i =I=. . .
. . A7 as our basic columns. We will sometimes talk about two bases being distinct or different. this is accom plished using a nonnegative amount of each basic vector. 0. A5 . 5. unique solution is given by XB m ) The basic variables are determined by solving the equation B XB b whose = I AB( 2 ) I XB = B . 4. As } coincide. A7 ( note that they are linearly independent ) . 4. . On the other hand the corresponding sets of basic indices. Furthermore. since they are linearly independent . 8. are different and we have two different bases. they form a basis of �m . in a basic feasible solution. . 4. A6 . In a basic solution. according to our con ventions. Thus. A5 . 1 ) . called a basis matrix. Note that they are linearly independent and the corresponding basis matrix is the identity. " " AB( m ) are called the basic columns and. the two sets of columns {A 3 . A6 . which is not feasible because X5 = 12 < O. 6. 6) . identical to A7. those associated with the basic variables. see Figure 2 . . 6) which is nonnegative and. is a basic feasible solution. The corresponding basic solution is x = (0. we use only m of the resource vectors. A 7} and {A 3 . which are {3. 0. or ��= l AiXi b. Suppose now that there was an eighth column As . We then obtain the basic solution x = (0. 3 Polyhedra in standard form 55 basic variables.1 b. 2. A 5 . 5. 0 0 1 0 [ XB( l ) � 1 Example 2. ) We can similarly define a vector XB with the values of the basic variables. = = . ( Note that this matrix is invertible because the basic columns are required to be linearly independent . 8} . they will be viewed as one and the same basis. 12.12 . A6 . By arranging the m basic columns next to each other. Another basis is obtained by choosing the columns A3 . The columns AB( l ) . A5 .1 Let the constraint A x = b b e o f the form Let us choose A4 . recall our interpretation of the constraint A x b. 8. if two bases involve the same set of indices in a different order. B (m) } of basic indices. 6. A6 . 0. therefore. 7} and {3.Sec. [� 1 1 0 1 2 6 0 0 1 0 0 0 0 1 0 0 n [ . Then. 0. our convention is that distinct bases in volve different sets { B ( l ) . the remaining variables are called nonbasic.n x � For an intuitive view of basic solutions. we obtain an m x m matrix B . as a requirement to synthesize the vector b E �m using the resource vectors Ai ( Section 1 .
then the latter are adjacent . 0. . (For an extreme example.56 Chap. The corresponding basic solutions x = (0. 4. 8.12 . 0. 4. . if we have b = 0. . A4 be as shown. The vectors AI . ) This phenomenon has some important algorithmic implications.  Example 2. . A5 . AI . which is the subject of the next section. A7} and {A3 . 2 The geometry of linear programming n = 4 and 2 . the bases {A4 . the corresponding basic solution is feasible. Different basic solutions must correspond to different bases. A3 form another basis. 6) and x = (0. A 2 form a basis. A5 . and is closely related to degeneracy. 4. and let the vectors b. we also say that two bases are adjacent if they share all but one basic column. A 2 . then every basis matrix leads to the same basic solution. the vectors AI . the zero vector. Conversely.2 In reference to Example 2 . 12. 6) . Then. because a basis uniquely determines a basic solution. Adj acent basic solutions and adj acent bases n Recall that two distinct basic solutions are said to be adjacent if there are 1 linearly independent constraints that are active at both of them. Finally. The vectors AI . 1 . However. two different bases may lead to the same basic solution. m = Figure 2. . 0. For standard form problems. it is not hard to check that adjacent basic solutions can always be obtained from two adjacent bases. A4 do not form a basis because they are linearly dependent.8: Consider a standard form problem with Correspondence of bases and basic solutions We now elaborate on the correspondence between basic solutions and bases. namely. the corresponding basic solution is infeasible because a negative value of X 2 is needed to synthesize b from A I . A6 . A6 . if two adjacent bases lead to distinct basic solutions. A7} are adjacent because all but one columns are the same. 0.
. and the four equality constraints. j=l j= l Consider now an element y of Q. · ' a� k are linearly independent. . k k Indeed. . .Sec. a� y = L Aij aj y = L Aijbj = bi . . . . . is. with rank equal to k. namely. Since rank(A) = k. i k = k . X 2 2: 0. bi = a�x = I:>ij ajx = L Aij bj . for some scalars Aij . a linear programming problem in standard form can be reduced to an equivalent standard form problem (with the same feasible set) in which the equality constraints are linearly independent. k k o Notice that the polyhedron Q in Theorem 2. where A is a matrix of dimensions m x n. . a� form a basis of the row space. and f is a kdimensional subvector of h . these are Xl 2: 0. We provide the proof for the case where i 1 = 1 . x 2: O} where D is a k x n submatrix of A. x 2: O} be a nonempty polyhe dron. m.5 is in standard form. We will now show that Q C P. i. that for any i = 1 . Let x be an element of P and note that Proof. . .5 Let P = {x I Ax = Then Q = P. with rows a� . Q = {x I Dx = f. The general case can be reduced to this one by rearranging the rows of A. . the row space of A has dimension k and the rows a� . the first k rows of A are linearly independent . Consider the polyhedron Theorem 2. 3 Polyhedra in standard form 57 are adjacent: we have n = 7 and a total of six common linearly independent active constraints. The full row rank assumption on A We close this section by showing that the full row rank assumption on the matrix A results in no loss of generality. every row a� of A can be expressed in the form a� = 2:7= 1 Aij aj . a� . . We will show that it belongs to P. 2. h. Therefore. j= l j= l which establishes that y E P and Q C P. . Clearly P C Q since any element of P automatically satisfies the constraints defining Q. . . . Suppose that rank(A) = k < m and that the rows a� l ' · . We conclude that as long as the feasible set is nonempty.
we still have the same polyhedron. 0) and ( 1 . 6. in three dimensions. three of them linearly independent. and X3 ::::: o. 2) is a degenerate basic feasible solution. we will now develop some more intuition. This is because the last two rows ( 1 . Definition 2. but the first row is equal to the sum of the other two. Xl S 4. X3 S S S S 8 12 4 6 > o. in n dimensions. 2 The geometry of linear programming Example 2. X2 S 6. Example 2. 1 . X l + X2 + 2X3 S 8. we say that we have a degenerate basic solution.10 more than n basic solution x E lRn is said to be degenerate if of the constraints are active at x. 1 ) are linearly independent. 0. at a basic solution. . 0) is a nondegenerate basic feasible solution. X2 + 6X3 S 1 2 .4 Consider the polyhedron P defined by the constraints Xl + X2 + 2X3 X2 + 6X3 Xl X2 X l . namely. ) In this case. 9 for an illustration. In other words.4 Degeneracy According to our definition. because there are exactly three active and linearly independent constraints. The vector x = (4. ( Of course. because there are four active constraints. the number of active constraints is greater than the minimum necessary. The vector x = (2. at a degenerate basic solution. a degenerate basic solution is at the intersection of four or more planes. no more than n of them can be linearly independent . see Figure 2 . and X 2 ::::: o. we must have n linearly independent active constraints.3 Consider the ( nonempty) polyhedron defined by the constraints 2X I + X2 + X3 X l + X2 2 1 1 The corresponding matrix A has rank two. the first constraint is redundant and after it is eliminated. A In two dimensions. 0. This allows for the possibility that the number of active constraints is greater than n . Thus. It turns out that the presence of degeneracy can strongly affect the behavior of linear programming algorithms and for this reason. 2. a degenerate basic solution is at the intersection of three or more lines. X2 . namely.58 Chap. X l + X2 + 2X3 S 8.
6) .Sec. To calculate the corresponding basic solution. By intro • • • ducing the slack variables X4 . where 1 1 o 1 2 6 0 0 1 o o o 0 1 0 0 o o 1 o Consider the basis consisting of the linearly independent columns AI . A3 . This leads us to the next definition which is a special case of Definition 2. . 2. 0. and X6 to zero. X7 . . = Example 2. having more than n active constraints is the same as having more than n . x � O } .4. 2 . 0. X5 . Therefore.5 Consider once more the polyhedron of Example 2.9: The points A and C are degenerate basic feasible solutions. because we have a total of four variables that are zero. the m equality con straints are always active. A7. Let m be the number of rows of A. . The points B and E are nondegenerate basic feasible solutions. The vector x is a degenerate basic solution if more than n . A 2 . we first set the nonbasic variables X4 . and then solve the system Ax = b for the remaining variables. 0.m of the components of x are zero. . whereas . 0. to obtain x = (4. The point D is a degenerate basic solution. Definition 2 . Degeneracy in standard form polyhedra At a basic solution of a polyhedron in standard form. we can transform it into the standard form p = { x = (X l . 10. X7 ) I Ax = b .m variables at zero level. This is a degenerate basic feasible solution. 1 1 Consider the standard form polyhedron P = { x E 3?n I Ax b. . 4 Degeneracy 59 B ( a) (b) Figure 2. x � O} and let x be a basic solution.
the constraint X 2 � 0) with equality. 1 8 ) . 0. there are examples of degenerate basic solutions to which there corresponds only one basis. Thus. Exercise 2 . . We pick a basic solution by picking n linearly indepen dent constraints to be satisfied with equality.::: 0 are active. At a nondegenerate basic solution. and there are usually several ways of choosing which n . A3 . Figure 2. exactly n . 0. there are several bases corresponding to that same basic solution. Also. the corresponding variables are nonbasic.::: 0 are active. ) Degeneracy i s not a purely geometric property We close this section by pointing out that degeneracy of basic feasible solu tions is not. Consider now the basis consisting of the linearly independent columns AI . (This discussion refers to the typical case. 10: Small changes in the constraining inequalities can remove degeneracy. in general. 2 The geometry of linear programming n m = 7 4 = 3. In practical prob lems. more than n . and A7 . If the entries of A or b were chosen at random.m of the constraints X i . a geometric (representation independent) property. The corresponding basic feasible solution is again x = (4. 0. 6) . and degeneracy is more common than the preceding argument would seem to suggest. we as sume that n . However. 1 1 . 1 0 illustrates that if the coefficients of the active constraints are slightly perturbed. Figure 2 . this would almost never happen. 0.m of the constraints Xi . and we realize that certain other constraints are also satisfied with equality. In the case of a degenerate basic solution. 2. however. A4 . In order to visualize degeneracy in standard form polyhedra. the solution to the system Ax = b turned out to satisfy one more of the constraints (namely. while we initially set only the three nonbasic variables to zero. The preceding example suggests that we can think of degeneracy in the following terms. degeneracy can disappear (cf.m variables to call nonbasic. in that case. see Fig ure 2 .m = 2 and we draw the feasible set as a subset of the twodimensional set defined by the equality constraints Ax = b. the entries of A and b often have a special (nonrandom) structure.60  Chap.
0 . The ba sic feasible solution B is degenerate. X6 . X6 . m = 2 and n . 0) is nondegenerate because only one variable is zero. consider the standard form polyhedron ( cf. 12) We have n = 3 . or to choose Xs . Thus. 1 . The vector (0. 2. X6 as the nonbasic variables. X3 .m) dimensional illustration of degener but rather it may depend on the particular representation of a polyhedron.0 . n = 6 and m = 4. Figure 2 . 1 1 : An (n .12: A n example o f degeneracy i n a standard form problem. for the same basic feasible solution B . X2 . However. Other possibilities are to choose X l . Here. To illustrate this point . Xs . 1 ) Figure 2.Sec. The vector ( 1 . there are three possible bases. We can choose X l . 1 ) is degenerate because two variables are zero. Figure 2. The basic feasible solution A is nondegenerate and the basic variables are X l . the same polyhedron can also be described (0. . 4 Degeneracy 61 acy.m = 1.
m of the variables xi are equal to zero. For example. Hence.0 . In particular. Then. x 2 O}. as argued in Section 2 . while Q does not contain a line and has extreme points. every basic feasible solution is degenerate. X l 2 0. The vector (0. We first observe that not every polyhedron has this property. x 2 O}. Also. X l + X 2 + 2X3 = 2 .9) . X 2 . 2 (cf. 2. if the matrix A has fewer than n rows.62 Chap. We therefore have n + m active constraints and x* is degenerate. Still.b. exactly n . 2 The geometry of linear programming Figure 2. we have n . . where A is of dimensions m x n.m variables set to zero and an additional 2m inequality constraints are satisfied with equality. 1 ) is now a nondegenerate basic feasible solution. X3 2 O}. it can be shown that if a basic feasible solution is degenerate under one particular standard form representation. at the basic feasible solution x* . a halfspace in �n is a polyhedron without extreme points. if n > 1 . in the (nonstandard) form p = { ( X l . under the second representation. then the polyhedron {x E �n I Ax 2 b} does not have a basic feasible solution. Ax 2 . consider a nondegenerate basic feasible solution x* of a standard form polyhedron P = {x I Ax = b. the discussion after Definition 2. We have established that a degenerate basic feasible solution under one representation could be nondegenerate under another representation.5 Existence of ext reme points We obtain in this section necessary and sufficient conditions for a polyhe dron to have at least one extreme point . For another example. because there are only three active constraints.13: The polyhedron P contains a line and does not have an extreme point . Let us now represent P in the form P = {x I Ax 2 b. 19) . X3 ) I X l . then it is degenerate under every standard form representation of the same polyhedron (Exercise 2 .X 2 = 0.
corresponding to the active constraints are linearly independent . a basic feasible solution and. an extreme point . therefore. a basic feasible solution exists. by definition. (b) =? ( a ) We first prove that if P does not contain a line. i E I . There exist n vectors out of the family aI . We then have the following result . Definition 2 . therefore. it follows that as we vary A. If this is not the case. For i E I. . since the polyhedron is assumed to contain no lines. 2. We claim that aj is not a linear combination of the vectors ai . At the point where some constraint is about to be violated. . d is orthogonal to any linear combination of the vectors � . E lRn such that x + Ad E P for bi . On the other hand. I such that aj (x + A*d ) = bj . 14. am . a new constraint must become active. and we conclude that there exists some A * and some j tf. aj d I. . Then.o . = Proof. We need the following definition. the following are equivalent: The polyhedron P has at least one extreme point. I) and aj (x + A*d) = bj (by the definition of A* ) .13. . . we .bj (because j tf. . A geometric interpretation of this proof is provided in Figure 2. i E I. Indeed. Let us consider the line consisting of all points of the form y = x + Ad. The polyhedron P does not contain a line. those constraints that were active at x remain active at all points on the line. where A is an arbitrary scalar. . However. we have aj x I. i E I. lie in a proper subspace of lRn and there exists a nonzero vector d E lRn such that a�d = 0.6 Suppose that the polyhedron P = {x E lRn I a�x ?:: 1 . a� d = 0 for every i E I (by the definition of d) and therefore. then x is. . then it has a basic feasible solution and. we have a�y = a�x + Aa�d = a�x = bi . Since d is not orthogonal to aj . i E I. m} is nonempty. Thus. see Figure 2. i (a) (b) (c) Theorem 2. If n of the vectors ai . Thus. 1 2 A polyhedron P C lRn contains a line if there exists a vector x E P and a nonzero vector d all scalars A . for every i E I. then all of the vectors � . which are linearly independent. 5 Existence of extreme points 63 It turns out that the existence of an extreme point depends on whether a polyhedron contains an infinite line or not. some constraint will be eventually violated.Sec. Let x be an element of P and let I = {i I a�x = bi } .
(a) =} (c) ( c ) =} (b) Suppose that n of the vectors ai are linearly independent and. let us assume that a l . If P has an extreme point x.A. a basic solution. this implies that d = o. an are linearly independent. i E I. We repeat this procedure until we end up with n linearly independent active constraints. Suppose that P contains a line x + . by definition. . the number of linearly independent active constraints has been increased by at least one. are linearly independent . n . Similarly. as many times as needed.Ad. the number of linearly independent active constraints has increased by at least one. where d is a nonzero vector. we can violate the constraint by picking .Ad ) � bi for all i and all . . The orem 2. . This is a contradiction 0 and establishes that P does not contain a line. at which point we have a basic feasible solution.64 Chap. We then have a� ( x + . Such a point is. i = 1 . ( If a� d < 0. with the corresponding vectors ai being linearly independent . . a symmetric argument applies if a� d > 0. 2 The geometry of linear programming Figure 2. without loss of generality. At that point. then x is also a basic feasible solution ( cf. . . Notice that a bounded polyhedron does not contain a line.14: Starting from an arbitrary point of a polyhedron. and there exist n constraints that are active at x. By repeating the same argument . .A * d. We then move along that direction until a new constraint is about to be violated. we choose a direction along which all currently active constraints remain active.3) .A very large. We conclude that a� d = 0 for all i. conclude that aj is a not a linear combination of the vectors ai . . we eventually end up with a point at which there are n linearly independent active constraints. . by moving from x to x + .) Since the vectors ai . it is also feasible since we have stayed within the feasible set . Thus.
2. 2. Then.6. . ) Let Q be the set of all optimal solutions. Since p Q x* Figure 2. Then. there exists an optimal solution which is an extreme point of P. Since a polyhedron in standard form is contained in the positive orthant . 7 . Suppose that P has at least one extreme point and that there exists an optimal solution.Sec. These observations establish the following important corollary of Theorem 2. at the expense of a more complicated proof. 6 Optimality of extreme points 65 the positive orthant {x I x � O} does not contain a line. c ' x = v } . Here. Proof. Let P be of the form P = {x E lRn I Ax � b} and let v be the optimal value of the cost c' x. 7 Consider the linear programming problem of minimiz ing c ' x over a polyhedron P.2 Every nonempty bounded polyhedron and every non empty polyhedron in standard form has at least one basic feasi ble solution. Corollary 2. Later in this section. we prove a somewhat stronger result . we can always find an optimal solution within the set of extreme points of the feasible set . it does not contain a line either. Theorem 2. we will now confirm the intuition developed in Chapter 1 : as long as a linear programming problem has an optimal solution and as long as the feasible set has at least one extreme point .6 Optimality of extreme points Having established the conditions for the existence of extreme points.15: Illustration of the proof of Theorem 2 . Q = {x E lRn I Ax � b. (See Figure 2. which is also a polyhedron. Q is the set of optimal solutions and an extreme point x· of Q is also an extreme point of P.15 for an illustration. which we have assumed to be nonempty.
lie in a proper subspace of �n . Our next result is stronger than Theorem 2. since v is the optimal cost .6. Suppose. But this contradicts the fact that x* is an extreme point of Q . in order to derive a contradiction. The proof is essentially a repetition of the proof of Theorem 2 . linearly independent constraints that are active at x. 0 The above theorem applies to polyhedra in standard form. e'x. In addition. Theorem 2. Q has an extreme point . we have some A * > 0 and j tt I such that aj (x + A*d) = bj • We let y = x + A*d and note that e'y < e'x. 1] such that x* = Ay + (1 .A)e'z. It shows that the existence of an optimal solution can be taken for granted. Let us consider the halfline y = x + Ad. the vectors ai . the halfline eventually exits P. This implies that e'y = e'z = v and therefore z E Q and y E Q. or there exists an extreme point which is optimal. 6 .6. It follows that v = e'x* = Ae'y + (1 .x* . The contradiction establishes that x* is an extreme point of P. either the optimal cost is equal to . since they do not contain a line. Then. Then. since x* belongs to Q. If the entire half line were contained in P. and some A E [0.66 Chap. Let x* be an extreme point of Q. i E I. We will show that there exists some y E P which has greater rank and satisfies e'y ::. Since k < n . When this is about to happen. that x* is not an extreme point of P. i E I. such that y I. 2 The geometry of linear programming Q c P. the optimal cost would be . Let us assume that the optimal cost is finite. o. Let I = {i I a�x = bi }. where a� is the ith row of A. _and we can choose some nonzero d E �n orthogonal to every ai . Therefore. where A is a positive scalar. and the rank of y is at least k + 1. Let P = {x E �n I Ax 2": b} and consider some x E P of rank k < n. there exist y E P.x* . Furthermore. we will also make sure that the costs do not increase.00 . Proof.A) Z . as long as the optimal cost is finite. . As in the proof of Theorem 2. We will show that x* is also an extreme point of P. As in the proof of Theorem 2. which we have assumed not to be the case. z I. The difference i s that as we move towards a basic feasible solution. Q contains no lines either. it is optimal. but not more than k. Theorem 2. and since P contains no lines (cf. by possibly taking the negative of d. all points on this halfline satisfy the relations a�y = bi . z E P. Suppose that e'd < O. Suppose that P has at least one extreme point. We will use the following terminology: an element x of P has rank k if we can find k.7. as well as to bounded polyhedra.8 Consider the linear programming problem of minimiz ing e'x over a polyhedron P. aj is linearly independent from ai . Furthermore. we can assume that e'd ::. Therefore.00 .6) . i E I. i E I. e'y 2": v and e'z 2": v.
w is a basic feasible solution) such that e ' w ::. A more elegant argument. the optimal cost is not . e ' x for all x E P. and its main idea is summarized in Figure 2. There is a similar repre sentation of unbounded polyhedra involving extreme points and "extreme rays" (edges that extend to infinity) . On the other hand. This establishes the following corollary. Then. 7 Representation of bounded polyhedra* 67 Suppose now that e ' d = o . 16. . we provide an alternative. wT be the basic feasible solutions in P and let w* be a basic feasible solution such that e' w* ::. based on duality theory. In either case. For a general linear programming problem. 2. The proof that we give here is elementary and constructive. we have e' y = e ' x.3 should be contrasted with what may hap pen in optimization problems with a nonlinear cost function. . either the optimal cost is equal to .8 does not apply directly. By repeating this process as many times as needed. D Corollary 2. Let W I . if the feasible set has no extreme points. .00 or there exists an optimal solution. . 2. We have already shown that for every x there exists some i such that e ' w i ::. by showing that a bounded polyhedron can also be represented as the convex hull of its ex treme points. The result in Corollary 2. and the basic feasible solution w* is optimal. where A is an arbitrary scalar.9 below. in the problem of minimizing l /x subject to x ::::: 1 . since e' d = 0.Sec. we end up with a vector w of rank n (thus. e ' x. any linear programming problem can be transformed into an equivalent problem in standard form to which Theorem 2.7 Represent at ion of b ounded polyhedra* So far. For example. Since P contains no lines. We consider the line y = x + Ad. e ' w i for all i. we are again at a vector y of rank greater than that of x. In this section.3 Consider the linear programming problem of minimiz ing e'x over a nonempty polyhedron. we have found a new point y such that e ' y ::. but an optimal solution does not exist . we have been representing polyhedra in terms of their defining in equalities. will be presented in Section 4.9 and will also result in an alternative proof of Theorem 2. Furthermore. e ' x. then Theorem 2. This representation can be developed using the tools that we already have. It follows that e ' w* ::. and whose rank is greater than that of x.8 does apply. . e ' x.00 . at the expense of a more complicated proof. the line must eventually exit P and when that is about to happen.
i = 1 . ( Recall from Section 1 . . since polyhedra are convex sets. . . If P is zerodimensional. 2 The geometry of linear programming z. These are also extreme points of P. . . that a kdimensional affine subspace is a translation of a kdimensional subspace. Let P = {x E lRn I a�x ?: bi . Let f1 ' . Let 9i = fl xo . . . it consists of a single point . Every convex combination of extreme points is an element of the polyhedron.k linearly independent vectors that are orthogonal to x l . fn . Proof.16: Given the vector Theorem 2. We define the dimension of a polyhedron P C lRn as the smallest integer k such that P is contained in some kdimensional affine subspace of lRn . Using induction on di mension. Then. Let us assume that the result is true for all polyhedra of dimension less than k. we can express the vector u as a convex combination of extreme points of Q. which can be assumed to be of the form where X l . x k . we express it as a convex com bination of y and u.68 Chap. ) Our proof proceeds by induction on the dimension of the polyhedron P. x k are some vectors in lRn . . we only need to prove the converse result and show that every element of a bounded polyhedron can be represented as a convex combination of extreme points.k be n . . . m} be a nonempty bounded kdimensional polyhedron. 5 . P is contained in a kdimensional affine subspace S of lRn . . The vector u belongs to the polyhedron Q whose dimension is lower than that of P.9 A nonempty and bounded polyhedron is the convex hull of its extreme points. Figure 2. This point is an extreme point of P and the result is true. . . . . for . Thus.
.k . 7 Representation of bounded polyhedra* 69 i = 1 . Hence. . the vectors ri . let us choose an arbitrary extreme point y of P and form the halfline consisting of all points of the form z + A ( Z . On the other hand. where Ai are nonnegative scalars that sum to one. n . B y applying the induction hypothesis t o Q and u .k. The set on the right is defined by n . we also have U + A* Y Z = 1 + A* . (2. y must also be an extreme point of P. Note that at an extreme point y of Q. . . Let Z be an element of P.1 (see the discussion at the end of Section 1. .y) .1 . . . . Q has dimension at most k . y E P. n . . . Q since Eq.z = gi = r. therefore. (2.3) Since P e S.k. we have shown that a� * (z . every element x of S satisfies i = 1 . . . Then. Therefore. we have r.5). we find some A * � 0 and U E P. we must have a�y = bi for n linearly independent vectors � . . . which implies that the vector ai* is not a linear combination of. m. such that U = Z + A* (Z .y) < O. Note that Q c {x E �n l a�* x = bi ' r{ X = gi .y is orthogonal to each vector ri .k } . Let Q be the polyhedron defined by {x E P I a�* x = bi* } {X E �n l a�x � bi ' i = l . n . then z is a trivial convex combination of the extreme points of P and there is nothing more to be proved. it is an affine subspace of dimension k . . By considering what happens when this constraint is j ust about to be violated.k + 1 linearly independent equality constraints.Y) . 2. . . . . this halfline must eventually exit P and violate one of the constraints. and is therefore linearly independent from. i = l . the same must be true for every element of P. n . it follows that a� * (z . If z is not an extreme point of P. .Sec. for i = 1 . where A is a nonnegative scalar.y which shows that z . we see that U can be expressed as a convex combination U = L Aiyi of the extreme points y i of Q. a�* x = bi * } ' Since Z .3) holds for every element of P. Since P is bounded. say the constraint a� * x � bi * .y) < 0. . and Since the constraint a� * x � bi* is violated if A grows beyond A* . If z is an extreme point of P. Using the definition of A * .
0) . . · · · . .9. and 4 = (0.6 Consider the polyhedron P= X { (X 1 . . 0.9 asserting that the convex hull of a finite number of points is a polyhedron. ( 1 . 3 3 4 12 X 1 . .l (P) is nonempty. D Example 2. we eventually arrive at the set 11 1 (P) that lRn . but it has some interesting theoretical corollaries.8 Project ions of p olyhedra: FourierMotzkin elimination * In this section. 0) . The key to this method is the concept of a projection. x n ) = (X l " ' " X k ) . If we keep eliminating variables one by one. the projection mapping 7rk : lRn It lR k projects x onto its first k coordinates: 7rk (X) = 7rk (X l . see Figure 2. There is a converse to Theorem 2. 1/4) belongs to P. This result is proved in the next section and again in Section 4. " " X k ) I P c there exist X k+ l . . 1 . We also define the projection I1 k (S) of a set S c lRn by letting I1 k (S) = { 7rk (X) I X E S} .70 Therefore. defined as follows: if x = (X I . X l = (0. Xn s. X3 . X n ) is a vector in lR n and k ::::: n.l (P) c lRn . } Suppose now that we wish to decide whether a given polyhedron is nonempty. X3) I Xl + X2 + X3 S It has four extreme points. (X l " ' " X n ) E S . x 2 = (0. Note that S is nonempty if and only if I1 k (S) is nonempty. This method is not practical because it requires a very large number of steps.17 for an illustration.::: O} . An equivalent definition is I1 k (S) = { (X l . . . X 2 . 2 The geometry of linear programming Z = 1 + A* + � 1 + A * � A*Y Ai • z v .t. 2. we present perhaps the oldest method for solving linear pro gramming problems. 0) . x 3 = 1. namely. we can instead consider the presum ably easier problem of deciding whether I1 n .l . 1 ) . 0 . If we can somehow eliminate the variable X n and construct the set I1 n . It can be represented as 1 1 1 2 1 3 1 4 X = x + x + x + x . . Chap. 1/3. 0. X 2 . The vector x = ( 1 /3. i which shows that is a convex combination of the extreme points of P.
m. involves a single variable. i = 1 . . .17: The projections Il I (S) and Il2 (S) of a rotated threedimensional cube. 8 Projections of polyhedra: FourierMotzkin elimination* 71 Figure 2. 2. .20 deals with a family of examples in which the number of constraints increases exponentially with the problem dimension. We now describe the elimination method. We are given a polyhedron P in terms of linear inequality constraints of the form L aijXj j= l n 2': bi . a large number of constraints is usually added. The main disadvantage of this method is that while each step reduces the dimension by one. . Exercise 2. .Sec. We wish to eliminate X n and construct the projection II n .1 (P) . and whose emptiness is easy to check.
(2 . 7) (2.8) Example 2 .1 defined by the constraints ?Rn 1 .6) is a vector in 2. each di .(xl /2) . 7 Consider the polyhedron defined by the constraints Xl + X2 ?: 1 X l + X2 + 2x s ?: 2 2Xl + 3x s ?: 3 Xl . .1 ) .4x s > 4 2Xl + X2 . if if ai n > 0 and aj n a k n O.1 + (xl /4) ?: X s .(x2 /2) X s ?: 1 . We rewrite these constraints in the form 0 ?: 1 .?: bi in the form we obtain an equivalent representation of P involving the follow ing constraints: n. By letting x = ( X l " ' " X n . Xn dj + fj x > > 0 > di + fix. the set Q is defined by the constraints o ?: 1 .?: di + fix.L aijXj + bi. Xn . . dk + f£ x.1 ain Xn .Xl . if ain > 0.(xl /2) . . if aj n < 0. (2.X s ?: 5 . .1 + x l /4 ?: 1 .X2 .X2 X s ?: 1 .?: . Let Q be the polyhedron in ?Rn .5 . 2 The geometry of linear programming Elimination algorithm 1.(2X l /3) . dk . dj . j=l if ai n =I.(x2 /2) . and each fi' fj . divide both sides by ai n . Rewrite each constraint 2:7= 1 aij Xj . dk + f£ x. fk dj + fj x o .2Xl + X2 ?: X s · Then.?: is a scalar.4) ( 2 .0. 5) (2. Here. m.72 Chap.Xl . if a k n = O. = < 0. i = 1 .
Sec. Notice that for any vector x = (X l . we see that if we apply the elimination al gorithm k times. we also have By generalizing this observation.4)(2. but a proof simpler than the one we gave is not apparent . leading to a polyhedron II 1 ( P ) described by a very large number of constraints. almost all of these constraints will be redundant .l times. . the vector x = (x. . we must . since III ( P ) is onedimensional. and x E Q .(2X l /3) 5 . in general. x n ) satisfies Eqs. but this is of no help: in order to decide which ones are redundant.6) . In particular.I (P ) .1 (P ) of P. (2. x n ) satisfies Eqs.8 Projections of polyhedra: FourierMotzkin elimination* . We will now prove that Q C II n .I ( P) C Q . from which it follows immediately that x satisfies Eqs. it follows that a projection II k (P ) of a polyhedron is also a polyhedron. It follows from Eq. we end up with the set II n .4)(2.2Xl + X2 2 1 .2Xl + X2 > 1 . This fact might be considered obvious. Unfortunately. x n ) E P. if we apply it n . there exists some Xn such that (x. belongs to D the polyhedron P. enumerate them. This shows that IIn . each application of the elimination algorithm can increase the number of constraints substantially. We now restate it in somewhat different language. . Let x E Q.1 ( P ) . therefore.5 . If x E II n . Of course.7)(2. Proof. we end up with III ( P ) .(2Xl /3) .k (P ) . (2. and since the elimination algorithm always produces a polyhedron.7) that Let X n be any number between the two sides of the above inequality.1 + xl /4 2 1 . xn ) . It then follows that (x. . The elimination algorithm has an important theoretical consequence: since the projection Il k ( P ) can be generated by repeated application of the elimination algorithm. . we have Accordingly.6) and. for any polyhedron P.(X2 /2) . 73 Theorem 2. (2.8) .10 The polyhedron Q constructed by the elimination al gorithm is equal to the projection II n . 2. (2.(xl /2) .
Therefore. x E Pl. We have Q = {y E lRm I there exists x E lRn such that Ax = y. X n . Then. . The convex hull of a finite number of vectors X l .74 Chap. be a polyhedron and let A be an matrix. x E P} onto the y coordinates. .4 states that the image of a polyhedron under a linear mapping is also a polyhedron.4 Let P C lRn +k be a polyhedron. . . there D We finally indicate how the elimination algorithm can be used to solve linear programming problems.21) . a polyhedron. Then. and the optimal cost is equal to the smallest element of Q. y) E P} is also a polyhedron. we are left with the set Q= {xo I there exists x E P such that Xo = c'x } . A variation of Corollary 2. . Corollary 2. the set { xE lRn I there exists y E lRk such that (x. 2 The geometry of linear programming Corollary 2. We define a new variable X o and introduce the constraint X o = c'x. Q is the projection of the polyhedron { (x. . . An optimal solution x can be recovered by backtracking (Exercise 2. Proof. the set Q = {Ax I x E P} is also a polyhedron. . Consider the problem of minimizing c ' x subject to x belonging to a polyhedron P. y) E D lRn + m I Ax = y. x k is the image of the polyhedron under the linear mapping that maps fore. . (>'1 . .6 The convex hull of a finite number of vectors is a poly hedron. Corollary 2. . . A k ) to E� l A i Xi and is.5 Let P C lRn m x n Proof. . If we use the elimination algorithm n times to eliminate the variables X l .
In that respect . \. 9 Summary S ummary 75 We summarize our main conclusions so far regarding the solutions to linear programming problems. determine whether it is a poly hedron.9 2. 2. a systematic procedure that moves from one extreme point to another. without having to enumerate all extreme points.Sec. . or basic feasible solution ) is a geometric property. ( b ) Whether or not a point is a basic solution may depend on the way that a polyhedron is represented.00 . ( This can only happen if the feasible set has no extreme points. x 2: 0 . we will exploit the geometry of the feasible set and develop the simplex method. it never happens when the problem is in standard form. ( b ) I f the feasible set i s unbounded. there are the following possibilities: ( i ) There exists an optimal solution which is an extreme point . 2. vertex. there exists an optimal solution which is an extreme point . Furthermore. ( c ) Whether or not a basic or basic feasible solution is degenerate may depend on the way that a polyhedron is represented. ( ii ) There exists an optimal solution. In the next chapter. by enumerating all extreme points and evaluating the cost of each one. 7r /2] . ) ( iii ) The optimal cost is . An interesting aspect of the material in this chapter is the distinction between geometric ( representation independent ) properties of a polyhedron and those properties that depend on a particular representation. Suppose now that the optimal cost is finite and that the feasible set contains at least one extreme point . (a) The set of all (x. we have established the following: ( a) Whether or not a point is an extreme point ( equivalently.1 For each one of the following sets. but no optimal solution is an extreme point .f () E [0. y) E ))( 2 satisfying the constraints x cos () + y sin (} S 1 . the problem can be solved in a finite number of steps. 10 Exercises Exercise 2. ( a ) If the feasible set is nonempty and bounded. Since there are only finitely many extreme points. This is hardly a practical algorithm because the number of extreme points can increase exponentially with the number of variables and constraints. y 2: 0 . there exists an optimal solution.
. Let P and Q be polyhedra in iRn and iRm .76 Chap.. vectors in iRm .3 for constructing basic solutions. An o} .. 2 The geometry of linear programming (b) The set of all x E iR satisfying the constraint x 2 . Ai AiAi ... if f and 9 are as above.3 (Basic feasible solutions in standard fOrIn polyhedra with upper bounds) Consider a polyhedron defined by the constraints Ax = b and o :S x :S u.8x + 15 :S O. Exercise 2.. .. Ex plain what is wrong with this argument. where is a matrix and b is a vector. respectively. and with at most m of the coefficients being nonzero.. Show that the set S = {x E iR I f (x ) :S c} is convex. where is a matrix of dimensions k x n. Let Q = { ( x ..2 Let f iRn .z = b. Q and 9 : Q . Hint: Consider the polyhedron Ai c= A AI.. (a) Let Exercise 2.) (a) If P and Q are isomorphic. We are then tempted to conclude that every nonempty polyhedron has at least one extreme point. P such that g ( J (x» ) = x for all x E P.. show that x is an extreme point of P if and only if f ( x ) is an extreme point of Q.. . A 2: be a collection of (b) Let P be the convex hull of the vectors Ai : . (b) (Introducing slack variables leads to an isoInorphic polyhedron) Let P = {x E iRn I Ax 2: b. .. and f (g ( y» ) = y for all y E Q.. In particular. Exercise 2.. Exercise 2. iR be a convex function and let c be some constant. An {�AiAi I AI. z ) E iRn H I x . and prove an analog of Theorem 2...4 We know that every linear programming problem can be con Exercise 2.4. x 2: 0 .. . We say that P and Q are isomorphic if there exist affine mappings f : P ..5 (ExtreIne points of isoInorphic polyhedra) A mapping f is called affine if it is of the form f( x ) = Ax + b. Show that P A and Q are isomorphic. isomorphic polyhedra have the same shape.6 (Caratheodory's theoreIn) Let Show that any element of C can be expressed in the form L�=l with 2: 0.. (Intuitively.. show that there exists a onetoone correspon dence between their extreme points. We also know that nonempty polyhedra in standard form have at least one extreme point. (c) The empty set.. : n A verted to an equivalent problem in standard form. Provide a procedure analogous to the one in Section 2. z 2: o } . x 2: O } . and assume that the matrix has linearly independent rows.
Show that a basis is associated with the basic solution x if and only if every column Ai . 1 0 Exercises 77 Show that any element of P can be expressed in the form L:� l Ai Ai . it must have an optimal solution which is an extreme point of P. (d) If there is more than one optimal solution. . 2. . x ::::: n and that its rows are linearly independent. (b) The set of all optimal solutions is bounded. Exercise 2.7 Suppose that g�x ::::: hi . Is it true that it corresponds to two or more distinct bases? Prove or give a counterexample. state whether it is true or false. . i E J. Suppose that the matrix A has dimensions Exercise 2. d ' x} over the set P.Sec. . . i = 1. . Is it true that P has at least one basic feasible solution? Exercise 2. provide a proof. (f) Consider the problem of minimizing max { e ' x. where L:� l Ai = 1 and Ai ::::: 0 for all i. provide a counterexample. . then P has at most two basic feasible solutions.8 Consider the standard form polyhedron {x I Ax = b. with k > n. Exercise 2. {x E �n I a�x ::::: bi . If true.12 Consider a nonempty polyhedron P and suppose that for each . . else. Suppose that at a particular basic feasible solution. Exercise 2 . no more than m variables can be positive. . . For each one of the following statements. ( e ) If there are several optimal solutions. Show that the basic solution is degenerate. m x O } . . . ( c ) Suppose that a basic solution is degenerate. x ::::: O } . . is in the basis. with at most m + 1 of the coefficients Ai being nonzero. and assume that the rows o f the matrix A are linearly independent. Let x be a basic solution. i = and assume that the rows of the matrix A are linearly independent . g k . Is it true that there exist exactly (�) bases that lead to this basic feasible solution? Here (�) = k! / ( n! (k . m } and {x E �n I 1 . variable Xi we have either the constraint Xi ::::: 0 or the constraint Xi ::. Exercise 2. (a) If n = m + 1 . and let J = { i I Xi =1= O} . Is it true that there exists an adjacent basic solution which is degenerate? Prove or give a counterexam ple. (a) Suppose that two different bases lead to the same basic solution. (e) At every optimal solution. then there exist at least two basic feasible solutions that are optimal. Suppose that the vectors a I . am span �n . . 1 1 Let P = {x E �n I Ax ::::: b } . there are k active constraints. . O. k} are two representations of the same nonempty polyhedron. If this problem has an optimal solution. x ::::: O } .9 Consider the standard form polyhedron {x I Ax = b.n) ! ) is the number of ways that we can choose n out of k given items.10 Consider the standard form polyhedron P = {x I Ax = b. then there are uncountably many optimal solutions. (b) Consider a degenerate basic solution. Show that the same must be true for the vectors g l . .
78 Chap. . compare the number of equality constraints in two representations of P under which x is degenerate and nondegenerate. x 2: 0.l are linearly independent. (c) Suppose that x is a degenerate basic feasible solution. We define Q = {x E P l a'x = b} .17 Consider the polyhedron {x E )Rn I Ax ::. 0 ::. z ) I Ax + z = b. Show that every extreme point of Q is either an extreme point of P or a convex combination of two adjacent extreme points of P .A)V I ° ::. respectively. Could this b e the feasible set o f a problem i n standard form? I}. .18 Consider a polyhedron P = {x I Ax 2: b } . and let b be some scalar.16 Consider the set { x E )Rn I X l = . . and that the vectors al . . Prove that L = { z E P l a� z = bi .14 Let P be a bounded polyhedron in )Rn . 2 The geometry of linear programming Exercise 2. (Edges joining adjacent vertices) Consider the polyhedron a�x 2: bi . .  Exercise 2. A ::. Given any E > 0. m } . I } be the segment that joins u and v. . ( In particular. z 2: o} in standard form. . . Show that x is degen erate under every standard form representation of the same polyhedron ( in the same space )Rn ) . i = 1. Show that the dimension of P is equal to n . Its dimension is defined as the smallest integer k such that P is contained in some kdimensional affine subspace of )Rn . . Exercise 2. of dimensions m x n. x 2: o} and a nondegenerate basic feasible solution x* .13 Consider the standard form polyhedron P = {x I Ax = b. for the new polyhedron. has linearly independent rows. . count active constraints.l = 0. an . . show that there exists some b with the following two properties: (a) The absolute value of every component of b b is bounded by E . Suppose that the matrix removed. X n ::. (b) Show that the result of part ( a) is false if the nondegeneracy assumption is o } . . i = 1 . .1 } . We introduce slack variables z and construct a corresponding polyhedron { (x. (a) Explain why the dimension of P is at most n m.Ax* ) is a nondegenerate basic feasible solution Exercise 2. b . n . . Then.  . = X n . (b) Suppose that P has a nondegenerate basic feasible solution.m .19* Let P c )Rn be a polyhedron in standard form whose definition involves m linearly independent equality constraints. Show that (x* . n . Exercise 2. u and v are adjacent. x 2: A. b. Let x be an element of P that has exactly m positive components. and that all basic feasible solutions are nondegenerate. (a) Show that x is a basic feasible solution. Exercise 2.15 P = {x E )Rn I Exercise 2. Suppose that u and v are distinct basic feasible solutions that satisfy a� u = a�v = bi . ) Let L = {Au + (1 . (b) Every basic feasible solution in the polyhedron P = {x I Ax 2: b} is nondegenerate.1. . . Hint: Using parts ( a) and ( b ) . . . let a be a vector in )Rn . i = 1 .
( Note that this number increases exponentially with n . but no more than that.8. but was limited to two and threedimensional spaces. (b) Show that every extreme point of P + Q is the sum of an extreme point of P and an extreme point of Q .2. Exercise 2. ( a) Suppose that the number m of constraints defining a polyhedron P is even.2 con straints. 1 1 Notes and sources E Q}. Dines (1918) . where all possible combinations are present. 2. it allows for a unified treatment . (c) Let n = 2P + p + 2 . 2.8 to find the optimal cost in a linear programming problem. in terms of the number of linearly independent active con straints. Show that after p eliminations. . where p is a nonnegative integer. 1 1 Notes and sources 79 Exercise 2. 2.21 Suppose that FourierMotzkin elimination is used in the manner Exercise 2. is not common. ) described at the end of Section 2. 2. because it provides the main bridge between the algebraic and geometric viewpoint . we consider it to be quite central. Consider a polyhedron in Rn defined by the 8 ( �) constraints 1 :s: i < j < k :s: n. Show. Nevertheless. FourierMotzkin elimination is due to Fourier ( 1827) . and shows that there is not much that is special about standard form problems. Let P + Q P. by means of an example. Show how this approach can be augmented to obtain an optimal solution as well. we have at least 2 2P + 2 constraints. y = {x + y I x E ( a) Show that P + Q i s a polyhedron. that the elimination algorithm may produce a description of the polyhedron IIn .22 Let P and Q be polyhedra in Rn . The insight that the same relation goes through in higher dimensions only came in the middle of the nineteenth century. (b) Show that the elimination algorithm produces a description of the one dimensional polyhedron III (P) involving no more than m2 n1 / 2 2 n .Sec. The relation between algebra and geometry goes far back in the history of mathematics. and Motzkin (1936) .l (P) involving as many as m 2 / 4 linear constraints.20 * Consider the FourierMotzkin elimination algorithm. Our algebraic definition of basic ( feasible ) solutions for general poly hedra.
.
3. 3.4.6. 3.9.1. Optimality conditions Development of the simplex method Implementations of the simplex method Anticycling: lexicography and Bland's rule Finding an initial basic feasible solution Column geometry and the simplex method Computational efficiency of the simplex method Summary Exercises Notes and sources 81 .Chapter 3 The simplex method Contents 3.3.7.10. 3. 3.8. 3. 3. 3.5.2. 3.
we should only consider those choices of d that do not immediately take us outside the feasible set . the algorithm terminates and we have a locally optimal solution. always in a cost reducing direction. Exercise 3 . We continue using our previous notation: Ai is the ith column of the matrix A. Clearly. We assume that the dimen sions of the matrix A are m x n and that its rows are linearly independent . and on the associated optimality conditions. Fortunately.82 Chap. local optimality implies global optimality. we search its neighborhood to find a nearby feasible solution with lower cost . and a� is its ith row. a basic feasible solution is reached at which none of the available edges leads to a cost reduction. we consider the standard form problem minimize subject to c' x Ax b x > 0. a locally optimal solution need not be ( globally ) optimal. and we conclude with a discussion of its running time. then there exists a basic feasible solution that is optimal. illustrated in Figure 3 . 3 The simplex method We saw in Chapter 2. this is because we are minimizing a convex function over a convex set ( cf. If no nearby feasible solution leads to a cost improvement . such a basic feasible solution is optimal and the algorithm terminates. Suppose that we are at a point x E P and that we contemplate moving away from x. This leads to the following definition. For general optimization problems. In this chapter. as a function of the dimension of the problem being solved. We also address some difficulties that may arise in the presence of degeneracy. .1 O pt imality condit ions Many optimization algorithms are structured as follows: given a feasible solution. in the direction of a vector d E �n . Throughout this chapter. includ ing the simplex tableau and the revised simplex method. we concentrate on the problem of searching for a direction of cost decrease in a neighborhood of a given basic feasible solution. 3. 1 ) . and we let P be the corresponding feasible set . along the edges of the feasible set . 1 . Eventu ally. The simplex method is based on this fact and searches for an op timal solution by moving from one basic feasible solution to another. in linear programming. We provide an interpretation of the simplex method in terms of column geometry. we provide a detailed development of the simplex method and discuss a few different implementations. In this section. that if a linear programming problem in standard form has an optimal solution.
dB( 2 ) . . if there exists a positive scalar e for which x + ed E P. we obtain dB = B. 3. and that di = 0 for all other nonbasic indices i.l b. Given that we are only interested in feasible solutions. let B ( l ) . Recall now that dj = 1 . and since x is feasible. to a new vector x + ed. 1 : Feasible directions at different points of a polyhedron. and increasing it t o a positive value e. dj = 1 . 1 Optimality conditions 83 Figure 3 . x B( m ) ) of basic variables is given by XB = B. . . we also have A x = b. while keeping the remaining nonbasic variables at zero. . by selecting a nonbasic variable Xj (which is initially at zero level) . B (m) be the indices of the basic variables. m n L Aidi = L AB( i) dB(i) + Aj = BdB + Aj . . where dB = (dB( l) .l Aj . we require A ( x + ed) = b. for the equality constraints to be satisfied for e > 0. we need Ad = o. A vector d E �n is said to be a feasible direction at x.Sec. At the same time. i= l i= l Since the basis matrix B is invertible. Thus. . . 1 ) . AB( m ) ] be the corresponding basis matrix. and di = 0 for every nonbasic index i other than j .1 Let x be an element of a polyhedron P. . . while the vector XB = (x B( l ) ' . . the vector XB of basic variables changes to XB + edB . Algebraically. we have X i = 0 for every nonbasic variable. Then. 0 = Ad = (3. In particular. We consider the possibility of moving away from x. dB( m ) ) is the vector with those components of d that correspond to the basic variables. • • . Definition 3. Let x be a basic feasible solution to the standard form problem. and let B = [AB( l ) .
B B . Definition 3. let B be an associated basis matrix. this is the same as Cj . we can choose Xl and X2 as our basic variables.1 Aj . Thus. the nonnegativity con straint for XB ( i) is immediately violated. We distinguish two cases: (a) Suppose that x is a nondegenerate basic feasible solution. while the corresponding component dB ( i) of dB = B.1 Aj . Since they are linearly independent. and the term C�B. and all other nonbasic variables stay at zero level.2 Let x be a basic solution. when 0 is sufficiently small. The corresponding basis matrix is . (3. if we follow the jth basic direction. see Figure 3. (b) Suppose now that x is degenerate.c.2.C�B. where CB = ( CB (l ) . 2 2 The first two columns of the matrix A are A l = ( 1 . d is a feasible direction. and let CB be the vector of costs of the basic variables. it is possible that a basic variable XB ( i) is zero. If d is the jth basic direction. This quantity is im portant enough to warrant a definition. 3 The simplex method The direction vector d that we have j ust constructed will be referred to as the jth basic direction. Then. We now study the effects o n the cost function i f we move along a basic direction. In particular. then the rate c'd of cost change along the direction d is given by C�dB + Cj . We have so far guaranteed that the equality constraints are respected as we move away from x along the basic direction d. Example 3. and feasibility is maintained. d is not always a feasible di rection. X4 2: O . For an intuitive interpretation. 2) and A2 = ( 1 . we need only worry about the basic variables.1 Consider the linear programming problem minimize subject to C I X I + C2X2 + C3X3 + C4X4 X4 X3 + X2 + Xl + 2XI + 3X3 + 4X4 Xl .j of the variable Xj according to the formula Cj = Cj . In that case.84 Chap. and we are led to infeasible solutions.1 Aj is the cost of the compensating change in the basic variables necessitated by the constraint Ax = h . 1 ) . Then. Indeed. we define the reduced cost (. Cj is the cost per unit increase in the variable Xj . XB > 0.1 Aj is negative. For each j . X2 . " " CB (m ) ) ' Using Eq. 0 ) . X3 . from which it follows that XB + OdB ?: 0. How about the nonnegativity constraints? We recall that the variable Xj is increased.
( ) We set X3 = X4 = 0. 3. The cost of moving along this basic direction is c ' d is the same as the reduced cost of the variable X3 . A basic direction is obtained by increasing X3 . where . while keeping the other nonbasic variable X5 at zero level. 1(3. A basic direction corresponding to an increase in the nonbasic variable X3 . Con sider now the degenerate basic feasible solution F and let X3 .4. the vari ables X l and X3 are at zero level nonbasic and X2 . X2 . we have B. to obtain X l = and X2 = We have thus obtained a nondegenerate basic feasible solution. 2. X5 be the nonbasic variables. The direction of change of the basic variables is obtained using Eq. this basic direction is not a feasible direction. Thus. At the nondegenerate basic feasible solution E . 1/2 3 1/2 1/2 ] [1] [ 3/2 ] ' 3Cl/2 /2 = + c2 + C3 . and solve for Xl . n . X4 . This Consider now Definition 3. We have d3 = and d4 = O. is constructed as follows. 1 ): 1 1. Since B is the matrix [AB( l ) ' " AB( rn) l . in which case. while keeping the other nonbasic variable X3 at zero level.1 [AB( 1 ) ' " AB( rn)l = I.Sec. The first basic direction is obtained by increasing Xl . the edges of the feasible set are associated with the nonnegativity constraints Xi 2: O. Note that X4 is a basic variable at zero level. This is the direction corresponding to the line FG and it takes us outside the feasible set. 1 Optimality conditions 85 Figure 3. 1.2: Let n = 5. This is the direction corresponding to the edge EF. X5 are positive basic variables.m = As discussed in Section we can visualize the feasible set by standing on the twodimensional set defined by the constraint Ax = b.2 for the case of a basic variable.
B B .x. x is optimal. and we see that the reduced cost of every basic variable is zero. Feasibility implies that A x = A y = b and.1 A B(i) = CB(i) . which is the ith unit vector ei . Xj must be a nonbasic variable and Cj is the rate of cost change along the jth basic direction. The latter equality can be rewritten in the form BdB + L Aidi = 0 .L B .1 AB( i) is the ith column of the identity matrix. we must have Xi = 0 and. iEN where N is the set of indices corresponding to the nonbasic variables under the given basis.86 I is the Chap. we obtain feasible solutions whose cost D is less than that of x. In particular. (b) If x is optimal and non degenerate. we have CB(i) = CB(i) .C 'B ei = CB(i) . Therefore. (b) Suppose that x is a nondegenerate basic feasible solution and that Cj < 0 for some j . and let c be the corresponding vector of reduced costs. Thus. Ad = o.1 Ai)di = L Cidi . for every basic variable XB( i) .C . L (Ci . di � 0 and cidi � 0. ( a ) We assume that c � 0 . the jth basic direction is a feasible direction of cost decrease. B.x ) = c ' d � 0. then x is optimal. we let y be an arbitrary feasible solution. Since x is nondegenerate. therefore. then c � o. and x is not optimal. and we define d = y . Our next result provides us with optimality conditions. Since the reduced cost of a basic variable is always zero. By moving in that direction. as discussed earlier. and since y was an arbitrary c ' d = C�dB + L Cidi iEN = .CB(i) = 0. i EN and feasible solution. since y is feasible. this result is intuitive. Given our interpretation of the reduced costs as rates of cost change along certain directions. for all i E N. Theorem 3. we obtain dB = .1 Consider a basic feasible solution x associated with a basis matrix B. 3 The simplex method m x m identity matrix. Proof.C�B . We conclude that c ' (y . ( a ) If c � 0 . Yi � O. i EN i EN For any nonbasic index i E N.1 Aidi . Since B is invertible.
3.1 allows the possibility that x is a (degenerate) optimal basic feasible solution. in the degenerate case. [This is the direction obtained by letting dj 1 . an equally simple computational test for determining whether x is optimal is not available (see Exercises 3. and nonnegativity of the reduced costs. in order to decide whether a nondegenerate basic feasible solution is optimal.2 Development of the simplex method 87 Note that Theorem 3. we need only check whether all reduced costs are nonnegative.2 and 3. we need to satisfy two conditions: feasibility. the reduced cost Cj of a nonbasic variable Xj is negative. This assumption will remain in effect until it is explicitly relaxed later in this section. If x is a degenerate basic feasible solution. 3.] While moving along this direction d. see Exercise 3.l A � 0' .8) .c�B . .1 b � 0.B(1) . which is the same as examining the n m basic directions. According to Theorem 3. di 0 for i =I. the jth basic direction d is a feasible direction of cost decrease.7 and 3. B(m) . if an optimal basis is found.2 D evelopment of the simplex met hod We will now complete the development of the simplex method. and is therefore optimal. 1 that provides conditions under which a basic feasible solution x is a unique optimal solution. and dB B . but that Cj < 0 for some nonbasic index j . . Our main task is to work out the details of how to move to a better basic feasible solution.3 A basis matrix B is said to be optimal if: (a ) B . and ( b) c' c' . j .3. the corresponding basic solution is feasible. On the other hand. If on the other hand. There is an analog of Theorem 3 .6. .Sec. having an optimal basic feasible solution does not necessarily mean that the reduced costs are nonnegative. whenever a profitable basic direction is discovered.1. as developed in subsequent sections. Suppose that we are at a basic feasible solution x and that we have computed the reduced costs Cj of the nonbasic variables. A related view of the optimality conditions is developed in Exercises 3. Let us assume that every basic feasible solution is nondegenerate. the simplex method. = Clearly.1 Aj . Fortunately. If all of them are nonnegative. 1 and assert that a certain ba sic solution is optimal. This leads us to the following definition. the nonbasic variable Xj becomes positive and all other nonbasic = = = . Note that in order to use Theorem 3 . Theorem 3 . . and we stop. manages to get around this difficulty in an effective manner. 1 shows that we have an optimal solution. satisfies the optimality conditions.  Definition 3.
we form the corresponding basic direction. then either Xi is the entering variable and di 1 . the largest possible value of 0 is 0* = { i min ld. then x + Od :2: ° for all 0 :2: 0. X S( i) > 0 for all i . we have C3 = 3. and the equality constraints will never be violated. { i= l.mIn ) <O} •  X S( i) dS( i) ) . The resulting cost change is O* c' d. or else di = O. which is the same as O*Cj . as a consequence of Example 3. Since costs decrease along the direction d. which is d = (3/2. 1 from the previous section. This constraint on 0 must be satisfied for every i with di < O. 0. Suppose that c = (2. 0) . with 9 � O. 0) and recall that the reduced cost C3 of the nonbasic variable X3 was found to be 3cd2 + c2 /2 + C3 . the only component of x that decreases is the first one (because d l < 0) . 1 . 1 . Once we start moving away from x along the direction d. As 9 increases. 2 2 Let us again consider the basic feasible solution x = ( 1 . X2 . In either case. where 0* = max { 0 :2: 0 I x + Od E p} . Thus. Given that Ad = 0 . The largest possible value . because nondegeneracy. Thus. d (  Recall that if Xi is a nonbasic variable.88 Chap. we only need to consider the basic variables and we have the equivalent formula = u ll * = m ldB(. the constraint X i + O di :2: 0 becomes 0 � x i / di . This takes us to the point x + O*d.. it is desirable to move as far as possible. X4 � O. we have A (x + Od) = Ax = b for all 0. X3 . We distinguish two cases: ( a ) If d :2: 0. in which case. ( b ) If di <0 for some i . where 0 :2: O. and we let 0* = 00 . . . Since C3 is negative. 0. We describe this situation by saying that Xj ( or Aj ) enters or is brought into the basis. di is nonnegative. we are tracing points of the form x + Od.2) Note that 0* > 0. 1/2. <O} (_ Xii ) . Thus. (3.. the vector x + Od never becomes infeasible. 3 The simplex method variables remain at zero. x + Od can become infeasible only if one of its components becomes negative. 0) . We now derive a formula for 0* . dealing with the linear programming problem minimize subject to C1 Xl + C2X2 + C3X3 + C4X4 X2 + X3 + Xl + X4 2Xl + 3X3 + 4X4 Xl . and consider vectors of the form x+9d.2 This is a continuation of Example 3 . 0.
(3.Sec. Let £ be a minimizing index in Eq. = Proof.2 Development of the simplex method 89 of () is given by ()* = . thus obtaining the matrix I A B(R . B(m) } of basic indices by a new set {B(I) . Note that the columns A2 and A3 corresponding to the nonzero variables at the new vector y are ( 1 . Once ()* is chosen.l ) A ( m) (3. In particular. i =1= £. ( a) If the vectors AB (i) ' i = 1 . . and X B (R) + () * dB(R) = O. This takes us to the point y = x + 2d/3 = (0. then there exist coefficients AI . we are replacing the set {B(I) . Since Xj = 0 and dj = 1 . in particular. .. . . respectively. Accordingly. therefore. . we have Yj = ()* > O. J.(X I /dl ) = 2/3. . B (i) = { l!(i) . " A m . we take the old basis matrix B and replace A B( e ) with Aj . 4/3. 3) . and are linearly independent. that is.3) I Equivalently. 0) and ( 1 . B(m) } of indices given by � ]. min (_ XB(i(i)) ) dB = () * . B is a basis matrix. Therefore. . m. which suggests that Xj should replace x B(R) in the basis. and Aj are linearly independent and. the variable X3 has entered the basis and the variable Xl has exited the basis. such that L AiAB (i) i =l m = 0. We observe that the basic variable x B( e ) has become zero. (b) The vector y x + ()* d is a basic feasible solution associated with the basis matrix B . . . . not all of them zero.2 ( a) The columns A B(i) . 0 ) .2) . . (3. 2/3. . .mldB(i) <O} d B(R) < 0. . 3. . they form a basis and the vector y is a new basic feasible solution.4) Theorem 3. are linearly dependent. . whereas the nonbasic variable Xj has now become positive. we move to the new feasible solution y = x+()* d. . B(R) _ XdB(£) = { i =l . . . and assuming it is finite.
m. the optimal cost is . B(m) . the new basic feasible solution x + B*d is distinct from x. . 2.1 Aj is equal to dB . . the current basic feasible solution is optimal. 3. Since B* is positive. i =f. the columns AB ( l ) ' . AB ( m ) . If they are all nonnegative.1 Aj . in particular. B . dB ( R) . we have B* = 00. We have therefore accomplished our objective of moving to a new basic feasible solution with lower cost . 3 The simplex method L AiB . .1 AB ( i) .1 Aj is linearly independent from the unit vectors B . . Ay = b.90 which implies that Chap. . e. we start with a basis consisting of the basic columns A B ( l) . B . . .1 B = I.6 for a discussion of the origins of this term) . . On the other hand. and B . Since AB ( i) is the ith column of B . i =f. Its eth entry. and the vectors B . e. the cost of this new basic feasible solution is strictly smaller.1 AB ( i) i= l m = 0. . . . also known as a pivot (see Section 3. and Yi = 0 for i =f. . they are linearly independent and their eth component is zero. choose some j for which Cj < O. . and the algorithm terminates. We have B . i =f. . else. . is nonzero by the definition of e. um) by letting where Aj is the column that enters the basis. We can now summarize a typical iteration of the simplex method.1 Aj are linearly independent . Fur thermore.1 AB ( i) . and the algorithm terminates.1 AB ( i) are also linearly dependent . It follows that y is a basic feasible solution D associated with the basis matrix B . Compute the reduced costs Cj = Cj . = dB ( i) . and an associated basic feasible solution x. To show that this is not the case. . In a typical iteration. are all the unit vectors except for the eth unit vector. ' AB ( m) have just been shown to be linearly independent .C�B . we will prove that the vectors B . In particular. e. If no component of u is positive. . Ui for i = 1 . Compute u = B . An iteration of the simplex method 1.1 Aj for all nonbasic indices j . (b) We have y ?: 0. .00 .1 AB ( i) . it follows that the vectors B . since d is a direction of cost decrease. For our purposes. Thus . B(l) . it is convenient to define a vector u = (U 1 ' .
D = = Theorem 3. and the current basic feasible solution is optimal. 1 have been met . the algorithm must eventually terminate.3 Assume that the feasible set is nonempty and that ev ery basic feasible solution is nondegenerate. = Proof. Theorem 3. If the algorithm terminates because the criterion in Step 3 has been met . d � 0. i I C. then we are at a basic feasible solution x and we have discovered a nonbasic variable Xj such that Cj < ° and such that the corresponding basic direction d satisfies A d 0 and d � O. the cost can be made arbitrarily negative.m . In particular. is guaranteed to exist . Let C be such that () * XB ( R ) /Uf. the values of the new basic variables are Yj ()* and YB ( i) XB (i) . and the optimal cost is . I f the algorithm terminates due t o the stopping criterion i n Step 2. in the nondegenerate case. by taking () arbitrarily large. the cost of every successive basic feasible solution visited by the algorithm is strictly less than the cost of the previous one. Therefore. Since there is a finite number of basic feasible solutions. . The following theorem states that . either the optimal . there are the following two possibilities: (a) We have an optimal basis B and an associated basic feasible solution which is optimal. and c ' d < 0. the simplex method works correctly and terminates after a finite number of iterations. the algorithm moves by a positive amount () * along a direction d that satisfies c ' d < 0. In particular. it shows that for feasible and nondegenerate problems. At each iteration.00 . then the optimality conditions in Theorem 3 .00 . and no basic feasible solution can be visited twice. ) = = The simplex method is initialized with an arbitrary basic feasible solution. and the optimal cost is . (b) We have found a vector d satisfying A d 0 .() * Ui .2 Development of the simplex method 91 4. = AB ( f. the simplex method terminates after a finite number of iterations.' Form a new basis by replacing with Aj • If y is the new basic feasible solution. Then. . for feasible standard form problems. Since c ' d Cj < 0.3 provides an independent proof of some of the results of Chapter 2 for nondegenerate standard form problems.Sec. let () * = XB ( i) i =l. If some component of u is positive. minl ui > O } Ui { . 3. which. B is an optimal basis. At termination. 5. x + ()d E P for all () > 0.
( b ) Even if ()* is positive. It is sometimes maintained that cycling is an exceptionally rare phenomenon. there may be several indices £ that attain the minimum in the definition of ()* . 3 The simplex method cost is .4)] . in which case. P ivot Selection The simplex algorithm. Suppose now that the exact same algorithm is used in the presence of degeneracy.4) . also. As illustrated in Figure 3.2 is still valid. On the other hand. most basic feasible solutions are degenerate. and Theorem 3. a sequence of such basis changes may lead to the eventual discovery of a cost reducing feasible direction. the following new possibilities may be encountered in the course of the algorithm. the others remain in the basis at zero level. in Step 5. (3. a sequence of basis changes might lead back to the initial basis.3)(3. Regarding the choice of the entering column. in which case the algorithm may loop indefinitely. we are free to choose any j whose reduced cost Cj is negative. This happens if some basic variable X B( l ) is equal to zero and the corresponding component d B( l ) of the direction vector d is negative. Eqs. The simplex method for degenerate problems We have been working so far under the assumption that all basic feasible solutions are nondegenerate. its extension to the degenerate case is not as simple. However. Rules for making such choices are called pivoting rules. Nevertheless. by replacing A B( l ) with Aj [ef.8 in Section 2. Basis changes while staying at the same basic feasible solution are not in vain. we can still define a new basis B. Then. the following rules are some natural candidates: . and we are free to choose any one of them.00 .6) . We now discuss the freedom available in this respect . for many highly structured linear program ming problems. Since only one of them exits the basis. after we develop some bookkeeping tools for carrying out the mechanics of the algorithm. ()* can be equal to zero. While the proof given here might appear more elementary. Cycling can be avoided by judiciously choosing the variables that will enter or exit the basis ( see Section 3. or there exists a basic feasible solution which is optimal ( cf. the new basic feasible solution y is the same as x. has certain degrees of freedom: in Step 2. as we described it . and the new basic feasible solution is degenerate. An example of cycling is given in Section 3.3. ( a ) If the current basic feasible solution x is degenerate.3. This undesirable phenomenon is called cycling. and cycling is a real possibility. Theorem 2. it may happen that more than one of the original basic variables becomes zero at the new point x + () * d.92 Chap.
n . Since the reduced cost is the rate of change of the cost function. However. by standing on the twodimensional plane defined by the equality constraints Ax = b. 3. if we perform a change of basis. the actual cost decrease depends on how far we move along the chosen direction. with ( a) Choose a column Aj . ) In particular.Sec. On the other hand. the new nonbasic variables are X5 and X6 . the computational burden at each iteration is larger. then the two corresponding basic directions are the vectors g and f. because we need to compute e* for each column with Cj < O. The available empirical evidence suggests that the overall running time does not improve. In practice. However.2 Development of the simplex method 93 1 2. this rule chooses a direction along which costs decrease at the fastest rate. with Cj < 0. we have e* = O . because it requires the computation of the reduced cost of every variable.m = Figure 3. The basic feasible solution x is degenerate. For large problems. whose reduced cost is the most negative.3: We visualize a problem in standard form. ( b ) Choose a column with Cj < 0 for which the corresponding cost de crease e* I Cj I is largest. If X4 and X5 are the nonbasic variables. This suggests the next rule. ( The direction g is the one followed if X6 is in creased while X5 is kept at zero. This rule offers the possibility of reaching optimality after a smaller number of iterations. and the two basic directions are h and .g . simpler rules are sometimes . For either of these two basic directions. with X4 entering the basis and X6 exiting. even the rule that chooses the most negative Cj can be computationally expensive. we can now follow direction h to reach a new basic feasible solution y with lower cost .
1977) . B ( m ) of the current basic variables.6) . the main difference between alternative implementations lies in the way that the vectors B . 1973) and the steepest edge rule ( Goldfarb and Reid. cycling can be avoided ( cf.) The reduced cost Cj Cj . . and the stepsize ()* are easily computed. such as the smallest subscript rule.1 Aj play a key role. It turns out that by following the smallest subscript rule for both the entering and the exiting column. We form the basis matrix B and compute p' C�B . the direction of motion. If these vectors are available. it is important to keep the following facts in mind ( cf. choose one with the smallest subscript . once a negative reduced cost is discovered. It should be clear from the statement of the algorithm that the vectors B . the simplest option is again the smallest subscript rule: out of all variables eligible to exit the basis. . Under this rule. . . that chooses the smallest j for which Cj is negative.1 Aj of any variable Xj is then obtained according to the formula = = = . Finally. = Naive implementation We start by describing the most straightforward implementation in which no auxiliary information is carried from one iteration to the next . Regarding the choice of the exiting column. we discuss some ways of carrying out the mechanics of the simplex method. removing.4) . If B is a given m x m matrix and b E iRm is a given vector. by solving the linear system p'B c� for the unknown vector p. we have the indices B ( l ) . When comparing different implementations. Thus . Section 1 . Computing a matrixvector product Bb takes O ( m 2 ) operations. depending on the rule used for adding. 3. Section 3. ( This vector p is called the vector of simplex multipliers associated with the basis B.3 Implementat ions of the simplex met ho d In this section. or reordering elements of the list . computing an inner product p ' b of two mdimensional vectors takes O ( m ) arithmetic operations.C�B . There are different ways of maintaining such prioritized lists.1 . At the beginning of a typical iteration. 3 The simplex method used.94 Chap. there are methods based on candidate lists whereby one examines the reduced costs of nonbasic variables by picking them one at a time from a prioritized list .1 Aj are computed and on the amount of related information that is carried from one iteration to the next. the reduced costs. computing the inverse of B or solving a linear system of the form Bx b takes O ( m3 ) arithmetic operations. there is no reason to compute the remaining reduced costs. Other criteria that have been found to improve the overall running time are the Devex ( Harris. Finally.
the total computational effort per iteration is O (m3 + mn) . In an alternative implementation. In addition.Sec. After we develop some needed tools and terminology. 13. because we need to form the inner product of the vector p with each one of the nonbasic columns Aj . we will see that this is indeed the case. for certain problems with a spe cial structure. . . when we apply the simplex method to network flow problems. we may have to compute all of the reduced costs or we may compute them one at a time until a variable with a negative reduced cost is encountered. For this approach to be practical.1 and B. the implementation described here is rather inefficient . Let be the basis matrix at the beginning of an iteration and let B = [AB ( 1 ) . An alternative explanation and line of development is outlined in Exercise 3. AB (R . We note that we need O (m3 ) arithmetic operations to solve the sys tems p'B c� and Bu Aj .1 each time that we effect a change of basis.1 Aj are computed by a matrix vector multiplication. Once a column Aj is selected to enter the basis. On the other hand. the linear systems p'B c� and Bu Aj can be solved very fast . Thus. we need an efficient method for updating the matrix B .1 is made available at the beginning of each iteration. = = = = = = Revised simplex method Much of the computational burden in the naive implementation is due to the need for solving two linear systems of equations. We finally determine 0* and the variable that will exit the basis. These two basis matrices have the same columns except that the £th column AB (R) ( the one that exits the basis ) has been replaced by Aj . in general.1 B . 3. we solve the linear system Bu Aj in order to determine the vector u B. .1 Aj . and the vectors c�B. We will see shortly that alternative implementations require only O ( m 2 + mn) arith metic operations. the matrix B . and construct the new basic feasible solution.3 Implementations of the simplex method 95 Depending on the pivoting rule employed.1 contains information that can be exploited in the computation of . computing the reduced costs of all variables requires O (mn) arithmetic operations. in which case this implementation can be of practical interest . we can form the direction along which we will be moving away from the current basic feasible solution. We will revisit this point in Chapter 7. Therefore. At this point. This is discussed next. It is then reasonable to expect that B.1 ) Aj AB ( H 1 ) · · · AB ( m ) ] be the basis matrix at the beginning of the next iteration.
we see that B 1 AB (i) is the ith unit vector ei . Example 3. Then. except for the (i. [� ] . we have  B'B � [ I e1 I 1 ee . : [� ] 0 1 C= 0 11 1 QC = 4 4 . Generalizing Example 3. [� �] . Since B .1 I u I I ee + l I em Ul Ue Urn 1 1 . therefore. 3 The simplex method Definition 3. the sequence of these elementary row operations is the same as leftmultiplication by the invertible matrix QKQK .3 Let Q= and note that 6 In particular.1 B = I. not necessarily square. We conclude that performing a s equence of elemen tary row operations on a given matrix is equivalent to leftmultiplying that matrix by a certain invertible matrix. Using this observation.96 Chap.4 Given a matrix. j) is the same as leftmultiplying by the matrix Q = I + Dij . Suppose now that we apply a s equence of K elementary row oper ations and that the kth such operation corresponds to leftmultiplication by a certain invertible matrix Q k .1 · · · Q 2 Q 1 . multiplication from the left by the matrix Q has the effect of mul tiplying the third row of C by two and adding it to the first row. Q is invertible. we see that multiplying the jth row by (3 and adding it to the ith row (for i =f. where Dij is a matrix with all entries equal to zero. where Q is a suitably constructed square matrix. The determinant of such a matrix Q is equal to 1 and. j)th entry which is equal to (3. The example that follows indicates that performing an elementary row operation on a matrix C is equivalent to forming the matrix QC.3. the operation of adding a constant multiple of one row to the same or to another row is called an elementary row operation.
and the algorithm terminates. .1 and then compute the re duced costs Cj Cj . leftmultiply by Q ) .1 of the basis matrix. (Recall that u£ > 0. = 3.1 Aj . In words. Compute the row vector p ' c'aB . Since the result is the identity. The last equation shows that if we apply the same sequence of row operations to the matrix B .f. we add the fth row times u i / u£ to the ith row. we obtain B .) This replaces Ui by zero. Finally. Let us apply a sequence of elementary row operations that will change the above matrix to the identity matrix. 1 which yields QB.1 and apply the sequence of elementary row oper ations described above.00 . (a) For each i "I. 1. An iteration of the revised simplex method 1 .5 1 2. we divide the third row by We obtain 1 B. = . is to start with B . our objective is to transform the vector u to the unit vector e3 = (0. 1 ) .Sec. and the inverse B .1 B .1 . This replaces u£ by one. If they are all nonnegative. 3.. consider the following sequence of elementary row operations.4 Let B1 = and suppose that f. This sequence of ele mentary row operations is equivalent to leftmultiplying B . . an associated basic feasible solution x. We subtract the third row from the second row. 0. In a typical iteration.1 Aj . We multiply the third row by and add it to the first row. Thus. which we summarize below. and the algorithm ter minates. Compute u B . AB ( m) . • = = current basic feasible solution is optimal. choose some j for which Cj < O.p ' Aj . I f no component o f u i s positive. the . [ ! 2 2� 1 ' 3 3 [ 2� : 2 1 When the matrix B . We conclude that all it takes to generate B . = = Example 3. we ob tain an implementation of the simplex method known as the revised simplex method.3 Implementations of the simplex method = 97 where u B .1 .1 is updated in the manner we have described. the optimal cost is . we have QB .1 (equivalently.1 = 3 .1 B by a certain invertible matrix Q.1 B I. we are adding to each row a multiple of the fth row to replace the fth column u by the fth unit vector e£ . we start with a basis consisting of the basic columns AB ( l ) . (b) We divide the fth row by u£. else. 3. 2. In particular.
we maintain and update the m x (n + 1) matrix with columns B . The first m columns of the result is the matrix B .1 Ai is called the ith column of the tableau. Finally. the rows of the tableau provide us with the coefficients of the equality constraints Blb At the end of each iteration. Given the current basis matrix B . Add to each one of its rows a multiple of the f. we need to update the tableau B . The full tableau implementation We finally describe the implementation of simplex method in terms of the socalled full tableau. instead of maintaining and updating the matrix B . > O } Ui { .l b and B . these equality constraints can also be expressed in the equivalent form which is precisely the information in the tableau. If y is the new basic feasible solution. the element belonging to both the pivot row and the pivot column is called the pivot element.th row of the tableau is called the pivot row.th row to make the last column equal to the unit vector ee .1 Aj corresponding to the variable that enters the basis is called the pivot column. contains the values of the basic variables. the f.[b I A] . . Note that the column B. called the zeroth column. Form the m x (m + 1) matrix [B . If the f. in which case the algorithm has met the termination condition in Step 3) .() * Ui . the values of the new basic variables are Yj = ()* and YB ( i) = XB ( i) . . If some component of u is positive. Here. 3 The simplex method 4. Note that the pivot element is Ue and is always positive ( unless u :"::: 0 .1 A n . This can be accomplished by leftmultiplying the = B .. b e such that ()* = XB ( e J iUe . The column B . In other words. i =J f.1 I u ] .I Ax.th basic variable exits the basis.1 [b I A] 1 and compute B .l . Let f. 6.l b. . minl u. let ()* = XB ( i) i= l .m 5.1 A I " ' " B.1 . The equality constraints are initially given to us in the form b = Ax. The information contained in the rows of the tableau admits the fol lowing interpretation.98 Chap. This matrix is called the simplex tableau. The column u = B . . Form a new basis by replacing A B e e) with Aj .
3. The rule for updating the zeroth row turns out to be identical to the rule used for the other rows of the tableau: add a multiple of the pivot row to the zeroth row to set the reduced cost of the entering variable to zero.3 Implementations of the simplex method 99 simplex tableau with a matrix Q satisfying QB. Let column j be the pivot column. in more detail. this is the same as performing those elementary row operations that turn B .g' [b I Al . The entry at the top left corner contains the value . en I B.1 b B. At the beginning of a typical iteration.e�B. that is. with the exception of the pivot element which is set to one.1 to B . where g' = e�B. We will now verify that this update rule produces the correct results for the zeroth row. after a multiple of the . the zeroth row is equal to [0 I e'l plus a linear combination of the rows of [b I Al . as will be seen shortly. We only consider those i for which Ui is positive.e�XB XB ( l) XB C m ) C1 I B. that is..1 b or. Steps 4 and 5 in the summary of the simplex method amount to the following: XB(i ) /Ui is the ratio of the ith entry in the zeroth column of the tableau to the ith entry in the pivot column of the tableau. ) The rest of the zeroth row is the row vector of reduced costs.e�B. It is customary to augment the simplex tableau by including a top row. As explained earlier. the zeroth row is of the form [0 I e'l . The smallest ratio is equal to ()* and determines f.1 . and row f be the pivot row.e�xB ' which is the negative of the current cost .1 . the structure of the tableau is: . we add to each row a multiple of the pivot row to set all entries of the pivot column to zero.1 A 1 I . (The reason for the minus sign is that it allows for a simple update rule. Hence. to be referred to as the zeroth row. Regarding the determination of the exiting column AB (R) and the stepsize ()* .1 .e�B.Sec. .1 A. .1 A n I .1 A . Note that the pivot row is of the form h' [b I Al . where the vector h' is the fth row of B. e' .1 = B .1 . the vector c' = e' . .1 A B. Hence. Thus.
For each i for which Ui is positive. before the change of basis. the entry in the pivot row for that column is also equal to zero. choose some j for which Cj <O. and p ' �B . 4. Consider now the B ( i ) th column for i =I.£.100 Chap. that row is again equal to [0 I e ' l plus a ( different ) linear combination of the rows of [b I A] . Because B . The column AB ( l ) exits the basis and the column Aj enters the basis. Add to each row of the tableau a constant multiple of the £th row ( the pivot row ) so that Ul ( the pivot element ) becomes one and all other entries of the pivot column become zero. with our update rule. 3 The simplex method pivot row is added to the zeroth row. for some vector p. and the algorithm terminates. that is. Recall that our update rule is such that the pivot column entry of the zeroth row becomes zero. 5. and the algorithm terminates.00 . else.1 Aj . Consider the vector u B . We conclude that the vector p satisfies C B (i) p ' AB (i) 0 for every column AB (i) in = the new basis. A typical iteration starts with the tableau associated with a basis matrix B and the corresponding basic feasible solution x. which is the jth column ( the pivot column ) of the tableau. which is left at zero. the optimal cost is . Examine the reduced costs in the zeroth row of the tableau. Hence. and is of the form [0 I e' l  p ' [b I A ] . 2. We can now summarize the mechanics of the full tableau implemen tation.1 . 3. ( This is a column corresponding to a basic variable that stays in the basis. An iteration of the full tableau implementation 1. compute the ratio XB (i) /Ui ' Let £ be the index of a row that corresponds to the smallest ratio. adding a multiple of the pivot row to the zeroth row of the tableau does not affect the zeroth row entry of that column. ) The zeroth row entry of that column is zero. If they are all nonnegative. the updated zeroth row of the tableau is equal to = = as desired. This implies that � p ' B 0. If no component of u is positive. = . Hence.1 AB (i) is the ith unit vector and i =I.£. cB ( f )  p ' AB ( l ) = Cj  p ' Aj = O. since it is the reduced cost of a basic variable. the current basic feasible solution is optimal.
4: The feasible set in Example 3. 0) with cost . 0) with cost .4.5. 20) is a basic feasible solution and can be used to start the algorithm. 20 The feasible set is shown in Figure 3.5 Consider the problem minimize subject to 12x2 . After introducing slack variables.lOXl X l + 2X2 X2 2Xl + 2Xl + 2X2 Xl . 20. 12x3 12x2 2X2 + 2X3 + X4 + X5 X2 + 2X3 + X6 X3 2X2 + .Sec. B ( l ) = 4. These are A = (0.10) ��� D= c= (0. . and B (3) = 6. 3. X3 2: + + + 0. . 4) with cost . + + + 20 20 20 Note that x = (0. C = (0. The . In particular. D = (10. we obtain the following standard form problem: minimize subject to .0) Figure 3. . Note that we have five extreme points.0. Example 3. Let accordingly. 12x3 2X3 :::.3 Implementations of the simplex method 101 B= (0. 0) with cost 0. 0. X2 .120. 0. and E = (4. 0. 20. 10) with cost . X 6 2: 0. 10.100. E is the unique optimal solution. 20 X3 :::. 20 2X3 :::.120. 4.0. 0.0) (10.lOXl Xl 2Xl 2Xl Xl . B(2) = 5. B = (0.10. 0.136.
4 where we observe that there are four active constraints at point D. 0. We form the ratios XB ( i ) /Ui . 2. and B (3) = 6. these labels are not quite necessary. The reduced cost of Xl is negative and we let that variable enter the basis.102 Chap. This determines the pivot element .1 Ax = B .5 0. exits the basis. 3 The simplex method corresponding basis matrix is the identity matrix I.I b. We know that the column in the tableau associated with the first basic variable must be the first unit vector. we have the following initial tableau: Xl X2 X3 X4 X5 X6 0 20 X5 = . In terms of the original variables Xl . which is X5 . In our current . The second basic variable XB ( 2 ) . X2 . 0) in Figure 3. which are equivalent to the original constraints Ax = b. 2) . Finally.e = 2 . the smallest ratio corresponds to i = 2 and i = 3. Hence. the first basic variable XB ( l ) is X4 . and its value is 20. We subtract the pivot row from the third row. because the basic variable X 6 is equal to zero. Similarly. X3 . we divide the pivot row by 2. XB ( 2 ) = X5 = 20. it follows that X4 is the first basic variable.10 1 2* 2 12 2 1 2 12 2 2 1 0 1 0 0 0 0 1 0 0 0 0 1 20 20 We note a few conventions in the format of the above tableau: the label Xi on top of the ith column indicates the variable associated with that column. 2 . Note that this is a degenerate basic feasible solution. The new basis is given by :8(1 ) = 4. We multiply the pivot row by 5 and add it to the zeroth row. 0 . which we indicate by an asterisk. C�XB = 0 and c = c. To obtain the zeroth row of the initial tableau. 3. We continue with our example. therefore. The pivot column is u = ( 1 . For example. We have mentioned earlier that the rows o f the tableau ( other than the zeroth row) amount to a representation of the equality constraints B .5 0.5 1 The corresponding basic feasible solution is x = (10. Strictly speaking. we have moved to point D = (10. 0) . Once we observe that the column associated with the variable X4 is the first unit vector. i = 1 . 0. B (2) = 1 . we note that CB = 0 and. This leads us to the new tableau: Xl X2 7 X3 X4 X5 X6 100 X4 = 0 0 1 0 2 1* 1 1 0 1 0 0 5 0. We break this tie by choosing . We multiply the pivot row by 1/2 and subtract it from the first row. and XB ( 3 ) = X 6 = 20. 10.5 1 0 0 0 1 10 10 0 1. This agrees with Figure 3.4. 0 . The labels "Xi =" to the left of the tableau tell us which are the basic variables and in what order.
with only two iterations? The answer is no. which would again bring the total to three.3 Implementations of the simplex method 103 example.6 1. After carrying out the necessary elementary row operations. and the cost has been reduced to .6 0. the variables X2 and X3 have negative reduced costs. X4 .4.E in Figure 3.Sec. Could the simplex method have solved the problem by tracing the path A . and therefore at least three basis changes are required. At this point. There is again a tie. .5 * 0 1 0 0 2 1 1 1 4 0.4 0. the simplex method took three changes of basis to reach the optimal solution. which we break by letting £ = 1 .D . there would be a degenerate change of basis at point D ( with no edge being traversed ) . Let us choose X3 to be the one that enters the basis. The initial and final bases differ in three columns. Its optimality is confirmed by observing that all reduced costs are nonnegative.6 0.6 0. The pivot element is again indicated by an asterisk.E.E.6 0.B . if the method were to trace the path A .4 1.4. and it traced the path A . With different pivoting rules.5X2 + X3 + 0 .5 0 0 0 1 10 0 10 In terms of Figure 3. the tableau indicates that the equality constraints can be written in the equivalent form: 10 Xl + 0. 3. 2 .4 0. we obtain the following new tableau: Xl X2 X3 X4 X5 X6 120 X3 = 0 0 1 0 4 1.D . In this example. We now return to the simplex method. X2 is the only variable with negative reduced cost. 1 . a different path would have been traced.4 0.4 4 4 4 We have now reached point E in Figure 3.D .5 1 2. which involves only two edges.1 ) . We bring X2 into the basis.5 1 1. exits the basis. X 6 exits. The pivot column is u = ( 1 .4 0. and the first basic variable. Since U3 < 0. . 5X5 o X2 X3 X5 + X6 . we have moved to point B = (0. for i = 1 .4. 10) .120. With the current tableau. In particular.6 0. and the resulting tableau is: Xl X2 X3 X4 X5 X6 136 X3 = 0 0 1 0 0 0 0 1 0 1 0 0 3. we only form the ratios XB ( i ) / Ui . 0.
6 This example shows that the simplex method can indeed cycle.1/8 1 0 0 0 1 X7 = 1 . ( b ) Out of all basic variables that are eligible to exit the basis.1 5/4 0 X5 1 12 . Xl 3 0 3/4 1/4 * 1/2 0 X2 20 8 12 0 X3 X4 X5 X6 X7 . We then obtain the following sequence of tableaux ( the pivot element is indicated by an asterisk ) : Xl 3 0 0 0 1 0 0 X2 4 32 4* 0 X3 7/2 4 3/2 1 X4 33 36 15 0 X5 X 6 X7 3 4 2 0 0 0 1 0 0 0 0 1 X7 = 1 X l X2 3 0 0 1 0 0 0 0 1 0 X3 2 8* 3/8 1 X4 18 84 .1 /8 0 0 1 0 0 1 0 0 X4 3 21/2 3/16 * 21/2 X5 2 3/2 1/16 3/2 X 6 X7 3 1 .1 /2 0 X 6 X7 1 8 1/4 0 0 0 0 1 X2 = X7 = 0 1 Xl X 2 X3 3 0 0 1 /4 1/8 3/64 . 3 The simplex method Example 3.1/2 1 6 0 1 0 0 0 0 1 0 0 0 0 1 9 3 0 X6 = X7 = 0 1 We use the following pivoting rules: ( a) We select a nonbasic variable with the most negative reduced cost Cj to be the one that enters the basis. we select the one with the smallest subscript.1/2 1 . We consider a problem described in terms of the following initial tableau.104 Chap.
The same sequence of pivots can be repeated over and over. 3.1 /2 1 6 0 1 0 0 0 0 1 0 0 0 0 1 9 3 0 X6 = X7 = 0 1 After six pivots. y � O. = b We implement the simplex method on this new problem.1 /2 1 . and the simplex method never terminates. Comparison of t he full tableau and the revised simplex methods Let us pretend that the problem is changed to minimize subject to c' x + 0 ' Y Ax + Iy x. we had the same feasible solution and the same cost.1 /6 1 0 0 1 0 0 1 0 0 X6 2 3 1/3 * 0 X7 0 0 0 1 . we had 8* = O.Sec. At each basis change.3 Implementations of the simplex method 105 Xl 3 . In particular.7/4 5/4 1/6 0 X7 1 Xl 3 0 3/4 1/4 1/2 0 X2 20 8 12 0 X3 X4 X5 X 6 X7 . for each interme diate tableau.1/4 5/2 X 2 X3 X4 16 56 16/3 56 0 1 0 0 0 0 1 0 X5 1 2* 1/3 2 X 6 X7 1 6 2/3 6 0 0 0 1 X3 = 0 0 1 Xl 3 0 0 = X2 44 28 4 0 X3 X4 X5 1/2 1/2 . the simplex method performs basis changes as if the vector y were entirely .1/2 5/2 . we have the same basis and the same tableau that we started with. Then. except that we never allow any of the components of the vector y to become basic.
Another important element in favor of the revised simplex method is that memory requirements are reduced from O (mn) to O ( m 2 ) .1 . Thus. the amount of com putation per iteration is proportional to the size of the tableau. once the entering variable Xj is chosen. the simplex multipliers p' = C�B . a typical iteration of the revised simplex method might require a lot less work. The revised simplex method uses similar computations to update B1 and C�B. Since m � n. We now discuss the relative merits of the two methods. We can now think of the revised simplex method as being essentially the same as the full tableau method applied to the above augmented prob lem.1 A is never formed explicitly. except that the part of the tableau containing B . The conclusion is that the revised simplex method cannot be slower than the full tableau method.106 Chap. instead. thus eliminating the need for solving the linear system p'B c� at each iteration. the revised simplex method is just a variant of the full tableau method. the worstcase computational effort per iteration is O(mn + m 2 ) = O (mn) . for a total of O(mn) computations per it eration. Note also that the vector of reduced costs in the augmented problem is Thus. which requires O (m) operations.1 B1 In particular. which is O (mn) . It could be counterargued that the memory requirements of the revised simplex method = .C�B1b B1b c' B.c�B . the simplex tableau for the augmented problem takes the form . and since only O(m 2 ) entries are updated. As n is often much larger than m. if the first reduced cost computed is negative. and the corresponding variable is chosen to en ter the basis. the reduced cost of each variable x. Thus. and could be much faster during most iterations. this effect can be quite significant . In addition. by following the mechanics of the full tableau method on the above tableau. 3 The simplex method absent . On the other hand. the pivot column B1 Aj is computed on the fly. with more efficient bookkeeping. if we consider a pivoting rule that evaluates one reduced cost at a time.1 is made available at each iter ation. If the revised simplex method also updates the zeroth row entries that lie on top of B1 ( by the usual elementary operations ) .1A . until a negative reduced cost is found.1 become available. In the worst case. the reduced cost of every variable is computed. under either implementation. The full tableau method requires a constant ( and small ) number of arithmetic op erations for updating each entry of the tableau.i can be computed by forming the inner product p' Aj . In the best case. the total computational effort is only O ( m 2 ) . the inverse basis matrix B . the compu tational requirements per iteration are O(m 2 ) .
We have u = B . it is customary to recompute the matrix B1 from scratch once in a while. is to store a matrix Q such that QB . in most large scale problems that arise in applications. the inverse basis matrix B. Note that Q basically prescribes which elementary row operations need to be applied to B1 in order to produce B .1 = B . we have the option of generating explicitly and storing the new inverse basis matrix B . ) We summarize this discussion i n the following table: Full tableau Memory Worstcase time Bestcase time O (mn) O (mn) O (mn) Revised simplex O (m 2 ) O (mn) O (m 2 ) Table 3 . as is required by the revised simplex method. The efficiency of such reinversions can be greatly enhanced by using suitable data structures and certain techniques from computational linear algebra. where Aj is the entering column. It is not a full matrix.1 is updated according to certain rules.1 is available.Sec. The time requirements refer to a single iteration. For this reason. we need to know what multiple of the pivot row must be added to it . 3. Practical performance enhancements Practical implementations of the simplex method aimed at solving problems of moderate or large size incorporate a number of additional ideas from numerical linear algebra which we briefly mention. Suppose now that we wish to solve the system Bu = Aj for u. ( Note that the sparsity of A does not usually help in the storage of the full simplex tableau because even if A and B are sparse.1 Aj = QB. The first idea is related to reinversion. Recall that at each iteration of the revised simplex method. Each such iteration may introduce roundoff or truncation errors which accumulate and may eventually lead to highly inaccurate results. 3 Implementations of the simplex method 107 are also O (mn) because of the need to store the matrix A. Another set of ideas is related to the way that the inverse basis matrix B1 is represented. which shows that we can first compute . Subsequent to the current iteration of the revised simplex method. and can be completely specified in terms of m coefficients: for each row. An alternative that carries the same information. Suppose that a reinversion has been just carried out and B. the matrix A is very sparse ( has many zero entries) and can be stored compactly.1 .1 . However.1 A is not sparse.1 .1 Aj . B . in general. 1 : Comparison of the full tableau method and revised simplex.
1 explicitly. thus extending Theorem 3. this pivoting rule was de rived by analyzing the behavior of the simplex method on a nondegenerate problem obtained by means of a small perturbation of the righthand side vector b. we discuss anticycling rules under which the simplex method is guaranteed to terminate. The methods discussed in this subsection are designed to accomplish two objectives: improve numerical stability (minimize the effect of roundoff errors) and exploit sparsity in the problem data to improve both running time and memory requirements. Symbolically. 3 The simplex method ementary row operations) to produce u.1 Aj and then leftmultiply by Q (equivalently.C�B.5 A vector u E �n is said to be lexicographically larger (or smaller) than another vector v E 1}?n if u =I. Lexicography B . The same idea can also be used to represent the inverse basis matrix after several simplex iterations. as a product of the initial inverse basis matrix and several sparse matrices like Q. . then there exists an optimal basis. This connection is pursued in Exercise 3.3 to degenerate prob lems. As an important corollary. but B . respectively) . 3." one does not usually compute B.:::: 0 ' . These methods have a critical effect in practice. Subsequent to a "rein version. Besides having a better chance of producing numerically trust worthy results. as opposed to optimization. Definition 3. We start with a definition.15.1 b . that is. The last idea we mention i s the following.1 is instead represented in terms of sparse triangular matrices with a special structure. we write u>v L L or u < v. apply a sequence of el We present here the lexicographic pivoting rule and prove that it prevents the simplex method from cycling.v and the first nonzero component of u .v is positive (or negative. we conclude that if the optimal cost is fi nite.4 Ant icycling : lexicography and B land ' s rule In this section. and for this reason we do not pursue them in any greater depth.108 Chap. a basis satisfying B . These techniques lie much closer to the subject of numerical linear algebra.1 A . Historically.:::: 0 and (5' = c' . they can also speed up considerably the running time of the simplex method.
Lexicographic pivoting rule 1 . which contradicts our standing assumption that A has linearly independent rows. We note that the lexicographic pivoting rule always leads to a unique choice for the exiting variable. therefore. 2). (0. 3 . and the variable XB( 3 ) exits the basis.1 Aj be the jth column of the = L L tableau. 1 . A also has rank less than m. to obtain: 1 /3 * 0 * 5/3 * 1 * 1/3 0 7/9 1 The tie between the first and third rows is resolved by performing a lexicographic comparison. 2. Example 3.Sec. 2. 0) > (0. 2. We divide the first and third rows of the tableau by Ul = 3 and U 3 = 9.7 Consider the following tableau ( the zeroth row is omitted ) . 0) < ( 1 .1 A has rank smaller than m and. divide the ith row o f the tableau ( includ ing the entry in the zeroth column ) by Ui and choose the lexico graphically smallest row. the matrix B. as long as its reduced cost Cj is negative. Indeed. Note that there is a tie in trying to determine the exiting variable because XB ( l) /Ul = 1/3 and XB ( 3 ) /U 3 = 3/9 = 1/3. 1 . two of the rows in the tableau would have to be proportional. respectively. 3. . if this were not the case. 4. Choose an entering column Aj arbitrarily. For each i with Ui > 0. If row f is lexicographically smallest. But if two rows of the matrix B. then the fth basic variable x B (l) exits the basis. and suppose that the pivot column is the third one (j 1 2 3 0 4 0 5 6 7 3 1 9 = 3) . 5. Let u B.4 Anticyc1ing: lexicography and Bland 's rule 109 For example. (0.1 A are proportional. 4) . Since 7/9 < 5/3. 2. the third row is chosen to be the pivot row.
if i =I. Suppose that the lexicographic pivoting rule is followed. we need to add a positive multiple of the pivot row. lexicographi cally positive. 3 The simplex method Theorem 3. the ith row will remain lexicographically positive after this addition. remains lexicographically positive throughout the algorithm. other than the zeroth row.e and U i > O. the new ith row is also lexicographically positive. Suppose that Xj enters the basis and that the pivot row is the . Then: ( a ) Every row of the simplex tableau. other than the zeroth row. other than the zeroth row. ( b ) The zeroth row strictly increases lexicographically at each itera tion. the zeroth row increases lexicographically.eth row is divided by the positive pivot element Uc and.� (old . According to the lexicographic pivoting rule. ( a) Suppose that all rows of the simplex tableau. The lexicographic pivoting rule is straightforward to use if the simplex method is implemented in terms of the full tableau.. In order to zero the (i. consider the ith row for the case where U i > 0 and i =I. j )th entry of the tableau.eth row. no basis can be repeated twice and the simplex method must terminate after a finite number of iterations.eth row) . Proof. we have Uc > 0 and (. Since the latter row is lexicographically positive. Uc U Because of the lexicographic inequality (3 . therefore. Finally.eth row) � (ith row) . It can also be used o .110 Chap. (3 . Since the zeroth row is determined completely by the current basis. c ) The simplex method terminates after a finite number of itera ( tions. Due to the lexicographic positivity of both rows. the . it never returns to a previous value. remains lexicographically positive. 5) Uc Ui To determine the new tableau. 5) . we need to add a positive multiple of the pivot row to the ith row.e. which is satisfied by the old rows. ( c ) Since the zeroth row increases lexicographically at each iteration.. We have (new ith row) = (old ith row) .4 Suppose that the simplex algorithm starts with all the rows in the simplex tableau. are lexicographically positive at the beginning of a simplex iter ation. In order to make it zero. ( b ) At the beginning of an iteration. Consider the ith row and suppose that U i < O. the reduced cost in the pivot column is negative.
Out o f all variables X i that are tied i n the test for choosing an exiting variable. Find the smallest j for which the reduced cost Cj i s negative and have the column Aj enter the basis.1 is formed explicitly ( see Exercise 3 . 16) .Sec. 3. The vector (x. Smallest subscript pivoting rule 1 . in sophisticated implementations of the revised simplex method. This is equivalent to rearranging the tableau so that the first m columns of B . . in the natural order. 2. The resulting tableau has lexicographically positive rows. until a negative one is discovered. In general. Let us assume that an initial tableau is available ( methods for obtaining an initial tableau are discussed in the next section ) . suppose that we are dealing with a problem involving constraints of the form Ax :::. B land 's rule The smallest subscript pivoting rule. the matrix B. We can then rename the vari ables so that the basic variables are the first m ones. Sometimes this is straightforward. and the lexicographic rule is not really suitable. Under this pivoting rule. however.1 A are the m unit vectors. b. For example. We finally note that in order to apply the lexicographic pivoting rule. We can then introduce nonnegative slack variables s and rewrite the constraints in the form Ax + s = b .1 is never computed explicitly. it is known that cycling never occurs and the simplex method is guaranteed to terminate after a finite number of iterations. 5 Finding an initial basic feasible solution 111 in conjunction with the revised simplex method. is as fol lows. 3. also known as Bland's rule. On the other hand. select the one with the smallest value of i. s) defined by x = 0 and s = b is a basic feasible solution and the corresponding basis matrix is the identity. finding an initial basic feasible solution is not easy and requires the solution of an auxiliary linear programming problem. as desired. as will be seen shortly.5 Finding an init ial basic feasible solution In order to start the simplex method. where b ?: O. we need to find an initial basic feasible solution. provided that the inverse basis matrix B. an initial tableau with lexicographically positive rows is required. This pivoting rule i s compatible with an implementation o f the re vised simplex method in which the reduced costs of the nonbasic variables are computed one at a time.
we have accomplished our objectives only partially. XB ( l) ' . ) Let k be the number of columns of A that belong to the final basis (k < m ) and. . We now introduce a vector y E lRm of artificial variables and use the simplex method to solve the auxiliary problem minimize subject to Ax + y b x > 0 y > o. using B as the starting basis matrix. we con clude that the original problem is infeasible.depending on the implementation .the corresponding tableau. If x is a feasible solution to the original problem. and x is a feasible solution to the original problem. without loss of generality. we need a basic feasible solution. yields a zero cost solution to the auxiliary problem. However. an associated basis matrix B . it must satisfy y = 0. that b � o. . . ( In particular. By possibly multiplying some of the equality constraints by . All this is straightforward if the simplex method. We have a method that either detects infeasibility or finds a feasible solution to the original problem. terminates with a basis matrix B consisting exclusively of columns of A. . if the optimal cost in the auxiliary problem is nonzero. assume that these are the columns AB ( l) . we can assume. we have a basic feasible solution and the corresponding basis matrix is the identity. Therefore. We can simply drop the columns that correspond to the artificial variables and continue with the simplex method on the original problem. If on the other hand. ( Since the final value of the artificial variables is zero. 3 The simplex method c' x Ax b x > o. At this point .112 Consider the problem minimize subject to Chap. this choice of x together with y = 0. ' XB (k ) are the only variables . without loss of generality.1 . applied to the auxiliary problem. we obtain a zero cost solution to the auxiliary problem. this implies that we have a degenerate basic feasible solution to the auxiliary problem. AB (k ) . but some of the artificial variables are in the final basis. in order to initialize the simplex method for the original problem. the simplex method applied to the auxiliary problem terminates with a feasible solution x* to the original problem. Driving artificial variables out of t he basis The situation is more complex if the original problem is feasible. . Yl + Y2 + · · · + Ym y = Initialization is easy for the auxiliary problem: by letting x = 0 and b. . and .
in which case the above described procedure fails. . . AB ( m ) of A.1 AB ( 1 ) .1 . . This is accomplished in the usual manner: perform those elementary row operations that replace B. We claim that Aj is linearly independent from the columns AB (l ) . Note that the lth row of B . the constraint g ' Ax = g ' h is redundant and can be eliminated ( cf. . . . where g ' is the lth row of B. AB ( k) .5 in Section 2. B. All o f the above can be carried out mechanically. B. .1 Aj ) could be negative. . to obtain a set of m linearly independent columns. in the following manner. To see this. that is. . . ) Note that the columns AB (l ) . . Suppose that the lth basic variable is an artificial variable. all nonbasic components of x * are at zero level. g ' A = 0' for some nonzero vector g. 3.1 A is equal to g ' A. the columns of A span �m .1 Aj is nonzero. AB ( k) must be linearly independent since they are part of a basis. . and we can choose m k additional columns AB ( k+ 1 ) . and it follows that x * is the basic feasible solution associated with this new basis as well. . We repeat this procedure as many times as needed until all artificial variables are driven out of the basis.1 AB (l ) . Hence. this vector is not a linear combination of the vectors B . Let us now assume that the lth row of B . . This means that after the change of basis. but we have reduced the number of basic artificial variables by one. After all. k.5 i n Section 2.1 AB ( k) . note that B.1 AB ( i) = ei . With this basis. and the matrix A has linearly dependent rows. Since we are dealing with a feasible problem. constructing a basis for � m using the columns of A is impossible and there exist redundant equality constraints that must be eliminated. Aj is not a linear combination of the vectors AB (l ) . Under our standard assumption that the matrix A has full rank. Thus.  . At this point .Sec. . Since this constraint is the information provided by the lth row of the tableau. the artificial variables and the corresponding columns of the tableau can be dropped. . Since the lth entry of B. we must also have g ' b = o. . .1 Aj by the lth unit vector.5 Finding an initial basic feasible solution 113 that can be at nonzero level. The procedure we have just described is called driving the artificial variables out of the basis . Theorem 2. we are still at the same basic feasible solution to the auxiliary problem. . AB ( k) . if A has rank less than m. The only difference from the usual mechanics of the simplex method is that the pivot element ( the lth entry of B. i n terms of the simplex tableau. the lth entry of these vectors is zero. We examine the lth row of the tableau and find some j such that the lth entry of B. Because the lth basic variable was zero. Equivalently. . . and since k < l. i = 1 .1 A is zero.3) . . a basis consisting exclusively of columns of A. and depends crucially on the assumption that the matrix A has rank m. It follows that the lth entry of any linear combination of the vectors B. which proves our claim.1 Aj is nonzero. We now bring Aj into the basis and have the lth basic variable exit the basis. . which is in the basis at zero level. . . adding a multiple of the lth row to the other rows does not change the values of the basic variables. . .3. .1 AB ( k) is also equal to zero. we can eliminate that row and continue from there. as described by Theorem 2.
which is . . 1 ) . . We evaluate the reduced cost of each one of the original variables Xi . X7. Xs � o.8 Consider the linear programming problem: minimize subject to Xl + X 2 Xl + 2X 2 Xl + 2X 2 4X 2 X3 3X3 6X3 9X3 3X3 + X4 Xl .18 3 X4 X5 X 6 X7 X s 0 0 0 0 1 0 1 1 0 1 0 0 1 0 0 0 1 1 0 0 0 1 0 0 4 0 6 9 3* 0 0 0 0 0 0 . X4 � o. . . The basis matrix B is still the identity and only the zeroth row of the tableau changes.1 14 Chap.C � A i . . . X 6 . The corresponding basis matrix is the identity. 2 . 1) . . + + + + 3 2 5 1 In order to find a feasible solution. Furthermore. 3 2 5 1 A basic feasible solution to the auxiliary problem is obtained by letting ( X5 .10 3 2 5 1 X2 8 2 2 X3 . 5. we have C B = ( 1 . We obtain: Xl . 3 The simplex method Example 3. 1 . . xs ) = b = (3. we form the auxiliary problem minimize subject to X5 + X 6 + X7 + Xs X l + 2X 2 + 3X3 + X5 X l + 2X 2 + 6X3 + X6 4X 2 + 9X3 + X7 3 X3 + X4 + Xs X l . and form the initial tableau: Xl 11 3 2 X2 8 2 2 X3 21 3 X4 1 X5 X 6 X7 Xs 0 1 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1* X7 = Xs = 5 1 0 0 4 0 6 9 3 0 0 0 0 0 0 We bring X4 into the basis and have Xs exit the basis. 1 .
3. . . Note that this is a degenerate pivot with (J* = O. We obtain the following Xl X 2 X3 X4 Xs 2 1/2 1/4 1 0 1 1/2 0 1 0 0 1 0 1/3 0 0 0 0 0 0 0 0 0 1 0 1/2 3/4 X 6 X7 2 . 4. at zero level.5 Finding an initial basic feasible solution 115 We now bring X3 into the basis and have X4 exit the basis. is zero. because it corresponds to a redundant constraint. and also remove all of the artificial variables. we need to drive X7 out of the basis.1/2 1/4 1 Xs 1 1/2 3/4 0 1/3 0 0 0 1 0 1/3 0 0 0 Note that the cost in the auxiliary problem has dropped to zero. However. indicating that we have a feasible solution to the original problem. The new tableau is: Xl 4 0 8 1 1 2 2* 4 X2 X3 Xs = 2 0 2 1/3 0 0 0 0 0 0 1 X4 Xs X 6 X7 Xs 6 0 1 1 2 3 1/3 0 0 1 0 0 0 0 0 0 0 1 7 1 2 3 1/3 0 0 0 We now bring X 2 into the basis and X 6 exits. .1 Aj . This leaves us with the following initial tableau for the . In order to obtain a basic feasible solution to the original problem. j = 1. At this point. Note that X7 is the third basic variable and that the third entry of the columns B . we remove the third row of the tableau. The new tableau is: Xl 4 4 2* 1/2 2 X 2 X3 Xs = X2 = X7 = We now have tableau: 2 0 0 1 0 2 1/3 0 0 0 0 0 0 0 1 X4 X s 2 1 1 1 1/3 0 1 X 6 X7 4 1 1/2 2 Xs 1 1 1 1 1/3 0 0 0 0 0 0 1 0 0 X l enter the basis and X s exit the basis. the artificial variable X7 is still in the basis. . associated with the original variables. This indicates that the matrix A has linearly dependent rows.Sec.
if the eth entry of the jth column is nonzero. . Otherwise.116 original problem: Chap. . the artificial variables and the cor responding columns are eliminated. .::: O . and start executing the simplex method on the original problem. j = 1 . If all of these entries are zero. fill in the ze roth row of the tableau. if necessary. apply a change of basis ( with this entry serving as the pivot . a feasible solution to the original problem has been found. . If the eth basic variable is an artificial one. If the optimal cost in the auxiliary problem is zero. Phase I: 1. . We observe that in this example. the artificial variable Xg was unnecessary. we can always let that variable be in the initial basis and we do not have to associate an artificial variable with that constraint . n. 4. Instead of starting with Xg = 1 . 3 The simplex method X l X 2 X3 * * * * X4 * Xl = 1 1/2 1 0 0 0 1 0 0 0 1 1/2 3/4 1/3 X3 = 1/3 We may now compute the reduced costs of the original variables. 2. . and apply the simplex method to the auxiliary problem with cost L:: Yi . More generally. The twophase simplex method We can now summarize a complete algorithm for linear programming prob lems in standard form.1 . Introduce artificial variables Y1 . we could have started with X4 = 1 thus elimi nating the need for the first pivot . examine the eth entry of the columns B . the eth row represents a redundant constraint and is elimi nated. . Ym . the orig inal problem is infeasible and the algorithm terminates. 5. 1 3. If no artificial variable is in the final basis. change the prob lem so that b . . If the optimal cost in the auxiliary problem is positive. whenever there is a variable that appears in a single constraint and with a positive coefficient ( slack variables being the typical example ) .1 Aj . and a feasible basis for the original problem is available. By multiplying some of the constraints by .
5 Finding an initial basic feasible solution 117 element ) : the £th basic variable exits and Xj enters the basis. all of the artificial variables are eventually driven to zero ( E�ercise 3. if the original problem is feasible and its optimal cost is finite. We can leave M as an undetermined parameter and let the reduced costs be functions of M. this is detected at the end of Phase I. or luck ) . this is detected and corrected at the end of Phase I. by eliminating redundant equality constraints. 3. an anticycling rule. Whenever M is compared to another number ( in order to determine whether a reduced cost is negative ) . The above twophase algorithm is a complete method. 2. ( b ) If the problem is feasible but the rows of A are linearly dependent . this is detected while running Phase II. Compute the reduced costs of all variables for this initial basis. there is no reason for fixing a numerical value for M. one of the following possibilities will materialize: ( a ) If the problem is infeasible.26) . ( d ) Else. using the cost coefficients of the original problem. As long as cycling is avoided ( due to either nondegeneracy. M will be always treated as being larger. For a sufficiently large choice of M.Sec. Repeat this operation until all artificial variables are driven out of the basis. and where Yi are the same artificial variables as in Phase I simplex. 3. ( c ) If the optimal cost is equal to 00. Let the final basis and tableau obtained from Phase I be the initial basis and tableau for Phase II. in the sense that it can handle all possible outcomes. that combines the two phases into a single one. . the bigM method. Phase II terminates with an optimal solution. Phase II: 1 . In fact . Apply the simplex method to the original problem. The bigM method We close by mentioning an alternative approach. which takes us back to the minimization of the original cost function. The idea is to introduce a cost function of the form j=l i =l where M is a large positive constant .
. We evaluate the reduced cost of each one of the original variables Xi . X7. . 1 ) . . We therefore bring X2 into the basis and X 6 exits. + + + + 3 2 5 1 Furthermore. and form the initial tableau: A basic feasible solution to the auxiliary problem is obtained by letting (X5 . The new tableau is: Xl 4M .C� A i . 0) . X4) = b = (3. X4 2: o. . The corresponding basis matrix is the identity. . 5. . which is Ci . . Note that this is a degenerate pivot with e* = o.1/3 1 1 1 X2 8M + l 2 2* X3 0 0 0 0 1 X4 X5 X 6 X7 6M . Xl .18M + 1 3 X4 X5 X 6 X7 0 0 0 0 1 0 1 0 0 1 0 0 0 1 X5 = X6 = X7 = X4 = 3 2 5 1 0 0 4 0 6 9 3* 0 0 0 0 0 0 The reduced cost of X3 is negative when M is large enough. we need to multiply the pivot row by 6M .9 We consider the same linear programming problem as in Exam minimize subject to Xl + X 2 + Xl + 2X 2 + Xl + 2X 2 + 4X 2 + X3 3X3 6X3 9X3 3X3 + X4 Xl . Note that in order to set the reduced cost of X3 to zero.118 Chap. 3 2 5 1 We use the bigM method in conjunction with the following auxiliary problem.8: Example 3.1/3 1 2 3 1/3 0 1 0 0 0 0 0 1 0 0 0 1 X5 = X6 = X7 = X3 = 2 0 2 1/3 0 0 4 0 0 0 0 The reduced cost of X 2 is negative when M is large enough.10M 1 1 1 X2 8M + l 2 2 X3 . M.1/3 and add it to the zeroth row. X 6 . 3 The simplex method ple 3. M. 2 . minimize subject to Xl + X2 Xl + 2X 2 Xl + 2X 2 4X2 + MX5 + MX 6 + MX7 X3 X5 3X3 + X6 6X3 + X7 + 9X3 3X3 + X4 Xl . we have CB = (M. We therefore bring X3 into the basis and have X4 exit. X7 2: o. in which the unnecessary artificial variable X s is omitted. . .
Sec.
3. 5
Column geometry and the simplex method
119
The new tableau is:
Xl
1 4M  3 2 3 4M + 2 2*  1/2 2
X 2 X3
X4 X5
2 2M + 3 1 1 1 1/3
X 6 X7
1 4M  2 1 1/2 2
0 0
1
0 0 0 0
1
0
1
0 0 0
1
X2 = X7 =
We now have
0
2 1/3
0
X l X 2 X3
0 0
X4
 1/12 1/2 3/4
0 0 0
0
X 6 X7
0
X l enter and X5 exit the basis. We obtain the following tableau: X5
2M  3/4 1/2 1/4 1  1 1/6 1
0
1
0 0
1
X2 = X7 =
We now bring
1/2
0
1/3
0 0 0
0 0
0 0 0 0
1
2M + 1/4 1/2 1/4 1
0 0 0
1
0
1/3 *
0
X5
2M  3/4 1/2 1/4 1
0
X 6 X7
2M + 1/4  1/2 1/4 1
0
X4 into the basis and X3 exits. The new tableau is: Xl X2
7/4 1/2
X3 X4
1/4 3/2 9/4
0
1
0 0
1
X2 = X7 = X4 =
5/4
0
1
0 0 0
0 0
0
3
0 0 0 0
1
0 0 0
1
0
0
0
With M large enough, all of the reduced costs are nonnegative and we have an optimal solution to the auxiliary problem. In addition, all of the artificial variables have been driven to zero, and we have an optimal solution to the original problem.
3.6
Column geomet ry and t he simplex met ho d
In this section, we introduce an alternative way of visualizing the workings of the simplex method. This approach provides some insights into why the
120
Chap.
3
The simplex method
simplex method appears to be efficient in practice. We consider the problem minimize subject to
c'x Ax b e'x 1 x > 0,
(3.6)
where A is an m x n matrix and e is the ndimensional vector with all components equal to one. Although this might appear to be a special type of a linear programming problem, it turns out that every problem with a bounded feasible set can be brought into this form ( Exercise 3.28) . The constraint e'x = 1 is called the convexity constraint. We also introduce an auxiliary variable z defined by z = c' x. If AI , A 2 , . . . , A n are the n columns of A, we are dealing with the problem of minimizing z subject to the nonnegativity constraints x � 0, the convexity constraint L:�= 1 Xi = 1, and the constraint
In order to capture this problem geometrically, we view the horizontal plane as an mdimensional space containing the columns of A, and we view the vertical axis as the onedimensional space associated with the cost components Ci . Then, each point in the resulting threedimensional space corresponds to a point (Ai , Ci ) ; see Figure 3.5. I n this geometry, our objective i s t o construct a vector (b, z ) , which is a convex combination of the vectors (Ai , Ci ) , such that z is as small as possible. Note that the vectors of the form (b, z ) lie on a vertical line, which we call the requirement line, and which intersects the horizontal plane at b. If the requirement line does not intersect the convex hull of the points (Ai , Ci ) , the problem is infeasible. If it does intersect it , the problem is feasible and an optimal solution corresponds to the lowest point in the intersection of the convex hull and the requirement line. For example, in Figure 3.6, the requirement line intersects the convex hull of the points (Ai , Ci ) ; the point G corresponds to an optimal solution, and its height is the optimal cost . We now need some terminology.
Definition 3.6 l ( a ) A collection of vectors y , . . . , y k+ l in �n are said to be affinely independent if the vectors y l y k+ 1 , y 2 _ y k+ 1 , . . . , y k y k+ 1
_ _
( b ) The convex hull of k + 1 affinely independent vectors in called a kdimensional simplex.
are linearly independent. (Note that we must have k :::; n. )
�n is
Sec.
3. 6
Column geometry and the simplex method
121
z
Figure 3.5: The column geometry.
Thus, three points are either collinear or they are affinely independent and determine a twodimensional simplex (a triangle) . Similarly, four points either lie on the same plane, or they are affinely independent and determine a threedimensional simplex (a pyramid) . Let us now give an interpretation of basic feasible solutions to prob lem (3.6) in this geometry. Since we have added the convexity constraint , we have a total of m + 1 equality constraints. Thus, a basic feasible solution is associated with a collection of m + 1 linearly independent columns (Ai , 1) o f the linear programming problem (3.6) . These are i n turn associated with m + 1 of the points (Ai , Ci ) , which we call basic points; the remaining points (Ai , Ci ) are called the nonbasic points. It is not hard to show that the m + 1 basic points are affinely independent (Exercise 3.29) and, therefore, their convex hull is an mdimensional simplex, which we call the basic simplex. Let the requirement line intersect the mdimensional basic simplex at some point (b, z ) . The vector of weights X i used in expressing (b, z ) as a convex combination of the basic points, is the current basic feasible solution, and z represents its cost . For example, in Figure 3.6, the shaded triangle CDF is the basic simplex, and the point H corresponds to a basic feasible solution associated with the basic points C, D, and F. Let us now interpret a change of basis geometrically. In a change of basis, a new point (Aj , Cj ) becomes basic, and one of the currently basic points is to become nonbasic. For example, in Figure 3.6, if C, D, F, are the current basic points, we could make point B basic, replacing F
122
z
Chap.
3
The simplex method
B
Figure 3.6: Feasibility and optimality in the column geometry.
( even though this turns out not to be profitable ) . The new basic simplex would be the convex hull of B, C, D, and the new basic feasible solution would correspond to point I. Alternatively, we could make point E basic, replacing C, and the new basic feasible solution would now correspond to point G. After a change of basis, the intercept of the requirement line with the new basic simplex is lower, and hence the cost decreases, if and only if the new basic point is below the plane that passes through the old basic points; we refer to the latter plane as the dual plane. For example, point E is below the dual plane and having it enter the basis is profitable; this is not the case for point B. In fact, the vertical distance from the dual plane to a point (Aj , Cj ) is equal to the reduced cost of the associated variable Xj ( Exercise 3.30) ; requiring the new basic point to be below the dual plane is therefore equivalent to requiring the entering column to have negative reduced cost . We discuss next the selection of the basic point that will exit the basis. Each possible choice of the exiting point leads to a different basic simplex. These m basic simplices, together with the original basic simplex ( before the change of basis ) form the boundary ( the faces ) of an (m + 1) dimensional simplex. The requirement line exits this (m + 1 ) dimensional simplex through its top face and must therefore enter it by crossing some other face. This determines which one of the potential basic simplices will be obtained after the change of basis. In reference to Figure 3.6, the basic
Sec.
3. 6
Column geometry and the simplex method
123
points C, D , F, determine a twodimensional basic simplex. If point E is to become basic, we obtain a threedimensional simplex (pyramid) with vertices C, D, E, F. The requirement line exits the pyramid through its top face with vertices C, D, F. It enters the pyramid through the face with vertices D, E, F; this is the new basic simplex. We can now visualize pivoting through the following physical analogy. Think of the original basic simplex with vertices C, D, F, as a solid object anchored at its vertices. Grasp the corner of the basic simplex at the vertex C leaving the basis, and pull the corner down to the new basic point E. While so moving, the simplex will hinge, or pivot, on its anchor and stretch down to the lower position. The somewhat peculiar terms (e.g. , "simplex" , "pivot" ) associated with the simplex method have their roots in this column geometry.
Example 3 . 1 0 Consider the problem illustrated in Figure 3 . 7 , in which m = 1 , and the following pivoting rule: choose a point (Ai , Ci) below the dual plane to become basic, whose vertical distance from the dual plane is largest. According to Exercise 3.30, this is identical to the pivoting rule that selects an entering variable with the most negative reduced cost. Starting from the initial basic simplex consisting of the points (A3 , C3) , (A 6 , C6 ) , the next basic simplex is determined by the points (A3, C3) , (A5 , C5) , and the next one by the points (A5 , C5) , (A s , cs ) . In particular, the simplex method only takes two pivots in this case. This example indicates why the simplex method may require a rather small number of pivots, even when the number of underlying variables is large.
z
6 3
• •
1
7 next basis
b
Figure 3.7: The simplex method finds the optimal basis after two iterations. Here, the point indicated by a number i corre sponds to the vector (Ai , Ci) .
. by taking steps from one vertex to an adj acent one that has lower cost . 8. The number of iterations in the worst case Although the number of extreme points of the feasible set can increase exponentially with the number of variables and constraints. EXi . 7) (3 . defined by the constraints i = 1 . n. each time improving the value of the cost function. if we choose some E E (0. Once such a polyhedron is available. no The unit cube has 2 n vertices ( for each i . it has been observed in practice that the simplex method typically takes only O (m) pivots to find an optimal solution.needs an exponential number of pivots. . 8) . 3 . . The computational requirements of each iteration have already been dis cussed in Section 3 . Unfortunately. . Furthermore. For example.l .124 3. X i or X i ::. 1 become active ) . 1 . Thus.1 . Xi ::. we call such a path a spanning path. and we do not yet have the desired example. We will describe shortly a family of problems for which an exponential number of pivots may be required. the simplex method always moves from one vertex to an adjacent one. 1 EXi . we may let either one of the two constraints 0 ::. the same is true for the revised simplex method in the worst case.  i = 2 . (3 . 1/2) and consider the perturbation of the unit cube defined by the constraints E ::. Let us now introduce the cost function X n . .1 ::. We now turn to a discussion of the number of iterations.7 Chap. along with a path that visits all vertices. However. It can be constructed according to the procedure illustrated in Figure 3 . the full tableau implementation needs O (mn) arithmetic operations per iteration. Consider the unit cube in �n . then the simplex method . ( b ) the number of iterations. 3 The simplex method Computat ional efficiency of t he simplex met ho d The computational efficiency of the simplex method is determined by two factors: ( a ) the computational effort at each iteration. X l ::.under a pivoting rule that traces this path . Recall that for nondegenerate problems.Half of the vertices of the cube have zero cost and the other half have a cost of . this practical observation is not true for every linear programming problem. We will now describe a polyhedron that has an ex ponential number of vertices. the cost cannot decrease strictly with each move along the spanning path. . . however. . there exists a path that travels along the edges of the cube and which visits each vertex exactly once.
Sec. then the simplex method will require 2n . and following pz in the reverse order. then it can be verified that the cost function decreases strictly with each move along a suitably chosen spanning path. This property persists in the perturbed polyhedron as well. 3. If we start the simplex method at the first vertex on that spanning path and if our pivoting rule is to always move to the next vertex on that path.32 ) . 5 ing programming problem of minimiz Then: ( a ) The feasible set has 2n vertices. ( b ) A spanning path P3 in the threedimensional cube. Thus. We are thus led to the following question: is it true that for every pivoting rule there are examples where the simplex .8 that the first and the last vertex in the span ning path are adj acent . 7). Theorem 3 .8). following path pz in one of them. Consider the linear  We observe in Figure 3.1 pivots. (b) The vertices can be ordered so that each one is adjacent to and has lower cost than the previous one. We summarize this discussion in the following theorem whose proof is left as an exercise (Exercise 3.8: ( a) A spanning path pz in the twodimensional cube. 7 Computational efficiency of the simplex method 125 ( a) (b) Figure 3. with a different pivoting rule. the simplex method could terminate with a single pivot . Notice that this path is obtained by splitting the threedimensional cube into two twodimensional cubes.(3. (c) There exis ts a pi voting rule under which the simplex method requires 2n 1 changes of basis before it terminates. Xn subject to the constraints (3. This construction generalizes and provides a recursive definition of a spanning path for the general ndimensional cube. moving to the other cube.
we are only allowed to jump to an adjacent vertex. Finally. The diameter D (P) of the polyhedron P is then defined as the maximum of d(x. For example.u (n. The diameter of polyhedra and t he Hirsch conj ecture The preceding discussion leads us to the notion of the diameter of a poly hedron P. 3 The simplex method method takes an exponential number of iterations? For several popular pivoting rules. (n.126 Chap. m) = m . ( 2 . see Figure 3. m) = l. This is one of the most important open problems in the theory of linear programming. We define the distance d(x.2. we define D. If .2 = 6. The quantity D. we have D. polyhedra are allowed. ( a) (b) Figure 3. (a) A bounded polyhedron with diameter lm/2J = 4. However. In the next subsection. these ex amples cannot exclude the possibility that some other pivoting rule might fare better. m) is defined similarly. J ' and D. Suppose that from any vertex of the polyhedron. y ) of vertices. except that general.u (2. m) as the maximum of D (P) over all bounded polyhedra in lRn that are represented in terms of m inequality constraints. we address a closely related issue. possibly unbounded. which is defined as follows. (b) An unbounded polyhedron with diameter m .9.9: Let n = 2 and m = 8. Suppose that the feasible set in a linear programming problem has diameter d and that the distance between vertices x and y is equal to d. y ) between two vertices x and y as the minimum number of such jumps required to reach y starting from x. such examples have been constructed. y ) over all pairs (x.
the following bounds are available: �(n. in particular.n + l�J . m) � m . it does not provide any insights as to whether the growth of �u (n. there has been a fair amount of research aiming at an under standing of the typical or average behavior of the simplex method. 3. in order to have any hope of developing pivoting rules under which the simplex method requires a polynomial number of iterations. conjecture has been advanced: Hirsch Conjecture: � (n." Basically. but this is only part of the story. and then take the mathematical expectation of the number of iterations . m) . 1967) that the Hirsch conjecture is false for unbounded polyhedra and. this is not necessarily relevant to the typical behavior of the simplex method. In fact. m ) is polynomial or exponential. if �(n. m) ::. no matter which pivoting rule is used. even though it disproves the Hirsch conjecture for unbounded polyhedra. but the available upper bound grows faster than any polynomial. the following. m) grows with n and m at the rate of some polynomial. Now. The average case behavior of t he simplex method Our discussion has been focused on the worstcase behavior of the simplex method. that �u (n. m) do not grow exponentially fast . m) increases exponentially with n and m. �u (n. m) < m 1+1og2 n = ( 2 n) log2 m . Thus. m) or �u (n. Regarding upper bounds. and an explanation of its observed behavior. In par ticular. we are far from es tablishing the Hirsch conjecture or even from establishing that these quan tities exhibit polynomial growth. it has been established (Kalai and Kleit man. 1 993) that the worstcase diameter grows slower than exponentially. m) and �u (n. m) and �u(n.n . then at least d steps will be required. we must first establish that �(n. much stronger. The main difficulty in studying the average behavior of any algorithm lies in defining the meaning of the term "average. It is known (Klee and Walkup. this implies that there exist examples for which the simplex method takes an exponentially increasing number of steps.Sec. and if y happens to be the unique optimal solution. m) ::. m . Even if every pivoting rule re quires an exponential number of iterations in the worst case. one needs to define a probability distribution over the set of all problems of a given size. The practical success of the simplex method has led to the conjecture that indeed �(n. 7 Computational efficiency of the simplex method 127 the simplex method (or any other method that proceeds from one vertex to an adjacent vertex) is initialized at x . Unfortunately. m) or �u (n. this is the best lower bound known. Despite the significance of �(n. For this reason.
a set of vectors c . and the full tableau implementation. Unfortunately. bi or a� x 2: bi . . b m is given. the lower level details of the simplex method. Abstractly. the simplex method requires no more than n/2 iterations. . relating to the organization of the required computations and the associated bookkeeping. . which is nothing but the simplex method applied to an auxiliary . . Practical implementations of the sim plex method follow our general description of the revised simplex method. the revised simplex method. We have described three different implementations: the naive one. there is no natural probability distribution over the set of linear programming prob lems. . each time reducing the cost . . on the average over those L feasible problems. These are provided by the Phase I simplex algorithm. This linear dependence on the size of the problem agrees with observed behavior. The cornerstones of the simplex method are: ( a ) the optimality conditions ( nonnegativity of the reduced costs ) that allow us to test whether the current basis is optimal. However. Cycling can be avoided if suitable rules for choosing the entering and exiting variables ( pivoting rules ) are applied ( e. This is because in the presence of degeneracy. a change of basis may keep us at the same basic feasible solution.8 S ummary This chapter was centered on the development of the simplex method. We then have 2 m possible linear programming problems. which is a complete algorithm for solving linear programming problems in stan dard form. and suppose that L of them are feasible. but the details are different . . but their mechanics are quite different . the simplex method simply moves from one extreme point of the feasible set to another. because an explicit computation of the inverse basis matrix is usually avoided. . al . and an associated tableau. they are all equivalent . In one such result . when applied to a random problem drawn ac cording to the postulated probability distribution. . 3 The simplex method required by the algorithm.g. am E lRn and scalars bI . until an optimal solution is reached. Nevertheless. Starting the simplex method requires an initial basic feasible solution. with no cost improvement resulting. we introduce either constraint a� x ::. We have seen that degeneracy can cause substantial difficulties. a fair number of positive results have been obtained for a few different types of probability distributions. some empirical evidence is discussed in Chapter 12. .128 Chap. play an important role. . m . For i = 1 . . with equal probabil ity. ( b ) a systematic method for performing basis changes whenever the op timality conditions are violated. 3. in cluding the possibility of nonterminating behavior ( cycling ) . Haimovich ( 1 983) has established that under a rather special pivoting rule. At a high level. Bland's rule or the lexicographic pivoting rule ) . .
3. Show that the set of feasible directions at the point x* is the set { d E Rn Exercise 3.6 ( Conditions for a unique optiIllUIll) Let x be a basic feasible solution associated with some basis matrix B . 0 . Ex* < g. Let x* be an element of S. Exercise 3.::: 0 for every i such that Xi = O. hence the practical usefulness of the method. 1 (Local Illinhna of convex functions) Let f : Rn f+ R be a convex function and let S c Rn be a convex set. Exercise 3. Prove that a vector d E Rn is a feasible direction at x if and only if Ad = 0 and di . The simplex method is a rather efficient algorithm and is incorporated in most of the commercial codes for linear programming. there exists some E > 0 such that f (x* ) ::::. Ex :S g } .::: O} . its observed behavior is a lot better. that is. x . 1 ) . Prove the following: (a) A feasible solution x is optimal if and only if c ' d . Exercise 3. . f (x* ) :S f (x) for all x E S. f (x) for all x E S for which I l x .5 Let vector x = I Ad = 0.::: 0 for every feasible direction d at x.Sec. then x is the unique optimal solution. (b) A feasible solution x is the unique optimal solution if and only if c ' d > 0 for every nonzero feasible direction d at x. Prove the following: (a) If the reduced cost of every nonbasic variable is positive. then the reduced cost of every nonbasic variable is positive. Let x* be an element of P that satisfies Dx* = f. f. that is.::: O } and consider the (0. We saw that the changeover from Phase I to Phase II involves some delicate steps whenever some artificial variables are in the final basis constructed by the Phase I algorithm. Exercise 3. Suppose that x* is a local optimum for the problem of minimizing f (x) over S.4 Consider the problem of minimizing c/x over the set P = { x E Rn I Ax = b. 3. Dx ::::. Find the set of feasible directions at x. (b) If x is the unique optimal solution and is nondegenerate.2 ( OptiIllality conditions) Consider the problem of minimizing c/x over a polyhedron P. 9 Exercises 129 problem. E. Dd :S O } .x* 1 1 ::::. Prove that x* is globally optimal. While the number of pivots can be an exponential function of the number of variables and constraints in the worst case.9 Exercises Exercise 3 .3 Let x be an element of the standard form polyhedron P = {x E Rn I Ax = b. P = { x E R3 I Xl + X2 + X3 = 1 . x .
has an optimal cost of zero. then the simplex method will not cycle. Xi � 0. no matter which pivoting rule is used. t) E �n + l I x E P.7 ( Optimality conditions ) Consider a feasible solution x to a standard form problem.8* { (tx. for which the zero vector (in �n + l ) is a degenerate basic feasible solution in Q. where P C �n is a given bounded polyhedron. respectively. (a ) Show that Q is a polyhedron. then x* is the only optimal solution.9 L Xi Xi = 0. IJ } . Consider an optimal basis associ ated with x* . i E B U I. and let Z = {i I Xi = O}. Consider the linear programming problem of minimizing c'x over all x E P. 3 The simplex method Exercise 3.10 * Show that if n . Exercise 3. (a) Show that if I is empty. Ax i EI = b i E N \ I. (c ) Show that the zero vector (in �n + l ) minimizes ( c . with n = 2. (In this sense.130 Chap. ) This exercise deals with the problem o f deciding whether a given degenerate basic feasible solution is optimal and shows that this is essentially as hard as solving a general linear programming problem.m = 2 . Let Q = Exercise 3. deciding optimality is equivalent to solving a new linear programming problem. Let I be the set of nonbasic indices i for which the corresponding reduced costs Ci are zero. O ) ' y over all y E Q if and only if the optimal cost in the original linear programming problem is greater than or equal to zero. t E [0. Let B and N be the set of basic and nonbasic indices. n . Exercise 3. show the example in a figure.11 * Construct an example with under which the simplex method will cycle. (b) Show that x* is the unique optimal solution if and only if the following linear programming problem has an optimal value of zero: maximize subject to Exercise 3. (Necessary and sufficient conditions for a unique opti mum ) Consider a linear programming problem in standard form and suppose that x* is an optimal basic feasible solution. (b) Give an example of P and Q. Ad = 0 iE Z.m = 3 and a pivoting rule . Show that x is an optimal solution if and only if the linear programming problem minimize subject to c' d di � 0.
l using only O ( m 2 ) arithmetic operations. where e£ is the £th unit vector. . Exercise 3 . (c) Draw a graphical representation of the problem in terms of the original variables X l . (c) Let B and B be basis matrices before and after an iteration of the simplex method. Exercise 3 .1 and e� B . Provide a formula for gi . v be vectors in lRm . . Xl + X 2 ::::. Show how to obtain a tableau with lexicographically positive rows. Show that (C + WV. Hint: Multiply both sides by ( C + C. under the usual assumption that the rows of A are linearly independent.) .1 1 + v' C . Hint: Permute the columns. ( Note that _ wv' is an m x m matrix) .1 wv'). .Sec.B = (AS( £ ) .1 is the pivot row. and indicate the path taken by the simplex algorithm.1 WV'C. 1 4 Suppose that a feasible tableau is available. respectively. 2 (a) Convert the problem into standard form and construct a basic feasible solution at which (X 1 . starting with the basic feasible solution of part ( a) .1 w ( b ) Assuming that C. B . for suitable scalars gi . (d) Note that e � B l is the ith row of B. (a) (Matrix inversion lemma) Let C be an m x m invertible matrix and let u. (b) Carry out the full tableau implementation of the simplex method. . X 2 2 0.  Show that Exercise 3 . explain how t o obtain ( C + WV. ) . AB( £ ) be the exiting and entering column. X 2 . Show that i = 1 . 1 3 This exercise shows that our efficient procedures for updating a tableau can be derived from a useful fact in numerical linear algebra. 9 Exercises 131 Exercise 3 . Let AB ( £ ) . 3. X 2 ) = (0. Let € be a scalar and define b (€ ) = b + [ ] 2 : €m .1 i s available. 1 2 Consider the problem minimize 2X 1 X2 subject to Xl X 2 ::::. 6 X 1 . 1 5 (Perturbation approach to lexicography) Consider a stan dard form problem.l = C . (e) Multiply both sides of the equation in part ( d ) by [b I A] and obtain an interpretation of the mechanics for pivoting in the full tableau implemen tation. 0) . Interpret the above equa tion in terms of the mechanics for pivoting in the revised simplex method.AB ( £ ) )e� . m.
compare the results lexicographically.132 Chap. both Phase I and Phase II) via the sim plex method the following problem: minimize subject to 2X1 + 3X2 + 3X3 Xl + 3X2 Xl + 2X2 Xl 4X2 + 3X3 Xl . divide the ith row of [B1b I B1] by Ui and choose the row which is lexicographically smallest . C�B1) increases lexicographically at each iteration. then the I!th basic variable X B (£) exits the basis. If row I! was lexicographically smallest . we define the Eperturbed problem to be the linear programming problem obtained by replacing b with b(E) . and assume that all rows of B1 [b I I] are lexicographically positive. Let u = B1 Aj • For each i with Ui > 0. . Show that this is the same choice of exiting variable as in the original simplex method applied to the Eperturbed problem. (c) The revised simplex method terminates after a finite number of steps. X5 2: o. 3 The simplex method For every E > 0. Prove the following: (a) The row vector (C�B1b. (a) Given a basis matrix B. with the lexicographic rule de scribed in part (d) . + X4 2X5 + 4X4 + X5 3X4 + X5 2 2 1 . (b) Exercise 3. For every row i such that Ui is positive. show that the corresponding basic solution XB (E) in the Eperturbed problem is equal to Show that there exists some E* > 0 such that all basic solutions to the Eperturbed problem are nondegenerate. is guaranteed to terminate even in the face of degener acy. when E is sufficiently small. (b) Every row of B1 is lexicographically positive throughout the algorithm. Consider a pivoting rule that chooses the entering variable Xj arbitrarily (as long as Cj < 0) and the exiting variable as follows. (c) Suppose that all rows of B1 [b I I] are lexicographically positive. Show that XB (E) is a basic feasible solution to the Eperturbed problem for E positive and sufficiently small. (Lexicography and the revised shnplex method) Suppose that we have a basic feasible solution and an associated basis matrix B such that every row of B1 is lexicographically positive. . Let the exiting variable be determined as follows.17 Solve completely (Le.. and define u = B1 Aj . .16 Exercise 3. Let some nonbasic variable Xj enter the basis. for 0 < E < E* . and choose the exiting variable to be the one corresponding to the lexicographically smallest row. divide the ith row of B1 [b I I] by Ui . (d) Consider a feasible basis for the original problem. . (e) Explain why the revised simplex method.
For each of the statements that follow. ( c ) The current solution is feasible but not optimal. 8.10 8 1 Q 2 'f/ 0 1 0 0 0 0 1 0 0 0 0 1 4 1 4 (3 "( 3 The entries Q. no more than m of its components can be positive. find the ranges of values of the various parameters that will make the statement to be true. X4 . 'f/ in the tableau are unknown parameters. where m is the number of equality constraints.. ( c ) A variable that has just entered the basis cannot leave in the very next iteration. (3. 9 Exercises 133 Exercise 3. give either a proof or a counterexample. then there exists a unique optimal basis. 3. . (3.19 While solving a standard form problem. Exercise 3. For each one of the following statements. let B be the basis matrix corresponding to having X2 . (a) An iteration of the simplex method may move the feasible solution by a positive distance while leaving the cost unchanged. e in the tableau are unknown parameters. with X 3 .. 'f/. (d) If there is a nondegenerate optimal basis. . .(. (b) A variable that has just left the basis cannot reenter in the very next iteration. find some parameter values that will make the statement true.18 Consider the simplex method applied to a standard form prob lem and assume that the rows of the matrix A are linearly independent . de scribed in terms of the following initial tableau: 0 (3 0 0 0 1 0 1 0 0 0 0 1 0 8 Q 3 1 "( 0 'f/ e 3 1 1 2 3 2 0 2 1 2 The entries Q.(. 8.00 . we arrive at the follow ing tableau. (a) Phase II of the simplex method can be applied using this as an initial tableau.20 Consider a linear programming problem in standard form. and X l (in that order) be the basic variables. Exercise 3. ( e ) If x is an optimal solution. For each one of the following statements. X3 .Sec. (b) The optimal cost is . (a) The current solution is optimal and there are multiple optimal solutions. and X5 being the basic variables: . Furthermore.
()* i 0) .00 ( e ) The corresponding basic solution is feasible. in unlimited quantities. and when X6 is the entering variable. X n is the only candidate for entering the basis. ( a) Use the simplex method to find an optimal solution. 16. . and cn < O. (d) The corresponding basic solution is feasible and the first simplex iteration indicates that the optimal cost is . X6 is a candidate for entering the basis. . (b) Assuming that the optimal cost is finite. o 1 o o Assume that in this tableau we have Cj :::: 0 for j = m + 1 . . ( a) Derive a simple test for checking the feasibility of this problem. . n .22 Consider the following linear programming problem with a sin minimize subject to L CiXi i= l i= l Xi :::: 0. In particular. but we do not have an optimal basis. but the selling price of gasoline may rise.23 While solving a linear programming problem by the simplex method. develop a simple method for ob taining an optimal solution directly. (f) The corresponding basic solution is feasible. Exercise 3. the solution and the objective value remain unchanged. (b) Suppose that the selling price of heating oil is sure to remain fixed over the next month. 3 The simplex method (b) The first row in the present tableau indicates that the problem is infeasible. n. .1 . Exercise 3. ( a) Suppose that X n indeed enters the basis and that this is a nondegenerate pivot (that is. X 3 leaves the basis. Prove that X n will remain basic in all subsequent .134 Chap. How much should be bought? gle constraint: Exercise 3.21 Consider the oil refinery problem in Exercise 1 . ( c ) The corresponding basic solution is feasible. but if it does. . . . How high can it go without causing the optimal solution to change? ( c ) The refinery manager can buy crude oil B on the spot market at $40/barrel. . n L aiXi = b i = 1 . X7 is a candidate for enter ing the basis. the following tableau is obtained at some iteration.
. XB(rn) . . what is the largest possible value of ()? How are the new basic columns determined? ( e ) Let Xj be some nonbasic variable such that Xj = Uj and Cj > o. . show that the cost strictly decreases with each iteration and the method terminates. when an artificial variable becomes nonbasic. Show that the resulting vector x is a basic solution. assume that the basic solution constructed in part (a) is feasible. . . . what is the largest possible value of ()? How are the new basic columns determined? (d) Assuming that every basic feasible solution is nondegenerate. (a) If the simplex method terminates with a solution ( x .2. it need never again become basic. (b) If the simplex method terminates with a solution (x. .00 . and adjust the basic variables from XB to xB . As in Section 3. and adjust the basic variables from XB to XB + (}B . (b) Suppose that X n indeed enters the basis and that this is a degenerate pivot (that is. then the original problem i s infeasible. . . Ax = b ° where A has linearly independent rows and dimensions m x n. . We then solve the equation Ax = b for the basic variables XB ( l ) . then x is an optimal solution to the original problem.26 (The bigM method) Consider the variant of the bigM meth od in which M is treated as an undetermined large parameter.24 Show that in Phase I of the simplex method. Consider a problem of the form minimize subject to e I Exercise 3. Show that X n need not be basic in an optimal basic feasible solution. Given that we wish to preserve feasibility.(}B . Assume that Ui > 0 for all i.Sec. We decrease Xj by (). show that the original problem is either . Exercise 3. 9 Exercises 135 iterations of the algorithm and that X n is a basic variable in any optimal basis. (b) For this part and the next. B (m) into two disjoint subsets L and U.l Aj . Given that we wish to preserve feasibility. its column can be eliminated from the tableau. We partition the set of all i =I. and Xi = Ui for all i E U.25 (The simplex method with upper bound constraints) x � x � u. . ( e ) If the simplex method terminates with an indication that the optimal cost in the auxiliary problem is . ()* = 0) . 3. Also. Thus.0 .l Aj .B ( l ) . y) for which y =I. Prove the follow ing. y ) for which y = 0. if an artificial vari Exercise 3. show that it is nondegenerate if and only if 0 < Xi < Ui for every basic variable Xi . Let Xj be some nonbasic variable such that Xj = 0 and Cj < o. We set Xi = 0 for all i E L. we increase Xj by (). ( a ) Let AB ( l ) . able becomes nonbasic. We form the simplex tableau and compute the reduced costs as usual. AB ( rn ) be m linearly independent columns of A (the "basic" columns) .
viewed in terms of column geom Exercise 3. . viewed i n terms o f column geom etry. X4 ::::: 0. Furthermore. (d) Provide examples to show that both alternatives in part (c) are possible. and such that the number of positive components of x is maximized. . it has discovered a feasible direction d = (dx . Show that the m + 1 basic points (Ai .28 Consider a linear programming problem in standard form with a bounded feasible set. Let P(bl . + + + + 2X4 X4 3X4 X4 bl b2 1 where bl .6. Cj ) is equal to the reduced cost of the variable Xj . as defined in Section 3. and such that the number of positive components of x is maximized. b2 ) is nonempty. Hint: When the simplex method ter minates. for all i. show that the vertical distance from the dual plane to a point (Aj . Show that dy = O. Show that the problem can be transformed into an equivalent one that contains the constraint L�=l Xi = 1 . dy ) of cost decrease.31 Consider the linear programming problem minimize subject to Xl + 2X l + Xl + Xl + Xl . etry.6.00 . Exercise 3.27 * (a) Suppose that we wish to find a vector x E Rn that satisfies Ax = 0 and x ::::: 0. suppose that we know the value of a scalar U such that any feasible solution satisfies Xi s:: U. y ::::: O . z .136 Chap. ( b ) Suppose that we wish t o find a vector x E Rn that satisfies A x = b and x ::::: 0. . Exercise 3. are affinely independent . Show that this can be accomplished by solving the linear programming problem maximize subject to L Yi A(z + Y ) Yi s:: 1 . .29 Consider the simplex method. b2 ) be the feasible set. Show how this can be accomplished by solving a single linear programming problem. b2 are free parameters.30 Consider the simplex method. 3 The simplex method infeasible or its optimal cost is . In the terminology of Section 3. n i= l = 0 for all i. (a) Characterize explicitly (preferably with a picture ) the set of all (bl ' b2 ) for which P(h . Exercise 3. Use the column geometry of linear programming to answer the following questions. 3X2 + 2X 3 3 X2 + X3 2X2 + X3 X2 + X3 . Exercise 3. Ci ) .
For an exposition of the perturbation method.5. 84 . There are four bases in this problem. The column geometry interpretation of the simplex method is due to Dantzig ( 1 963) . . enumerate all bases that correspond to each degenerate basic feasible solution. b2 ) is the optimal solution degenerate? Let b 1 = 9/5 and b 2 = 7/5. It can be viewed as an outgrowth of a perturbation method developed by Orden and also by Charnes ( 1952) .1 .6. and Wolfe ( 1 955) . For which values of (b 1 . 4. A proof that Bland's rule avoids cycling can also be found in Papadimitriou and Steiglitz ( 1 982) . x .25 and in the textbooks that we mentioned earlier. is due to Beale ( 1 955) . instead of converting the problem to standard form. all variables except for Xi are basic. The smallest subscript rule is due to Bland ( 1977) . y be two different basic feasible solutions. X4 as the basic variables. . Exercise 3. 3. An excellent introduction to numerical linear algebra is the text by Golub and Van Loan ( 1 983 ) .Sec. see Stone and Tovey (1991 ) . b2 ) I the ith basis is optimal} . and let 3 . who later wrote a comprehensive text on the subject (Dantzig.6.3. and Wright ( 1 981) . Which path will the simplex method follow? Exercise 3.2. The simplex method was pioneered by Dantzig in 1947. . b2 ) for (c) (d) (e) (f) which some basic feasible solution is degenerate. . Suppose that we start the simplex method with X 2 . 3. Orden. let 8i = { (b 1 . Chvatal ( 1 983) . show that we can go from x to y in a finite number of steps. Murty ( 1 983) . Murray. instead of B. Example 3. For every (b 1 . .4. see the books by Gill. 10 Notes and sources 3. b2 ) for which there exists a degenerate basic feasible solution. If we are allowed to move from any basic feasible solution to an adjacent one in a single step. 1963) . . 3. X 3 . we can use a suitable adaptation of the simplex method.33 Consider a polyhedron in standard form. . which shows the possibility of cycling. preferably with a picture. Identify. as well as Exercise 3 . 1 5 . For i = 1 . 1 0 Notes and sources 137 ( b ) Characterize explicitly (preferably with a picture) the set of all (b 1 . the sets 81 . If we have upper bounds for all or some of the variables. This is developed in Exercise 3. The lexicographic anticycling rule is due to Dantzig. and Luenberger ( 1 984) . . 3. in the ith basis.32 * Prove Theorem 3. Chvatal ( 1 983) . For further discussion. or Murty ( 1 983) . For more discussion of practical implementations of the simplex meth od based on products of sparse matrices. see Chvatal ( 1 983) and Murty ( 1983) . .
1 0 and 3 . The first results on the aver age case behavior of the simplex method were obtained by Borgwardt ( 1 982) and Smale ( 1 983) .138 Chap. The Hirsch conjecture was made by Hirsch in 1957. are due to Marshall and Suurballe ( 1 969) . 1 1 . . The example showing that the simplex method can take an exponen tial number of iterations is due to Klee and Minty ( 1 972) . 7. 3.9. Schrijver ( 1 986) contains an overview of the early research in this area. which deal with the smallest examples of cycling. The results in Exercises 3 . 3 The simplex method 3 . 1 3 (a)] is known as the Sherman Morrison formula. as well as proof of the n/2 bound on the number of pivots due to Haimovich ( 1 983) . The matrix inversion lemma [Exercise 3 .
4. 4.8.6. 4.Chapter 4 D uality theory Contents 4.5. 4.1. 4.9.3. 4.12. 4. Motivation The dual problem The duality theorem Optimal dual variables as marginal costs Standard form problems and the dual simplex method Farkas' lemma and linear inequalities From separating hyperplanes to duality * Cones and extreme rays Representation of polyhedra General linear programming duality * Summary Exercises Notes and sources 139 .10. 4.2. 4.7.13. 1 1 . 4. 4. 4. 4.
and leads to another algorithm for linear programming (the dual simplex method) . the optimal solution to the constrained problem is also optimal for the unconstrained problem. This leads to the unconstrained minimization of x 2 + y 2 + p(l . called the dual of the original. which can be done by setting aLlax and aLlay to zero.y) . in our example) . called the pri mal.  . subject to no constraints. It is a powerful theoret ical tool that has numerous applications. we introduce a Lagrange multiplier p and form the Lagrangean L(x.x . Duality theory deals with the relation between these two problems and un covers the deeper structure of linear programming. It turns out that the right prices can be found by solving a new linear programming problem. under that specific value of p. y. often used in calculus to minimize a function subject to equality constraints. p) = x2 + y2 + p(l . When the price is properly chosen (p = 1 . 1 Motivat ion Duality theory can be motivated as an outgrowth of the Lagrange multiplier method. The constraint x + y = 1 gives us the additional relation p = 1 . The situation in linear programming is similar: we associate a price variable with each constraint and start searching for prices under which the presence or absence of the constraints does not affect the optimal cost.y) . the presence or absence of the hard constraint does not affect the optimal cost . Instead of enforcing the hard constraint x + y = 1 . In particular. We now motivate the form of the dual problem. 4. and the optimal solution to the original problem is x = y = 1/2. For example. or price. we minimize the Lagrangean over all x and y. and introduce another linear programming problem. p with the amount 1 . y.140 Chap. While keeping p fixed. in order to solve the problem minimize subject to x2 + y2 x + y = 1. we start with a linear programming problem.x . 4 Duality theory In this chapter. The optimal solution to this unconstrained problem is and depends on p. called the dual. we allow it to be violated and associate a Lagrange multiplier.x Y by which it is violated. The main idea in the above example is the following. p) defined by L(x. provides new geometric insights.
and is known as the dual problem. as sumed to exist . as a function of the price vector p. and satisfies Ax* = b. In other words.Ax ) . where the last inequality follows from the fact that x* is a feasible solution to the primal problem. The main result in du ality theory asserts that the optimal cost in the dual problem is equal to the optimal cost c'x* in the primal. we only need to consider those values of p for which g(p) is not equal to 00 We therefore conclude that the dual problem is  the same as the linear programming problem maximize p'b subject to p' A ::::. when the prices are chosen according to an optimal solution for the dual problem. We are then faced with the problem minimize c'x + p' (b . which we call the primal problem.O: O [  Ax ) ::::. g (p) = min c ' x + p' (b x . each p leads to a lower bound g(p) for the optimal cost c'x* . if c' p' A 2: 0' .Ax ) x .  In maximizing g(p) . Note that min ( c' x 2: 0  p'A)x = . 1 Motivation 141 Consider the standard form problem mInImIZe subject to c'x Ax = b x 2: 0.O: O [ ] p'b + min ( c' . We introduce a relaxed problem in which the constraint Ax = b is replaced by a penalty p'(b . Thus. and let x* be an optimal solution. .p' A ) x. where p is a price vector of the same dimension as b. the option of violating the constraints Ax = b is of no value. c'. Let g(p) be the optimal cost for the relaxed problem. 4. c'x* + p' (b ]  Ax * ) = c'x* . otherwise. Indeed.Sec.Ax ) subject to x 2: o. we have g(p) min c'x + p' (b . and we expect g(p) to be no larger than the optimal primal cost c'x* . The relaxed problem allows for more options than those present in the primal problem. Using the definition of g(p) . The problem maximize g(p) subject to no constraints can be then interpreted as a search for the tightest possible lower bound of this type. x 2: 0  00 . { O.
142 Chap. a�x :::. j E N3 . The dual problem is a maximization problem that looks for the tightest such lower bound. In summary. s 2: o. p'Aj = Cj . 4 Duality theory Ax = b and we ended up with no constraints on the sign of the price vector p. the corresponding lower bound is equal to .p' A = 0 ' . otherWIse.2 The dual problem Let A be a matrix with rows a� and columns Aj • Given a primal problem with the structure shown on the left. Pi free. j E Nb j E N2 . If the primal problem had instead inequality constraints of the form Ax 2: b. . . equivalently. to end up with the constraint p' A = e ' in the dual problem. we use the fact min ( e ' . e' . p'A :::. 0. j E NI .00 . which leads to the dual constraints p' [A I I] :::. j E N3 . or. if e ' . i E MI . a�x = bi . . and for every p we have a method for obtaining a lower bound on the optimal primal cost. The equality constraint can be written in the form In the preceding example.00 . Also. Cj .p 'A ) x = x { 0. and does not carry any useful information. maximize subject to p'b P i 2: 0. p' Aj 2: Cj . Pi :::. Xj :::. i E M2 . we only need to maximize over those p that lead to nontrivial lower bounds. Xj free. Xj 2: 0. j E N2 . we started with the equality constraint [A I I]  [:]= 0. bi . These consid erations motivate the general form of the dual problem which we introduce in the next section. they could be replaced by Ax . [e' 1 0 ' ] . i E M3 . i E M2 . 0. i E M3 . the construction of the dual of a primal minimization problem can be viewed as follows. .s = b. p 2: o. p'Aj :::. if the vector x is free rather than signconstrained. Thus. its dual is defined to be the maxi mization problem shown on the right: minimize subject to e 'x a�x 2: bi . 4. For some vectors p. i E MI . and this is what gives rise to the dual constraints. We have a vector of parameters ( dual variables ) p.
c'x Ax "2 b. if a particular form is assumed for the primal. In addition. If we start with a maximization problem. for example. A problem and its dual can be stated more compactly. respectively. We summarize these relations in Table 4. the following pairs of primal and dual problems: minimize subject to and minimize subject to c'x Ax b x > 0. we have an equality or inequality constraint . we introduce a variable in the dual problem. 1 Consider the primal problem shown on the left and its dual shown . we will keep referring to the objective function in the dual problem as a "cost" that is being maximized. and then form its dual according to the rules we have described. We have. the corre sponding dual variable is either free or signconstrained. However. to avoid confusion. Exrunple 4 . respectively. for each variable in the primal. PRIMAL minimize maximize DUAL constraints variables "2 bi � bi = bi "2 0 �O free "2 0 �O free variables <  c · J "2 Cj = Cj constraints Table 4 . maximize subject to p'b p'A = c' P "2 o. depending on whether a variable in the primal problem is free or signconstrained. Finally.Sec. we can always convert it into an equivalent minimization problem.2 The dual problem 143 Notice that for each constraint in the primal ( other than the sign con straints ) . Depending on whether the primal constraint is an equality or inequality constraint . 1 : Relation between primal and dual variables and constraints. and its dual is a maximization problem. we introduce a constraint in the dual. we will adhere to the convention that the primal is a minimization problem. maximize subject to p'b p'A � c' . 4. in matrix notation. in the dual problem. 1 .
(The first three constraints in the latter problem are the same as the first three constraints in the original problem. we have the following result . and multiply the three last constraints by .3P 2 . Its proof needs nothing more than the steps followed in Example 4.2 . if the maxi mization in the latter problem is changed to a minimization. Also.144 on the right : minimize subject to Xl + 2X2 + 3X3 = 5 Xl + 3X2 2Xl . P3 to Xl . 0 P3 free. A compact statement that is often used to describe Theorem 4 . X 2 . 1 . Theorem 4 . Each equivalent form leads to a somewhat different form for the dual problem. for example. The resulting problem is shown on the left. We transform the dual into an equivalent minimization problem. and will therefore be omitted. the examples that follow indicate that the duals of equivalent problems are equivalent . we obtain the cost function in the original problem. 4 Xl � 0 X2 :::. with abstract symbols replacing specific numbers. Nevertheless.X 3 = . by introducing slack variables or by using the difference of two nonnegative variables to replace a single free variable. 0 X3 free. we obtain a problem equivalent to the original problem." Any linear programming problem can be manipulated into one of several equivalent forms. 1 is that "the dual of the dual is the primal. This suggests that the conclusion reached at the end of the example should hold in general. 4 Duality theory maximize subject to 5P l Pl P2 P3 P l 3P l + 6P 2 + 4P 3 free �0 :::.3 .1 had all of the ingredients of a general linear programming problem. 3P3 = 5 3P 3 :::.P2 �2 3P 2 + P 3 = 3 . rename the variables from P l . Indeed.1 . on the right . Chap. multiplied by .3X2 .1 . 0 + 2P 2 :::.2X2 � 1 + X2 :::. 0 .2P 2 P l . . P2 . 6 P 3 � 4 We observe that the latter problem is equivalent to the primal problem we started with. 1 . by multiplying the objective function by . maximize subject to P l .6X2 . .) The first primal problem considered in Example 4.4X3 free �0 :::.X2 + 3X3 � 6 X3 :::.1 . X3 . Then. we show its dual: minimize subject to 5Xl Xl X2 X3 Xl 3X l . 1 If we transform the dual into an equivalent minimiza tion problem and then form its dual.2P l + P 2 Pl � 0 P 2 :::.
c ' x Ax + . c ' and can b e rewritten Furthermore.Ax. using Eq. . c ' and p ' A ::. we obtain the following pair of problems: minimize subject to c ' x + . the last equality constraint is redundant and can be eliminated.2 The dual problem 145 Example 4. We transform the primal problem by introducing surplus variables and then ob tain its dual: minimize subject to c' x + D ' s Ax . c ' . i= l ::. 1 ) bm = a� x = L 'Yi a�x = L /'i bi . i= l i= 1 Note that the dual constraints are o f the form 2:: 1 Pi a� as ml L (Pi + /'ipm )a� ::.3 Consider a standard form problem. and its dual: minimize subject to c'x Ax b x � 0. /'m . the dual cost 2:: 1 Pi bi is equal to ml i= l . (4. By considering an arbitrary feasible solution x.2 Consider the primal problem shown on the left and its dual shown on the right : minimize subject to c' x Ax � b x free.l . c ' . The next example is in the same spirit and examines the effect of removing redundant equality constraints in a standard form problem. D.l (4. 1 Let a� . maximize subject to p'b p free p' A = c' p ::. a� be the rows of A and suppose that am = 2::� /'i ai for some scalars /'1 . . maximize subject to p' b p�D p ' A ::. Alternatively. the duals of the three variants of the primal problem are also equivalent . if we take the original primal problem and replace x by sign constrained variables.� b x+ � 0 x. Furthermore. c ' . D . Example 4. 1 ) . we obtain ml m. . c' . .� 0. Thus. the con straint p ' A = c ' is equivalent to the two constraints p ' A ::. c ' p ' A ::. . 4.s = b x free s � 0. Note that we have three equivalent forms of the primal. . . In particular. maximize subject to p' b p�D p' A = c ' . assumed feasible. We observe that the constraint p � 0 is equivalent to the constraint p ::. maximize subject to p' b p ' A ::. .Sec.
4.e. i.2 and 4. or they have the same optimal cost.1 i= 1 =.3 (Weak duality) Ifx is a feasible solution to the primal problem and p is a feasible solution to the dual problem. Then. the cost g(p) of any dual solution provides a lower bound for the optimal cost. ( e ) If some row of the matrix A in a feasible standard form problem is a linear combination of the other rows. . . the duals of 0 1 and 02 are equivalent. eliminate the corre sponding equality constraint. and is left to the reader. then p'b S e' x. we see that the dual problem is equivalent to maximize subject to =.3.1 i= l L qi bi L qi a� :S c' .3 The duality theorem We saw in Section 4. by a sequence of transformations of the following types: (a) Replace a free variable with the difference of two nonnegative variables.2 Suppose that we have transformed a linear program ming problem 01 to another linear programming problem O2 . The conclusions of the preceding two examples are summarized and gener alized by the following result . Theorem 4. The proof of Theorem 4. We observe that this is the exact same dual that we would have obtained if we had eliminated the last (and redundant) constraint in the primal problem. they are either both infeasible. Theorem 4. before forming the dual.146 If we now let qi = Pi Chap.2 involves a combination of the various steps in Examples 4. 4 Duality theory + "/iP= . (b) Replace an inequality constraint by an equality constraint involv ing a nonnegative slack variable. We now show that this property is true in general.1 that for problems in standard form.
Suppose that the optimal cost in the primal problem is . p satisfies p'b � c'x for every primal feasible x. we define Ui = Pi (a� x . we conclude that p'b � . Proof. and the sign of Cj . primal and dual feasibility imply that and Vj � 0. Vj = (Cj . thus establishing part ( a ) . 3 The duality theorem 147 Proof. The def inition of the dual problem requires the sign of Pi to be the same as the sign of a�x . yet it does provide some useful information about the relation between the primal and the dual. Corollary 4 .p ' Ax. then the primal problem must be infeasible. This is impossible and shows that the dual cannot have a feasible solution.bi ) . j and We add these two equalities and use the nonnegativity of Ui . for example. Taking the minimum over all primal feasible x. We have. to obtain j o The weak duality theorem is not a deep result . Suppose that x and p are primal and dual feasible. respectively. By weak duality. For any vectors x and p. Another important corollary of the weak duality theorem is the fol lowing. then the dual problem must be infeasible. 4. Thus. (b) If the optimal cost in the dual is +00. Vj . the following corollary.bi .p' b. Notice that V j. i L Vj = c' x .Sec. 1 (a) If the optimal cost i n the primal i s .00 . L Ui = p ' Ax .p ' Aj )Xj .p' Aj to be the same as the sign of Xj .00 and that the dual problem has a feasible solution p. Part 0 ( b ) follows by a symmetrical argument. .00 .
Proof.I b = c B XB = C X. Let us apply the simplex method to this problem. respectively. respectively. we first transform it into an equivalent standard . Let us define a vector p by letting p' = c B B I . and the respective optimal costs are equal. the simplex method terminates with an optimal solution x and an optimal basis B .1 b be the corresponding vector of basic variables. For every primal feasible solution y. I c/ . . Then. Corollary 4. and suppose that p'b = c'x. the weak duality theorem yields c'x = p'b :::. c'y. c / .1 A > 0 ' .2) . When the simplex method terminates. and the optimal dual cost is equal to the optimal primal cost . I = c B B . which 0 proves that x is optimal.c 'B B . so does its dual. e.2 Let x and p be feasible solutions to the primal and the dual. The next theorem is the central result on linear programming duality. Let XB = B . which shows that p is a feasible solution to the dual problem  maximize subject to In addition. P' b I p/b p' A :::. x and p are optimal solutions to the primal and the dual. Theorem 4. 4 Duality theory Corollary 4. b Let us assume temporarily that the rows of A are linearly independent and that there exists an optimal solution. As long as cycling is avoided.4 ( Strong duality ) If a linear programming problem has an optimal solution. where c B is the vector with the costs of the basic variables. by using the lexicographic pivoting rule. the reduced costs must be nonnegative and we obtain C' .148 Chap. Consider the standard form problem minimize subject to Ax c' x x > o. If we are dealing with a general linear programming problem II I that has an optimal solution. We then have p'A :::. It follows that p is an optimal solution to the dual (cf. The proof of optimality of p is similar. Proof.g. Let x and p be as in the statement of the corollary.
which provides a geometric. and in which the rows of the matrix A are linearly independent . 1 : Proof of the duality theorem for general linear pro gramming problems.2. If left under the influence of gravity. Example 4..7. and which is developed in Section 4. It follows that III and Dl have the same optimal cost (see Figure 4 . At equilibrium.3 The duality theorem 149 form problem II 2 .4 Consider a solid ball constrained to lie in a polyhedron defined by inequality constraints of the form a�x 2: bi . depends on the existence of pivoting rules that prevent cycling. By Theorem 4. that is. The preceding proof shows that an optimal solution to the dual prob lem is obtained as a byproduct of the simplex method as applied to a primal problem in standard form. There is an alternative derivation of the duality theorem.Sec. This corner is an optimal solution to the problem minimize subject to x a�x c ' 2: bi . 4.� � duality for standard form problems D2 Figure 4. V i. this ball reaches equilibrium at the lowest corner x* of the polyhedron. in turn. gravity is counter balanced by the forces exerted on the ball by the "walls" of the polyhedron. they are aligned with the vectors ai . D equivalent duals of equivalent problems are equivalent . Let Dl and D 2 be the duals of III and II 2 . the dual problems Dl and D2 have the same optimal cost . with the same optimal cost . We conclude that c = 2: i Piai . 1 ) . see Figure 4. . for some nonnegative coefficients Pi . we provide an illustration that conveys most of the content of the geometric proof. algorithmindependent view of the subject . respectively. in particular.2. It is based on the fact that the simplex method is guaranteed to terminate and this. where c is a vertical vector pointing upwards. We have already proved that II 2 and D 2 have the same optimal cost . The latter forces are normal to the walls. At this point.
so does the other. whenever a� x * > bi . if one problem has an optimal solution. Pi (bi a� x * ) = 0 for all i. c ' Given that forces can only be exerted by the walls that touch the ball. We therefore have p'h = L i Pibi = L i Pi�X* = c ' x * .2 as "impossible. exactly one of the fol lowing three possibilities will occur: ( a ) There is an optimal solution. the other must be infeasible. or +00 ( for maximization problems ) .2: A mechanical analogy of the duality theorem. which are shown in Table 4 . Recall that in a linear programming problem. It follows (Corollary 4. Consequently. By the strong duality theorem.  Figure 4. the optimal cost is 00 ( for minimization problems ) .150 Chap.2) that p is an optimal solution to the dual." . ( b ) The problem is "unbounded" . ( c ) The problem is infeasible. we must have Pi = 0. Furthermore. and the optimal dual cost is equal to the optimal primal cost. This leads to nine possible combinations for the primal and the dual. 4 Duality theory the vector p is a feasible solution to the dual problem maximize subject to p'h p' A = p � O. that is. as discussed earlier. 2 . This allows us to mark some of the entries in Table 4. the weak duality theorem implies that if one problem is unbounded.
5 ( Complementary slackness ) Let x and p be feasible solutions to the primal and the dual problem. = .3. 196 1 ) .Sec. 't/ j .bi ) 0. The vectors x and p are optimal solutions for the two respective problems if and only if: 't/ i. There is another interesting relation between the primal and the dual which is known as Clark's theorem ( Clark.p ' Aj )xj . we defined Ui Pi ( a� x . 3 The duality theorem 151 II Finite optimum Unbounded Infeasible Finite optimum Possible Impossible Impossible Unbounded Impossible Impossible Possible Infeasible Impossible Possible Possible I Table 4. which we present next . In the proof of Theorem 4. The case where both problems are infeasible can indeed occur. Theorem 4. as shown by the following example. which is also infeasible.p' Aj )xj 0. (Cj . It asserts that unless both problems are infeasible.5 Consider the infeasible primal minimize subject to Its dual is Xl Xl 2Xl + 2X2 + X2 + 2X2 + + + 3P2 2P2 2P2 1 3. maximize Pl subject to Pl Pl 1 2. = = Proof. Complementary slackness An important relation between primal and dual optimal solutions is pro vided by the complementary slackness conditions.21 ) . respectively.bi ) and Vj = (cj . at least one of them must have an unbounded feasible set ( Exercise 4.2: The different possibilities for the primal and the dual. Example 4. 4. and noted that for x primal feasible and p dual feasible. Pi ( a� x .
8 3 maximize subject to 8PI 5PI PI 3PI + + + 3P2 3P 2 :::. if x and p are optimal. since xi > 0 and X3 > 0. and there is no point in associating a nonzero price with such a constraint . o Example 4. In addition.152 we have Chap. Note that this is a dual feasible solution whose cost is equal to 19.4. j . An intuitive explanation is that a constraint which is not active at an optimal solution can be re moved from the problem without affecting the optimal cost. which we can solve to obtain PI = 2 and P2 = 1 . Note also the analogy with Example 4. Conversely. 6. then c'x = p'b.6 Consider a problem in standard form and its dual: minimize subject to 13x I 5X I 3X I + lOx2 + 6X3 + X2 + 3X3 + X2 Xl . X3 � 0. This verifies that x* is indeed an optimal solution as claimed earlier. i j By the strong duality theorem. the vector x* = ( 1 . The first complementary slackness condition is automatically satis fied by every feasible solution to a problem in standard form. the corresponding complementary slackness condition asserts that the dual variable Pi is zero unless the constraint is active. if Ui = Vj = 0 for all i .2 implies that x and p are optimal. then c'x = p'b. If the pri mal problem is not in standard form and has a constraint like a�x � bi . which is the same as the cost of x* . 1 ) is a nondegenerate optimal solution to the primal problem. because X2 = O. However. since the primal is in standard form. where "forces" were only exerted by the active constraints. We illustrate this fact in the next example. we use the comple mentary slackness conditions to construct the optimal solution to the dual. 10 :::. If the primal problem is in standard form and a nondegenerate optimal basic feasible solution is known. which implies that Ui = Vj = 0 for all i . 0. Suppose that Xj is a ba sic variable in a nondegenerate optimal basic feasible solution to a primal . As will be verified shortly. 4 Duality theory Ui � 0 and Vj � 0 for all i and j . We now generalize the above example. we showed that c' x . 13 P2 :::. Assuming this to be the case. The condition Pi (a�x* . and Corollary 4. j . we obtain and 3PI = 6.p ' b = L Ui + L Vj . the complementary slackness conditions determine a unique solution to the dual problem.bi ) = 0 is automatically satisfied for each i. The condition (Cj . X2 .p ' Aj )xj = 0 is clearly satisfied for j = 2.
A similar conclusion can also be drawn for problems not in standard form (Exercise 4 . has a unique solution. . 1 7) . On the other hand. where the dimension of x is equal to n. . if we are given a degenerate optimal basic feasible solution to the primal. respectively. .3 The duality theorem 153 problem in standard form. Since the basic columns Aj are linearly independent . Let p E � m be a dual vector (not necessarily dual feasible) . 4. such that the vectors � . We finally mention that if the primal constraints are of the form Ax 2 h. c ' i = 1. . for all i . The system a�x = bi . m. i E I. i E I. complementary slackness may be of very little help in determining an optimal solution to the dual problem (Exercise 4. A geometric view We now develop a geometric view that allows us to visualize pairs of primal and dual vectors without having to draw the dual feasible set . m } of cardinality n. We consider the primal problem minimize subject to x a�x 2 bi . (dual feasibility) . . a�xI 2 bi . the complementary slackness condition (Cj _ pi Aj )xj = 0 yields pi Aj = Cj for every such j. but these lie outside the scope of this book. We need: (a) (b) (c) (d) P 2 0. a variable in one problem is nonzero if and only if the corresponding constraint in the other problem is active (Exercise 4. This result has some interesting applications in discrete optimization. then there exist optimal solutions to the primal and the dual which satisfy strict complementary slackness. namely. (complementary slackness) . Pi = O . . . We assume that the vectors ai span �n . that is. Then. denoted by xl .Sec.9 in Section 2 . that xl is nondegenerate. for all i � I. and the primal problem has an optimal solution. . a�x =I.bi for i � I. (primal feasibility) .20) . that is. x 2 0 . L> i � = C m Let I be a subset of { 1 . pi = C�B . 1 2 ) .l . are linearly independent. The corresponding dual problem is maximize subject to p'h i =l P 2 0. we obtain a system of equations for p which has a unique solution. and let us consider what is required for xl and p to be optimal solutions to the primal and the dual problem. which is a basic solution to the primal problem (cf. We assume. . (dual feasibility) . 2 ) . Definition 2.
form a basis for the dual problem (which is in standard form) and p I is the associated basic solution. condition (c) becomes Since the vectors ai . Every twoelement subset I of { I . Given the complementary slackness condition (b) . we also need it to be nonnegative. 4.3: Consider a primal problem with two variables and five inequality constraints ( n = 2 . For the vector p I to be dual feasible. If I = { I . If I = { 1 . 2 } . m = 5) . it is readily seen that the vectors ai . 3 . We conclude that once the complementary slackness condition (b) is enforced. If I = { I .4. 4}. xl and p I are optimal. Xl is primal infeasible ( point A) and p I is dual in feasible. Xl is primal feasible ( point B) and p I is dual infea sible. x l is primal feasible ( point C) and p I is dual feasible. 4 Duality theory l Figure 4. i E I. the latter equation has a unique solution that we denote by p l . If I = { 1 . respectively. because c can be expressed as a nonnegative linear combination of the vectors a l and 8. x l is primal infeasible ( point D) and p I is dual feasible. In particular.154 c Chap. In fact . are linearly independent . 5 } . because c cannot be expressed as a nonnegative linear combination of the vectors a l and a2 . and suppose that no two of the vectors ai are collinear. i E I. feasibility of . 2 . 5} determines basic solutions xl and p I of the primal and the dual. 3} .
Still. the corresponding dual basic solution p i is infeasible. Using different choices for I. 2 } .Sec. therefore. On the other hand. in more depth. see Figure 4. . 4. a2 .4 Optimal dual variables as marginal costs 155 Figure 4. 3} . we may obtain several dual basic solutions p l . This theme will be revisited. It may then well be the case that some of them are dual feasible and some are not . 3} or I = { 2 . the resulting dual vector p I is equivalent to c being a nonnegative linear combination of the vectors ai.3. optimal. If x* is a degenerate basic solution to the primal. then they are both optimal. all Pi are nonnegative) and if x* is primal feasible.4: The vector x* is a degenerate basic feasible solution of the primal.4 O pt imal dual variables as marginal cost s In this section. i E I. because c is not a nonnegative linear combination of aI . associated with the active primal constraints.5 applies. If we choose I = { 1 . see Figure 4. in Chapter 5. Consider the standard form problem minimize subject to x Ax b x > o. because we have been enforcing complementary slackness and Theorem 4. This allows us to visualize dual feasibility without having to draw the dual feasible set . and by solving the system L i E I Pi ai = c . if we choose I = { I . we elaborate on the interpretation of the dual variables as prices.4. 4. if p I is dual feasible (Le. c ' We assume that the rows of A are linearly independent and that there . the resulting dual basic solution p i is feasible and. there can be several subsets I such that xl = x* .
we concentrate on the case where the primal problem is in standard form. We also comment on the relation between the basic feasible solutions to the primal and the dual. is p'b. Thus. which is positive. where d is a small perturbation vector. is c'x* .1 b be the vector of basic variables.156 Chap. By the optimality of x* in the original problem. which is an alternative to the simplex method of Chapter 3. Since B .5 St andard form problems and t he dual simplex method In this section. We develop the dual simplex method. 3 in Section 1 . 1 ) . duality is concerned with two alternative ways of cost accounting. Let us interpret Pi as the "fair" price per unit of the ith nutrient . Let us now replace b by b + d. A unit of the jth food has a value of Cj at the food market . The duality relation c'x* = p'b states that when prices are chosen appropriately. 4. the vector of reduced costs c' . the value of the ideal food. we also have B1 (b + d) > 0. a small change of d in the righthand side vector b results in a change of p'd in the optimal cost . We interpret each vector Aj as the nutritional content o f the jth available food.c'aB1 A is nonnegative and this establishes that the same basis is optimal for the perturbed problem as well. This implies that the same basis leads to a basic feasible solution of the perturbed problem as well. as long as d is small. . We conclude that each component Pi of the optimal dual vector can be interpreted as the marginal cost ( or shadow price ) per unit increase of the ith requirement bi .1 is an optimal solution to the dual problem. the optimal cost in the perturbed problem is where p' = c'aB . should be consistently priced at the two markets. for standard form problems. by nondegeneracy. and view b as the nutritional content of an ideal food that we wish to synthesize. Complementary slackness asserts that every food which is used ( at a nonzero level ) to synthesize the ideal food. the two accounting methods should give the same results. Perturbing the righthand side vector b has no effect on the reduced costs associated with this basis. 4 Duality theory is a nondegenerate basic feasible solution x* which is optimal. We conclude with yet another interpretation of duality. but it also has a value of p' Aj if priced at the nutrient market . The value of the ideal food. including a discussion of dual degeneracy. Let B be the corresponding basis matrix and let XB = B .1 b > 0. where x* is an optimal solution to the primal problem. Thus. Therefore. we phrase our discussion in terms of the diet problem ( Example 1 . as computed in the food market . In order to develop some concrete intuition. as computed in the nutrient market .
and we show that it moves from one basic feasible solution of the dual problem to another. The cost of this dual feasible solution is p'b = C�B. is called a revised dual simplex method ( Exercise 4. consisting of m linearly independent columns of A. We can thus think of the simplex method as an algorithm that maintains primal feasibility and works towards dual feasibility. In this section. which means that we have a basic. We then noted that the primal optimality condition c' .. cn I XB ( l) XB (m ) B1A1 I . If the inequality B . we present a dual simplex method. instead of the entire tableau. and consider the corresponding tableau or. An alternative implementation that only keeps track of the matrix B . implemented in terms of the full tableau. the vector p' = c�B . A method with this property is generally called a primal algorithm. 4. in more detail.1 b = C� X B ' which is the negative of the entry at the upper left corner of the tableau. and we have a feasible solution to the dual problem.C� X B C1 I . . equivalently. we assume that c 2:: 0. Let B be a basis matrix. we perform a change of basis in a manner we describe next ..1 b 2:: 0 fails to hold.l . A method of this type is called a dual algorithm.. and optimal solutions to both problems have been found.Sec. but not necessarily feasible solution to the primal problem.1 An I We do not require B.1 b to be nonnegative. .1 satisfies p' A � c '. However. we also have a primal feasible solution with the same cost . we considered the simplex method applied to a primal problem in standard form and defined a dual vector p by letting p' = C�B . If the inequality B . An alternative is to start with a dual feasible solution and work towards primal feasibility.1 b 2:: 0 happens to hold.C�B . We argue that it does indeed solve the dual problem. under the usual assumption that the rows of the matrix A are linearly independent .23) .5 Standard form problems and the dual simplex method 157 In the proof of the strong duality theorem..1 A 2:: 0' is the same as the dual feasibility condition p' A � c'. The dual simplex method Let us consider a problem in standard form. B .1 .
4 Duality theory We find some £ such that X B( £ ) < 0 and consider the £th row of the tableau.2) J . we choose the second row to be the pivot row. we multiply the pivot row by cj /l vj l and add it to the zeroth row.1 /2 . Cj Ci + Vi � ' which is nonnegative because of the way that j was selected [ef.21 and 10/ 1 . the second column enters the basis. For every i. In particular. therefore. The smallest ratio is 6/ 1 . with the exception of the pivot element which is set to 1 . ) We then perform a change of basis: column Aj enters the basis and column A B( £ ) exits.158 Chap. We compare the corresponding ratios 6/ 1 . We conclude that the reduced costs in the new tableau will also b e nonneg ative and dual feasibility has been maintained. since the jth column in the tableau contains the negative element Vj . the new value of Ci is equal to _ Ci (4. Example 4. Eq. We then divide the pivot row by 2. Vj < 0 and be a nonbasic variable. " " vn ) .21 and. we form the ratio i. ( The pivot element is indicated by an asterisk. that is. We multiply the pivot row by 2 and add it to the first row. The new tableau is Xl X2 X3 X4 X5 3 0 1/2 14 6 2 0 0 1 1 5 0 1 0 3 2 3/2 .3 1 . Negative entries of the pivot row are found in the second and third column.2) { i lvi<O} I Vi l (We call the corresponding entry Vj the pivot element. V 1 .7 Consider the tableau Xl X2 X3 X4 X5 0 2 2 2 6 4 2* 10 1 3 0 1 0 0 0 1 1 4 Since XB ( 2 ) < 0. Note that Xj must _ _ CJ I Vj l = min _ . / I vi l and let j be an index for which this ratio is smallest . this row is of the form ( XB(£ ) . ) We multiply the pivot row by 3 and add it to the zeroth row. in order to set the reduced cost in the pivot column to zero. (4. This change of basis (or pivot) is effected exactly as in the primal simplex method: we add to each row of the tableau a multiple of the pivot row so that all entries in the pivot column are set to zero. where Vi is the £th component of B .1 Ai ' For each i with Vi < 0 (if such i exist) . called the pivot row.
where as the corresponding reduced cost Cj is nonnegative. as long as the reduced cost of every nonbasic variable is nonzero.1 b i n the zeroth col umn of the tableau. We now provide a summary of the algorithm. choose some f! such that x B (R) < O. Examine the components o f the vector B . If they are all nonnegative. Let us now consider the possibility that the reduced cost Cj in the pivot column is zero. 5 Standard form problems and the dual simplex method 159 The cost has increased to 3 . 4. matrix B and with all reduced costs nonnegative. The column AB ( £ ) exits the basis and the column Aj takes its place. in order to replace Cj by zero. In full analogy with the primal simplex method. ) . Vn in the pivot row are nonnegative and we are therefore unable to locate a pivot element .I b remains the same. 4. the dual cost increases with each basis change. For each i such that Vi < 0. It follows that the algorithm must eventually terminate and this can happen in one of two ways: ( a ) We have B . this implies that the optimal dual cost is equal to +00 and the primal problem is infeasible. Thus.1 b 2': 0. Vb " " vn (the pivot row ) .22) . else. • . this has the effect of adding a negative quantity to the upper left corner. . 5. B . the zeroth row of the tableau does not change and the dual cost C�B . A typical iteration starts with the tableau associated with a basis 2. In this case. If Vi 2: 0 for all i . then the optimal dual cost is +00 and the algorithm terminates. Consider the f!th row o f the tableau. we need to add a positive multiple of the pivot row to the zeroth row. Then. we have an op timal basic feasible solution and the algorithm terminates. Let us temporarily assume that Cj is in fact positive.I b 2: 0 and an optimal solution. 3. An iteration of the dual simplex method 1 . compute the ratio cdlvi l and let j be the index of a column that corresponds to the smallest ratio. with elements XB ( R. Add to each row of the tableau a multiple of the f!th row ( the pivot row ) so that Vj ( the pivot element ) becomes 1 and all other entries of the pivot column become O. Equivalently. The proof of termina . and no basis will ever be repeated in the course of the algorithm.Sec. the dual cost increases. the proof is left as an exercise ( Exercise 4. and an Note that the pivot element Vj is always chosen to be negative. Furthermore. ( b ) All of the entries V I . Since XB ( £ ) is negative. we now have optimal solution has been found. .
a change in b does not affect the reduced costs and we still have a dual feasible solution. divide all entries by IV i I . it may be preferable to apply the dual simplex algorithm starting from the optimal basis for the original problem. 2. B . AB ( m ) .160 Chap. Choose any row £ such that XB ( C ) < 0. choose the one with the smallest index. it is natural to ask when the dual simplex method should be used. On the other hand. and that we wish to solve the same problem for a different choice of the righthand side vector b. each vector (cj . Suppose. The geometry o f the dual simplex method Our development of the dual simplex method was based entirely on tableau manipulations and algebraic arguments. . This idea will be considered in more detail in Chapter 5 . for example. If there is a tie between several lexicographically smallest columns. . 4 Duality theory tion given earlier does not apply and the algorithm can cycle. Lexicographic pivoting rule for the dual simplex method 1. If the dual simplex method is initialized so that every column of the tableau [that is. Determine the index j o f the entering column as follows. The proof is similar to the proof of the corresponding result for the primal simplex method ( Theorem 3.. instead of solving the new problem from scratch. One such case arises when a basic feasible solution of the dual problem is readily available. This basis matrix determines a basic solution to the primal problem with XB = B . We now present an alternative viewpoint based on geometric considerations. For each column with Vi < 0.. . the method terminates in a finite number of steps. .4) and is left as an exercise ( Exercise 4. and then choose the lexicographically smallest column. Let B be a basis matrix with columns AB (l) . i = 1 . The same basis can also be used to determine a dual vector p by means of the equations p ' A B (i) = CB (i) . .m. such as the following. This can be avoided by employing a suitable anticycling rule. and if the above lexicographic pivoting rule is used. The optimal basis for the original problem may be primal infeasible under the new value of b. to b e the pivot row. We continue assuming that we are dealing with a problem in standard form and that the matrix A has linearly independent rows . .. Thus.l b. that we already have an optimal basis for some linear programming problem.1 Aj ) ] is lexicographically positive. When should we use t he dual simplex method At this point . 24) .
The feasible sets of the equivalent primal problem and of the dual are shown in Figures 4. X2 2: 0.5(b) . if we choose the columns A 3 and A4 to be the basic columns. D. B.1 ) (point A) . then p is a basic feasible solution of the dual problem. maximize subject to 2PI + P2 PI + P 2 :S 1 2PI :S 1 PI . 0.5 Standard form problems and the dual simplex method 161 These are m equations in m unknowns. These correspond to the points A.8 Consider the following standard form problem and its dual: minimize subject to Xl + X2 Xl + 2X2 .1 variables that are common to both bases are zero) .l linearly independent active constraints in common (the reduced costs of the m . The associated initial tableau is .X4 = 1 X I . is a basic feasible solution to the primal (respectively. since the columns AB (l ) . C. thus. that is. X4 2: 0. B. D. which are points A. we obtain the equivalent problem minimize subject to Xl + X2 X l + 2X2 2: 2 Xl 2: 1 Xl . if p i A � c ' . dual) . For example.X 3 = 2 Xl . In matrix notation.5(b) . We now have a geometric interpretation of the dual simplex method: at every iteration. . or p i = C�B . . . . we have the infeasible primal basic solution x = (0. X2 . there is a unique solution p. If p is also dual feasible. There is a total of five different bases in the standard form primal problem and five different basic solutions. which yields p = (0. we have a basic feasible solution to the dual problem. and E in Figure 4. AB ( m ) are linearly independent . P2 2: 0. The same five bases also lead to five basic solutions to the dual problem.2 . If we eliminate the variables X 3 and X4 . respectively. To summarize. the dual basic solution p satisfies p/B = c � .5(a) . The basic feasible solutions obtained at any two consecutive iterations have m . 0) .Sec. which was referred to as the vector of simplex multipliers in Chapter 3.l . C. . This is a basic feasible solution of the dual problem and can be used to start the dual simplex method. Example 4. The corresponding dual basic solution is obtained by letting p ' A 3 = C3 = ° and p ' A4 = C4 = 0. consecutive basic feasible solutions are either adjacent or they coincide. 4. the number of linearly independent active dual constraints is equal to the dimension of the dual vector. dual) feasible. dual) which is primal (respectively.5 (a) and 4. X3 . A basic solution to the primal (respectively. The feasible set of the primal problem is 4dimensional. . and it follows that we have a basic solution to the dual problem. a basis matrix B is associated with a basic solution to the primal problem and also with a basic solution to the dual. and E in Figure 4. For such a vector p.
Xl X2 X3 X4 0 2 1 1 1 1 1 2* 0 0 1 0 0 0 1 We carry out two iterations of the dual simplex method to obtain the following two tableaux: Xl X2 X3 X4 1 1 1 1/2 1/2 1* Xl 0 1 0 X2 1/2 .8. at  . In the primal space.1/2 0 1/2 1/2 1 This sequence of tableaux corresponds to the path A B C in either figure. the path traces a sequence of infeasible basic solutions until.1 /2 0 X3 0 0 1 X4 3/2 1/2 1 0 0 1 0 1 0 1/2 .5: The feasible sets in Example 4. 4 Duality theory Y Pi b (a) (b) Figure 4.162 Chap.
while at each step improving the cost function. respectively. Duality and degeneracy Let us keep assuming that we are dealing with a standard form problem in which the rows of the matrix A are linearly independent. if and only if the reduced cost Cj is zero. but only four different basic solutions [points A. D in Figure 4. In the dual space. We eliminate X 3 and X4 to obtain the equivalent primal problem minimize subject to 3X I + Xl + 2X I X I . The example that follows deals with the relation between basic solu tions to the primal and the dual in the face of degeneracy. 4. that is. A" . . P2 :2: O. X2 . A' . because the dual problem is not in standard form. X2 X2 X2 :2: 2 X2 :2: 0 :2: O. At this basic solution. the dual constraint p' Aj = Cj is active if and only if c� B 1 Aj = Cj . 2 0 maximize subject to 2PI PI + 2P2 :S 3 PI . Any basis matrix B leads to an associated dual basic solution given by p' = c� B . however.6(a) and 4. C.P2 :S 1 PI . B.9 Consider the following standard from problem and its dual: minimize subject to 3X I + X2 X l + X2 .1 . Given that the reduced costs of the m basic variables must be zero. dual degeneracy is obtained whenever there exists a nonbasic variable whose reduced cost is zero.X 3 2X I .X4 X2 X l . The feasible set of the equivalent primal and of the dual is shown in Figures 4. the algorithm behaves exactly like the primal simplex method: it moves through a sequence of (dual) basic feasible solutions. X3 .6(b)] . Having observed that the dual simplex method moves from one basic feasible solution of the dual to an adj acent one. it may be tempting to say that the dual simplex method is simply the primal simplex method applied to the dual. Since p is mdimensional. In the dual problem.Sec. dual degeneracy amounts to having more than m reduced costs that are zero.6(a)] . Example 4. X4 :2: 0. however. it becomes feasible. D in Figure 4. 5 Standard form problems and the dual simplex method 163 optimality. the six bases lead to six distinct basic solutions [points A.25) . C . If we were to convert it to standard form and then apply the primal simplex method.6(b) . There is a total of six different bases in the standard form primal problem. This is a somewhat ambiguous statement. the resulting method is not necessarily identical to the dual simplex method (Exercise 4. A more accurate statement is to simply say that the dual simplex method is a variant of the simplex method tailored to problems defined exclusively in terms of linear inequality constraints. B.
namely. . the primal basic solu tion has Xl = X2 = 0 and the corresponding dual basic solution is (P l . ( a ) Every basis determines a basic solution to the primal. Furthermore. if we let columns A3 and A4 be basic. but also a corresponding basic solution to the dual. the reduced costs that are equal to zero correspond to active constraints in the dual problem. p ' = c'aB .6 ( b ) . but to different basic solutions for the dual. the equations p ' A2 = Cl and p ' A3 = C3 yield (P l . it may be that some are feasible and some are infeasible. If we let columns Al and A3 be basic.6: The feasible sets in Example 4. that were discussed in this section.9. if we let columns A2 and A3 be basic. however.1 ) . we still have the same primal solution. Finally.164 Chap. P 2 ) = (0. ( b ) This dual basic solution is feasible if and only if all of the reduced costs are nonnegative. 4 Duality theory D (a) ( b) Figure 4. the primal basic solution has again Xl = X2 = o. For the dual problem. point A" in Figure 4. For example. namely. which is a basic feasible solution of the dual. namely. 0) . We conclude with a summary of some properties of bases and basic solutions. for standard form problems. Example 4. the equations p ' Al = C l and p ' A3 = C3 yield (P l . which is an infeasible basic solution to the dual. For the dual problem. P 2 ) = (0. point A' in Figure 4. Note that this is a basic feasible solution of the dual problem. .9 has established that different bases may lead to the same basic solution for the primal problem.1 . ( d ) This dual basic solution is degenerate if and only if some nonbasic variable has zero reduced cost .6 ( b ) . 3/2) . out of the different basic solutions to the dual problem. P2 ) = (0. ( c ) Under this dual basic solution.
00 ) or infeasible.6 4. Let A l . Let us now assume that there exists no vector x 2: 0 satisfying Ax b. We conclude that Ax f=. p' A 2: 0' . Proof. Since p = 0 is a feasi ble solution to the minimization problem. The maximization prob lem is infeasible. related. In this section. and such a vector p is a certificate of infeasibility. This argument shows that if we can find a vector p satisfying p' A 2: 0' and p'b < 0. 0 We now provide a geometric illustration of Farkas' lemma (see Fig ure 4 . 6 Farkas ' lemma and linear inequalities Farkas ' lemma and linear inequalit ies 165 Suppose that we wish to determine whether a given system of linear in equalities is infeasible. which implies that the minimization problem is either unbounded (the optimal cost is . system of linear inequalities . minimize subject to p'b p' A 2: 0' . p'b. and whose cost is negative. that is. .6 (Farkas' lemma) Let A be a matrix of dimensions X n and let b be a vector in iR m . Consider the pair of problems maximize subject to 0' x Ax = b x > 0. b. m Theorem 4. for any x 2: 0. it follows that p' Ax f=. A n be the columns of the matrix A and note that Ax = L:�= l Aixi ' Therefore. . it follows that the minimization problem is unbounded. Farkas' lemma below states that whenever a standard form problem is infeasible. then p'b = p' Ax 2: 0. . Therefore. If there exists some x 2: 0 satisfying Ax = b. the latter system of linear inequalities can be interpreted as a search for a certificate of infeasibility for the former system. 4. such a certificate of infeasibility p is guaranteed to exist . p'b < O. One direction is easy. consider a set of standard form constraints Ax = b and x 2: o. we approach this question using duality theory. Intuitively. the existence of a vector x 2: 0 satisfying . for all x 2: O. there exists some p which is feasible. 7) . the standard form constraints cannot have any feasible solution. exactly one of the following two alternatives holds: (a) There exists some x 2: 0 such that Ax b. and if p' A 2: 0'. that is. and note that the first is the dual of the second. Then. and we show that infeasibility of a given system of linear inequalities is equivalent to the feasibility of another.Sec. . To be more specific. Then. ( b ) There exists some vector p such that p' A 2: 0' and p'b < = O. which shows that the second alternative cannot hold. we have p' Ax 2: 0 and since p'b < 0. Suppose that there exists some vector p such that p' A 2: 0' and p'b < O.
An . i 1 . Then. and p'b S d. p' A � 0'. . must also satisfy p'b � o. ' An . An and b be given vectors and suppose that any vector p that satisfies p' Ai � 0. Corollary 4. there is an equivalent statement of Farkas' lemma which is sometimes more convenient .6 and 4. If b does not belong to the shaded region (in which case the first alternative in Farkas' lemma does not hold) . .7 are often called theorems of the . equivalently. . . is provided in the next section. Proof. Finally. then the weak duality theorem asserts that every feasible solution to the first problem must also satisfy c'x S d. Consider the following pair of problems maximize subject to c'x Ax S b. . . 4 Duality theory Ax = b is the same as requiring that b lies in the set of all nonnegative linear combinations of the vectors AI ' . 7 Suppose that the system of linear inequalities Ax S b has at least one solution. (b) There exists some p � 0 such that p' A = c' and p'b S d. p � 0 . . 0 Results such as Theorems 4. and let d be some scalar. . This optimal solution satisfies p' A = c' . if some p satisfies p' A = c'. By the strong duality theorem. and the second alternative holds. = Our next result is of a similar character. minimize subject to p'b p'A = c' p � O. Conversely. which is the shaded region in Figure 4. . If the system Ax S b has a feasible solution and if every feasible solution satisfies c'x S d. . p � 0. Then.166 Chap. or. b can be expressed as a nonnegative linear combination of the vectors AI . but duality theory leads to a simple proof. based on the geometric argument we just gave. the second problem also has an optimal solution p whose cost is bounded above by d.7. Theorem 4 . We then have p'b < 0 and p' Ai � 0 for all i. . the following are equivalent: ( a ) Every feasible solution to the system Ax S b satisfies c'x S d. and note that the first is the dual of the second. A different proof.3 Let A I . . n. Farkas' lemma predates the development of linear programming. we expect intuitively that we can find a vector p and an associated hyperplane {z I p'z = o} such that b lies on one side of the hyperplane while the shaded region lies on the other side. and p'b S d. . . then the first problem has an optimal solution and the optimal cost is bounded above by d.
The components of a portfolio can be either positive or negative. A portfolio of assets is then a vector (X l . x n ) . There are several more results of this type. If we invest one dollar in some asset i and the state of nature turns out to be s. . alternative. . Exercises 4. 4. and 4. A negative value of Xi indicates a "short" position in asset i: this amounts to selling I Xi I units of asset i at the beginning of the period. then we can find a hyperplane {z I p'z = O} that separates it from that set. A n . Applications of Farkas ' lemma to asset pricing Consider a market that operates for a single period. 7: If the vector b does not belong to the set of all nonnegative linear combinations of A I . A positive value of X i indicates that one has bought Xi units of asset i and is thus entitled to receive r siXi if state materializes. . see. for example. .Sec. one must pay out rsi l xi l if state occurs. . Thus. . which is the same as receiving a payoff of rsiX i . . The following m X n payoff matrix gives the payoffs of each of the n assets for each of the m states of nature: Xi be the amount held of asset i. . . Hence. . and in which n different assets are traded. rm i ) . there are m possible states of nature at the end of the period.27. Depending on the events during that single period.26. with a promise to buy them back at the end. x Let = x s s . . 6 Farkas ' lemma and linear inequalities 167 Figure 4 . each asset i is described by a payoff vector ( r l i . we receive a payoff of r si . .28. 4.
" " qm ) . must be valuable to investors. wm ) .6) . Ws = L r S ixi . and R' plays the role of A. then we must have p ' x � O. the absence of arbitrage condition holds if and only if there exists some nonnegative vector q such that R' q = p. as described by the payoff matrix R. the absence of arbitrage condition can be expressed as follows: if Rx � 0. and we obtain n w = Rx. and let p = (PI .8 The absence of arbitrage condition holds if and only if there exists a nonnegative vector q of each asset i is given by Pi = = (q1 . Then. . ( Note that here p plays the role of b. In order to address this question. . 8= 1 Proof.) Therefore. This is of the same form as condition ( b ) in the statement of Farkas' lemma ( Theorem 4. . . the cost of acquiring a portfolio x is given by p'x. such that the price m L qs r S i . we introduce the absence of arbitrage condition. Theorem 4. . by Farkas' lemma. Given a particular set of assets. The absence of arbitrage condition states that there exists no vector x such that x'R' � 0' and x'p < O. 4 Duality theory that results from a portfolio x is given by We introduce the i= 1 vector w = ( W I . In other words. there must exist "state prices" qs . so it must have nonnegative cost. . any portfolio that pays off nonnegative amounts in every state of nature. . which underlies much of finance theory: asset prices should always be such that no investor can get a guaranteed nonnegative payoff out of a negative investment . The central problem in asset pricing is to determine what the prices Pi should be.168 The wealth in state s Chap. Pn ) be the vector of asset prices. which is the same as D the condition in the theorem's statement . Theorem 4. Let Pi be the price of asset i in the beginning of the period.8 asserts that whenever the market works efficiently enough to eliminate the possibility of arbitrage. Mathematically. . only certain prices p are consistent with the absence of arbitrage. What charac terizes such prices? What restrictions does the assumption of no arbitrage impose on asset prices? The answer is provided by Farkas' lemma.
Sec.
4. 6
From separating hyperplanes to duality
169
that can be used to value the existing assets. Intuitively, it establishes a nonnegative price qs for an elementary asset that pays one dollar if the state of nature is s, and nothing otherwise. It then requires that every asset must be consistently priced, its total value being the sum of the values of the elementary assets from which it is composed. There is an alternative interpretation of the variables qs as being (unnormalized) probabilities of the different states s, which, however, we will not pursue. In general, the state price vector q will not be unique, unless the number of assets equals or exceeds the number of states. The no arbitrage condition is very simple, and yet very powerful. It is the key element behind many important results in financial economics, but these lie beyond the scope of this text . (See, however, Exercise 4.33 for an application in options pricing. ) 4.7 Fr o m separat ing hyp erplanes to duality*
Let us review the path followed in our development of duality theory. We started from the fact that the simplex method, in conjunction with an anti cycling rule, is guaranteed to terminate. We then exploited the termination conditions of the simplex method to derive the strong duality theorem. We finally used the duality theorem to derive Farkas' lemma, which we inter preted in terms of a hyperplane that separates b from the columns of A. In this section, we show that the reverse line of argument is also possible. We start from first principles and prove a general result on separating hyper planes. We then establish Farkas' lemma, and conclude by showing that the duality theorem follows from Farkas' lemma. This line of argument is more elegant and fundamental because instead of relying on the rather compli cated development of the simplex method, it only involves a small number of basic geometric concepts. Furthermore, it can be naturally generalized to nonlinear optimization problems.
C losed sets and Weierstrass ' theorem
I
Before we proceed any further, we need to develop some background ma terial. A set S c Rn is called closed if it has the following property: if x l , x2 , . . . is a sequence of elements of S that converges to some x E Rn , then x E S. In other words, S contains the limit of any sequence of elements of S. Intuitively, the set S contains its boundary.
Theorem 4.9 Every polyhedron
j,
c1"wd.
X l , x2 , .
Proof. Consider the polyhedron P
.
{ x E Rn I Ax 2: b}. Suppose that . is a sequence of elements of P that converges to some x* . We have
=
170
Chap.
4
Duality theory
to show that x* E P. For each k , we have xk E P and, therefore, Axk 2: b. Taking the limit , we obtain Ax* = A ( lim k + oo xk ) = lim k + oo ( Axk ) 2: b, D and x* belongs to P. The following is a fundamental result from real analysis that provides us with conditions for the existence of an optimal solution to an optimiza tion problem. The proof lies beyond the scope of this book and is omitted.
: lRn 1+ lR is a con tinuous function, and if S is a nonempty, closed, and bounded subset of lRn , then there exists some x* E S such that f(x* ) S f (x) for all X E S. Similarly, there exists some y* E S such that f( y * ) 2: f (x) for all x E S.
Theorem 4.10 (Weierstrass' theorem) If f
Weierstrass' theorem is not valid if the set S is not closed. Consider, for example, the set S = {x E lR I x > o}. This set is not closed because we can form a sequence of elements of S that converge to zero, but x = 0 does not belong to S. We then observe that the cost function f (x) = x is not minimized at any point in S; for every x > 0, there exists another positive number with smaller cost , and no feasible x can be optimal. Ultimately, the reason that S is not closed is that the feasible set was defined by means of strict inequalities. The definition of polyhedra and linear programming problems does not allow for strict inequalities in order to avoid situations of this type.
The separating hyperplane t heorem
The result that follows is "geometrically obvious" but nevertheless ex tremely important in the study of convex sets and functions. It states that if we are given a closed and nonempty convex set S and a point x* fj. S, then we can find a hyperplane, called a separating hyperplane, such that S and x* lie in different halfspaces (Figure 4.8 ) .
Theorem 4. 1 1 (Separating hyperplane theorem) Let S b e a non
empty closed convex subset of lRn and let x* E �n be a vector that does not belong to S. Then, there exists some vector C E lRn such that c' x * < c'x for all x E S.
=
Proof. Let II . II be the Euclidean norm defined by I l x l l fix some element w of S, and let B = { x I l l x  x* 1 1
and D
=
(X'X) 1/ 2 . Let us
s I l w  x* I I } ,
S n B [Figure 4.9 (a) ] . The set D is nonempty, because w E D.
Sec.
4. 7
From separating hyperplanes to duality
1 71
c
Figure 4.8: A hyperplane that separates the point
convex set S.
x* from the
Furthermore, D is the intersection of the closed set S with the closed set B and is also closed. Finally, D is a bounded set because B is bounded. Consider the quantity Ilx  x* ll , where x ranges over the set D. This is a continuous function of x. Since D is nonempty, closed, and bounded, Weierstrass' theorem implies that there exists some y E D such that
property [see Figure 4.9 (b)] . Let X E S. For any A satisfying 0 < A ::; 1 , we have y + A(X  y) E S, because S is convex. Since y minimizes Ilx  x* II over all X E S, we obtain
V x E D. For any x E S that does not belong to D , we have Ilx  x* II > Ilw  x* II 2: Ily  x* ll · We conclude that y minimizes Ilx  x* 11 over all x E S. We have so far established that there exists an element y of S which is closest to x* . We now show that the vector y  x* has the desired
c
=
Ily  x * 1 I ::; Ilx  x * l l ,
Ily  x* 11 2
which yields
<
II Y + A(X  y)  x* 11 2 Ily  x * 11 2 + 2A(y  x * ) ' (x  y) + A 2 11 x y 11 2 ,
_
2 A(y  x* ) ' (x  y) + A 2 11 x  Y l 1 2 2: o. We divide by A and then take the limit as A decreases to zero. (y  x* ) ' (x  y) 2: o.
We obtain
[This inequality states that the angle () in Figure 4.9 (b) is no larger than 90 degrees.] Thus,
(y  x * ) ' x
2:
(y  x* ) ' y
172
Chap.
4
Duality theory
x*
( a)
(b )
Figure 4.9: Illustration o f the proof o f the separating hyperplane
theorem.
>
Setting
c
(y  x* ) ' x* + (y  x* ) ' (y  x* ) (y  x* ) 'x* .
D
=
y  x*
proves the theorem.
Farkas ' lemma revisited
We now show that Farkas ' lemma is a consequence of the separating hy perplane theorem. We will only be concerned with the difficult half of Farkas ' lemma. In particular, we will prove that if the system Ax = b, x ?: 0, does not have a solution, then there exists a vector p such that p' A ?: 0' and p'b < O. Let
S
{Ax l x ?: O} {y I there exists x such that y = Ax, x ?: O ,
}
and suppose that the vector b does not belong to S. The set S is clearly convex; it is also nonempty because 0 E S. Finally, the set S is closed; this may seem obvious, but is not easy to prove. For one possible proof, note that S is the projection of the polyhedron { (x, y) I y = Ax, x ?: O} onto the y coordinates, is itself a polyhedron ( see Section 2.8) , and is therefore closed. An alternative proof is outlined in Exercise 4.37. We now invoke the separating hyperplane theorem to separate b from S and conclude that there exists a vector p such that p'b < p'y for every
Sec.
4. 7
From separating hyperplanes to duality*
1 73
y E S. Since 0 E S, we must have p'b < 0. Furthermore, for every column Ai of A and every A > 0, we have A A i E S and p ' b < AP ' Ai . We divide both sides of the latter inequality by A and then take the limit as A tends to infinity, to conclude that p' A i 2: 0. Since this is true for every i, we obtain p' A 2: 0' and the proof is complete.
The duality theorem revisited
We will now derive the duality theorem as a corollary of Farkas' lemma. We only provide the proof for the case where the primal constraints are of the form Ax 2: b. The proof for the general case can be constructed along the same lines at the expense of more notation (Exercise 4.38) . We also note that the proof given here is very similar to the line of argument used in the heuristic explanation of the duality theorem in Example 4.4. We consider the following pair of primal and dual problems minimize subject to
c'x Ax 2: b ,
maximize subject to
p'b p'A = c ' p 2: 0,
and we assume that the primal has an optimal solution x* . We will show that the dual problem also has a feasible solution with the same cost . Once this is done, the strong duality theorem follows from weak duality (cf. Corol lary 4.2) . Let f = {i I a� x* = bd be the set of indices of the constraints that are active at x* . We will first show that any vector d that satisfies a� d 2: ° for every i E f, must also satisfy c'd 2: 0. Consider such a vector d and let E be a positive scalar. We then have a� ( x* + Ed) 2: �x* = bi for all i E f. In addition, if i rJ. f and if E is sufficiently small, the inequality a� x* > bi implies that a� ( x* + Ed) > bi . We conclude that when E is sufficiently small, x* + Ed is a feasible solution. By the optimality of x* , we obtain c'd 2: 0, which establishes our claim. By Farkas' lemma (cf. Corollary 4.3) , c can be expressed as a nonnegative linear combination of the vectors ai , i E f, and there exist nonnegative scalars Pi , i E f , such that (4.3) For i rJ. f, we define Pi 0. We then have p 2: 0 and Eq. (4.3) shows that the vector p satisfies the dual constraint p' A = c' . In addition,
=
p ' b = " Pi bi = " Pi a.; X * = C , X * , � , �
i EI
iEI
which shows that the cost of this dual feasible solution p is the same as the optimal primal cost . The duality theorem now follows from Corollary 4.2.
then our search for an optimal solution can be restricted to finitely many points. but with many im portant ramifications in optimization and other areas in mathematics." Cones The first step in our development is to introduce the concept of a cone. We proved the separating hyperplane theorem. 4 Duality theory (a) (b) Figure 4. we will show that the optimal cost is 00 if and only if there exists a cost reducing direction along which we can move without ever leaving the feasible set. 4. We used the separating hyperplane theorem to establish Farkas' lemma.8 Cones and extreme rays We have seen in Chapter 2. we wish to develop a similar result for the case where the optimal cost is . and finally showed that the strong duality theorem is an easy consequence of Farkas' lemma. assuming one exists.00 . that if the optimal cost in a linear programming problem is finite. which is a very intuitive and seemingly simple result . namely.1 all x E C. A set C c lRn is a cone if AX E C for all A�0 and . In particular. Furthermore.10: Examples of cones. our search for such a direction can be restricted to a finite set of suitably defined "extreme rays. In conclusion. the basic feasible solutions. In this section. Definition 4. we have accomplished the goals that were set out in the beginning of this section.174 Chap.
5.>. . . without leaving the set P.::: o. the only possible extreme point is the zero vector. (c) There exist n vectors out of the family al . To this see. Let x be a nonzero element of a polyhedral cone O . for all . .::: o } . x . For the case of a nonempty polyhedron P = {x E �n I Ax = b. It is easily seen that this set is the same as and is a polyhedral cone. D Proof.::: b. This shows that the recession cone is independent of the starting point y. consider an arbitrary element x of 0 and set . (b) The cone 0 does not contain a line. Whether this will be the case or not is determined by the criteria provided by our next result .::: O} is easily seen to be a nonempty cone and is called a polyhedral cone. . The nonzero elements of the recession cone are called the rays of the polyhedron P.m. then 0 E O. we say that the cone is pointed. i = 1 . the recession cone is defined as the set { d E �n I A(y + '>'d) .12 Let 0 C �n be the polyhedral cone defined by the constraints a�x . More formally. .6 in Section 2. = 0 in the definition of a cone. am . the recessi