You are on page 1of 102

Okonometrie

und Unternehmensforschung

Econometrics
and Operations Research

XXI

Herausgegeben von Edited by


M. Beckmann, MUnchenfProvidence R. Henn, Karlsruhe
A. Jaeger, Bochum W. Krelle, Bonn H. P. KUnzi, ZUrich
K. Wenke, ZUrich Ph. Wolfe, New York

Geschiiftsfuhrende Herausgeber Managing Editors


W. Krelle H. P. KUnzi
Peter Kall

Stochastic
Linear Programming

Springer-Verlag Berlin Heidelberg New York 1976


Peter Kall
Institute for Operations Research
and Mathematical Methods in Economics, University of Zurich

AMS Subject Classifications (1970): 28A20, 60E05, 90-02, 9OC05, 90C15, 9OC20, 90C25

ISBN-13: 978-3-642-66254-6 e-ISBN-13: 978-3-642-66252-2


DOl: 10.1007/978-3-642-66252-2

Library of Congress Cataloging in Publication Data


Kall, Peter. Stochastic linear programming. «()konometrie und Unternehmesforschung; 21). Bibliography: p. Includes index. 1. Linear
programming. 2. Stochastic processes. I. Title. II. Series.
HB143.K35 519.7'2. 75·30602.

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifical1y those of
translation, reprinting. fe-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in
data banks. Under § 54 of the German Copyright Law where copies 3rc made for other than private use a fee is payable to the publisher,
the amount of fee to be determined by agreement with the publisher.

© by Springer. Verlag Berlin Heidelberg 1976.


Softcover reprint of the hardcover 1st edition 1976
Preface
Todaymanyeconomists, engineers and mathematicians are familiar with linear
programming and are able to apply it. This is owing to the following facts:
during the last 25 years efficient methods have been developed; at the same time
sufficient computer capacity became available; finally, in many different fields,
linear programs have turned out to be appropriate models for solving practical
problems.
However, to apply the theory and the methods of linear programming, it is
required that the data determining a linear program be fixed known numbers.
This condition is not fulfilled in many practical situations, e. g. when the data
are demands, technological coefficients, available capacities, cost rates and so on.
It may happen that such data are random variables. In this case, it seems to be
common practice to replace these random variables by their mean values and
solve the resulting linear program. By 1960 various authors had already recog-
nized that this approach is unsound: between 1955 and 1960 there were such
papers as "Linear Programming under Uncertainty", "Stochastic Linear Pro-
gramming with Applications to Agricultural Economics", "Chance Constrained
Programming", "Inequalities for Stochastic Linear Programming Problems" and
"An Approach to Linear Programming under Uncertainty".
The aim of this book is to give some insight into this challenging field which
has to be understood as a special subject of planning under uncertainty. A com-
plete collection of results obtained so far did not seem entirely appropriate, and
my preference led me to choose those topics and results which can be handled
more or less systematically within a certain theoretical framework. This does
not imply a value judgement on topics and results which are not reported. In the
bibliography I have cited only those papers which were really used in the writing
of the text. A fairly comprehensive bibliography on stochastic linear programming
can be obtained by taking the union of the bibliographies of the books cited and
all references found by starting with the papers on stochastic programming listed
at the end.
It is assumed that the reader is familiar with elementary real analysis and
linear algebra. It would be helpful if he were also acquainted with optimization
theory as well as basic measure theory and probability theory. With regard to the
latter requirements, and also to avoid terminological confusions, I have included
a collection of the most important definitions and results (Chapter 0.), to which
I refer later on. Beyond these prerequisites, every assertion is proved which, in my
opinion, leads to a better understanding of the results, the difficulties and the
unsolved problems.
I am indebted to dip!. math. B. Finkbeiner, Dr. K. Hassig, Dr. M. Kohler,
Dr. K. Marti, Dr. R. J. Riepl and especially to Prof. Dr. W. Vogel and Prof. Dr.
R. Wets for their helpful comments and suggestions. Nevertheless I am responsible
for every mistake left, and I shall appreciate every constructive criticism. I also
owe thanks to Mrs. E. Roth for typing the manuscript and giving linguistic sup-
port. Finally I have to acknowledge the extraordinary patience of the editors and
Springer-Verlag.
Contents

Chapter O. Prerequisites
1. Linear Programming. . . 1
2. Nonlinear Programming . 4
3. Measure Theory and Probability Theory 6

Chapter I. Introduction. . . . . . . . . . . . . . . . . . . . . 11

Chapter II. Distribution Problems 19


1. The General Case 19
2. Special Problems . . . . . . . . . 33

Chapter III. Two Stage Problems 39


1. The General Case . . . . 39
2. The Fixed Recourse Case. 45
3. Complete Fixed Recourse 51
4. Simple Recourse . . . . 56
5. Computational Remarks . 65
6. Another Approach to Two Stage Programming 71

Chapter IV. Chance Constrained Programming 79


1. Convexity Statements . . . . . . . . . . . . 79
2. Relationship between Chance
Constrained Programs and Two Stage Problems . 90

References . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Subject Index. . . . . . . . . . . . . . . . . . . . . . . . . . 94
Chapter O. Prerequisites

1. Linear Programming

Linear Programs are special mathematical programming problems. We under-


stand by a mathematical programming problem in the Euclidean space lR" the
optimization of a given real-valued function - the objective function - on a
given subset of lR", the so-called feasible set. A mathematical programming
problem is called a linear program if its objective function is a linear functional
on lR" and if the feasible set can be described as the intersection of finitely many
halfspaces and at most finitely many hyperplanes in lR". Hence the feasible set
of a linear program may be represented as the solution set of a system of finitely
many linear inequalities and, at most, finitely many linear equalities - the so-
called (linear) constraints - in a finite number of variables.
For fElR", (X ElR it is evident that
{xl XElR",J'X~ (X} = {xl x ElR", ,EIR,f'X+, = (X, ,2 O},

where f' is the transpose off and f' x is the usual scalar product off and x. From
this relation, and from the fact that every real number may be represented as the
difference of two nonnegative real numbers, it follows that every linear program
may be written in the standard formulation
( 1) min{c'xIAx=b, X20},

where cElR", bElRm are constant and A is a constant real (m x n)-matrix, xElR" is
the vector of decision variables and X20 stands for Xi2 0, i = 1, .. . ,n. We under-
stand by a solution of a linear program a feasible x such that c' x~ c'x for all
x E{xl Ax=b; X20}.
The question, under which conditions the feasible set of a linear program is
non empty, is answered by
Theorem 1 (Farkas' Lemma). {xIAx=b,X20};60 if and only if
{ul A'U20} c {ulb'U20}.
Here the prime at A and b again indicates transposition.
An immediate consequence of Farkas' Lemma is
Theorem 2. If there is a real constant y such that c' x 2 y for all x E {x I Ax = b,
x 20} ;6 0, then the linear program min {c' x IAx = b, x 2 O} has a solution.
An important concept in linear programming is that of feasible basic solutions.
We call x afeasible basic solution of (1) if x E {xl Ax =b, X20} and if the columns
of A corresponding to the nonzero components of x are linearly independent.
Obviously the set III = {x I x is a feasible basic solution of (1)} is finite.
2 Chapter O. Prerequisites

If we define the convex polyhedron generated by m: as


r r
~ = {x Ix = L AiZi, Zi Em:, Ai ~ 0, L Ai = 1}
i= 1 i= 1

and observe that ~={yIAy=O,y~O} is a convex polyhedral cone, i.e. if ylE~


and y2E~, then Alyl+A2y2E~ for all Al~O, A2~0 and ~ is generated by a
finite set, then we may state
Theorem 3. {x IAx = b, x~ O} = ~ + ~ = {x Ix = z + y, z E~, Y E~}.
In other words: The feasible set of a linear program is the direct (or algebraic)
sum of a convex polyhedron and a convex polyhedral cone. Sets of this type
are usually called convex polyhedral sets. From the representation of feasible
solutions given by Th. 3, follows immediately
Theorem 4. The linear program (1) has a solution if and only if {x IAx = b, x ~ O} "# 0
and c'y~Ofor all YE{yIAy=O, y~O}.
Furthermore, we may conclude
Theorem 5. If the linear program (1) has a solution, then at least one of the feasible
basic solutions is also a solution.
Therefore, if we want to determine a solution of a linear program, we may restrict
ourselves to the investigation of a finite number of points, namely the feasible
basic solutions. This is done in the well-known Simplex method. To describe the
essential parts of this method, we assume without loss of generality that in (1)
the matrix A has rank m:5,n. Then A contains subsets of columns {Ail" . . ,A im },
which are linearly independent and hence constitute bases of lRm. Such a basis,
written as a matrix B=(A il , ... ,Aim),is called a feasible basis if B-lb~O.
If D is the matrix of the n -m columns, which are not contained in B, and if
x= {XiI" . . ,Xim ) and y is the (n -m)-tuple of x-variables corresponding to the
columns of D, then Ax=b is equivalent to Bx+Dy=b. Hence, the vector of
basic variables x depends on the vector of nonbasic variables y as follows:

(2)
If B is a fc;:asible basis and if we choose y = 0, then we have a feasible basic
solution, where the basic variables have the values of the components of B-1b.
If we reorder the components 01 c into an m-tuple c and an (n -m)-tuple d
corresponding to the reordering of the components of x in x and y, then, from
(2), we get for the objective function
(3) c'x=c'x+d'y

Starting with a feasible basis B and the corresponding feasible basic solution
given by y=O, x=B-1b, the only feasible change, given the constraints x~O in
(1), is to increase some component(s) of y, while keeping X=B-lb-B-lDy~O.
Hence, it is obvious that the Simplex criterion d' -C'B-ID~O is sufficient for
the optimality of that feasible basic solution. If the feasible basic solution is
1. Linear Programming 3

nondegenerated, i. e. B- 1b> 0, then this condition is obviously necessary for


optimality too. But also if degenerated feasible basic solutions occur (i. e. some
basic variables become equal to zero), one can prove
Theorem 6. The linear program (1) has a solution if and only if there is a feasible
basis which satisfies the Simplex criterion.
Now the Simplex method works as follows: Start with a feasible basis B. If the
Simplex criterion is satisfied, we have an optimal feasible basic solution with an
optimal value c'B- 1b. Otherwise, increase a yj' for which (d'-c'B- 1D)j<0,
until an Xi vanishes. If Xi is the first to vanish, exchange the i-th column of Band
the j-th column of D, i. e. let Yj become a basic variable and Xi a non basic variable.
After rearranging c and d correspondingly, restart from the beginning. The
arithmetic operation - necessary for this exchange of basic and non basic
variables - is called pivoting.
As long as only non degenerated feasible basic solutions occur, it is obvious
that in (3) the value of the objective function is strictly decreased at each step.
Thus, the procedure must then be finite, since there are only finitely many
feasible basic solutions. There are additional rules to avoid cycling, in case
degeneracy occurs.
Some special methods have been developed making use of special data structures
in the matrix A, one of which is the so-called decomposition structure. Here A
looks like

link part with J..l rows


I
v subblocks

where all elements outside the rectangles are equal to zero. Decomposition methods
take advantage of this structure - especially for large-scale problems - by
pivoting in intermediate steps only on the subblocks of A, whereas in the main
pivot steps one is concerned with a matrix of J..l+ v rows (corresponding to the
so-called masterprogram). Without going into details we can say that this pro-
cedure is essentially based on Th. 3.
In certain practical situations it is of interest to know what happens to the
solution or the optimal value of a linear program if some of the elements of
A, b, c vary. This kind of problem leads to parametric linear programming. For
later purposes we might mention the following special result:
Theorem 7. Let TeJRm be a convex (polyhedral) set. Suppose that {xIAx=t,
x20}#0 for all tET and that y(t)=min {c'xIAx=t, X20} is finite for at least
one tE T. Then yet) isfinitefor all tE T and yet) is a piecewise linear, convex function
on T (i. e.for arbitrary t 1E T, t 2 E Tand AE(O, 1) we have y(At 1+(1-A)t 2 )~Ay(t1) +
(1 - A) y( t 2 )).
4 Chapter O. Prerequisites

This assertion can be proved by Th. 4 and Th. 6.


Finally, we have to mention the duality in linear programming. To every so-called
primal linear program,
(4) min {c'xl x ElRn, Ax=b, x~O},
we may express its so-called dual program as
(5) max {b'ul u ElRm, A'u~c}.
Theorem 8. Suppose that (4) and (5) have feasible solutions x and 11, respectively.
Then
a) b'l1~c'x, and
b) (4) and (5) have a solution.
Assertion a) follows from the feasibility of x and 11, and b) follows from Th. 2.
Theorem 9. (Duality Theorem). (4) has a solution x* if and only if (5) has a
solution u*. Then c'x* =b'u*.

For further details see


George B. Dantzig: Linear Programming and Extensions. Princeton University
Press Princeton.

2. Nonlinear Programming

A nonlinear program is a mathematical programming problem which is not a


linear program, i. e. a nonlinear program is of the form
(6) min {[(x) I x E'B}
where the objective function f(x) is some real valued function defined on the
feasible set 'BelRn. However, in this generality problem (6) cannot be handled.
One of the reasons is the fact that in general a local minimum of (6) is not a
global one. We may overcome this difficulty by assuming that (6) is a convex
program. Problem (6) is a convex program if the feasible set is convex, i. e. if
XE'B and YE'B, then ;,X+(1-A)YE'B for all A E(O, 1), and if f is a convex
function, i.e. for x E'B and YE'B we have f(h+(1-A)Y )~.if(x)+(1-A)f(Y) for
all AE(O, 1). From the convexity of a function f we may conclude immediately
the following statements:
Theorem 10. Let 'BelRn be a convex set andf:'B-1R be a convex function. Then
f is continuous in the interior of 'B.
Theorem 11. Let f:'B-1R be convex and differentiable at Xl E'B. Then
f(x 2 ) - f(x l ) ~ (x 2 _Xl), gradf(x!) for all x 2 E'B.
Theorem 12. Let f:'B-1R be convex and twice differentiable. Then the matrix
(ofjox;i3x) is positive semidefinite on 'B.
2. Nonlinear Programming 5

We have to mention that a function f is called concave, if - f is convex.


Usually the feasible set ~ of a convex program is determined by constraints,
i.e. ~={Xlgi(X)'::;O, i=1, ... ,k}, where gi(X), i=1, ... ,k, are convex functions.
For the following, it is meaningful to distinguish between linear and nonlinear
constraints, so that ~={Xlgi(X)'::;O, i=1, ... ,k;hi(x)=O, i=1, ... ,m}, where
n
hi(x)= I aijxj-b i. Hence, we rewrite (6) in the following form:
j= 1

minf(x)
(7) subject to g(x).::;O
h(x) =0,

where 9 and h are vector-valued functions g:lRn-+lR k and h:lRn-+lRm , whose


components have the above-mentioned properties. To get an existence theorem,
we require a regularity condition, for example the following:
(8) 3x ElRn:g(X)< 0, h(x) =0.

Then we may state


Theorem 13. (Kuhn-Tucker Theorem). Given the regularity condition (8), the
convex program (7) has a solution if and only if there exist x ElR n, U ElRk, u~O, and
vElRmsuch that
f(x) + u'g(x) + v' hex) .::;f(X) + u'g(x) +v' hex) .::;f(x) + u'g(x) +v' hex)
for all x E lR n, all v ElR m and all u E lRk such that u ~ O. Then x is a solution of (7).
If we start with another standard formulation of convex programs, namely
minf(x)
(9) subject to g(x).::;O
x~O

then - given the corresponding regularity condition - we have, according to


Th. 13, to prove the existence of a saddlepoint (x', u')' ~ 0 of
<P(x, u)=f(x) + u'g(x) on x~O, u~O, i.e.
<P(x, u).::;<P(x, Ii)'::;<P(x, Ii) for all x~O and all u~O.

Local conditions for such a saddlepoint are given by


Theorem 14. Let cp(x, u) be convex in x and concave in u and continuously
differentiable. Then (x', u')' is a saddlepoint of cp, i. e. x~ 0, u~ 0 and cp(x, u)
.::; cp(x, U).::; cp(x, u) for all x~O, u~O, if and only if for i= 1, .. . ,n and j= 1, .. . ,k
acp(x, Ii) >0 acp(x, Ii) <0
aXi aUj
- . acp(x, Ii) 0 - apex, u) 0
Xi a
Xi
Uj
aUj

The application of this theorem to problem (9) with cp(x, u) = f(x) + u'g(x)
yields the local Kuhn- Tucker conditions for a solution of (9):
6 Chapter O. Prerequisites

gradf(x) + (u'G(x»)' ~O g(x):s;O


(10) x' [gradf(x) + (u'G(x»)'J =0 u'g(x) =0
x~O u~O

where G(x) is a matrix with the elements Gij(x)= a~i(X) .


uXj
In particular, for a so-called quadratic program

min [c'x+l x'Qx ]


2
(11) subject to Ax~b
x~O,

where Q is a real symmetric positive semidefinite (n x n)-matrix (see Th. 12), the
local Kuhn-Tucker conditions are:
c+Qx-A'u~O b-Ax:s;O
(12) x'(c+Qx-A'U)=O u'(b-Ax)=O
x~O u~O.

There are different approaches for solving convex programs. Besides linearization
methods, which approximate the original program by linear programs, there are
gradient methods, applied to the original problem, and so-called complementarity
methods, applied to the Kuhn-Tucker conditions. In the special case of quadratic
programming, gradient methods as well as complementarity methods are available,
which simply require pivoting - as in the simplex method - under additional
rules for choosing the pivot elements.

For further details see


G. Hadley: Nonlinear and Dynamic Programming, Addison-Wesley Publ. Com-
pany, Inc., Reading - Palo Alto - London.

3. Measure Theory and Probability Theory

One of the basic concepts in measure theory is that of a measurable space (R, m:),
where R is some nonempty set (called space) and m: is a a-algebra on R. m: is a
a-algebra, if it is a nonemty class of subsets of R, which is closed under the for-
mation of complements and countable unions, i.e. if A Em:, then A=R -A Em:,
00

and if AiEm:, i= 1,2,3, ...... , then U AiEm:.


i= 1
From this definition follows immediately
Theorem 15. Let (R,m:) be a measurable space. If Ai Em:, i=1,2,3, ... , then
00

0Em: and RE m:.


i= 1
Sets belonging to m: are called measurable.
Ifin R some nonempty class (t of subsets is given, we may define in a unique way
a "smallest" a-algebra containing (t, more precisely:
3. Measure Theory and Probability Theory 7

Theorem 16. If(f is any nonempty class of subsets of R, then there exists a unique
a-algebra 21 such that (f c 21 and such that if (£ is any other a-algebra containing
(f then 21 c (£.

This smallest a-algebra 21 containing (f is called the a-algebra generated by (f.


One of the most important a-algebras in applications is the Borel algebra ~ in
IR n, which is the a-algebra generated by (f={A,IAt={xlxEIRn,x~t, tET}},
where T is the set of all n-tuples of rationals. It is obvious, that for a EIR n, bE IRn
sets of the type {xlx~b}, {xlx<b}, {xlxza}, {xlx>a}, {xla<x~b} etc. are
Borel sets. From the properties of the natural topology in IRn follows
Theorem 17. Every open set in IRn is a Borel set.
By the expression extended real we indicate that either a real value or ±=
may occur. Should extended real components be allowed in IRn, we indicate this
by the symbol IRn. The Borel algebra ~ is then extended accordingly.
Let (R,21) and (S, (£) be two measurable spaces.
A mapping T:R--->S is called a measurable transformation if T- I [C]={rlrER;
Tr EC} E21 for all C E(£. If S = IR and (£ is the Borel algebra on JR, T is called a
measurable function. If, in particular, R = IRm, S = JRn and 21 and (£ are the corre-
sponding Borel algebras, we use the term Borel measurable transformation. If
moreover S = IR and (£ is the Borel algebra on IR, we speak of a Borel measurable
function.
Theorem 18. Let (R, 21), (S,
(£), (U, 1)) be measurable spaces and TI : R--->S, T2 : S--->U
be measurable transformations. Then Tz 0 TI : R ---> U, defined by T2 TI (r) = T2 (TI (r) )
0

for rER, is a measurable transformation.


Theorem 19. Let (R, 21) be a measurable space andJ;.:R--->IR, i = 1,2, ... , be extended
real valued measurable functions. Then lfll,J! +f2'/1 1z, inf/i and sup/i are meas-
urable functions.
From Th. 17 follows
Theorem 20. If T: IRm---> IRn is continuous, then T is Borel measurable.
A measure on a a-algebra 21 is a function 11 : 21---> IR, with the properties: 11(0) = 0,
I1(A)zO for all A E21 and

for every countable class of disjoint sets Aj E21. If 21 is a a-algebra on the space R
and 11 is a measure on 21, then (R, 21,11) is called a measure space. A measure 11 is
called a-finite, if there is a countable class of sets AjE21, such that I1(A j) < = for
00

i = 1,2, ... , and U Aj = R. A measure 11 is finite, if I1(R) < =. An important


j= I

example of a a-finite measure is the Lebesgue measure in IRn, which is uniquely


determined on the Borel algebra by requiring
n
n
11({xla< x~b})= (bj-aj) for all aEIR n, bEIRn such that a< b.
j= I
With respect to a measure space (R,21,I1), a proposition is said to be true almost
everywhere (a. e.) if the proposition is true for every element of R except at most
8 Chapter O. Prerequisites

a measurable set N of elements with Ji(N) = O. Hence, a sequence {f,,} of


measurable functions defined on a measure space (R, ~ Ji) converges to f a.e. if
there is a N E~ such that JLIIJ,J,,(r) = fer) for all rER - Nand Ji(N) =0. The sequence
{f,,} converges in measure to J, if JLIIJ, Ji ({r 11J,,(r) -fer) I ~ 8} ) = 0 for every 8> O.
To introduce integration we need simple functions. A simple function on a
measure space (R, ~,Ji) is a measurable function J, which attains a finite number
of different real values Yi' i = 1, ... , k. If for Ai = {r If(r) = y;}, i = 1, ... , k, it is true
that Ji(A;} < = whenever Yi #0, then f is called an integrable simple function, and
the integral is defined as k

IfdJi= L YiJi(A i),


i= 1

where for Yio=O and Ji(A io )== the product Yio' Ji(Aio) =0 by definition. A
sequence {f,,} of integrable simple functions is called mean fundamental, if
I If" - fm IdJi tends to zero whenever nand m tend to infinity. Now a measurable
function f on a measure space (R, ~,Ji) is called integrable if there is a sequence
{f,,} of integrable simple functions which is mean fundamental and converges
in measure to f Then the integral is defined as IfdJi = lim If" dJi.
n~oo

Theorem 21. a) A measurable function f on (R, ~,Ji) is integrable if and only if its
absolute value is integrable; and I IfdJiI ::S; Ilfl dJi.
b) Let J,g be integrable functions on (R, ~,Ji) and a,fJ be real constants. Then
af + fJg is integrable and S(af + fJg) dJi = a Jf dJi+ fJ JgdJi.
Theorem 22 (Legesgue). If {f,,} is a sequence of integrable functions converging in
measure (or a. e.) to f, and ifg is an integrable function such that
If,,(r) I::S; Ig(r) Ia. e., n = 1,2, ... ,
thenfis integrable and
lim Ilf - f,,1 dJi=O.
n~oo

Theorem 23 (Holder's Inequality). Let p and q be real numbers greater than


such that ~+~= 1 and assume that fP and gq are integrable functions on (R, ~,Ji).
Then the product fg is integrable and
1 1
Ilfgl dJi::S; (Ilfl P dJi)P (JlglqdJi)q.
For p=q=2 Holder's inequality is called Schwarz's Inequality.
Let (R, 21) be a measurable space and V,Ji be two measures defined on ~. The
measure v is called absolutely continuous with respect to Ji, v« J1, if Ji(A) = 0
implies v(A)=O.
Theorem 24 (Radon-Nikodym). If v and Ji are (J-finite measures, then v« Ji if and
only if there is an integrable function fsuch that v(A) = SfdJifor every measurable
set A. jis a. e. uniquely determined. A
Here JfdJi= JX.JdJ1, where XA(r) = 1 ifrEA and XA(r) =0 ifr¢A.
A
Finally we have to mention product spaces. Let (R,~,Ji) and (S,(£,v) be
(J-finite measure spaces. Then the product space is denoted as (R x S,21 x (£,Ji x v),
where R x S is the Cartesian product of Rand S, ~ x (£ is the (J-algebra generated
3. Measure Theory and Probability Theory 9

by the class of all Cartesian products A x C, where A E ~ and C E (£:, and f.1 x v is
the product measure on ~ x (£: uniquely determined by (f.1 x v) (A x C) = f.1(A)-v(C)
for all A E ~ and C E(£:. If for example 'Ek is the Borel algebra on IRk and f.1k is the
corresponding Lebesgue measure, then (IRm+n, 'E m +n, f.1m+n) = (IRm x IRn, 'Em X 'En,
f.1m X f.1n)·
If Dc R x S, then a section of D determined by r E R, or a r-section of D, is the
set Dr = {s I (r, s) ED}.
Theorem 25. In the product space (R x S, ~ x(£:, f.1 x v), a measurable set DcR x S
has measure zero if and only if almost every r-section (or almost every s-section)
of D has measure zero.
With respect to integration we need the important
Theorem 26 (Fubini's Theorem). Let f be an integrable function on the product
space (R x S, ~ x(£:, f.1 x v).
Then
Jfd(f.1 x v) = J(Jfdf.1 )dv= J(Jfdv )df.1.
Probability theory may be understood as a special area of measure theory.
A probability space is a finite measure space (Q, 0:, P) for which P(Q) = 1. The
measurable sets (i. e. the elements of 0:) are called events and P is called a
probability measure. Instead of a. e. we use the phrase almost sure. A measurable
transformation x:Q--->IRn (where the a-algebra on IRn is always the Borel
algebra) is called a (n-dimensional) random vector. A one-dimensional random
vector is a random variable. Observe that every component of a random vector
is itself a random variable. A random vector x defines a probability measure
P on IRn in a natural way by PCB) = P(x - ! [B]) for all Borel sets B. P is uniquely
determined on 'En by the distribution function of x: FAt) =P (W ~ EIRn, ~:S; t} )
for all t EIRn. If P is absolutely continuous with respect to the Lebesgue measure
f.1m by the Radon-Nikodym theorem there is a probability density function fAT)
defined on IRn such that P(B) = SfAT)df.1n for all BE'En.
B
The expectation Ex of a random vector x is the vector of the integrals of the
components of x. For simplicity we write
Ex = (Jx! dP, JX2dP, . .. , JxndP)' = JxdP.
Q Q Q Q

Hence we have
Ex = JxdP= gdP= gdFA~)
Q Rn Rn

where the last expression is the so-called Lebesgue-Stieitjes integral. If x has a


probability density function, we may also write Ex = J~fx (~)df.1n= Jlfx(~)d~,
Rn Rn
where df.1n and d~ refer to the Lebesgue measure on 'En. We call k random
variables Xi, i= 1, .. . ,k stochastically independent, if

p(6{W' Xi(W) EBi})= i~/({WIXiCW)EBJ)


10 Chapter O. Prerequisites

for all Borel sets Bi in IR. There is an obvious connection between stochastic
independence and product measures. Let Pi be the probability measure on IR
corresponding to the random variable X;, i = 1, ... , k, and let P be the probability
measure on IRk corresponding to X=(Xb Xz, . . "Xk)', then stochastic independence
of the random variables Xl,Xz," .,Xk is equivalent to P=P1 x Pz X ... X Pk,
Fx(t) = Fx,(t l )· FX2 (t Z)' .... FXk(t k) and, if the densities exist,
fAr:) = fx/r: 1) "ix2(r: z)· .... fXk(r:d·
From Fubini's Theorem follows
Theorem 27. Let Xl and Xz be stochastically independent and assume that EXb Ex z
and EXIXZ exist. Then Ex1Xz =(Exd (Exz).

For further details see


Paul R. Halmos: Measure Theory, D. Van Nostrand Company, Inc., Princeton-
Toronto-London-New York.
M. Loeve: Probability Theory, D. Van Nostrand Company, Inc., Princeton-
Toronto-London-New York.
Chapter I. Introduction

In view of the fact that sometimes there seems to be a terminological confusion, it


might be useful to try to explain what stochastic linear programming is. There
are many practical situations for which - at first glance - linear programs are
appropriate models. This is especially the case in production problems with
(piecewise) linear production functions and (piecewise) linear cost functions, diet
problems with (piecewise) linear cost functions, all other optimal mix problems
such as oil-refining, distillation of spirits etc., general network flow problems with
(piecewise) linear cost functions, critical path scheduling problems, Hitchcock-
type transport problems and so on. It is obvious that these and many similar
problems are of great practical importance. For this reason the development of
linear programming has been explosive during the last 25 years. Since at the same
time there has been an equally remarkable development of the computer technol-
ogy, linear programming can now be looked on as a standard tool for sloving
problems as mentioned above.
Let us see under what conditions the application of linear programming can
be justified. If we solve one of the above problems by solving a linear program in
one of its standard forms
min c'x
(1) Ax=b
x;;::: 0
where x ElRn, C ElR n, b ElR m, we must make sure that our problem not only has the
linear structure indicated in the linear program, but also that the coefficients in A,
b, c are, at least throughout the planning period, fixed known data. But everybody
will agree that this is not true in most practical problems. For example, if the
linear program represents a production problem, b is the demand vector, A is
the matrix of technological coefficients, c is the vector of costs per unit and x is
the vector of factors of production, i. e. x is the input into the production process
and shall be determined optimally. It is evident that in many practical situations
neither the demand vector nor the technological coefficients nor the cost vector
are fixed known data. Then there are three possibilities: Either these data are
stochastic variables with known (joint) probability distributions or they are
stochastic variables with unknown probability distributions or they are not
stochastic variables but simply variables. In all these cases a linear programrning
model does not make sense. At this point we can explain what the subject of this
bookis:
Stochastic linear programming (SLP) is concerned with problems arising when
some or all coefficients of a linear program are stochastic variables with known
(joint) probability distribution.
12 Chapter I. Introduction

In this respect many users of linear programming have already been involved
in a special procedure of stochastic linear programming, namely by replacing the
random variables in a linear program by their expectation values or, fairly good
estimates of them, and solving the resulting linear program. The following
numerical example shows that this procedure is not feasible in all practical
situations. Suppose that the problem is
min Xl +XZ
aXI +xz2:7

°
bXI +xz2:4
X I 2: 0, X z 2:

where (a, b) is a uniformly distributed random vector within the rectangle


{(1'::; cx.::;4), G.::; f3.::; 1)}- Then E(a, b) =G, ~). so that the linear program would
be
min Xl +Xz
25 XI + Xz 2:7
2
}XI +xz2:4
XI 2: 0, X z 2: 0,

which yields the unique optimal solution

If we ask for the probability of the event, that this solution is feasible with
respect to the original problem, we get

P{(a,b)lax~+x~2:7;bx~+x~2:4}=P {(a,b)la2: 252} 1


,b2:"3 =4'

So this solution is infeasible with probability.75. If we associate with this simple


example any practical problems such as diet problems in hospitals or oil refining
problems involving such high quality restrictions as for aircraft gasoline, we must
agree that in many practical situations the approach chosen above cannot be
allowed. And even in cases where human safety is not involved, it seems to be
worthwhile to consider the loss of other goods, which may correspond to
infeasible solutions. Therefore, one should be careful when using the above
procedure in view of the possible practical consequences of infeasibility.
There are essentially two different types of models in SLP situations, namely the
so-called "wait and see" and the "here and now" models. "Wait and see" problems
are based on the assumption, that the decision maker is able to wait for the
realisation of the random variables and to make his decision with complete
information on this realization, i.e. if (A, h, c) is a realization of the random vector
(A, b, c), he has to solve the linear program
Chapter I. Introduction 13

min c'x
(2) Ax=b
x;;:::O.

Typical questions in this case are: What is the expectation of the optimal value
of (2), or what are the expectation and the variance of this optimal value and so
on. More generally the question is: What is the probability distribution of the
optimal value of (2)? A possible interpretation of this distribution problem is the
following: Suppose that a special production program (with linear structure)
may be adapted for any short period to actual realizations of random factor prices,
random technological coefficients and random demands. Planning the budget
for a long term - i.e. for many short periods - the board of the firm wants to
know the amount of cash needed for this production program "in the mean"
or "for 95% of the time". More precisely, the board wants to know the expectation,
or the 95% percentile, of the probability distribution of this special production
program's costs per (short) period.
"Here and now" models are based on the following assumption: A decision on
x - or on a "strategy" for x - has to be made in advance or at least without
knowledge of the realization of the random variables.
By a "strategy" for x we understand the game theoretical concept of "mixed
strategy" within a feasible set X of pure strategies x; or, equivalently, a
"strategy" for x is a probability measure P x on a Borel set Xc IRn. If we restrict
ourselves to probability distributions Px such that there exists an X EX with
P x ({ x} ) = 1, we are restricted to pure strategies, i.e. to decisions on x instead of
mixed strategies of x's.
The practical interpretation of a strategy is as usual the assumption that the
decision maker plays his game very often with - possibly different - x's resulting
from a Monte-Carlo simulation of the chosen probability distribution P x •
To understand the philosophy of "here and now" models, it seems to be
necessary to start at the very beginning. Our first observation is that in a linear
program some or all coefficients are random variables with a joint probability
distribution. This implies - by definition of random vectors - the existence of
a probability space (0, ~, P w) such that {A(w), b(w), c(w)} is a measurable trans-
formation on 0 into IRmxn + m+n. Our general assumption for SLP is that we know
PW. A further very important assumption is that a decision on x - or on a
mixed strategy Px - does not influence P W. More precisely, the events in 0 - i.e.
the elements of ~ - and the events in X, i.e. the Borel sets in X - are stochastically
independent; or equivalently, the probability measure ofthe product space X x 0 is
the product measure Px x Pw. It should be pointed out very clearly, that this as-
sumption is not at all trivial from the practical point of view. If for example a pro-
ducer with a large share in the factor market takes very extreme decisions on inputs,
it seems very unlikely that these decisions do not influence input prices or quality,
which would alter the technological coefficients. On the other hand there are cer-
tainly many cases, where the assumption of stochastic independence is quite realis-
tic. Therefore, in most practical cases we must check very seriously whether the
influence of the producer's decision on the probability distribution Pw may be
neglected before applying one of the "here and now" models handled in this book.
14 Chapter l. Introduction

A decision maker who does not want to choose his strategy at random out of
a certain feasible set must have a criterion telling him whether a certain strategy
is the "best" one or not. As is well-known, in decision theory there are different
concepts of what the "best" may be. One of them is that there is a partial ordering
on the set of feasible strategies P x : then a "best" strategy is not necessarily "better"
(with respect to the partial ordering) than, or equivalent to, every other strategy,
but there is no other "comparable" and "better" strategy. Since, under a partial
ordering, not every pair of strategies need be comparable, it follows that there
may be a strategy which is "best" in virtue of not being comparable to any other
feasible strategy. This concept has important applications in multi-goal pro-
gramming.
However, we shall be concerned with a stronger concept of "best" strategy. Let us
assume that any two feasible strategies are comparable and that the result of the
comparison says that either one strategy is "better" than the other or both
strategies are "equivalent". In other words, either we prefer one strategy to the
other or we are indifferent. Furthermore, we suppose that the decision maker is
consistent in the following sense. When he has preferred a strategy p~l) to p~2)
and also has preferred p~2) to p~3), then he will prefer p~l) to p~3). When he is
indifferent with respect to p~l) and P~2( then he also thinks of p~2) as equivalent to
p~l). And when he believes p~l) to be equivalent to p<j-) and p<j-) to be equivalent
to p~3>, then he will also be indifferent with respect to p~l) and p~3). To obtain such
a preferential scheme we need, on 'P x 'P, a strict linear ordering -< (the "better"
relation) and an equivalence relation ~ (the "indifferent" relation) so that for
any P~) E'P, i = 1,2,3, the following statements hold:
1) One and only one of these relations hold:

2) If and
p~l)-<p<j-) p~2)-<p~3), then p~l)-<p~3).
3) And if
p~l) ~ p~l). p~l) ~ p~2), then p~2) ~ p~l).
4) If p~l) ~ p~2) and p~2) ~ p~3>, then p~l) ~ p~3).

In the economic as well as in the mathematical literature the question on the


existence of a real-valued criterion functionf: 'P-----IR inducing a given preferential
scheme, is discussed in great detail. This is not our aim at this point, because in
SLP-situations a decision maker does not first have a preferential scheme on the
feasible strategies and then look for an order preserving real-valued function, but,
conversely, it seems to be more likely that he has a certain criterion function
f: 'P-----IR which yields in a very natural way a preferential scheme on 'P x 'P:
a) f(P~l)) <f(P<]»)= p~l)-<p<])
b)f(P~l)) = f(p~2»)= p~l) ~ p~2)

One can easily verify that these relations" -<" and "~" defined by a) and b) fulfil
the above conditions 1) to 4). We shall discuss now how to construct a criterion
function on 'P in a SLP situation. As we have seen, we shall then have also solved
the problem - at least implicitly - of how to get a preferential scheme on
'P x 'P.
Chapter I. Introduction 15

The following construction has turned out to be appropriate to SLP-problems.


Given a realization of (A, b, c), i.e. for a certain w, and given a certain decision x
which, as outlined above, may also be understood as a realization of a strategy
Px , we are interested in the result e(w, x) = (c'(w)'x,A(w)x-b(w»), which
represents the actual value of the objective function and the vector of "deviations"
in the constraints. First we define a loss function L on the set E of possible results
e(w, x), i.e. L:E---+IR. Obviously L {e(w, x)} represents the value of loss we assign
to both the actual value of c'(w)x (for example the actual costs) and the vector of
deviations in the constraints. It should be noted that L depends on the actual strat-
egy x and the realisation w. However, what we want to have is a real-valued
function on the set ~ of feasible strategies. Therefore we need a transformation
cP which assigns a real-valued function on ~ to any loss function L, i.e. cP : iI---+ffi.
where iI = {L! L:E---+IR} and ffi = {FI F: ~---+IR}.
Now to define a special "here and now" problem we have to define the set ~ of
feasible strategies, to choose a special loss function L:E---+IR and a special trans-
formation CP:iI---+ffi. The set ~ may, or may not depend on the probability
distribution P W' Additionally, ~ may be restricted to certain types of probability
distributions on X, where, in most practical cases, X is a convex polyhedral set.
If ~={PxI3iEX:Px{.x}=I}, we shall write for simplicity ~=x. As we have
seen, L depends on (c'(w)x, A(w)x -b(w»). Therefore, to define L we - or the
decision maker - must know which value or penalty costs have to be assigned to
an actual deviation A (w) x - b(w), and how to take them into account, together
with c'(w)x,which may represent the actual costs. Consequently, cP represents the
influence of all possible losses - with respect to X and Q - and of Pro and the
feasible P x on the criterion function. We summarize this construction of the
criterion function as follows:
Given a probability space (Q, tY, Pw), a measurable transformation (A (w), b(w),
c(w»):Q---+IRmxn+m+n and a Borel set XcIRn of feasible pure strategies, we
define
(i) E= {(c'(w)· x, A(w)x -b(w»)1 wEQ, XEX}
(ii) ~ = {Pxl P x is a probability measure on X; possibly further conditions}
(iii) L:E---+ IR
(iiii) cP such that CPL=F:~---+IR.
The problem then is to minimize F on ~.
To show that this construction is not so artificial as it might seem, we shall con-
clude this section with some examples.
Suppose that
1) ~=X n{xl Pw(A(w)x-b(w)~O )~ct}, ctE [0, 1]
L(e(w, x) )=c'x
CPL=Ew L(e(w, x»)
The problem resulting from these definitions is
minEwL(e(w, x»)
~
or equivalently
16 Chapter I. Introduction

with respect to Pw(A(w)x-b(w);;:::O);;:::oc


XEX.

This is a so-called chance constrained programming problem, which means that


the expectation of the original objective function c'x shall be minimized with
respect to the constraints that x EX (for example x;;::: 0) and that the decision -
or pure strategy - x must be feasible in Ax;;::: b with probability at least oc.
2) We have an entirely different type of problem, if
'll = {Pxl PAX) = 1}
L(e(w, x) )=c'(w)· x+min {q'yl Wy=b(w) -A(w)x, y;;:::O},

where W, q may also be deterministic or stochastic, in the sense that they depend
onwEQ,
i=l, ... ,r).
In particular, the case when F = j1Epx x Pw {L(e(w, x) )} + A· (Jpx x Pw { L(e(w, x) )}
with A;;::: 0 seems to be of practical importance.
This type of problem is called a two stage problem of SLP or a SLP problem with
recourse. The practical meaning of problems with recourse is this:
When we have determined x (or x has been determined as a realization of a strategy
P x ) and when we observe a realization of the random vectors (A (w), b(w), c(w)),
then there may be a deviation from the original constraints, i.e. A(w)x=b(w) may
not be satisfied. Such a deviation causes penalty costs arising from a second
stage linear program min {q'yl Wy=b(w)-A(w)x,y;;:::O}, which may be under-
stood as an "emergency program", yielding the last possibility of compensating
for the deviation from the original constraints. The total costs observed in this
situation are the sum of the original costs c'(w)x and the penalty costs, and -
since they depend on wand x - these total costs are random. The objective is
to determine x or Px such that some function F of certain moments of the total
costs becomes minimal, for example the expected total costs or - if risk aversion
is involved - a weighted sum of the expectation and the standard deviation of
the total costs.
3) A further possibility is:
'll = X n{x I P.n(c'(w)x ::;; y);;::: oc}
L(e(w x))=
,
{10 ifotherwIse
A(w)~;;:::b
<PL= -Ew L

The problem of minimizing <PL on 'll can be written as


maxPw(A(w)x;;:::b(w))
xeX

with respect to pw(c'(w)x::;;y);;:::oc;


this means that the costs c'(w)x should not exceed a certain prescribed value y
with probability oc and that with respect to this constraint the probability offeasibil-
ity, which may be a measure of reliability ofthe production process, shall be as high
Chapter 1. Introduction 17

as possible. Therefore we call this problem reliability optimization in SLP. Though


these problems have not yet been investigated satisfactorily, they seem to be of
great practical importance in many production processes in which a high quality
rather than minimal costs should be achieved. Since the theoretical questions
with respect to this problem are essentially the same as in chance constrained
programming, we shall not handle them separately.
4) Problems as known in two-person-zero-sum games arise, for example, in
the following form:

L (e(w, x)) some real valued function


tPL =ess. supL (e(w, x))
WEa

Then we have to determine


min ess. supL(e(w, x))
XEX WEa

However, we shall not discuss these problems here because their general theory
is contained in the literature on decision theory and they do not really make use
of the information on P w' i.e. they just take into account some sets of measure
zero with respect to P W (in determining ess. supL).
WEa

We shall discuss the distribution problem as well as the problem with


recourse and the chance constrained problem. From this discussion the reader
may conclude how many problems are still unsolved and to which of them he
might direct his efforts.
Whereas the interpretation of the distribution problem should be clear from the
description given above, it might be helpful to have an example for chance con-
strained and two stage programming.
Assume that for producing iron you have two ores with different concentra-
tions al and a2' and that the iron production should meet essentially the demand
b (e.g. per month) of one customer. The costs per unit of the ore may be Cl and C2,
respectively. According to your production capacity you may not smelt more
than d units of ore (per month). If all the data were deterministic, we should have
the linear program
min (C1X+ C2Y)
subject to alx+a2y~b
x+ysd
x~O,y~O

where x and yare the quantities of ores smelted.


Suppose now that only the demand b is random and, for simplicity, uniformly
distributed on the interval1.2SbS1.6, whereas al=0.5; a2=0.3; cl=2; c2=1
and d=4 are deterministic. Assume further that you do not have a capacity to store
iron and that you must order the monthly quantities of ore x and y before knowing
the actual iron demand b. However to maintain the good will of your customer,
18 Chapter I. Introduction

you should be able to deliver the quantity b of iron required with a high proba-
bility, e.g. with probability 0.9. This yields the chance constrained program

min [2x+y]
subject to P(0.5x+0.3y~b)~0.9
x+y~4
x~O,y~O.

In this particular case we get an equivalent linear program, since P(0.5x+0.3y~ b)


=Fb(0.5x+0.3y), where Fb is the distribution function of the random variable b
and the constraint Fb(0.5x+0.3y)~0.9 is equivalent to 0.5x+0.3y~Fb 1 [0.9]
= 1.56 according to our assumptions on the probability distribution of b. As we
shall see in Ch. IV, chance constrained programs are not equivalent to linear or
convex programs in general. In our simple example the assumption of stochastic
concentrations a1 and a2 would already raise difficulties.
Suppose now a somewhat different situation. You have a contract with your
customer to meet exactly his monthly random demand b during the next two years.
But you still have to order the ores in advance. If you produce a surplus of iron,
you may sell it to other customers at the (low) price q 1 = 2; if you, on the other
hand, produce less then b, you must buy yourself the difference at the price q2 = 4
to fulfil your contract. Hence after knowing the demand b of a particular month,
you shall have additional costs Q(x,y, b), following from the recourse program

Q(x,y,b)=min( -2Z1 +4z 2 )


subject to Z2 -Zl =b -0.5x -0.3y
Zl zO, Z2~0.

Minimizing the total costs (per month) in the mean, i.e. the expected total costs,
yields the two stage program

min [2x+y+EbQ(x,y,b)]
subject to x+ y~4
x~O, y~O,

where EbQ(x,y,b) is the expectation of Q(x,y,b) with respect to the distribution


of b.
Whether to choose the chance constrained or the recourse model cannot be
determined theoretically. This choice depends on the real situation and on the
aims of the decision maker.
Chapter II. Distribution Problems

1. The General Case

Let us consider the linear program

determine Y= inf c' x


(1) subject to Ax=b
x2:0,

where xElRn, A is a real (mxn)-matrix, and band c are real m- and n-vectors
respectively. Without loss of generality we assume that m ~n.
If we define Y= + 00 whenever (1) has no feasible solution, then y = yeA, b, c) is an
extended real-valued function, defined on IR m xn +m +n.
Obviously y represents the optimal value of (1), if this linear program has a
solution.
As we have seen in Ch. I, in an SLP situation there is a known probability space
(Q,(j,P OJ) and a measurable transformation (A(w),b(w),c(w»):Q~lRmxn+m+n
such that (A, b, c) = (A (w), b(w), c(w») is a random vector. Therefore yeA, b, c)
becomes yew) = y (A (w), b(w), c(w»). Now we may state the general
Distribution problem: Given P OJ' what is the probability distribution of y(w)?
First of all we have to assure that this problem is mathematically meaningful, i.e.
that yew) is a random variable or, equivalently, that y(w):Q~1R is measurable.
This may be concluded by Th. 0.18 and the following
Theorem 1. y(A,b,c):lRmxn+m+n~1R is a Borel measurable extended real-valued
function.
Proof: We shall prove this statement by constructing a countable set of Borel
measurable functions Yin:lRm xn+m+n~ IR and showing that y = supn infiYin- Apply-
ing Th. 0.19 we get the desired result.
Let ~ be a countable set of vectors Xi 2: 0, which is dense in {XixElR n, X2:0}, for
example the set of all non-negative n-tuples of rationals.
Define by using as vector-norm the Euclidean norm

~in={(A,b,C)iIIAXi-bll ~~}
and
_{ C'Xi if (A,b,c)E~in
Yin - 00 otherwise

Obviously Yin:lRmxn+m+n~1R is Borel measurable, since the euclidean norm


defining the natural topology is continuous and therefore Borel measurable,
20 Chapter II. Distribution Problems

(see Th. 0.20) which implies ~in to be Borel measurable sets. Furthermore Yin is
continuous in c on ~in'
Let us show, that y=suPninfiYim in two steps:
A)infiYin:-::;;Y for n=1,2, ...
We have to distinguish three possible cases:
A1) Y= +00.
In this case infiyinS:y is trivial.
A2) - 0 0 < Y< 00
Then there exists a vector xo;;::: 0 such that Axo =b and Y = c' xO. For an arbitrarily

chosen natural number n and any real positive 8 let 15(8, n) = min [n II~ II ; II: III
Since ::0 is dense in {xl x;;::: O}, there exists an Xi E::0 such that II XO - Xi II s: J(8,n).
For this Xi we have

IIAxi-bll =11 AXi-Axoll:-::;; IIA 1IIIxi-xoll:-::;; II A II·J(8,n)s:~


and therefore

Since we can repeat this procedure for any 8> 0 and for all natural n, we have
proved thatinfiYin:-::;;Y for n=1,2, ...
A3) Y= - 0 0
In this case there exist vectors xv, v = 1,2,3,. .. such that xv;;::: 0, AxV= b,
c'xv:-::;;-v.
Again let
J(8,n) =min[n II~ II; II :IIJ·
Then to each XV there exists a Xiv E::0 which satisfies

Again we have

and

IYivn- c'x V IS:8


and thereby

which yields infiYin = - 00 for n = 1,2,3, ...


B) sUPninf;Yin=Y
B1) Y= - 0 0

From A3) we know that infiYin = - 00 for n = 1,2,3, ....


Therefore the statement sUPninfiYin=Y is trivial.
1. The General Case 21

B2)y=00
In this case {x IAx = b, x ~ O} = 0 and therefore
l? =infll Ax -b II> O. This implies
x~o

(A,b,c)~21in forall n>~ andforall i:XiEfi}.


For all these i and n we have
Yin=oo and consequently sUPninfiYin=oo=y.
B3) -00< Y< 00
In this situation there exists an optimal basic solution. Without loss of generality,
we may assume that A has rank m. Let B be the optimal basis, i.e. an (m x m)-sub-
matrix of A, and AN the matrix of non-basic columns of A. Let c be the vector of
the components of c belonging to the basic variables and CN the vector of the
x
remaining components of c. Finally, let be the vector of basic variables and Y
the vector of non-basic variables. Then the following relations must hold:
B-1b~0 (feasibility)
y=c'B- 1b
c;' -c' B- 1AN~O (optimality).
If sUPninfiYin < Y -e for some e> 0, then for every n there exists a in such that
Yi.n < Y -~. For simplicity we suppose now that for every n there exists Xi. Efi}
such that II AiXin -bll ~kand Yinn=C'Xi n< y. (Otherwise there would exist an n such
that Yin~Y for all i, and by A2) we should conclude that sUPninfiYin=Y). Then
Xi.~O and AXi.=b+dm where Ildnll ~~.
Consider the linear program
"Cn=infc'x
(2) subject to Ax=b+dn
x~O.

Obviously "Cn~ Yi.n < Y, since Xi. is feasible in (2). On the other hand "Cn must be
finite, because "Cn= -00 would imply the existence of a vector w~O, Aw=O,
c'w< 0 (see Th. 0.4), which contradicts our assumption that (1) had a finite
optimal value y. With respect to the basis B the basic and non-basic variables
x and y of any optimal solution of (2) must satisfy the equation
x=B- 1b+B- 1dn _B- 1ANY'
Since

we get
"Cn=c' B- 1b+c' B- 1dn+(c;'-c' B- 1A N)y
~y+c' B- 1dn
and therefore
0< Y-Yi•n~Y-"Cn~1 C'_ B-1dnl~ II cll'll B- 1 11· 1
n ,
and this implies again that sUPninfiYin =y.
22 Chapter II. Distribution Problems

{Yin} is a countable class of Borel measurable functions. By Th. 0.19, {infiYin} is


a countable class of Borel measurable functions and, again by Th. 0.19,
y=suPninfiYin is therefore Borel measurable. D
If we consider the optimal solution instead ofthe optimal value of (1), the con-
jecture that this optimal solution is also a Borel measurable transformation must
be false, since the optimal solution is in general not uniquely determined. How-
ever, we may prove the following
Theorem 2. There is a Borel measurable transformation x:JRmxn+m+n~JRn which
coincides with an optimal solution of(1), whenever (1) has a solution.
Proof Let 9f!=JRmxn+m+n
9Jl = {(A,b, c) I(1) has no feasible solution}
3= {(A,b,c)I(1) has feasible solutions and rank (A)=m}
91={(A,b,c)I(1) has feasible solutions and rank (A) < m}
Obviously 9Jl, 91, 3 are disjoint sets and
9f!=9Jlu91u3.
First we show, that 9Jl, 91, 3 are Borel measurable.
a) 9Jl is a Borel measurable subset of f!Il.
As we know from the proof of Th. 1, 9Jl={(A,b,c)linfIIAx-bll> O}.
x;,o
Let ~ = {x;l i= 1,2, ... } be a countable dense subset of {xl XEJRn, x;:::O}.
Then
infIIAx-bll=infIIAxi-bll forall (A,b).
x 2::: 0 XiE!'}

This follows immediately from the fact, that


inf II Ax -bll:::;; inf II AXi -bll for all (A,b)
x 2::: 0 XjE!»

and that simultaneously

x;' °
inf IIAx-bll;::: infIIAxi-bll,
XiE9

which we explain as follows:


For some (A,b) let {XV} be a sequence, XV;::: 0, such that IIAxv-bll tends to
inf II Ax-bll·
x;,o
Since ~ is dense in {xl x;::: O}, there exist Xiv E~ with the property that
II XV -xi.!1 :::;;~, V= 1,2, ....
Hence
IIAxiv -bll =IIAxiv -Axv+Axv -bll
:::;;11 A II· II Xiv _xvii +IIAxv -bll

:::;;11 A II·~+II Axv -bll

and, therefore
IIAxi -bll~ infllAx-bll as V~OO,
\I x~o

which implies the desired inequality.


1. The General Case 23

Now for any XiEfi2, IIAxi-bll is a Borel measurable function on £71, since it is
continuous. Therefore, by Th. 0.19,
infllAxi-bll
XiE[j)

is also a Borel measurable function and consequently, the set


9Jl={(A,b,c)1 infIIAxi-bll> O}
XjE§j

is Borel measurable.
b) 91 is a Borel measurable subset of £71.
Let B I, . . . , Br be all (m x m)-submatrices of A, i.e. r = (~).
Then
r
91=n {(A,b,c)1 detBi=O} -9Jl.
i= I

From this relation the measurability of 91 follows immediately, since detBi is a


continuous and therefore Borel measurable function.
c) 3 is a Borel measurable subset of £71.
This statement is now trivial, since
3 = £71- 9Jl- 91.
To define X, we have to determine a further disjoint partition of 3 into measurable
sets:
i-I
(£:i = {(A,b, c) IdetB i #0, Bi I b2 O} - U (£:k, i = 1, ... , r,
k= 1
j- I
'Dij= {(A,b,c)1 (A,b,c) E(£:i, cj-c~iBi I A j < 0, Bi- I Aj~O} -U 'Dik,j= 1, .. . ,n,
k= I i=1, ... ,r,
where Cj is the j-th component of c, CBi is the vector of components of c belonging
to the basis B i, and A j is the j-th column of A; finally,
(fi = {(A,b,c) I (A,b, c) E(£:i, c' -CB,' B i- I A 20}, i = 1, .. . ,r.

By definition we have a finite number of disjoint sets (fi and 'Dij. To show that
n
3=U«fiUU'D i),
i= 1 j= 1

suppose (A,b,c) to be in 3. Then the linear program (1) has either a finite
optimal solution and, therefore also an optimal feasible basic solution implying
(A, b, c) E(fi for some i, or the objective is not bounded from below implying that
there must be a feasible basic solution - i.e. B i- 1 b20 - so that some non-basic
variable may be augmented arbitrarily without violating non negativity -
i.e. Bi 1 Aj~O - and thereby decreasing the objective arbitrarily - i.e.
Cj-CBiBi-l Aj<O. Hence (A,b,c)E'Dij for some (i,j). Conversely it is clear that
n
U «fiUU 'Di)c3.
i= 1 j= I
24 Chapter II. Distribution Problems
n
To define x on 3 we shall represent this vector on (fj U U 'nij by its basic and non-
basic parts xj , / respectively. Then j~ 1

and
.I{oo
Y~=
0 for all v, if (A, b, c) E(fj
for v=j.
o for v#j, If (A,b,c)E'nij,
which defines x on 3 as a Borel measurable transformation yielding a solution
to (1), whenever (1) has a solution and A has full rank. Furthermore we define
x on Wl by
, {+oo
Xj= . Cj~O}, If
if . (A,b,c)EWl.
-00 If Cj<O

On 91 we may define x in the same way as on 3 taking into account the fact that, for
(A, b, c) E 91, a certain subset of (linearly dependent) constraints of (1) can be deleted
yielding a new linear program with ml < m linear independent constraints. D
In a certain sense Th. 2 is stronger than Th. 1.
Precisely:
Theorem 3. Let x be the measurable transformation of Th. 2 and y = c' X. Then y is
a Borel measurable extended real valued function on IRmxn+m+n and y=Y almost
everywhere with respect to the Lebesgue measure.
Proof. Every component Xj of x is a Borel measurable function on IRmxn+m+n as
well as every component Cj of c. Therefore y=c'x as a sum of products of
measurable functions is measurable (see Th. 0.19).
Ymay differ from y just on the set
~ = {(A,b, c)1 (A,b,c) E9Jl, c=O},

because there y=O (if we use the definition 0'00=0) and y=oo.
But the Lebesgue measure of ~ is equal to zero. D
Although these results indicate that, from a mathematical point of view, the
distribution problem is meaningful, we may still get distribution functions of the
optimal value with defect, i.e. it may happen that the probability
P",({wl -00< y< oo})< 1.

In this case the moments of the random variable y certainly do not exist.
Since, however, in many practical cases decision makers are interested in the
mean value and the variance, we shall try to characterize those problems whose
optimal value has a distribution function without defect. Mter that we shall
investigate some special types of problems which are of practical importance as
well as mathematically "well behaved". The following results are due to Bereanu
[4 ].
1. The General Case 25

Theorem 4. Pro({wl- oo < y< oo} )= 1 if and only if the implications:


a) for all u, if u' A ~ 0 then b' u ~ 0
b) for all w~O, if Aw=O then c'w~O
hold with probability 1.
Proof P",({wl- oo < y< oo})= 1 if and only if
a) (1) has feasible solutions with probability 1,
b) the objective c'x of (1) is unbounded from below on the feasible set with
probability o.
If we define
~ = {(A, b, c) I Ax = b, x ~ 0 has feasible solutions}
= Bl-IDl in the terminology of Th. 2
and
U= {(A,b,c)1 Ax=b, x~O, has feasible solutions such that c'x is unbounded from
below}
={(A,b,c)I(A,b,c)E~ and there exist w~O such that Aw=O and c'w<O},

then we must have


a) P",(~)=1
b) P",(U) =0.
By Farkas' lemma (see Th. 0.1)
(A,b,c)E~ if and only if u'A~O implies b'U~O(UElR.m),

and therefore
P",(~)=P", ({(A,b,c) Iu'A~O implies b'u~O})= 1.

Finally, we need P",(U) = 0 or equivalently P",(Bl- U) = 1.


Since
U=~ II {(A,b,c)13w~0,Aw=0 implies c'w< OJ,
we get
Bl-U=(Bl-~)u{(A,b,c)1 w~O, Aw=O implies c'w~O},

and hence
f=P",(Bl-U)~P",(Bl- ~)+P",({(A,b,c)1 w~O, Aw=O implies c'w~O}).
The fact, that P",(~) = 1 and consequently P",(Bl-~) = 0, completes the proof. 0
Let us now consider two special types of linear programs, which satisfy the
conditions of Th. 4 and hence yield an optimal value, whose distribution function
has no defect
y=infc'x
(3) Ax~b
x~O
where n
L aij> 0, i= 1, .. . ,m,
j= 1
m
L aij>O, j=I, ... ,n and
i= 1

Cj~O, j=l, ... ,n, withprobability1.


26 Chapter II. Distribution Problems

We call this type of linear program a positive linear program. The assumptions
made here are quite realistic if we imagine that (3) represents a production
program, where A is the technological matrix, x represents the input, Ax the out-
put, b the demand vector and Cj the cost per unit of the j-th factor of production.

Theorem 5. For any positive linear program (3),

Proof Introducing slack variables we get

y=infd'z
Bz=b
z;;::::O

where d'=(c',O')
B=(A, -E)
z'=(x',y)

To verify condition a) of Th. 4, let UEIRm be such that u'B;;::::O, i.e. u'A;;::::O and
u'::;;O.
Since (3) is positive, it follows that u=o and therefore that b'u=O. Condition b)
of Th. 4 follows from the fact that w;;:::: 0 already implies d' w;;:::: 0.0
Another special type is the stochastic transportation problem, where the unit
transportation costs, the supplies and the demands are assumed to be random
variables with positive range and such that the total demand almost surely does
not exceed the total supply.
m n

y=inf L L CijXij
i= 1 j= 1

n
(4) L Xij::;;ai, i=l, .. . ,m
j= 1

m
L Xij;;::::b j, j=l, ... ,n
i= 1

n m
where cij;;::::O, ai;;::::O, bj;;::::O and L bj::;; L ai with probability 1.
j= 1 i= 1

Theorem 6. The distribution function F/~) of the stochastic transportation problem


(4) has no defect, i.e.
1. The General Case 27

Proof. Introducing slack variables, we get the matrix

d' I
I
d' I
, I
\ 1m I
A=
\ I
'd' I II
---1--1--
In I
I
I -In
I

where d' =(1,1, ... ,1) and Ir is the (r x r)-identity matrix.


~

n times

Let uEIRm+n, i.e.


u=(::) with u1EIRm, u2 EIRn.

Fromu'A~Oitfollows thatu 1~O, u2~Oand uf +uJ~O, for i= 1, .. . ,m,j= 1, .. . ,n,


and hence, with probability 1,
m n m n

L aiuf+ j=1
i=1
L bjuJ~ 1_._m L ai+ 1_1_n
~~!l uf· i=1 L bj
~~~ uJ- j=1
n
~ (min uf + min uf)'
1:'Oi:'Om 1:'Oj:'On
L bj~O,
j=1

which coincides with condition a) of Th. 4.


On the other hand, condition b) of Th. 4 is trivially satisfied, since for
W'=(W1',W2')~O, where w 1 EIRm·n and w2EIRm+n,c'w1~O.O

It often happens, as in problems (3) and (4), that only a certain subset of the
coefficients of a linear program are random variables. We may express this fact
by a reformulation of the general problem due to Bereanu [3], which allows
statements of all possible kinds of SLP distribution problems.
Let Tc IRr be the range of a random vector 1= (I h' .. , Ir)' and

y(/)=inf c'(t)·x
(5) subject to A(/)x=b(/)
x~O
where
A(/)=AO+A 1'/ 1+A 2'12+" '+Ar/r
b(/)=bo+b1·/1+b2·12+···+br·/r
C(/) = CO + c 1 . 11 + C2 . 12 + ... +C' Ir

with deterministic real matrices Ai, bi, ci.


By Th. 1 it is obvious, that y(/), if it exists, is Borel measurable in I, since
A(/), b(/), c(/) are continuous in I.
28 Chapter II. Distribution Problems

Theorem 7. Let B(t) be an (m x m)-submatrix of A(t) and Jlr the Lebesgue measure
on JR'. Then either detB(t)=O for all t E T or Il,({t IdetB(t) =O}) =0.
Proof Suppose it is not the case that detB(t) =0 for all t E T, i.e. there exists a
t* E T such that detB(t*) =1= o.
Obviously lp(t)=detB(t) is an algebraic function in t, i.e.

L" rt.d"d
lp(t)=
.= t:"
2'•••

where irv~O are integers. Now for any algebraic function lp(t) either lp(t)=O or
Il,({t I lp(t) =O}) =0, as we shall see by induction to r.
For r=1 the fundamental theorem of algebra asserts that lp(t)=O or lp(t) has a
finite number of roots, i.e. 1l1({t Ilp(t) =O}) =0.
Assume that the statement is true for any algebraic function of at most r-1
variables t h •.. , t,_ 1. Consider now

lp(t)= L" rt.d" d 2 ' •• • ,t:" .


•= 1
Then either lp(t)=O or
A
lp(t)= L lpl'(t h ·· .,t,-I)·t/,
1'=0

where lpith . .. , t,-I) are algebraic functions of at most r -1 variables t 1, •.. , t,-1
such that there exists at least onello with lpl'o(t 1,· .. , t,_ 1) ¥=O.

{t Ilp(t) =O} = [(t Ilpl'o(t 1, ... , t,_ 1) =O}n {t Ilp(t) =O} ]u


u [{tl lpl'o(t l , ••• ,t,_I)=I=O}n{tl lp(t)=O}]
=<i:u3)

Now for any t,-section <i:/, of <i: we have, by assumption,

which implies Il,(<i:) =0 by Th. 0.25. And for any (th .. . ,t,_I)-section 3)/'-1) of
3) we get
A

1l1(3)/'-I)\ =1lt{{t,1 L lpl'(t 1,·· .,t,-IH/=O; lpl'o(t h · . . ,t'-I)=I=O})


1'=0
=0,

again by the fundamental theorem of algebra, implying /lr(3) =0 by Th. 0.25. D


Motivated by this theorem, we call any submatrix B(t) of A(t) almost nonsin-
gular, if it is nonsingular for at least one t E T. To give a general formula for the dis-
tribution function of y(t), let us assume, that
A1 a) the probability measure Pr on T is absolutely continuous with respect
to the Lebesgue measure Il" i.e. there exists a probability density functionJ,(-r);
At b) in (5), A(t), b(t) and c(t) satisfy the conditions a) and b) of Th. 4;
At c) there exists a t* E T such that A(t*) has rank m.
1. The General Case 29

Let {Bi(t) I i = 1, ... , q} be the class of all almost nonsingular (m x m)-submatrices


of A(t). By assumption A1 c) this class is not empty.
For simplicity let us define arbitrarily B i- 1 (t)=O for all tET with detBi(t)=O,
i=1, .. . ,q.
By Th. 7, this definition is valid on a t-set of Lebesgue measure zero and hence,
by assumption A1 a), of probability zero. Furthermore we define

where CBi(t) consists of the basic components of c(t) with respect to the basis
Bi(t). Moreover let
i-I q q
mi=mi-Umj ; i -=P k, and umi=umi.
j= 1 i= 1 i= 1
It is now easy to prove

Theorem 8. Assume AI. Then p{~ m) = itl Pr(mi) = 1.

Let Yi(t):mi--+IR be defined as Yi(t)=c~i(t)Bi-l(t)b(t) and (£i(~)={tltEmi


and Yi(t):::; ~}. Then the distribution Fy(t)(~) of y(t) is determined by
q
Fy(t)(~)= L S!r(T)dT.
i= 1 [i(~)
Proof Let
'!)={tltET; -=<y(t)<=}
G:={tllET; rank (A(t»)=m}; then
Pr ('!» = 1 and Pr (G:) = 1 by Alb) and c) respectively and Th. 7.
q q
For lE'!)nG: there exists a basis Bi(t) so that lEmi ; hence '!)nG:cumi=um i.
i= 1 i= 1
Therefore
1~ Pr(U mi)~
i=1
Pr('!)nG:) =Pr('!) -('!) -G:»)
= Pr('!») - Pr('!) - G:)
~ Pt('!» - P,(T -G:)
=Pt('!» - Pt(1) + Pt(G:) = 1

By construction from minm k = 0, i -=P k, it follows that

We are interested in the distribution function Fy(t)(~) ofy(t), i.e. in the probability
of the set
'9'(~)={lltET; -=<y(t):::;~}; ~EIR.
q
Let m = U mi. Then obviously Pt ('9'(~»)~ Pt ('9'(~)nm ).
i= 1
30 Chapter II. Distribution Problems

On the other hand


Pr(<§(~)nm )=Pr(<§(~) - (<§m -m))
=Pr (<§(~)) - Pt (<§(~) -m)
~ Pt (<§(~)) - Pt(T -m)
=Pr(<§(~)), since Prcm) = 1,
and hence
q
Pr(<§(~) )=Pt(<§(~)nm)= I Pt(<§(~)nmd-
i= 1
Let m\O)= {t! detBi(t) =0 and t EmJ, i= 1, .. . ,q.
Since Bi(t) is almost nonsingular, we have

and in cause of the absolute continuity of Pr to J1,.,


Prcm\O»)=o and
Pr (<§C~)nm;)=Pt (<§C~)n(mi -mjO»)), i = 1, .. . ,q.
But, for t Em i -mID), we have Bi(t) as an optimal feasible basis, i.e.

Therefore
q
Fy(t)C~) =Pr (<§c~))= I Pr (<§(~)n(mi -m\O»))
q i= 1
= I Pr({t!tEmi-m\O) and Yi(t)::;;~})
i= 1
q
=I Pr({t!tEm i and Yi(t)::;;O), since pt(m\O») =0,
i= 1
q q
= I Pr((£:i(~))= I J!r(T)dT.D
i=1 i=1(!:i(~)

The following proposition is an immediate consequence of Th. 8:


Theorem 9. With the assumptions and notations of Th. 8 the v-th moment of y (t)
is determined, provided that the integrals exist, by
q
EyV(t)= I Jyi(T)!r(T)dT.

Proof As we have seen in Th. 8,

p{~ m)= it1 ptcm i) = 1,


yet) =Yi(t) for t Em i -mID) and Jlr(m\O») =pt(m\O») =0.
Hence,
y(t)=Yi(t) almost everywhere on m i with respect to Jlr and Pr.
This completes the proof. 0
1. The General Case 31

Sometimes one can find the conjecture in the literature, that under the
assumptions Ai, or even stronger ones like positivity assumptions on the linear
program (5), the sets m: i could be taken as "decision regions" instead of the sets ~i.
The following very simple and well-behaved example shows, that this conjecture
cannot be true in general because
Pr(m:in m:k ) = 0, i =1= k,

is not true in general, in spite of such assumptions.


Consider
Y(t)=inf(2+~t)Xl +3X2
subject to (5+~t)Xl
4
+~x2=5+!t
2 3
Xl;:::O, x 2;:::0
and let t have a probability density function on
T={tI0::=;;t::=;;4}cIR.
Then

and 3

and hence,

Therefore one should be very careful in replacing ~i by m: i in Th. 8 and 9.


However, Pr(m:inm:k)=O, i=l=k, is a sufficient condition for replacing ~i by m: i
to be allowed.
Theorem 10. Assume AI. Then Jlr(m:inm:k)=Pr(m:inm:k)=O, i=l=k, i,kE{l, ... ,q},
if either
Jlr ({ tit ET, c~i(t)Bil(t)b(t) -c~k(t)B;; l(t)b(t) =O} ) = 0
or Jlr{{tltET,Bi1(t)b(t);:::0 and B;;l(t)b(t);:::O} )=0.
Proof By definition
m: i = {t It ET, Bi l(t)b(t);::: 0, Cl(t) -c~i(t)Bi-l(t)A(t);::: O},
and hence t Em:inm:k if only if
32 Chapter II. Distribution Problems

t E Tn{t I Bi-l(t)b(t)~ 0, Bi: l(t)b(t) ~ O}n


n {t I C'(t) -c~i(t)Bi l(t)A(t) ~ 0; C'(t) -c~k(t)Bi: l(t)A(t)~ O}.

Therefore, for t E~in~k we have in particular,


c~Jt) -C~,ct)Bi- let) Bk(t) ~ 0 and Bi: l(t)b(t) ~ 0
and

which implies
c~i(t)Bi let) bet) -C~k (t)Bi: I (t)b(t) = O.
Hence,
~in~k c {t It E T, C~i(t) Bi let) bet) -c~Jt) Bi: l(t)b(t) = O}
and obviously,
~in~kc {tl t ET, Bi l(t)b(t)~O and Bi: l(t)b(t)~O},
which proves the theorem. D
From the counter example above as well as from the last proof we might
suggest that there is some connection between the fact that t E~in~k and
primal or dual degeneracy. Let Bi(t) again be some almost nonsingular (m x m)-
submatrix of A(t). We say that Bi(t) is primal degenerated with respect to the
linear program (5), if at least one component of B i- l(t)b(t) vanishes. We call
Bi(t) dual degenerated, if at least one component of cRet) - R;(t)B'- let) cBet) is
equal to zero, where Ri(t) is the matrix consisting of 'all columns of A(t) not
belonging to Bi(t) and cR(t) is the vector of those components of c(t) belonging
to R/,J). Certain stochasti~ linear programs of type (5) satisfy the assumption
A2. For every almost nonsingular (m x m)-submatrix Bi(t) of A(t) there exists a
t(i) E T such that Bi(t(i») is nonsingular and neither primal nor dual degenerated
with respect to (5).
Theorem 11. Given Al and A2, we have Pt(~in~k)=O, i#k, i.e. the sets ~i may
be taken as "decision regions".
Proof ~in~k = {t I Bi-l(t)b(t)~ 0, Bi: l(t)b(t)~ 0,
c'(t) -c~(t)Bi l(t)A (t) ~ 0,
c'(t) -c~~(t)Bi: l(t)A(t)~O, t E T}
c {t IB i- l(t)b(t) ~ 0, Bi: l(t)b(t) ~ 0,
C~i(t) -C~k (t)Bi: I (t) Bi(t) ~ 0,
c~i(t)Bi-l(t)b(t) -c~.ct)Bi: l(t)b(t) =0, t E T} = 'D

as we know from the proof ofTh. 10. Obviously


'D= [{tldetBk(t)=O}n'D]u [{tldetBk(t)#O}n'D]
and
2. Special Problems 33

since Bk(t) is almost nonsingular and, for the same reason,


14(1) = f.lr (1)n {t IdetBk(t)"# O}n {t IdetBi(t)"# O} ).
For det Bi(t)"#O define

i.e. the elements of the matrix Di(t) are algebraic functions in t.

For t E 1)n {t I detBk(t)"# O}n {t I detBi(t)"# O} = (£:


c~i(t)Bi-l(t)b(t) -c~k(t)Bk l(t)b(t)=O

if and only if at least one of any two corresponding components of the vectors

C~i(t) -c~k(t)Bk l(t)B/t) and


Bi- 1 (t)b(t)= det1 i(t) . Di(t)b(t)

vanishes. By assumption A2 any component of Di(t)b(t) is an algebraic function


in t, not vanishing in t(i) and therefore vanishing only on a t-set of Lebesgue
measure zero.
In (£:, a component of

vanishes if and only if the same component of

vanishes. Since Bi(t) and Bit) are different submatrices of A(t), there is at least
one component of d(t), say dvCt), which is an algebraic function in t and, by A2,
dvCt(k)"# 0 and therefore
f.lr ({ t IdvC t) = O} ) = O.
This yields the desired result

and hence

2. Special Problems

From Th. 8 and Th. 9 we may conclude that in general it is not at all trivial to
determine the distribution function Fy(t)(~) or the moments EyV(t), since its
computation involves numerical integration over the sets (£:i(O or functions
y7(t) which are difficult to handle. One of the major reasons for these
difficulties is the fact that, in general, y(t) is not continuous in t.
34 Chapter II. Distribution Problems

The following example due to Bereanu [4] shows that this discontinuity may
appear even if y(t) is finite for all t ETas soon as the technological matrix varies
with t.
Define, for t ElR,
yet) =inf{xl XElR, YElR, x+ ty~ 1, x~O, y~O}.
Then
1 for t:::;;O
y(t)= { 0
for t> 0,

hence, yet) exists everywhere in t, but it is not continuous at t =0.


However, there are special situations where the continuity of yet) can be estab-
lished. Consider the parametric program

y(t) = inf c'(t)x


(6) subject to Ax=b(t)
x~O
where, as in (5), c(t) = CO + c 1 . t 1 + c 2 . t 2 + ... + Cr. tr
b(t)=bO +b 1. t1 +b 2 t2 + ... +br tr
but A is constant. Then we may prove
Theorem 12. Let T c lRr be a closed interval. Assume that
{xl Ax=b(t), x~O} #0 and
{YIY~O, Ay=0,c'(t)y<0}=0 foralltET.

Then yet) is continuous on T.


Proof Without loss of generality we may assume that A has rank m. If
{B i li=l, ... ,q} is the class of all nonsingular (mxm)-submatrices of A, then
~i={tltET, Bi1b(t)~0, C'(t)-CBi(t)Bi-1A~0}, i=l, ... , q, are closed convex
polyhedral sets, which according to our assumptions cover T, i. e.
q
T=U~i'

For tE~i we have

i.e. y(t)=Yi(t) is an algebraic function in (tb" .,tr) and therefore continuous on


~i' The continuity of yet) on T now follows from the fact that Yi(t)=y/t) for
t E~i n ~i' since y( t) is uniquely determined. D

If in addition to the assumptions of Th.12 c = c(t) == co, i.e. c 1 = c2 = ... = c' =0,
we see from this proof that y(t) is piecewise linear (see Th. 0.7). If moreover
T={t I rx:::;; t :::;;/3, t ElR} we may proceed as follows:
(i) For to =rx determine y(t o) =min{ c'xl Ax =b(t o), x~ O} with the simplex method
yielding an optimal feasible basis B o such that y(to)=cBoBr;1b(to). With
k = 0 go to the next step.
2. Special Problems 35

(ii) Define -r=sup{t IB';- 1 b(t)~O}.


If-r~/3, set tK=tk+l =/3 and stop.
If -r < /3, set tk+ 1 = -r and determine by the dual simplex method a new optimal
feasible basis Bk+ 1 so that
B';-+\b(t)~O for tE{tltk+l~t~tk+l+e} forsome e>O.
Then replace k by k + 1 and repeat step (ii).
From our assumptions it is evident that this procedure terminates after a finite
number of steps: Then we have, for tk ~ t ~ tk+ 1 and k = 0, 1 ... K -1,

Given a probability distribution on T for example by a continuous density func-


tionfr(-r), we may now - according to Th. 8 and Th. 9 - determine Fy(t)(~) or
EyV(t) by numerical quadrature. Now, it is clear that we have to define the
decision regions in this case as

~k={tltk-l~t<td for k=1, ... ,K-1 and


~K= {t ItK- 1 ~ t~tK}.

Now let us consider the general problem

y(t) = inf c'(t)x


(7) subject to A(t)x=b(t)
x~O

as stated in (5), and assume that t varies over a compact interval T c JRr. For
y(t) to be continuous on T it is necessary that (7) is solvable for all t E T, or equi-
valently (see Th. 0.1 and Th. 0.4) the conditions

wE1Rn, w~O, A(t)w=O implies c'(t)w~O for all tET


(8)
uE1Rm, A'(t)u~O implies b'(t)u~O for all tE T

are necessary. However, these conditions are not sufficient for the continuity
of y(t) on T, as we know from the example given above.
To motivate a sufficient condition given by Bereanu [4], let us assume that for
(m x m)-submatrices Bi(t), ;=1, .. . ,q, of A(t) the sets

(where dB,(t) and DB,(t) correspond to the nonbasic parts of c(t) and A(t), respec-
tively) cover T, i.e.

(10)

Then obviously y(t) is continuous on T, since for an arbitrary t E T we know from


36 Chapter II. Distribution Problems

(10) that there is a !)i such that tE!)i and y(t)=cB.(t)Bi l(t)b(t) is continuous
at t because !)i is an open set in 1R'. •
Now assume that for an arbitrary tE!)i there are vectors wE1Rn, UE1Rmsuch that
w#O, w~O, A(t)w=O and u#O, A'(t)u::;;O. Rearranging w into its basic part
wB, and its non basic part wN , we have

A(t)w=Bi(t)WB, +DB,(t)WN,=O
and therefore

Obviously w#O, w~O implies WN,#O, WN,~O and therefore

c'(t)W = CB.(t )wB.+ aB.(t )WN.


= (dB,(t) ":'cB,(t)Bi l(t)DB,(t) )WN, > 0
since tE!)i' Furthermore A'(t)u::;;O is equivalent to

B;(t)u::;;O
DB,(t)u::;;O,

where u#O implies B;(t)u#O, since det Bi(t)#O.


Hence,
b'(t)u=b'(t)B;-I(t)· B;(t)u< 0 for tE!)i'

So we have shown that for (10) the following conditions are necessary, which are
a strengthening of (8):

wE1Rn,w#O,w~O,A(t)w=O implies c'(t)w>O forall tET


(11)
uE1Rm ,u#O,A'(t)u::;;O implies b'(t)u<O forall tET.
We cannot expect that (11) implies (10), but it can be proved, that (11) implies
the continuity of y(t) on a compact interval T.
Lemma 13. Let Tc 1R' be a compact interval. Given (11), the set
m= {(x, u)13 t E T: A(t)x=b(t), x~ 0, c'(t)x -b'(t)u::;;O, A'(t) u::;; c(t)}
is a bounded subset of1Rn x 1Rm.
Proof Assume that mis not bounded. Then there is a sequence {(xv, uv)} in m
such that the Euclidean norm II (x", uv ) II ~ v, v = 1,2,3, .... Since T is compact,
this sequence may be chosen so that the corresponding tv E T converge to some
t* E T. From II (xv, uv) II::;; II XV II + II UV II it follows that at least one of the se-
quences {xv} and {u v} is unbounded.
If the sequence {xv} is unbounded, we may take a subsequence {xv.} such that
II xV, II ~ i for all Vi' i= 1,2, ... , and such that
~i=~ converges to ~*.
IIxVi11
2. Special Problems 37

Obviously ~* 2: 0 and II ~* II = 1.

Since A(t),b(t),c(t) are, due to (5), continuous in t, we have

Then from (11) follows that


lim c'(tv) ~i = c'(t*) ~* > 0
i-oo 1

and therefore
(12)

If the sequence {u v } is unbounded, we choose a subsequence {u,,) such that


u
Ilu"jll2:j for all Kj,j=1,2, ... , and at the same time I'/j=llu:'11 converges to 1'/*.
Obviously 111'/* II = 1 and J

I'1m A ' (
A '( t *) 1'/* = J-+ t") I'/j ~ I'1m -II
c(t,,) = 0
U II
_J
00 J r"-'t
00 Xj

which yields, according to (11),


lim b'(t,,)l'/j=b'(t*)I'/* < 0
j-oo J

and therefore
(13) b'(t")u,,
J J
= II u"J II b'(t,,)l'/j'-
J
- =.
But (12) as well as (13) is a contradiction to
(14) c'(tv)xv-b'(tv)uv~O, v=1,2,3, ... ,

which has to be satisfied for (xv, u v) Em. Hence, our assumption, that mis not
bounded, is contradictory. 0
Theorem 14. If Tc IRr is a compact interval and if (11) is valid on T, then yet) is
continuous on T.
Proof For any t E T the vectors x* ElRn and u* ElR m are solutions of (7) and its
dual program if and only if x* 2: 0 and, for all u E IRm and all x E IRn such that x 2: 0,

(15) c'(t)x* + u' (b(t) - A(t)x* ) ~ c'(t)x* + u*' (b(t) - A(t)x* )~


~ c'(t)x + u*' (b(t) - A(t)x)

(see Th. 0.13 (Kuhn-Tucker)), where bet) -A(t)x* =0.


Hence, if i E T and lET and X, uand x, u are primal and dual solutions correspond-
ing to i and I respectively, we conclude from (15) that

(16) I(y(i) -y(7))1 = I c'(i)x -c'(l)xl ~max {I Llc'x+ u'(Llb -LlAx);


ILlc'x+ u'(Llb -LlAx) I}
38 Chapter II. Distribution Problems

where
Lle = e(l) - e(l)
Llb = b(i) - b(l)
LlA = A(i) - A(l).

Since according to the duality theorem (see Th. O.9)(x, u) and (i, u) are elements
of the bounded set m defined in Lemma 13, the continuity of y(t) on t follows
from (16).0
This result suggests the application of numerical quadrature, if we are
interested in determining Fy(t)(~) or Ef(t) and t is a random vector with range in
the compact interval T and with a continuous density function fr(')' However,
this leads in general to a tremendeous amount of work. The question, whether
numerical quadrature or Monte-Carlo simulation is to be preferred, has not yet
been answered.
Chapter III. Two Stage Problems

1. The General Case

In Ch. I we introduced the general two-stage problem of SLP as

(1) MinF(Ep xp. I!(e(w,x)); i= 1, .. . ,r), where


Px E ~ x w

L (e(w,x) )=c'(w)x+min {q'yl Wy=b(w) -A(w)x, y~ O}.

As stated there, q and W may also be stochastic in the sense that they are
random vectors on the probability space (Q,ty:,Pw) too. First of all we shall check
whether we really need mixed strategies or not. In many practical cases, the
problem will be stated somewhat different from (1), namely

(2) MinF(Ep xp. i! (e(w,x)); i = 1, .. . ,r), where


PxE~ x w

£-i(ew,x
( ))={L(e(W,X)),
. if i=1
1£(e(w,x))I, if i> 1.

The question, whether we can restrict ourselves to pure strategies without loss,
was first answered by Wessels [18] for the case when

F(EPxxp",i! (e(w, x)); i= 1, .. . ,r)


= J1Epxx p",L (e(w, x))+ AO'pxx p",L (e(w,x)),
where A~ 0 and 0' means the standard deviation

This has been extended by Marti [11] to the case when

F(EPxxp",i! (e(w,x)); i= 1, .. . ,r)


r .
L AiVEpxxp",D(e(w,x)),
~_~,--,--_---,-

= Ai~O.
i= 1

The basis for these statements is the following


Theorem 1. Suppose that

XxQ
J D(e(w,x))d(PxxPw ) exists.
Then 1 1
inf {JD(e(w,x) )dProF S {
XEX Q
J Li(e(w,x) )d(Px x Pro)}'
XxQ
40 Chapter III. Two Stage Problems

Proof First let i = 1; then


inf
XEX Q
JL (e(w,x) )dPa,::; JL (e(w,x) )dPa,
Q
Vx EX and

hence, for any Px with PiX) = 1,

x
J
Q x
J J
inf L (e(w,x) )dPa, = {inf L (e(w,x) )dPa,} dPx ::;
X Q

::; J JL(e(w,x) )dPa, dPx


XQ

J L(e(w,x) )d(Pa, x Px),


XXQ
by Fubini's theorem.
Now let i> 1 and define 1

G(X)={J i! (e(w,x) )dPro} , .

Then by Holder's inequality and Fubini's theorem

J
infX G(x) = infxG(x)dPx ::; G(x)dPx J

r;
x x

: ; {1 y. {1
1 1

r
1 1
Gi(x)dPx dPx -+-=1
i j
1

= {1 Ai! (e(w,x) ) dPro dPx

={ J
1

i!(e(w,x»)d(Pro x Px)}T,
XxQ

which completes the proof. 0


r 1
Corollary 2. IjF (Epxx P", D(e(w,x»); i= 1, .. . ,r)= L ,qEPxxp",i! (e(w,x) )F, where
Ai~O for i=1, .. . ,r, and all integrals involved arJ~~pposed to exist, then,jor any
Px with Px(X) = 1,
r 1 r 1
inf L Ai{Ep",i! (e(w,x»)}T ::; L Ai {Ep", x Pxi! (e(w,x) )}T.

r
XEX i= 1 i= 1

Proof Obviously 1

!~1 itl Ai {Ep", i! (e(w,x) )}'t ::; itl Ai l{J D(e(w,x) )dP ro dPx'

Applying Fubini's theorem and Holder's inequality in the same way as in the
proof of Th. 1 yields the desired result. 0
For problem (1) a similar result may easily be established:
Theorem 3. Provided the integrals involved exist
r r

i~f.L Ai {Ep",Li (e(w,x»)}::;.L Ai {E p", xpxLi(e(w,x»)}


,= 1 ,= 1
jor any Px with PiX) = 1.
1. The General Case 41

S Sf LA.i£(e(w,x») dPwdPx
XQ i~ I
r

= I
i~ I
A.i J £ (e(w,x) )d(Pw
Xx Q
X PJ. D

The three statements Th. 1, Cor. 2 and Th. 3 cover more than we shall
treat in this book. However, the type of function - F in (1) as well as in (2) -
with which we may restrict ourselves without loss to pure strategies, does not,
in general, seem to be answered.
Furthermore, in the following we shall reduce problem (1), with the help of
our previous results, to

(3) minEp {c'(w)x + miny{ q'(w )yl W(w)y = b(w) - A (w)x, y~ O}}.
XEX W

First we have to discuss what we call the feasible set of decisions. Here we are
especially concerned with the so-called recourse program

(4) Q(x w) = {inf {q'(~)YI W(w)! = b(w) -:- A (w)x, y~ O}, iffeasible y exist;
, + =, If no feasIble y eXIst.

Since we are dealing with programming with recourse, it would be meaningless


to allow "here and now" decisions x for which in the second stage problem (4)
recourse is impossible, i.e. Q(x,w) = + =, with a positive probability. Hence we
define
(5) K ={xlxEIRn;Q(x,w)< += with probability 1}.
Obviously x E K does not yet establish the existence of the integral involved in
(3); it is only a necessary condition. Furthermore we still require XEX, where
Xc IR n is some predetermined set, usually given by simultanuous linear con-
straints. Therefore we shall assume throughout that X is a convex polyhedral set.
Concerning K, Walkup and Wets [15] found the following result:
Theorem 4. K is a closed convex set.
Proof First we show the convexity of K.
LetxIEK and xzEK.
Define
Q I = {wi Q(Xb W)< +=}
Qz={wIQ(xz,w)< +=}.
Then
Pw(QI) = Pw(Qz) = 1, which implies
PW(QlnQZ)=PW(QI-(QI-QZ»)
=Pw(QI) -Pw(QI -Qz)
~Pw(QI)-PW(Q-Qz)
=Pw(QI) -[Pw(Q) -Pw(Qz)] = 1
42 Chapter III. Two Stage Problems

and hence

For any weD l nD 2 there exist Yl and Y2 such that

W(W)Yl =b(w) -A (W)Xh Yl ~O


W(w)Y2=b(w)-A(w)X2, Y2~O.

Therefore, for any Ae(O,1)


W(w) [AYl +(1-A)J21=b(w)-A(w) [Axl +(1-A)x21
and AYI +(1-A)Y2~O.
Hence,
D.. = {wi Q(Axl +(1-A)X2;W)< +oo}::J Dl n D 2,
which means that

This establishes the convexity of K.


Now we have to show that K is closed.
Let {Xj; i= 1,2, ... } be a sequence in K converging to x. Then for

Dj={wIQ(Xj,w)< +oo}
we have

Let

Obviously we have

and hence
00 00

lim A k=(\ Ak=(\Dj and


k .... oo k=l j=l

by induction from the first part of the proof

and hence

Suppose now xrt K.


1. The General Case 43

Then there is an

such that, for the Euclidean norm,


II W(w)y-b(w)-A(w)xll :e:: e> 0 \fy:e::O.
Then

Ux={xi Ii W(w)y-b(w) -A(w)xll :e:: ~ \fy:e::O}

is a neighbourhood of x; since {Xi} converges to x, there must be an XkEUx.


For this Xk 00

Q(Xk,W)< +CXl, since wEnQi c Q b


i= I

which yields the existence of a yk:e:: 0 such that


W(w)Yk=b(w) -A (W)Xk
in contradiction to Xk E U x.
From this we must conclude that xEK, which means that K is closed. D
Now we shall investigate the properties of the objectives involved in (3). First
we state
Theorem 5. Q(x,w) is, almost surely on Q, a convex function on K.

Proof Let XIEK and X2EK and Qi={wIQ(Xi,W)< +CXl}, i=1,2.


As we know from the proof of Th. 4
P",(QI n (2) = 1.
Hence it suffices to show that

Here

(6) Q(xi,w)=infq'(w)y
subject to W(w)y=b(w)-A(w)Xi; i=l,2,
y:e::O
and
(7) Q (Ax I +(1-A)X2 ; w) =infq'(w)y
subject to W(w)y = b(w) -A(w) (AXI +(1-A)X 2 )
y:e::O.

Let Yi, i= 1, 2, be feasible in (6). Then obviously AYI +(l-A)Y2 is feasible in (7)
for any AE(O,1).
If Q(Xi,W), i=1,2, are both finite, then we can choose Yi to be the solutions of
(6) respectively. Hence,
44 Chapter III. Two Stage Problems

If Q(x b W) = - 00, we know that there exists a y* 20 such that

W(w)y* =0 and q'(w)y* < 0

which implies Q(Axl +(1-A)X2;W)= -00, since

A11 + (1- A)Y2 + J.ly* is feasible in (7) for any J.l> 0,

and q'(W){AYl+(1-A)Y2+J.lY*} tendsto -00 as J.l---++oo.


Therefore, we may in this case also state

After this theorem we may discuss


Q(x) = JQ(x,w)dP",.
Q

We may rewrite this integral as


(8) J
Q(x) = Q+(x,w)dP", -
Q Q
JQ-(x,w)dP",=Q+(x) -Q-(x)
where
Q+(x W)={Q(X,W) whene~er Q(x,w» 0
, 0 otherwIse
and
Q-(x W)={-Q(X,W) whene~er Q(x,w)< 0
, 0 otherWIse.
In spite of the fact, that the restriction x E K implies

it is still possible that Q + (x) = + 00. This means that either Q(x) would be
undefined, ifQ- (x) = + 00, or Q(x) = + 00. Since both these situations are meaning-
less in practice, because one does not want decisions with either an undefined
outcome or infinitely high costs, it seems natural to restrict x to

Obviously K c K. Since Q- (x) = + 00 is not meaningful in the practical interpreta-


tion of the recourse program, which was introduced to compensate for some
inconvenient situation caused by the original decision on x and the realization of
w usually implying additional (positive) costs, we need not consider it in the follow-
ing
Theorem 6. Q(x) is a convex function on K, and hence K is convex.
2. The Fixed Recourse Case 45

Proof Since xEK, Q(X) < + <Xl. Let Xl EK and X2 EK, implying XiEK, i= 1,2. Due
toTh.5 for any AE(O, 1) Q(Ax l +(1-A)X2;W):::;;AQ(X 1 ,W)+(1-A)Q(X2,W) almost
surely.
Hence
Q(Axl +(1-A)X2)= JQ(Axl +(1-A)x2,w)dPu,:::;;
Q

: :; J{AQ(X 1 w)+(1-A)Q(x2,w)}dPu,
Q

=AQ(X 1 )+(1-A)Q(X2)'

This inequality also establishes the convexity of K. 0


Corollary 7. Either Q(x» - <Xl on K or Q(x) = -<Xl on the relative interior of K.
Proof Follows immediately from the convexity of Q(x) on K. 0
However, as was pointed out by Walkup and ~ets [15], K need not be closed
in general, and Q(x) may be discontinuous on K, i.e. at the relative boundary,
even if Q(x) is finite on k..

2. The Fixed Recourse Case

We are now considering two-stage problems where

W(W)=W

i.e. W is a fixed nonstochastic matrix. From the previous section we know that in
general K c K. Under suitable integrability assumptions on the original random
variables we may strengthen this statement in the fixed recourse case. The fol-
lowing four statements are due to Walkup and Wets [15].
Theor~m 8. Let the random variables in q(w), b(w), A(w) be square integrable.
Then K =K.
Proof: We have to show that K c K. Let xEK be arbitrarily chosen. For
Q(x) = {wIQ(x,w)< +<Xl} we know by the definition of K, that Pw (Q(x»)=1. If
we define Q 1 = {wi WEQ(X), I Q(X,w) I < <Xl}, then obviously Q+(x):::;; S I Q(x,w)ldPw '
Q,
Since Q(x,w) is finite on Q b we may represent Q(x,w) in terms of basic solutions
for any wEQ b i.e. Q(x,w)=q'(w)B- 1 [b(w) -A(w)x], where B is an optimal
feasible basis out of W with respect to wand q( w) is the vector of the components
of q(w) corresponding to B. By the assumed square integrability of the elements
of q(w), b(w) and A(w) and Schwarz's inequality it is now obvious that IQ(x,w)1
is integrable on Ql'
Hence

Also Cor. 7 may be strengthened to


46 Chapter III. Two Stage Problems

Corollary 9. Under the assumption of Th. 8 either


Q(x» -00 on K or
Q(x) = -00 on K.

Proof We have to show, that Q(x) = - 0 0 for some XEK implies Q(x)= -00
for every other point xEK.
Let x be a point of K so that Q(x)= -00.
From our integrability assumption it follows that P'" ({wi Q(x,w) = -oo} }=IX> O.
Let .Qoo={wI3y~0: Wy=O,q'(w)y<O} (seeTh.O.4).
Then
P",(.Qoo) = IX.
If x is any other point of K, then for
.Q(x) = {wi Q(x,W) < +oo}
we know that
P'" (.0 (x) ) = 1.
Hence,
P'" (.0 00 n .Q(x»)= P",(.Qoo) - P",(.Qoo n (.0 -.Q(x»))
=IX
since P",(.Q - .Q(x) ) = O.
For WE.Q oo n.Q(x) there exists a feasible solution to the recourse program -
since WE.Q(X) - and a direction y~O such that Wy=O and q'(w)y<O - since
WE.Q oo - which implies

Q(x,W) = -00.

Hence p",({wl Q(x,W) = - oo} }~IX> 0 and therefore

Q(x)= -00.0

Theorem 10. Let,the random variables in q(w), b(w), A(w) be square integrable
and Q(x» - 00 on K. Then Q(x) is Lipschitz continuous.
Proof Let XiEK, i=1,2 and .Qi={wIQ(Xi'W)< +oo}, hence P",(.Q;) = 1, i=1,2.
From Q(x» - 00 on K it follows, for
.0 00 ={wI3y~0: Wy=O, q'(w)y< O},
that Pa,(.Q oo ) = O.
Therefore Pa, (.0 1n.Q2) -.0 00 )= 1.
Forany wE(.Q 1 n.Q 2)-.Q oo wehave -oo<Q(Xi,W)<+oo
and, obviously, - 0 0 < Q(Ax1 +(1-A)X 2,W)< +00.
Representing the optimal value via basic solutions shows that Q(x,w) is piece-
wise linear on
{XIX=Ax1 +(1-A)X 2, AE [0,1]) and, due to Th. 5, convex.
2. The Fixed Recourse Case 47

Hence
Q(Xi'W) =IXv/W) +d~i(W)Xi'
(see Th. 0.6), where IXv/W) is a weighted sum of products of elements of q(w) and
b(w) and dv,(w) consists of components, which are weighted sums of elements of
q(w) and A(w). Hence, according to our square integrability assumption and
Schwarz's inequality, IXv/W) and dv/w) are integrable. If O::;;Q(X2>w) -Q(xl,w),
then from the convexity of Q(x,w) in x we have (see Th. 0.11)
Q(X2,W) + (x I -X2)' d V2 (w)::;; Q(x I,W),
implying

Hence,
I Q(X2,W) -Q(xl,w) 1= ct v2 (w) -ct v1 (w) +d~2(W)X2 -d~,(W)XI
::;;d~,(W)(X2 -Xl)'

If

we get in an analogous way


IQ(Xz,w) -Q(xl,w)1 ::;;d~,(W)(XI -X2)'
Hence we have in general
IQ(Xz,w) -Q(xI,w)l::;; Maxi
i
d~(W)(X2 -xdl.

Since for any two vectorsj,gElRn

Il'gl ::;;Vnlfloo ·llg II,


where 1'100 is the maximum norm and 11·11 indicates the Euclidean norm, we have

Since obviously Maxldvi(w)loo is integrable on(Q l nQ 2 )-Q oo ' and the same is
true for !

Max Idiw) I00' where v varies over the number of all possible bases, we get
v

IQ(X2) -Q(x1)1::;; S IQ(x 2,w) -Q(xj,w)1 dPw


(Q, rdJ 2)-Qoo

::;;Vn {SMaxl diw) IdPw} ·11 X2 -xIII·


Q v

Since SMaxl diw)1 dPw is independent of X I ,X2, this is the desired Lipschitz
Q v

condition. D
Corollary 11. If either
a) q(w)=q and the random elements ofb(w) and A(w) are integrable or
48 Chapter III. Two Stage Problems

b) b(w)=b, A(w)=A and the components ofq(w) are integrable or


c) the ranges of A(w), b(w), q(w) are bounded,
=
then K =K and either Q(x) - 0 0 on K or Q(x» - 0 0 on K, and in the latter case,
Q(x) is Lipschitz continuous on K.
Proof Under assumption a) or b), the proofs of Th. 8, Cor. 9 and Th. 10 may be
reproduced, since integrability of Q(x,w) on K, as required there, follows
immediately. Under assumption c), the foregoing propositions may also be
proved in the same way, observing that for any XEK Q+(x,w) is either equal to
zero or of the type ij'(w)B- l [b(w)-A(w)x], and hence bounded above on Q,
which implies Q+(x)< 00 on K. Since, in Th. 10, any dv(w) is of the form
ij'(w)B- l A (w), Maxi dv(w) I00 is bounded on Q; hence we may use in this case
v

as Lipschitz constant. D
Theorem 12. Suppose one of the integrability assumptions of Th. 10 or Corr. 11 and
Q(x» - 0 0 on K. Suppose further, that the probability measure Pon the Euclidean
space JR.I spanned by the elements of A, b, q is absolutely continuous with respect to
the Lebesgue measure J11 on JR.I - i.e. P has a density function -, then Q(x) has a
continuous gradient on K.
Remark. We still assume (A,b,q) = (A(w),b(w),q(w») to be random in spite of the
fact that we have omitted the wfor simplicity. The following proof consists of
two parts. First we observe that Q(x,w), xEK, has a gradient a.e. and that its
partial difference quotients are bounded by an integrable function. From
Lebesgue's theorem then follows the differentiability of Q(x) = JQ(x,w)dPa,. In
the second part we demonstrate continuity by using an explicit presentation for
the gradient VQ(x).
Proof According to our assumptions Q(x,w) is finite with probability 1 for any
x E K, and may, therefore, be represented via basic solutions as

Q(x,w)=ijiBi l [b-Ax],

where Bi is an optimal feasible basis of Wand iji is the vector ofthose components
of q belonging to B i•
Hence Q(x, w) has a gradient - with respect to x - of the form

for all WE Q such that A, b, q do not belong to one of a finite number of sets of
the type
Sij= {A,b,ql ijiBi l A #ijjBjl A; ijiBi- l [b -Ax] =ijjBjl [b -Ax]}; i#j.

F or every A, q define
2. The Fixed Recourse Case 49

Now either
iii B i- 1A = iijBj-1A, implying EA~q = 0 and
hence the Lebesgue measure Il(EA~q)=O
or

Then
bEEi,j
A,q
if

this means that El'.~ is a hyperplane in the space spanned by the elements of b
and hence the Lebesgue measure Il(EA:q) =0.
From Il(EA~q)=O for every A,q it follows, by Th. 0.25, that IlI(Si) =0 and, since
P is absolutely continuous with respect to Ill,
P(Si) =0.
Therefore Q(x,w) has for every XEK a gradient of the form
7 x Q(x,w) = -(iiiBi-1AY
for all w except a set of Pro-measure zero.
From the proofs of Th. 10 and Cor. 11 we know that, for

I Q(Xb W ) -Q(x2,w)1 <hew)


Ilx 1 - X 211
almost surely, where hew) is integrable. In particular, this inequality is valid for
all partial difference quotients. Hence (as a consequence of Th. 0.22) Q(x) has a
gradient 7Q(x), and
7Q(x) = J7x Q(x,w)dPro.
The continuity of 7Q(x) follows from the following observation:

7Q(x)=I J -(iiiBi1A)dP,
i !!li(X)
where
i- 1
~i(X) = ~i(X) - U ~k(X)
k~ 1

and ~i(X) is the "optimality set" of the basis B i, i.e.

Obviously we have to show that the symmetric difference of ~i(x+L1x) and ~i(X)
tends to a set of P-measure zero as Ax----+O. Looking at this symmetric difference
50 Chapter III. Two Stage Problems

i-I i-I

= [(~i(x+Ax)- U~k(X+Llx))- (~i(X)-U ~k(X))]


k= 1 k= 1
i- 1 i- 1
U [(~i(X) - U ~k(X) ) - (~i(X + Llx) - U ~k(X + Llx) )]
k= 1 k= 1
i- 1

k=l
i-1 i- 1
U [(~;(X + Llx) - U ~k(X + Llx) )n U ~k(X)]
k= 1 k= 1
i- 1

k= 1
i-1 i-1
U [(~i(X) - U ~k(X))n U~k(x+Llx)]
k= 1 k= 1
C [~i(x+Ax)-~i(X)]

u[~ (~k(X)-~k(X+LlX))J
u [~i(X)-~i(X+Llx)]

U[Q (~k(x+LlX) -~k(X))]


shows that it suffices to prove for every index i, that ~;(x+Llx) tends to a set
differing from ~i(X) by a set of measure zero, as Llx tends to 0, i.e.

From
(A,b, q) E~i(X+ Llx) -~i(X) it follows that
Bi- 1(b -Ax)~Bi-1 ALlx
and that

Hence

for at least one component and, at the same time,

It is now obvious, that this element (A,b,q) is not in


3. Complete Fixed Recourse 51

which shows that


lim (2Ii(X + ~x) - 2Ii(X) ) = 0.
.dx--+O

If A, b, q E 2Ii(X) - 2I i(x + ~x), then


Bi- 1(b -Ax)~O
and

implying for at least one component

[Bi-1A~lv> [Bi-l(b-Ax)lv~O; hence (A,b,q)E lim (2Ii(X)-2Ii(X+~X))


.1x ....... O

only if
[B i- 1(b-Ax)lv=0, i.e. only if

(A, b) is an element of one of finitely many hyperplanes in the (A, b)-space. Now it
is obvious, that

Since P is absolutely continuous with respect to fJ.1, we have

p[ lim {(2II(x+~x) -2Ii(X))U (2I i(x) -2Ii(X+~X))}] =0


Llx~O

which completes the proof. 0

3. Complete Fixed Recourse

Again throughout this section we suppose W( w) == W In section 111.1 we were


concerned with the feasibility sets K and K. This means that we have also taken
into consideration such cases in which a first decision x E X may not yield a
feasible solution of the recourse problem with some positive probability.
As we have seen in Ch. I, the practical meaning of the recourse problem was the
possibility to compensate for deviations from the original constraints, caused by
the a priori decision on XEX and the a posteriori realization of the random
elements in A(w), b(w). Hence, from a practical point of view it often seems
meaningful to require that the recourse program has feasible solutions for every
XEX and almost surely with respect to Po,. From this requirement arises the
following
Definition. W is a complete recourse matrix, if
{YI Wy=z,y~0}#0 for every zElRm.

Following immediately from this definition every complete recourse matrix W


has rank r(W) = m. Furthermore it is obvious, that W must have more than In
m
columns W;, since in the contrary case for i = - L W;
i= 1

{YI Wy=i,y~O} would be empty


52 Chapter III. Two Stage Problems

caused by the fact that from r(W) = m the uniqueness of the solution of
Wy=i,y=( -1, ... , -1)' follows.
In that which follows we try to characterize complete recourse matrices.
Lemma 13. If W is a complete recourse matrix with m + 1 columns, then every m
columns of Ware linearly independent.
Proof Since r(W) = m, let WI>' .. , Wm be linearly independent.
Suppose now that WI" .. Wm- I> Wm+ 1 are linearly dependent; then there exist ai
such that
m-l
Wm+1 = L aiW;,
i= 1

Since - Wm E IRm and W is a complete recourse matrix, there exist Pi 2 0 such that
m+l
- Wm= L PiW;
i= 1
m m-l
= LPiW;+Pm+l LaiW;
i= 1 i= 1
m-l
= L (Pi+Pm+l a;) W;+PmWm
i= 1
and hence
m-l
L (Pi+Pm+ 1 a;) W;+(1 + Pm) Wm=O,
i= 1
contradicting the linear independence of WI>' .. , Wm , since Pm 2 0 and therefore
at least 1 + Pm> O. 0
If we assume the linear independence of WI>' .. , Wm , which is justified by
r(W)=m, we may state
Theorem 14. Let W have m+n columns (n21). W is a complete recourse matrix
if and only if
:D= {y\ Wy=0,Y20;Yi> O,i= 1, ... ,m}#0.
Proof The necessity of the condition may be shown as follows:
Let m

Z= L PiW;, where Pi< 0, i= 1, ... ,m.


i= 1

Since W is a complete recourse matrix, there exist Ji such that

m+1i
Z= L JiW;, where J i 20, i= 1, ... ,m+n.
i= 1

Therefore
m m+n
L PiW;= L JiW;,
i= 1 i= 1
3. Complete Fixed Recourse 53

implying m m+n
~::<c5i-Pi)W;+ L c5iW;=O,
i= I i=m+ I
where
c5 i -Pi>0, i=1, ... ,m; but c5 i 2:: 0, i=m+1, ... ,m+n;
consequently 1) # 0.
Suppose now that 1) # 0. Then there exist numbers

c5 i 2:: 0, i=m+ 1, ... ,m+n


CXi<O, i=1, ... ,m
m+n
such that for Wm + ii + 1 = L c5 i W;
i=m+ 1
m+n m
Wm + n+ 1 = L c5iW;= LCXiW;.
i=m+ 1 i= 1
For any ZEJRm there is a unique solution of
m

Z= LPiW;,
i= I

since WI' ... ' Wm are linear independent. If Pi2::0,i=1, ... ,m, we have no further
problem. Suppose, therefore, that, for at least one index i, Pi< 0. Without loss of
generality we may assume that
~= max Pi
CXm ISismCXi

which has to be strictly positive, since CXi< 0, i= 1•...• m. and Pi< for at least one
index i.
°
From the linear independence of WI •...• Wm and CXi < 0, i = 1•...• m, follows the linear
independence of WI.· ..• Wm - 1. Wm + ii + 1·
Hence. there is also a unique solution of
m-I
Z= L YiW;+Ym+ii+ 1 Wm+ii + 1·
i= I
Using
m

Wm + ii + 1 = L CXiW;.
i= 1
this implies
m-l
Z= L (Yi+Ym+ii+ 1· CX;)W;+Ym+ii+ICXmWm
i= I
m

= LPiW;
i= 1
54 Chapter III. Two Stage Problems

and hence, because of the linear independence of WI"'" Wm ,

Yi+Ym+ii+llXi =/3i i=1,oo.,m-1


Ym+n+ IlXm =/3m·
From this system of equations we obtain

Ym+ii+I=~m>o
m

and

since
0< ~= max A~.i!.L for every j = 1, ... , m, and
IXm l:5i:5m lXi IXj
IXj< 0, j= 1, .. . ,m.
Hence
rn-l m+n

Z= L YiW;+Ym+ii+ I L
i=l i=m+l
biW;

where Yi~O; i=1,oo.,m-1; Ym+ii+I>O and bi~O, i=m+1,oo.,m+n. Since ZEIRm


was arbitrary, this yields the completeness of W. 0
From Cor. 9 we know that the expected value Q(x) of the second stage pro-
gram's optimal value is either - 0 0 or finite for every x EK, which equals IRn in
the complete recourse case. In practical applications it seems to be meaningful
to assume that Q(x) is finite on IRn. A simple condition for this property yields
Theorem 15. Given complete recourse and one of the integrability conditions of
Th. 10 or Cor. 11 (for example square integrability of the elements of q(w), b(w),
A(wn then Q(x) is finite if and only if
{zi W'z~q(w)}#0 with probability 1.
Proof For an arbitrarily chosen XElRn the second stage program

Q(x,w) =infq'(w)y
Wy=b(w)-A(w)x
y~O

has feasible solutions for every wEQ by the completeness of W.


Following the lines of the proofs of Th. 8 and Cor. 9, Q(x) is finite if and only if
Q(x,w) is finite with probability 1, hence, by the duality theorem, if and only if

{zi W'z~q(w)}#0 with probability 1 0


Corollary 16. Given complete recourse, q(w)=q (constant) and A(w),b(w) inte-
grable, then Q(x) is finite if and only if

{zl W'z~q}#0.
3. Complete Fixed Recourse 55

Proof Follows immediately from Th. 15.


As we know from Th. 14 for a complete recourse matrix W there exist constants
IXj<O, i=1, ... ,m, and Dj~O, i=m+1, ... ,m+n
such that
m m+n
~>jWj= I
DjWj, where WI,···, Wm
j=1 j=m+1
are supposed to be linearly independent since r(W) = m.
With these constants IXj, Dj we may state
Corollary 17. Given complete recourse and one of the integrability conditions
assumed in Th. 15, for Q(x) to be finite it is necessary that
m m+n
L>jq/W)S L Djqj(W) with probability 1.
j=1 j=m+1
lfn = 1 this condition is also sufficient.
Proof From Th. 15 we know that Q(x) is finite only if {zl W'zsq(w)}#0 with
probability 1, and hence, by Farkas' lemma, onlyifVu~O, Wu=Oimpliesq'(w)u~O
with probability 1. In particular, for u* =( -IX I , ... , -IXm,Dm+ 1, ... ,Dm+n)'~O,
Wu*=O.
Hence Q(x) is finite only if
m m+n
L IXjqj(W)S I Djqj(w) with probability 1.
i= 1 i=m+ 1

Suppose now that n= 1 and hence


m
Dm+ I Wm+I = L IXjWj
j= I
where IXj< 0, i= 1, ... ,m and Dm+ I ~O, implying Dm+ I > 0 by the linear independence
of WI"'" Wm, and suppose further that

p{ {wi j~1 IXjqj(w)SDm+Iqm+ I(W)} J= 1.

For almost every WEQ there exists a unique z(w) such that
Wj'z(w)=qj(w), i=1, ... ,m, which implies
m m
Dm+IW~+IZ(W)= LIXjWj'Z(W)= LIXjqj(W)SDm+lqm+l(w),
j= I j= I
Hence, for almost every WEQ,Z(W) is a feasible solution of W'zsq(w), since
Dm+ I > O. Now the desired result follows from Th. 15. D
However, the condition given in Cor. 17 is not sufficient in general for the
finiteness of Q(x) if W has more than m+ 1 columns, as is shown by the following
example:
-1
2
-1)
-2 .
56 Chapter III. Two Stage Problems

W is a complete recourse matrix, since W1 and W2 are linearly independent and

W3 + W4 = -W1 -W2 ,
and hence 0(1=0(2=-1
153=154 =1.
Let q(w)=q, given by q1 =q2=q3=1, q4= -2, which satisfies
0(1q1 +0(2q2= -2::;J 3q3+ J4q4=-1.
Here W'z::;q
is equivalent to
Z1-Z2 ::;1
Z1+Z2 ::;1
-Z1 +2z 2 ::;1
-Z1 -2Z2::; -2.
Summing up the last two inequalities yields

if we add twice the second inequality and the fourth inequality, we get
Z1 ::;0.
Hence {zl W'z::;q} =0, which implies, by Cor. 16, that Q(x) = -exl.

4. Simple Recourse

Simple recourse is a special case of complete fixed recourse in the following


sense:
Defmition. W=(I, -1), where 1 is the (m x m) identity matrix, is called the
simple recourse matrix.
This definition says that in the simple recourse model the violations ofthe original
constraints, which may occur after having chosen a decision XEX and obseryed
the realization of A (w), b(w), are simply weighed by qj(w). For the simple
recourse model it is convenient to write the second stage program as follows:

Q(x,w)=inf[q+'(w)y+ +q-'(w)y-]
subject to y+ -y- =b(w)-A(w)x
y+~O
y- ~O; y+, y- EIRm.

Corollary 18. Given simple recourse and one of the integrability conditions of
Th. 15, then Q(x) isfinite if and only if q+(w)+q-(w)~O with probability 1.
4. Simple Recourse 57

Proof. By Th. 15 Q(x) is finite if and only if {zl W'zS;q(w)} #0 with probability 1,
i.e. if and only if {zl-q-(w)S;zS;q+(w)}#0 with probability 1. This yields the
proposition of the Cor. D .
The simple recourse model has been studied for various applications all of
which have in common that they can be understood as production or allocation
problems where only the demand is stochastic. In this case it turns out that we
get Q(x), or some equivalent, in a rather explicit form which allows more insight
into the structure of the problem than convexity and differentiability do.
Hence we assume that
q+(w)=q+, q-(w)=q- and A(w)=A;
i.e. only b(w) is random. According to Cor. 18 we assume that

q=q+ +q- ~O.

The following results are due to R. Wets [16]:


Theorem 19. Q(x,w) may be represented as a separable function in
m
x=Ax, i.e. Q(x,w)= L Q;(Xi,W).
i= 1

Proof. Q(x, W) =min{q+'y+ +q-'y-Iy+ -y- =b(w) -Ax,y+ ~O,y- ~O}.


By the duality theorem Q(x,w)=max{ (b(w) -Ax )'ul -q- s;us;q+}.
For this program we can immediately find an optimal solution u* (w) with the com-
ponents
if (b(w) - Ax )i> 0
if (b(w)-Ax)iS;O
if Xi<b;(w)
if Xi~bi(W).
If we define

we get the theorem. D


Theorem 20. Provided b(w) is integrable, Q(x) may be represented as a convex
separable function in X= Ax, i.e.
m
Q(x) = L Qi(Xi), where Qi(Xi) is convex, i = 1, ... , m.
i= 1

Proof. According to Th. 19 the separability in X=Ax follows from


m
Q(x) = JQ(x,w)dP",= L JQi(Xi,w)dP",
i= 1
i.e.
58 Chapter III. Two Stage Problems

Using the definitions q=q+ +q- and Oi= Sbi(w)dP"" Th. 19 yields
Q;(xi)=qt J (b;(w)-X;}dP",-qi- S (bi(w)-X;)dP",
b i (",) > Xi b i (",)!> Xi

=qiOi-qiXi-qi S (bi(w)-X;)dP",.
b i(",) !>Xi

To prove the convexity of Qi(Xi) it suffices to show that


qi J
b i (",)!> Xi
(Xi -bi(w) )dP",
is convex in Xi.
Since qi ~ 0, we have only to investigate the integral.
Suppose ti).at xl < xr, 0< A< 1 and xf =Axl +(1-A)Xf.
Then
J (Xf-bi(w))dP",=A S (xl- bi(w))dPc,,+(1-A) S (xr-bi(W))dP",
b i(",)!> x7 bi("')!> x7 b i(",)!> x7
=A S (xl- bi(w))dP",+(1-A) S (xr-bi(W))dP",
b i (",)!> x/ b i(",)!> xf
+}, S (xf-b i(w))dP",-(1-A) S (xr-bi(w))dP",
xt < bj(ro) S xi xt < bi(ro):$; xf
~A S (xl- bi(w))dP",+(1-A) S (xr-bi(W))dP""
b i (",)!> xl b i (",)!> xf
since obviously
S (xl -bi(w) )dP",~O
xl < bi (",)!> x~
and
S (xr -bi(w) )dP",~O.
x~ < bi (",)!> i;
Hence
S (Xi -b;(w) )dP",
bi("')!>Xi

is convex in Xi. D
Suppose now, that there exist IX;, Pi such that
lXi~bi(W)~Pi for all wED.
Then from
Q;(xi)=qioi-qixi-qi S (bi(w)-Xi)dP",
bi("')!>Xi

we know that

and

i.e. only on (lXi, Pi) the function Qi(Xi) may be nonlinear.


4. Simple Recourse 59

Thus, it seems desirable to separate the nonlinear and linear terms by constructing
a new objective function which yields the same solution set as
m

I
i= 1
Qi(XJ

This may be done by introducing the variables Xit,Xi2,Xi3 and the following con-
straints:
-Xit +Xi2+Xi3=Xi- Vi
Xit ~ Vi -rxi
Xi2 S Pi -rxi
Xi2~O
Xi3~O
(Xit ~ 0 follows from Vi ~ rxJ
Let ~i(Xi) be the set of all feasible (Xit, Xi2, Xi3).
If
,nxJ= J (Xi-bi(w»)dPro
and bi(ro):SXi

IP/Xit,Xi2,Xi3) = Xi3 + J
bi(ro):S cti + Xi2
(Xi2 +rxi -bi(w) )dPro,

then we can state the following


Theorem 21. t/ti(xJ=min lP i(Xil,Xi2,Xi3).
~i(Xi)

Proof Let (Xil,Xi2,Xi3)E~i(XJ be arbitrarily chosen.


t/Ji(XJ= J
bi(ro):SXi
(Xi-bi(w»)dPro

J
bi(ro) :SXi
(hi-Xil +Xi2+Xi3- bi(w»)dPro

= Xi3· Pro (bi(w) S Xi)+ (Vi -Xil -rxi)Pro (bi(w)SXi)+


+ J (Xi2 + rxi - bi(w) )dPro.
bi(ro):Sxi

t/J/xi)SXi3+ J
bi(ro) :SXi
(Xi2+ rxi- bi(w»)dPro

S Xi3 + J (Xi2 + rxi - bi(w) )dPro


bi(ro) ~ Xi2 + lXi

The last inequality is obvious, if Xi2+rxi~Xi' and if Xi2+rxi<Xi, it follows from


Xi2+rxi-bi(W)<O for bi(w) such that Xi2+rxi<bi(w)SXi.
Now it suffices to show that there exists (Xfl,xf2,xf3)E~/Xi) such that

lPi(xfl, Xf2, Xf3) = t/J i(X;).


60 Chapter III. Two Stage Problems

This may be achieved by determinirig (XfbXf2,xf3)E~;(xi)' as follows:


For Xi:::;IXi:xfl=bi-Xi, Xf2=xf3=0

=>cP;(Xfl' Xf2' Xf3) = f (Xi -bi(w) )dP", = t/li(X;).


bi (w):5 Xi

For Xi>f3i:xfl=bi-lXi, Xf2=f3i-IXi' Xf3=Xi-f3i


=>cPi(xfl, Xf2' Xf3) = Xi - f3i + f3i -bi

Where bi(w) is integrable and bounded below by lXi' but not essentially
bounded above we have
Corollary 22. Let

f
and

Then t/I ;(xi) = min cPi(Xil, Xi2)'


\!li(Xi)

Proof The proof follows immediately from that of Th. 21 by setting Xi3 =0 and
f3i= +=.0
Now the problem
Min{Q(x)+c'x},
xeX

where X is usually some convex polyhedral set, may be rewritten as

Min{tl Qi(Xi) + C'x}

subject to X-Ax=O
XEX

which is, by the proof of Th. 20, the same as

subject to
X -Ax=O
XEX.

Since, by assumption, iii = qi + qi ;: : : 0, i = 1, ... , m, it follows from Th. 21 that this


problem has the same solution set with respect to x as the following one:
4. Simple Recourse 61

subject to bi - Xil + Xi2 + Xi3 - Ai X =0 where Ai is the i-th row of A


Xil2 bi -(Xi
Xi2 :::;, Pi - (Xi
Xi2 20
Xi3 20
xEX.

In case that Pi =
Xi2:::;' Pi -(Xi·
+ 00, we set, as in Cor. 22, Xi3 = °and omit the constraint
It seems worthwhile mentioning that this representation of the problem implies
that contrary to the general complete recourse case, for the simple recourse model,
where q+, q- and A are constant, only the probability distribution of every
blw) has to be known, but not their joint distribution. This also means that it does
not matter whether the random variables bi(w) are stochastically independent
or not.
To illustrate the above result let us give some examples. First suppose that the
random variables bi(w) have finite discrete probability distributions, i.e.

·bi(w) = b il with probability Pib 1= 1, ... ,Ki.

L Pil = 1, bil = (Xi' bi"i = Pi·


"i

where bil < bil +1 and Pil> 0,


1= 1

Then, if b iv :::;' (Xi + Xi2:::;' b iv +1 for some v, 1:::;, v:::;, Kb

J (Xi2+(Xi- b l w ))dPw= L (Xi2+(Xi- bil)Pil


bi(w)"Xi2 + <Xi {lib" "Xi2 + <X;)
= L (biv-bil)Pil+ {1:b,,"Xi2+<Xi}
{/:bil "Xi2+<X;)
L (Xi2+(Xi- biv)Pil
v-1
= L (biv -bil)Pil+ iJ2· F iv
1= 1

where

and
v

Fiv= LPil·
1= 1

Hence the objective function is linear in Xi2 on every interval

If we choose
il2=b il +1 -bi/, 1:::;,I:::;,v-1,

il2 =0, I> v


62 Chapter III. Two Stage Problems

then
bj (ro):5:
J+
Xi2 aj
(Xi2 +OCi -bi(w) )dP=
v- 1 v- 1

= L L (bik+ 1 -bik)Pi/ + ihFiv


1= 1 k=1
v- 1 v- 1

= L L i~2Pi/ + ff2 Fiv


1= 1 k=1
V xi

= L i12 Fi/ = L il2Fil


1= 1 1= 1
and

Since F il <Fi/+l, 1=1, .. "/(i-1, it is obvious that


Xi Xi

LiI2 Fi/=min Lxl2Fil


1= 1 1= 1
";
subject to L xl2 = Xi2
1= 1
O:$;xb:$;bi/+ 1 -bi/.
Regarding qi ~ 0, therefore we may solve the linear program (provided that X
is convex polyhedral):

Min{ i {qt Xii -qt Xi2 +qiXi3 +qi f xI2 Fi/}+C'X}


i= 1 1= 1

subject to b i -Xil + Xi2 + Xi3 -Aix=O


Xil ~bi-OCi
O:$; Xi2:$; f3i -OCi
Xi3~0
";
Lxl2 -Xi2=0
1= 1
O:$; xl2 b
:$; il + 1 - bil
XEX.

Next, we suppose that the random variables bi(w) are uniformly distributed on
[OCi, f3i]' i.e. the distribution is determined by the density function

I:()
Jir= 11
f3i- oc/
O
OC(5:;'r~f3i (OCi< f3i)

otherwise.
Then we get, since
0< X·2<f3·-OC·
- r I - l'
4. Simple Recourse 63

Hence - as was first pointed out by Beale [2] - we have to solve the convex
quadratic program

subject to bi -Xil +XiZ +Xi3 -Aix=O


T p. -IX'
Xit2:Ui-lXi=~

O~XiZ~Pi-lXi
Xi32:0
xeX.
Finally, let us assume that the random variables bi(w) are exponentially
distributed with density functions
r2:0
otherwise
where A.i>O, i.e.lXi=O, Pi= +00.
Then we have
J J(XiZ -r)e-).itdr
Xi2

(Xi2 +lXi -bi(w) )dPro =A.i


bi(ro):5 ~i+ )(i2 0

= Xi2 +1.(e-.t i )(i2 -1).


,
Hence we get the convex program

· {~
M 10 { + - ij; -).)(2 qi} -,}
.~ qi Xit +qi XiZ +-:e i ' - - : +C x
,-1 A., A,
subject to bi -Xil +Xil -Aix=O
- 1
·I>b·=-
X, -, X
XiZ2:0 '
xeX.
Using the taylor series

the objective function of this program may be written as


m { + + - A. i 2
qi - <Xl v • ,-I
/'i XiZ
V }
_, }
Q(X.l,X.z,X)=.L
{ qiXil-qiXiz+-Xn+qiL(-1) ,. +CX
,=1 2 v=3 V.

which may be approximated by its.first and second order terms, i.e. by

, { m { + + qiA.i
Q(X.bX.Z,X)= i~1 qi Xil-qi XiZ+-2-Xil +CX .
2} _,}
64 Chapter III. Two Stage Problems

It is certainly more convenient to solve the approximating quadratic program


instead of the more complicated convex program. But then one should have, at
least a posteriori, some information on the accuracy of the approximating
solution, i.e. one needs at least an a posteriori error bound.
Rewriting the objective functions

Q(X.bX.2,X)={t{qtXil-qtXi2+ijiXi2+ ~: (e-,,;x;2_1)}+C'x}

we have

= L ~hiZ)'
i= 1
From
Ai(O) =0,
Ai(XiZ)=ij[1-e~";X;2 -AiXi2]=>Ai(0) =0, and
A;'(XiZ)= iji [Aie -";X;2 -A;]~O for Xi2 ~ O(Ai> 0, iji~ 0),

°
it follows that Ai(X;z) ~ and hence
A(X.2) ~ °
for X.2 ~ 0, i. e.
Q(X.bX.2'X)~ Q(X.1,X.2,X).
On the other hand, it follows from
1
iji [Xi2 +T(e- ";X;2 -1)] ~
,
° for Xi2 ~ °
that m
Q(X.[,X.z,x)~ L {qtXil-qtX;z}+c'x= -q+'Ax+q+'b+c'x=L(x).
i= 1

Therefore, if x*, x** and i are minimal feasible points with respect to Q, Qand
L, we know from Th. 21, that

L(x) ~ L(x*) ~ Q(X~1' X~2' x*) ~ Q(X~t, X~Z*, x**) =

= f {qtbi-qt AiX**} +
i= 1
L qi[Aix**+~(e-";A;X" -1)]+c'x**~
i:Aix·* > 0 Ai
~ Q(X~t, X~Z*, x**).

It is obvious that the bounds Q(X~t,xY,x**) and L(x), which are determined
by solving a quadratic and a linear program, depend essentially on the data q+,
q -, b, A and the feasible set X.
5. Complutational Remarks 65

5. Computational Remarks

From the theory developed so far it seems rather difficult to get a numerical
solution of a general two-stage program with some arbitrary given joint pro bability
distribution. Take for example a complete fixed recourse problem, the distribution
of which is given by a density function. In this case we have to minimize a con-
tinuous differentiable convex objective function Q(x) subject to XEX. If X is a
bounded convex polyhedral set, this problem can be theoretically solved by the
following special method of feasible directions:
Given Xk E X, solve the linear program
Minx'VQ(xk) subject to XEX.
If xk solves this linear program, then Xk solves the original problem MinQ(x)
subject to XEX. Otherwise let y" be a solution of the linear program. Then solve
the one dimensional problem
Min Q (h k + (1 - A)y" ) subject to 0::;; A::;; 1,
yielding Ak. Now restart the procedure with
xk + 1 = Akxk + (1 - Ak) y".

It is well known that this method converges to a solution of the original problem
MinQ(x) subject to XEX. However, this procedure involves the repeated
evaluation of Q(x) and VQ(x), which as we know from the proof of Th. 12, are
given by sums of multiple integrals over sets ~i(X), which are polyhedral and
depend on x. This type of numerical integration seems not to be completely
investigated in numerical analysis; one can only be sure that the amount of work
evaluating these integrals is tremendous. Therefore, it does not seem to be
reasonable to apply the above procedure. For an alternative approach we may
get hints from the examples in section 4. There we have seen that in the simple
recourse case, where only b is random, a finite discrete distribution of b leads to a
linear program and a uniform distribution of b/s yields a quadratic program.
Finally we gave an a posteriori error estimate for approximating the nonlinear
program resulting from exponential distributions by a special quadratic program.
From these examples it seems obvious to try the following approach: approximate
the given two-stage problem by a special optimization problem which may be
handled more easily, e.g. by a linear or quadratic program. Then the only problem
consists in finding reasonable error estimates.
Suppose for example that the given two-stage problem is of the simple recourse
type where only b is random and the finite distribution of b i is given by the
distribution function Flr) (Fi(CXi) = 0, Fi(Pi) = 1 ). According to the last section the
objective function of the problem is
Q(X.l>X.2,X.3,X) =
m CXi+ Xu
L {qiXil-qiXi2 +qi-xi3+iii J (Xi2+ cxi-'t')dFi(-r)}+c'x
i= 1 (Xi
66 Chapter III. Two Stage Problems

where O~Xi2~Pi-a.i. Replacing F;(r) by the discrete distribution

7: V =a.i+*(Pi -a.i), v=O, 1, .. . ,n and n is an arbitrary positive integer, yields a new


objective Q(Xil,Xi2,Xi3'X) which is piecewise linear in Xi2 and, as we know, may
be replaced by a linear objective function with 2m+m x n instead of 3m X-variables.
To get an error estimate for the optimal value of this approximating linear
program, we need a bound for
1 Q(XibXi2,Xi3,X) -Q(Xil,Xi2,xi3,x)l·
From the definition of Riemann-Stieltjes integrals we know that
K
Sn= L (Xi2+a.i-7: +l) [Fi(7: v+l)-Fi(7: v )]
v=o
v

aEj+Xi2

~ J (Xi2 +a.i -7:) dFi(7:)


lXi
K
~ L (Xi2+a.i-7: v)[Fi(7: v+l)-F;(-rv)]=Sn
v=o
where K ~ n is the greatest integer such that

At the same time


lZi+Xi2

Sn= J (a.i+Xi2 -7:)dF;'(7:).


IX;
Therefore, from
IIX;Zi\a.i+ Xi2 -7:)[d~(7:) - dFi(7:)] I~

ISn -Snl =1 Jo v (7: + 1 -7: v) [Fi(7: v +1) -Fi(7: v )] 1~~(Pi -a.i)


it follows that

this is the desired error estimate which obvious also remains valid for the optimal
values of Qand Q.
If, in the same simple recourse model, Fi(7:) has a continuous density Ji(7:), we
may try another approximation by replacingJi(7:) by a piecewise constant density
functionl(7:) such that
11(7:) - f(7:) 1~ e V7: E [a.i, Pd.
Then
5. Computational Remarks 67
and hence
IQ-QI~eitlqi (f3i~rxY .

From the last section we know that for constant densities Ji( or) we get quadratic
programs. It is now obvious that piecewise constant densities again yield
quadratic programs.
It is also evident for the general two stage problem that a finite discrete joint
probability distribution yields a linear program. Suppose that we have the
general two stage problem
min{c'x+Q(x)}
XEX

where
J
Q(x) = Q(x,w)dPo" c= Jc(w)dPo,
and
Q(x,w)=min {q'(w)yl W(w)y=b(w) -A(w)x,y;:::O}.

Suppose furthermore that Po, is a finite discrete probability distribution, where


r

the elements wiEQ,i=1, ... ,r, have the probability Pi (Pi~O; LPi=1). Then it
i= 1
is easily seen, that the two stage problem min {c' x + Q(x)} may be rewritten as
XEX
r

min {c'x+ L p;q'(wi)i}


i= 1
subject to

A(w;)x+ W(Wi)yi=b(Wi) } /,:=1, ... ,r


XEX, i~O

which is a linear program if X is convex polyhedral. This linear program has


(dual) decomposition structure, where the blocks W(Wi) remain unchanged in
case of fixed recourse. Therefore, it seems reasonable - from the computational
point of view - to approximate any probability distribution by a discrete one.
We may conclude from the stability theorems of Kosmol [19] that, under
appropriate assumptions on the choice of the discrete probability measures, the
optimal values of the resulting linear programs converge to the optimal value of
the original problem - at least for compact X and complete fixed recourse.
To get error estimates, let us state the assumptions
A1) {zl ~YEIRn :y~O, Wy=z} = IRm.
A2) 'v'WEQ: {ul uEIRm, W'u~ q(w)} #0.
A3) The elements of A(w),b(w), q(w) are square integrable with respect to P «)"

Hence we require complete fixed recourse so that Q(x,w) is finite on Q and


integrable for every x E X (bounded convex polyhedral). If we define the convex
polyhedral cone Kw by
Kw= {qI3uEIRm: W'u -q:s;O},
then A2) requires that q(W)EKw 'v'WEQ.
68 Chapter III. Two Stage Problems

Let Av(w), bv(w), qv(w) be arrays of the same dimension as A(w), b(w), q(w),
but with simple functions as elements. The corresponding objective functions let
be Qv(x,w) and Qv(x) = JQv(x, w) dPa,. Obviously the determination of the simple
functions defines a discrete distribution on Q. We must require, that A.2) is also
satisfied forqv(w) (at least almost surely). For this purpose, we have to be careful.
If for example
W=(1,-1)
and q(w) has the range R(q)={(e,'1)le~ -2.5;'1~2.5}, then A.t) and A.2) are
satisfied for the original problem. Now let M = {(e,'1)1-4< e ~ -2, 1~ '1< 3} be
an interval of some partition. Then q-l [M] #0, such that M could have a
positive probability. Choosing on M the norm minimal vertex v={(e,'1)I~= -2,
'1 = 1} as value of qv(w) does not satisfy A.2), since W' u~ qv(w) yields -1 ~ u~ - 2.
But if we choose the norm minimal element of the intersection of M and
Kw = {(~, '1)13u: -'1~ u~~}
= {(~, '1) I ~ + '1 ~ O},

i.e. qv(w)=( -2,+2), then A.2) is satisfied. In general, the analoguous way
(choosing the norm minimal element of the intersection of every interval and
Kw) yields a sequence satisfying A.2) too.
Let therefore (Av(w), bv(w), qv(w») be an integrable simple function such that
A.2) is satisfied. We want to have an error estimate for the objective function and
hence for the optimal value ofthe approximating problem min {c'x + Qv(x) 1x EX},
which depends on the approximation of (A(w), b(w), q(w») by (Av(w), bv(w),
qv(w»), measured by the (generalized) L 2 -norm. For any vector-valued function

we define
(!(g)= Jilg(w)11 2 dP""
Q

where II ... 11 is the Euclidean norm on IRk. In this connection (!(A) means that the
matrix A(w) is handled as an (m·n)-vector.
General error. There are constants 0(, y, bv such that

1Q(x) -Qv(x)l~ [O(+y II xii] (!(q -qv)+b v[(!(b -b v)+ II xii (!(A -Av)].

This may be seen as follows:


For every convex or concave function
<p: IRI_IR 1 we have
1<p(x) -<p(y) 1~ Max [I (x -y)'''l<p(x) I; 1(x - y)'''l<p(y) I]
~ Max [11"l<p(x) 11·11 x - yll; II "l<p(y) 11·11 x -yll],

where "l<p is the gradient (or some subgradient) of <p. Using basic solutions, from
the former results we get
5. Computational Remarks 69

IdQv(x,w)1 =1 Q (x, A (w), b(w), q(w)) -Q (x,Av(w), bv(w), qv(w))1


~ Q (x, A (w), b(w), q(w)) -Q (x,A(w), b(w), qv(w))1
1

+1 Q (x,A(w), b(w), qv(w)) -Q (x, Av(w), bv(w), qv(w))1


=1 qi(w)Bi 1 [b(w)-A(w)x]-q~iW)B;1 [b(w)-A(w)x]1
+1 q~iw)Bj-1 [b(w) -A(w)x] -q~k(w)Bk 1 [bv(w) -Av(w)x]1
~ IyIaxll
.ell
Bi 1 [b(w) -A(w)x] 11·11 q(w) -qv(w) II

where {Bil iEJd and {Bil iEJ2 } are those bases out of W, which for some XEX
and wED are primal feasible and dual feasible respectively.
From Schwarz's inequality follows with
0 if b(w) -A(w)x=O
z(x w) =
' 1 b(w) - A(w)x
Ilb(w)-A(w)xll
e se
I

0 if qv(w)=O
and ri(w) = 1II qvi(W) else
qv(w) II

1Q(x) -Qv(x)l~ f 1dQv(x,w) 1dP


D

~ Maxll
iell
B i- 1·Z(X,w) II· (J(b -Ax)· (J(q-qv)
weD

+ ¥axll Bi 1'ri(W) II· (J(qv)· (J(b -bv + (Av -A)x)


.ell
weD

Hence, since (J(q+h)~ (J(q)+(J(h), for


IX= IyIaxll Bi lZ(X,W) II (J(b)
IEJl
weD

it follows that
1 Q(X) -Qv(X)~ [1X+y II x II] (J(q -qv) +t5 v [(J(b -b v )+ II x II (J(A -Av)],
which was the hypothesis.
It must be mentioned that determining the constants IX and y leads in general
to a considerable amount of work, since it implies more or less the inversion of
all nonsingular (m x m)-submatrices of W. This difficulty diminishes rapidly
for certain special cases. Determining t5 v (or at least an upper bound) is not dif-
70 Chapter III. Two Stage Problems

ficult, since Bi l'ri(w) is a feasible u-part of the set {(u,q) I W'u;:£ q, II q 11;:£ I}, which
is bounded according to A. 1).
Assume simple recourse, i.e. W = (/, - J). Then
I Q(x) -Qv(x)1 ;:£ [e(b) + II xii e(A)]e(q -qv) + e(qv)[e(b -b v) + II x II e(A - A v)].

Namely W=(/, -1) implies that for every basis Bi out of W IIBi-1z(x,w)11 =
= Ilz(x,w)11 and, by definition, Ilz(x,w)ll;:£ 1. And for every iEJ2, Bi-1'ri(w)
is a feasible u-part of {(u,q)1 W'u;:£ q, II qll;:£ I}, which for

q:' may be written as


ql

Obviously every feasible u satisfies II ull;:£ 1. Using these observations, we get for
the constants defined in the proof of the general error formula
ex;:£ e(b)
Y;:£ e(A)
bv;:£ e(qv).
For the general complete recourse problem we get out of the difficulties of
determining ex and y, if q(w) is constant, as is readily seen above. Moreover we can
weaken assumption A. 3) to integrability instead of square integrability.
Defining a generalized 4 -norm for vector valued functions as

Jl(g) = JII g(w) II dP,


Q

we get the error estimate:


For constant q(w)=q there is a constant b such that
IQ(x) -Qv(x)1 ;:£ b[Jl(b -bv) + II xii Jl(A - A v)].

From IilQv(x,w)1 above we get

lilQ(x,w)l;:£ ¥ax
Ie},
II Bl'qdl'll b(w) -bv(w) - [A(w) - Av(w)]xll

and therefore

For iEJ2, Bi-1'qi is feasible in {ul W'u;:£ q}, which is bounded.


6. Another Approach to Two Stage Programming 71

Everyone of the error estimates above becomes independent of x, if A(w)=A


and q(w)=q are constant, i.e. in this case we get uniform convergence of
{Qv(x)} on X, if the simple functions chosen converge to the remaining random
variables with respect to the appropriate norm.
One might object that this type of approximation is not practicable, since the size
of the approximating problems becomes very large. If for example in (A,b,q)
50 random variables with a joint probability distribution are involved, and if we
discretize in such a way that for every random variable 10 realizations occur,
then we should have m x 1050 constraints, which cannot be handled. However,
in most of the practical problems there is a small number of random variables
t1>t 2, ... ,t" where often r~ 5, and(A,b,q) depend on t 1, ... ,tr. If this dependence
looks like
A(t)=Ao+A1t1 + ... +Artr
b(t)=bo +b 1t1 + ... +brtr
q(t)=qO+q1 t 1+ ... +qrt"

then the discretization can be carried through with respect to the random
vector t yielding problems of a size which can be handled today.
It is obvious, then, that all above mentioned error estimates can be expressed on
e(t - t(V» or Jl(t - t(V»' where t(v) is a simple function.

6. Another Approach to Two-Stage Programming

In a more general framework of stochastic input-output systems Marti [11]


found some interesting results. The generalization consists essentially in studying
stochastic linear programs on arbitrary topological linear spaces instead of
Euclidean spaces. Then it turns out to be useful to investigate more thoroughly
the optimal value ofthe second stage problem -which we called Q(x,w). Restrict-
ed to Euclidean spaces Marti is first concerned with the complete fixed
recourse objective function, i.e. his second stage program is

Q(x,w)=min{q'yl Wy=b(w)-A(w)x,y2::0}

where W is a complete recourse matrix and q2::0.


The investigation of the optimal value of the recourse program m(z)
=min{q'yl Wy=z,y2::0} for zeJRmyields the following statements:

Lemma 23. Under the above-mentioned assumption:


a) O::S;m(z)< 00
b) m(O) =0
c) m(Az) = Am(z) VzeJRm, VA2::0.
d) m(z1 +z2)::s;m(z1)+m(z2) VZ1>Z2eJRm.
Proof a) Follows immediately from q2:: 0 and the complete recourse assumption.
b) y = 0 yields a solution.
72 Chapter III. Two Stage Problems

c) Suppose A> 0 and let y be a solution corresponding to Z; then m(z)=q'y.


AY is feasible with respect to AZ, hence m(Az):::;; q'(AY) = Am(z).
If m(Az) < Am(z), let y be a solution with respect to AZ, i.e. m(Az) = q'y. Then
1 y is feasible with respect to Z and therefore m(z):::;; 1 1q'y= m(Az) < m(z).
From this contradiction it follows that m(Az) = Am(z).
d) Follows from the convexity theorem (Th. 0.7)

m(Azl +(1-A)z2 ):::;;Am(zl)+(1-A)m(z2) and c) for A= ~ . D

Defmition. Let K c lRm be convex and such that OElR mis an interior point of K.
Then QK(z)=inf{A> 01 ~ EK} is called the Minkowskifunctional of K (on IRm).
Theorem 24. Under the assumptions of Lemma 23 K={zlm(z):::;;l} is a convex
polyhedral set, which has 0 E IRm as an interior point.
Proof From the convexity theorem (Th. 0.7) we know that
1) m(z) is piecewise linear and convex on IRm and hence
2) continuous on IRm (Th. II. 12).
Therefore K is convex polyhedral and {zlm(z)< 1} c K is open and contains
OElRm due to Lemma 23 b). D
Theorem 25. m(z) is the Minkowski functional of the set K defined in Th. 24.
Proof The Minkowski functional of K is defined as

QK(Z)=inf{A> 01 : EK}
=inf{ A> 01 m( :):::;; 1} due to Th. 24
=inf{A> 01 m(z):::;;A} due to Lemma 23c)
=m(z). D

On the other hand we have the converse statement.


Theorem 26. For every Minkowski functional QK(Z) of a convex polyhedral set
K c IR m (which contains by definition OElRm as an interior point) there exists a
complete recourse matrix Wand a vector q"?:.O such that QK(z)=m(z)=min{q'yl
Wy=z,y"?:.O}.
Proof The convex polyhedral set K is the direct sum of a convex polyhedron and a
convex polyhedral cone, i.e. there are vectors piElRm,i= 1, .. . ,r, and kjElRm,
j=l, ... ,s, so that

K ={VI V= it1IY..i Pi + j~f3jkj;lY..i"?:.O,f3j"?:.O, itllY..i= 1}.


Since OElRm is an interior point of K,
for every z E IR m there exist v E K and A> 0 such that
r

l>i=1.
i= 1
6. Another Approach to Two Stage Programming 73

Hence for the matrix

and

we have
Wy=Z,
which implies that W is a complete recourse matrix.
Furthermore we have
r r

LYi=A L OCi=A.
i= 1 i= 1

Hence

QK(Z)=inf{A> 01 ~ EK}
=inf{A> 01 AV=Z, vEK}

=inf{A> 01 Wy = z,y;:::: 0, A= it/i}

=inf{t/il Wy=z,y;::::O},

which proves the theorem. D


According to this correspondence between optimal values of linear programs
and Minkowski functionals one can take advantage of the knowledge on Minkows-
ki functionals, which have been investigated in functional analysis, to get con-
tinuity and differentiability statements analogous to theorems 10 and 12 in the
complete fixed recourse case with q;::::O.
Recently Marti [12] continued this approach requiring instead of q;::::O the more
general condition of Cor. 16, which guarantees the existence of a solution of the
second stage program in the complete recourse case. Then the functional

Q(Z) = min {q'y 1 Wy=z,y;::::O}

is, in general, not a Minkowski functional, since Q(z) may be negative. But Q(z)
has still the following properties:
Lemma 27. a) Q(AZ)=AQ(Z) VZEIRm, VA> 0
b) Q(Zl +Z2):$; Q(Zl)+Q(Z2)
c) e(z) is continuous.
Proof a), b) are proved in the same way as in Lemma 23;
c) follows from the complete recourse and finiteness assumptions and the
convexity of Q(z). D
Lemma 28. There exist vectors 9iEIRm, i= 1, .. . ,r,
such that Q(z)=m~x{9;z}.
,
74 Chapter III. Two Stage Problems

Proof According to our assumptions - complete recourse and existence of solu-


tion - for every zER m we have a feasible optimal basis B in W, i.e. B is a m x m
nonsingular submatrix of W such that
B- 1 z:2=O

and

where ij consists of the m components of q belonging to B. Optimality of B means,


according to the simplex criterion,
q' -ij' B- 1 W:2= O.

Let B 1 , • • o,Br be all "optimal" bases in W, i.e. all nonsingular m x m submatrices


of W fulfilling

Due to the duality theorem


e(z)=max{z'gl W'g:::;;q}.

Since gi=(ij/Bi-1y, i=1,. .,r, is feasible in this dual program, we have


0

z'gi:::;; e(z), i= 1, .. . ,r, where equality holds for at least one gi' 0
According to Lemma 28 we may rewrite Q(x) as
Q(x)= Je(b(w) -A(w)x )dPw = Jl~~:)g/ (b(w) -A(w)x )}dPw-
From this representation we may conclude an error estimate for the discretization
mentioned in Section 5 at least for the case when (A,b) has a finite probability
distribution and X is bounded.
Suppose that m is an interval in the (A, b)-space with p(m)= 1, and is partitioned
s
into intervals mj-i.e.m i nmj =0 and umj=m - such that, for d 1,d2 Em j ,
j= 1

II d 1 -d2 11 :::;;<5.
Let Pj=Pw ({wi (A(w),b(w») Emj} ).
For some XEX let (Ajk,b jk ) Emj , k=1,2, be such that
e(b j1 - A jlX):::;; e(b -Ax):::;; e(b j2 - A j2X), V(A,b) Emj .
Then,
S s
LPje(bjl -AjlX):::;; Q(x)= Je(b(w) -A(w)x )dPw :::;; LPje(b j2 -A j2X).
j=l j=l

If we choose some (Aj,b)Emj we can approximate Q(x) by


s
Q(x) = L pje(bj - A jX),
j= 1

which yields a linear program, as shown in sec. 5; we then get the error estimate
6. Another Approach to Two Stage Programming 75
s
I Q(x) -Q(x)1 :::;; LPj[Q(b j2 -A j2 x) -Q(bj1 -A j1 x)]
j= 1

s
:::;; LPXmaxllgil1 {b+bllxll }],
j= 1 I

using the convexity of e (see Th. 0.11)


=maxllgill {b+bllxll}
I

:::;; C· b, since II x II is bounded.


This justifies the discretization, because if x and x are solutions of
min{c'x+Q(x)} and min{c'x+Q(x)} respectively,
xeX xeX

and, without loss of generality, c'x+Q(x)<c'x+Q(x), then

0< c'x + Q(x) - c'x -Q(x):::;; Q(x) -Q (x):::;; C· b.


But it must be pointed out that there are still some difficulties in calculating this
error bound, since determining

max II gill = max II q; B i- 1 II : :; II q II· max II B i- 1 II


I I I

is not at all trivial, if one does not want to determine all inverses of bases in W
For the special case where q> 0 and c'x~O on X, the boundedness assumption
on X is not restrictive, as is indicated by the following theorem, because there
then exists T> 0 such that e(z)~Tllzll, due to Lemma 28.

Theorem 29. Ifc'x~O for x EX and there is a realT> 0 such that e(z» T II.: II and
if
Pc,,({wl A(w)x=O})< 1 for every x#O, then there is a compact set IB such that
inf {c'x+Q(x)} = inf[c'x+ Q(x)}.
x Xn~\

Proof Let
<p(x) = JII A (w)x II dP",.
Then, from the assumption

it follows that

<p(x» 0 for x#O and <p(0) =0,


<p(h)=1 AI <p(x),
<p(x + y):::;; <p(x) + <p(y).

Hence <p(x) is a norm on IRn and there exists a real K> 0 such that

<p(x)~ KII xii.


76 Chapter III. Two Stage Problems

Now for XEX


c'x+Q(xL:~ Q(x) = IQ(b(w) -A(w)x )dPo,~ TIll b(w) -A(w)xll dPo,

~ TIIII b(w) II-II A(w)xlll dPo,~TI I II b(w) II dPo, - I II A(w)xll dPo,I


~T(cp(X)- Illb(w)lldPo,)

~TKllxll-TEpJbll·
For an arbitrary XEX and every x such that

Ilxll >.LEp II bll +_1_ (Q(x)+c'x)=r(x),


K W TK
we have

Hence
inf{c'x+Q(x)1 XEX} =inf {c'x+Q(x)1 XEX, II xii ~r(x)}. D

In the complete recourse case - q ~ 0 and x bounded is not required - there


are upper and lower bounds of inf {ex + Q(x)} given by inequalities first proved
XEX
by Madansky [9], which are obtained by solving the linear program (provided
that X is a convex polyhedral set) resulting from replacing the random variables
A(w), b(w) in the two-stage program by their expectations A, b.
Theorem 30. Suppose that x is a solution of the problem
min {c'x+ Q(b -Ax)}.
XEX
Then
c'x+ Q(b -Ax)~ inf {c'x+ I Q(b(w) -A(w)x )dPo,}
XEX

~c'x+ I Q(b(w)-A(w)x)dPo,.
Proof By Lemma 28
Q(Z) =max g(z.
i
Choose gio so that
gio(b - Ax) = Q(b - Ax).
Then
Q(b - Ax) = Igio(b(w) - A(w)x )dPo,
~ I maxgi (b(w) - A(w)x )dPo,
I

= I Q(b(w)-A(w)x)dPo, VXEX
and, therefore,
c'x+ Q(b -Ax)~c'x+ Q(b -Ax)~c'x+ I Q(b(w) -A(w)x )dPo, VXEX,

which yields the first inequality. The second one is trivial. D


6. Another Approach to Two Stage Programming 77

In the special case, where only b(w) is random, we get the following
inequalities derived from A. Madansky [9]:
Theorem 31. Under the assumptions of Th. 30 and for deterministic matrices A and
c the following inequalities hold:

c'x+ e(b - Ax) =:; S XEX


inf {c'x+ e (b(w) - Ax )}dPo,
=:; inf {c'x+ Se(b(w) -Ax )dPo,}
XEX

=:;c'x+ Se(b(w)-Ax)dPo,.

Proof Observe that under our assumptions


cp(b) = inf {c'x+e(b-Ax)} is a convex function of b,
XEX
because
cp(b)=inf{c'x+q'yl Ax+ Wy=b, y;:::O, XEX}.
Therefore
cp(b);::: cp(b) + g' (b -b),
where g is the gradient of cp in b (or some subgradient).
Integration of this inequality with respect to Po, yields

Scp(b(w) )dPo,;:::cp(b)
which is the first inequality of the theorem, whereas the other ones are trivial. D
As we know from Ch. I

a= Sinf {c'x+ e (b(w) - Ax)} dPo,


XEX

is the expectation of the optimal value in the "wait and see" case, and

p= inf{ c'x + Se (b(w) -Ax )dPo,}


XEX

is the optimal value obtained by the two-stage model in the "here and now"
situation. M. Avriel and A.c. Williams [1] call the difference P-a the expected
value of perfect information (EVPI). From Th. 31 we get bounds for EVPI:

O=:; P-a=:; Se(b(w) - Ax)dPo, - e(b - Ax)

where x can be determined by solving a linear program. Unfortunately these


bounds for the EVPI are not valid if also A(w) and c(w) are random, because the
first inequality ofTh. 31 does not hold in general, as the following example shows:
Example. Let W=(1,-l), q+=l, q-=O

X = {XlxEIR., x;:::O} and

Po, (A,b, c) =(1, 2, 2) ) =Po, (A, b, =


c) (3,12,2) ) =!.
78 Chapter III. Two Stage Problems

Then
.,1=2, b=7, c=2 and
cX+Q(b-Ax)=Min {cx+q+y+ I Ax+ y+ -y- =b,/ ;:::0, y- ;:::o}
XEX

=7, where x=~.


But
r.t. = Jinf {ex+ Q(b(w) - A(w)x)} dPa, =
°-
XEX

-21 M'm {2X+Yl+ I X+Yl+ -Yl- = 2,Yl+ ;::: ,Yl;::: 0'f+


XEX

+-21 Min{2x+
XEX
Y; 13x+ Y; -Y2 = 12,y; ;:::0, Y2 ;:::o} =

=~.2+l8=5<cx+ Q(b-Ax).
Hence the first inequality ofTh. 31 does not hold in this case.
Further
f3= inf {c'x+ JQ(b(w) - A(w)x )dPa,}
xeX

= · {2x+"2Y1
Mm 1 + Ix+ Yl+ -Yl-
1 + +"2Y2 = 2·,

3x+ Y; -Y2 = 12; x;:::O,yt ;:::0, Yi- ;:::O}


=7 (choosing x=2, Y; =6) and

c'x+ JQ(b(w) -A(w)x )dPa,=7 +~=7,75.


Hence,
JQ(b(w) -A(w)x )dPa, -Q(b -Ax) =0,75< f3 -r.t.=2,
which shows, that the bound given above for the EVPI is not valid in the more
general case of random A(w) and e(w).
IV. Chance Constrained Programming

1. Convexity Statements

Whereas two-stage problems, as we have seen in the last chapter, are rather well-
behaved from the viewpoint of optimization theory as far as convexity, continuity
and differentiability are concerned, this is in general not true for chance con-
strained programming problems. There are essentially two different versions of
chance constrained programs, namely either

min cp(x)
(1) subject to Pro ({wi A(w)x;:::':b(w)} );:::.:a
XEX
or
min cp(x)
(2) subject to Pro ({ wi A;(w) x ;:::.: bi(w)} );:::.: ab i = 1, ... , m
XEX

where Ai(w) indicates the i-th row of A(w) and b;(w) is the i-th component of b(w).
Given that cp(x) is a convex function and X is a convex set, the main question is
whether the sets
X(a) = {xl Pro ({wi A(w) x;:::.: b(w)} );:::.: a}
and
Xi(aJ = {xl Pro ({wi Ai(W)X;:::': bi(w)} );:::.: ad are convex.
The following example shows that the convexity of these sets cannot be
guaranteed in general.

Example 1. Let (~) be a two-dimensional random variable with the discrete

distribution

To get
P({(a,b)lax;:::.:b} )

for a certain value of x E IR, we have to check which of the two constraints

-3x;:::': -1 and 3x;:::.:2


80 IV. Chance Constrained Programming

is satisfied. Obviously
1
-3 yields
X<- -3xz -1 and 3x;j:: 2,

l<x<~ yields -3x;j:: -1 and 3x;j::2


3 3
and 2.<x yields -3x;j:: -1 and 3xz2.
3-
Hence
1 for x<l
3 -3
P({(a,b)laxzb} )= 0 for l<x<2.
3 3
2 for 2
3 xZ"3'
which implies, that
disconnected and hence not convex for 0< ct':::;;~

X(ct) is convex

empty

The following theorem is the only general convexity statement on X(ct) that can
be made disregarding the probability distribution ~.
Theorem 1. X(O) and X(1) (resp. Xi(O) and X i(1») are convex.
Proof a) X(O) = ]R.n.
b) Suppose that X(1)#0. For Xi EX(1), i= 1,2, define
Qi = {wi A(W)XiZb(w)}, i= 1,2.
Then
Pa,(QJ= 1, i= 1,2, and
Pa,(Q 1 n Qz) = 1 (see proof of Th. I1I.4),
and
for wEQ1nQ Z

A(W)XiZb(W)
and therefore
A(w) (Axl +(1-A)xz)zb(w) for O,:::;;A,:::;;1.
Hence for AE [0, 1 ]
Q .. ={wIA(w) (Axl +(1-A)X 2 )zb(w)} ~QlnQZ'
implying
Ax 1+(1-A)X ZEX(1) for AE[0,1]. D

Th. 1 may obviously be restated in the following way:


1. Convexity Statements 81

Corollary 2. Given Pro there are real numbers lJ(o and IJ(? such that X (IJ() and Xi(IJ(J
are convex sets for 1J(~lJ(o and lJ(i~IJ(?
According to Example 1 and Cor. 2 one has to determine lJ(o resp. IJ(? for each
particular probability distribution p",. Among others who were concerned with
these problems, Marti [10] found the results Th. 3 and Th. 5.
Theorem 3. Let p", be a finite discrete probability distribution, i.e.
r

Pi = P"'(Wi) > 0, i=l, ... ,r and LPi=1.


i= 1

For every IJ(O> max (1 -pJ and IJ(?> max (1-p;)


15!::s:;r 15lsr

the sets X(IJ() and Xi(lJ(;) are convex for IJ(> IJ(O resp. lJ(i> IJ(?
Proof For N={l, ... ,r} and leN, l#N
LPiS: i-Pi '\IjEN-l
iEI
s: max (l-p)s:max(l-p).
JEN-/ JEN
Hence for 1e N
LPi> max(l-p)
iE/ JEN
implies

This yields for x E X(IJ(), where IJ(> max (1 - p;),


lSi::;r

P",({wIA(w)x~b(w)})= I Pi~1J(
A(ro;)x;' bIro;)
and hence
P", ({ wi A(w)x~ b(w)} ) = 1;
and this implies X(IJ() =X(1), which is convex by Th. 1. 0
For finite discrete distributions, the condition IJ(> max (1-Pi) is, by Th. 3,
1515r
sufficient for convexity but not necessary, as may be seen in Example 1, where
X(IJ() is convex for IJ(>~, but mfx(1-p;)=~. However, the condition cannot be
weakened in general, as the following example demonstrates.
Example 2. Let p", be a discrete distribution so that

Pl = P"'(w 1 ) =~; P2 = P"'(w2) =!; P3 =P"'(w3) =4'


1
Let

A(w 1 )= C
°-1) l' ( 1-1)
A(W2)= -2 -3 ' A(W3)= -1 (-1-1) 3

and
b(W 1 )=( -;) b(W 2)=( -2~) b(W3)=( -~).
82 IV. Chance Constrained Programming

If
K(Wl)={(~,1])ElR.zl~-1]~ -2; 1]~3}

K(w z ) = {(~, 1]) E IR. zl ~ -1]~ 0; -2~ -31] ~ -25}

K(W3)={(~,1])ElR.zl-~-1]~ -8; -~+31]~0},

then
Pro({wIA(w)x~b(w)})= L Pi,
iEl(x)
where
lex) = {il xEK(Wi)}'
Since

we know that, for


a>~, X(a)=X(1) is convex.

Here X(1)=K(w 1 )nK(wz)nK(w 3 ) is the triangle with the vertices (3,3), (5,3),
(4,4).
But for a=2. we get
4
X(i) = [K(wdnK(wz)]u [K(w z )nK(w 3 )]

which is not convex, because


x=(6,2)EK(w z )nK(w 3 )
y=(6,4)EK(wdnK(w z )
and therefore

XEX(~) and YEX(~}


But for

z=ix+~y=( 6,~) we have

and hence

In this example we have made use of the fact that max (l-p;) is not unique. If
1 :::; l:5 r
we have a discrete distribution such that minpi is uniquely determined, we may
I

decrease the lower bound of the probability level given in Th. 3.


Theorem 4. Let Pro be a finite discrete probability distribution, i.e.
1. Convexity Statements 83
r

Pi = Pa,(Wi) > 0, iEN={I, ... ,r}, and LPi=l,so that


i= 1
minp·=p·
ieN I '0
is

uniquely determined. Then the sets X(IX) and Xi(lX) are convex for every 1X>1 -Pi"
where
mm Pi.
Pi,= ieN-{ia)

"!
Proof For leN we have

if I=N
L.Pi
iel =1
::;; 1 -Pia if io ¢I
::;;1-Pi, if j¢I j=Fio.
Hence
LPi> I-Pit implies I=:JN -{io }.
iel
With K(w;)={xIA(w;)x~b(wi)} it follows immediately that

X(IX)= (l K(Wi) for 1-Pi, < IX::;; 1-Pia and


ieN-{ia)
X(IX)=(lK(Wi) for IX> I-Pia'
ieN
which yields the theorem, since every K(Wi) is a convex polyhedral set. 0
The situation described in this theorem can be observed in Example 1, where

and where X(IX) is in fact convex for IX> (1-Pi)=~.


Besides these convexity statements on X(IX) in the discrete distribution case, the
convexity of Xi(lX) only seems to be investigated for some special distributions as
long as A(w) is random.
Theorem 5. Suppose that the random variables ai1(w),adw), ... ,aiiw),bi(w) have
a joint (n+ 1)-dimensional normal distribution. Then Xi(lX;) is convex for lXi~!.
Proof If d and fare (n + 1)-dimensional random vectors with probability density
functions <p(~) and 1/1<0 respectively, and if f = Td, where T is a nonsingular
(n + 1) x (n + 1) matrix, then

d has a normal distribution if


_.!.(~-m)'S(~-m)
<p(~)=ye 2

where y is a constant such that


f
IR n + 1
<p(~)d~=1,
84 IV. Chance Constrained Programming

S and hence S-1 are symmetric and strictly positive definite, and
Jl.i=mi=E(di)
aT = (S- 1)ii =E(di - Jl.;)2.
Suppose now, that

and for some XE1Rn


1 01 ................... 0)
T(x)=
( 0 0 ................. 0

~X1 -X2········· -Xn 1 .


If f= T(x)d then

n n

fn+ 1 =dn+ 1 - L Xidi=b i - L aijXj,


i; 1 j; 1

andfhas the density


t/I<O=rp(T- 1(x)(), since det T(x) = 1
.!.(T- '(xl' - m), S(T- '(xl' - ml
=ye 2
1
"2-({-r)' Q(x)({ - rl
=ye
where
r= T(x)m
Q(x) = T- 1(x),sr 1(X)
Q-1(X)= T(x)S-1T(x),.

Hence f has a normal distribution and especially


n
J,,+ 1 =bi - L aijxj
j = 1

has mean value


n

Jl.n+1(x)=rn+1=mn+1- Lmixi
i= 1

and variance
1. Convexity Statements 85

Since S - 1 is positive definite, it is easily shown that Un + 1 (x) is convex in x (Un + 1 (x)
may be regarded as a norm of the vector (-Xl -X 2 · .. -xnl)').
Obviously
f,,+l-Jin+l(X)
8n + 1 (x)

has the standard normal distribution with mean value 0 and variance 1, whose
distribution function shall be called <Per). Now it is evident that

and

if and only if

Since Jin+ 1 (X) and un+ leX) are convex in x, this inequality describes a convex set
as long as <P-l(<Xi)~O, which is true for <Xi~ ~. 0

The following example shows that the convexity of Xi(<Xi) is in general not
maintained for <Xi< ~ .
Example 3. Suppose that a and b are independent random variables with normal
distributions such that
ml =E(a)=1; u!=E(a-ml)2=3
m2 =E(b)=2; u~ =E(b -m2)2 = 1.

Then
P(b-xa:S;;O)=<P( -(2-x)(1 +3x 2)-'2) =
1
1 <P( -6/7)
<P( -2)
<P(2/7)
for x=-4
for x=O
for x=4.

For <x=<P -7-(-6) <21 we get


P(b -xa:S;;O)~<x for X= +4 and X= -4, but not for x=O.

As a matter of fact, for xE1R the results X i(<Xi)=1R and X i(<xJ;i:1R with Xi(<Xi)
convex are also possible:
Example 4. Assume, as in Example 3, that a and b are independently normally
distributed such that m 1 = m2 = 1;

ut = u~ = 1 and <X = <P( -1) < ~.


86 IV. Chance Constrained Programming

Then
<1>-I(OC)6'2(X) + Ji2(X) = - V x 2+ 1 + 1 -X{ ~g for
for
x<O
x~O
and hence X(oc)={xlx~O}i=IR.
Example 5. If under the same assumption as in Example 4

then
<1>-I(oc)6'ix)+Ji2(X)= -V4+2x 2+1-x<0 forall xE1R
and hence X(OC)= 1R.
It turns out that a result as in Example 4 is only possible in 1R.
Theorem 6. Suppose that the random variables
ail (w), ai2(w), . .. , ain(w), bi(w), where n> 1, have a joint (n + 1)-dimensional normal
!,
distribution. 110< OCi < then either Xi(OCi) = 1Rnor Xi(OCi) is a nonempty nonconvex set.

Proof From Th. 5 we know that XEXi(OCi) if and only if


<1>-I(oc;)6'n+ I(X)+ Jin+ I(X):::;;O
where
a;+ I(X)=X'S-I X, if X=( -~)E1Rn+ 1

and Jin+ l(x)=m n+ 1 -m'x, if m'=(ml," .,mn). Suppose that Xi(OCi)i=1Rn and
xfj:Xi(ocJ Since n> 1 there exists a yE1Rn such that yi=O and m'y=O. If we

define y = ( - ~) then, for AE IR,

Jin+ 1 (x + AY) = Jin+ 1 (x)


and
6';+ 1(x+ AY) =x'S-lx +2AX'S-ly + A2Y'S-Iji.
Since yi=O, <1>- 1 (OCi) < 0 and S- 1 is positive definite, there exists a AoE1R, Ao i=0
such that

and

Hence

whereas

and therefore X;(OCi) is nonempty and nonconvex. 0


1. Convexity Statements 87

A result similar to Th. 5 was obtained by K. Marti [10] for a joint Cauchy
distribution.
Except in the case of a finite discrete distribution we have presented convexity
statements only for Xi(aJ, but not for X(a). This corresponds to the state of
research in the field, as long as the matrix A is supposed to be random.
However, for fixed matrices A - i.e. only the right hand side b(w) is random-
Xi(aJ is always convex, and the convexity of X(a) has recently been proved by
A. Prekopa [14] for a rather broad class of probability distributions, including
the normal distribution.
Theorem 7. Suppose that A is fixed and b(w) is random.
Then
Xi(aJ = {xl Pw ({wi Aix;;::bi(w)} );;:: aJ
is convex for every probability distribution of b(w).
Proof Let FJr:) be the distribution function of b/w).
Then xjEXi(aJ, j=1,2, if and only if Fi(Aixj);;::ab j=1,2. For x=.h l
+(1-A)x 2 ,AE(0,1), we get
Fi(Ai X);;:: Fi (min2 A ix j )
J= I,

= min {Fi(Aixj)};;:: ai,


j= 1,2

by the mono tonicity of distribution functions.


Hence Xi (aJ is convex. D
It is much more difficult to get convexity statements for X(a). If F(z) is the
distribution function of b(w), then
X(a) = {xl F(Ax);;:: a}.
Therefore, if A has full rank, X(a) is convex for every aE [0,1] if and only if F(z)
is quasi concave, i.e.

Although every distribution function of a one dimensional random variable is


quasi concave because of the mono tonicity of distribution functions, this is in
general not true for multivariate distribution functions as the following example
shows:
Example 6. Let b(w) be a two-dimensional random variable with the discrete
distribution

Then for F(z)=P(bsz) we have

F(~)=F(~)= 1, but FC)=O,

which shows that F(z) is not quasi concave.


88 IV. Chance Constrained Programming

If the probability measure P defined on 1R" has a density functionflx) (with respect
to the Lebesgue measure J.L on 1R"), the problem arises under which conditions on
f(x) the measure P is quasi concave. P is quasi concave if, for
2m:+(1-2)~={zlz=2x+(1-2)y,XEm:,YE~},
P(2m:+(1-2)~ )~min{P(m:),P(~)}
for all convex subsets m: and ~ of 1R" and all 2E(0,1).)
Obviously the distribution function of a quasi concave probability measure is quasi
concave.
We call a density function f(x) almost quasi concave, if for every aE1R" and
bE1R" such that a= -yb and y> O,flx)~min{j{x+a),f(x+b)} almost everywhere
with respect to J.L. Then we may state
Theorem 8. Let P be a quasi concave probability measure on 1R" with the continuous
density function f(x). Then f(x) is almost quasi concave.
Proof Suppose f(x) were not almost quasi concave. Then there exist
°
aE1R", bE1Rn, y> with a= -yb such that J.L({xlf(x) < min [f(x + a),f(x + b)]} » 0.
Thus, there exists a convex Borel measurable set R (for example a sphere) such
that
R c {x If(x) < min [f(x + a),f(x + b) nand J.L(R) > 0.
Then

P(R) = Sf(z)dJ.L(z) < Smin[f(z+a),f(z+b)]dJ.L(z)


II II
:::; min[Jf(z+a)dz; Sf(z+b)dz]
II 51
= min[P(R+a);P(R+b)]
in contradiction to the quasi concavity of P, since

R=2(R+a)+(1-2)(R+b) with 2=-1-1_ E (0,1). 0


+y
However, almost quasi concavity of a density function does not in general
imply quasi concavity of the related probability measure.
Example 7. Letf(x)= 1~4 qJ(x) be a density function on 1R2, where

21 if xEID'l={xIO:::;x 1:::;1, -2:::;x 2:::;0}


qJ(x) = { 1 if XEm ={xl-l1:::;x 1:::;1, -6:::;x 2:::;1,x¢ID'l}
°else.
Obviously f(x) is quasi concave (and hence almost quasi concave). Take
z1=(-1,1), Z2=(1, -1) and z= ~ Z1++ Z2=(0,0).
Then we have for the distribution function
Zl

F(Z1)= S f(x)dx=~
-00 . 124
1. Convexity Statements 89
Z2

F(Z2)= Sf(x)dx=~
- 00 124

F(z)= Sf(x)dx=
-00
1~~ < F(ZI)< F(Z2).
Hence the distribution function and consequently the probability measure is not
quasi concave.
With respect to sufficient conditions the strongest results known so far are due
to A. Prekopa [14]. He was concerned with logarithmic concave measures,
which due to their definition satisfy the inequality
P (22l+(1-WB )zP\2l)· pl-.«m)
for all convex subsets 2l and m of lRn and all 2E(0,1).
Obviously a logarithmic concave probability measure is also quasi concave. The
main result is based on
Theorem 9 (Prekopa's inequality). Let f and g be nonnegative Borel measurable
functions defined on lRn and let
r(t) = sup f(x)·g(y); tElRn,
.<x+(1-.<)y=t
where 2 is a constant, 0< 2< 1.
Then ret) is Borel measurable and the following inequality holds:

Jr(t)dtz (j. i(x)dxY-(J.9~(y)dY) 1-.<.


The proof of this theorem can be found in A. Prekopa [13], [14] and L. Leindler
[8 ].
Theorem 10. Let f(x) be a probability density function defined on lRn, with the
representation f(x)=y·e-Q(X), where Q(x) is a convex function. Then the cor-
responding probability measure P is logarithmic concave.
Proof. Let 2l and m be arbitrary convex subsets of lRn and let 2E(0,1). Define
j;(x), i=1,2,3, as follows:
fl(x) = {fo(X) if xE2l
otherwise

f 2(X)={fo(x.) if XEm
otherwise

f3(X) = {fo(X) if xE22l+(1-2)m


otherwise.

For every xE22l+(1-2)m and every YE2l and ZEm such that 2y+(1-2)z=x,
in view of the convexity of Q(x), we have
f(x) = ye - Q(x) z ye - .<Q(y) - (! - '<)Q(z)
=f'<(y)fl-.<)(z),
90 IV. Chance Constrained Programming

implying immediately

Now Th. 9 yields


p (Am:+(1-A)~)= JJix)dx~ J{ sup f\(Y)f~l-.t)(z)}dx
JR" JR" .ty+(1-.t)z=x

which establishes the logarithmic concavity of P. 0


The convexity statements and examples given in this section show quite
clearly that in general it is not yet known under which conditions a chance
constrained program is a convex program. Moreover - even if convexity can be
asserted - there are still considerable computational difficulties.

2. Relationship between Chance Constrained Programs and


Two-Stage- Pro blems

We cannot in general expect that chance constrained programs and two-stage


problems replace each other, because in practical situations it sometimes seems
appropriate to require a probability level of feasibility but impossible to specify
penalty costs for infeasibility and vice versa. And from the theory developed so
far, we may suspect that chance constrained programs and two-stage problems
are in general not equivalent, since, in general, chance constrained programs may
be nonconvex whereas two-stage problems are always convex. Nevertheless there
are relations between these two types of problems, which may at least help us to
get more insight.
On the one hand under the assumptions ofTh. III. 12 any fixed recourse problem
with deterministic penalty costs q is equivalent to finding feasible points of
generalized chance constraints. Here generalized chance constraints are con-
straints involving functions of the type g(x) = P(Ax~ b). If we start with the two-
stage problem
min c'x+Q(x)
subject to Tx=d
x~O

where T, d are deterministic and Q(x) = Ep",Q(x, w), and

Q(x, w) = min q'y


subject to Wy=b(w) -A(w)x
y~O,
2. Relationship between Chance Constrained Programs and Two Stage Problems 91

then, according to the Kuhn-Tucker-theorem, we have to find a feasible solution


of
c+7Q(x)-T'u ~O
x' (c+ 7Q(x) - T'u )=0
Tx=d
x~O.

From Th. III.12 we know that


r

7Q(x) = - L J (ijiB i- 1A (W) )'dPw


i= 1 'll,(x)

where B;, i = 1, ... , r, are optimal bases of W (i.e. fulfil the simplex optimality con-
dition) and
i- 1
mi(x) = {wi B i- 1 (b(w) -A(w)x» O} -u mix).
j= 1

Rewriting the integrals in 7Q(x), using the conditional probability of b given


A, yields generalized chance constraints. This fact becomes still more evident if
A(w)=A is deterministic and the optimal feasible basis of the recourse program
is determined uniquely almost everywhere (as for example in the simple recourse
case). Then
r
7Q(x) = - L (ijiBi- 1 A)' P(Bi- 1 (b(w) -Ax» 0).
i= 1

On the other hand we may sometimes use simple recourse problems to find
feasible solutions of chance constraints. Consider the special simple recourse
problem (qt = Q> 0, qi- =0)
m
Ij!(Q)=Min{c'x+
XEX
LQ
i=l
J
(b(w)-A(w)x),>o
(b(w)-A(w)x)idPw}

and let, for every Q> 0, x(Q) be a solution.


Theorem 11. Let X be compact. Then
m=X n {xl Pw(A(w)x~b(w))= 1} #0
if and only if
lim Ij!(Q) < =.
II~OO

Proof a) The condition is necessary, since obviously Ij!(Q) is monotone increasing,


and if XEm, Ij!(Q)Sc'x, becausefor XEm Pw(b(w) -A (W)X)i> 0 )=0 for i= 1, ... ,m.
b) The condition is also sufficient. Consider the functions
( )_{o
({Jill w -
if (b(w)-A(w)x(Q))iSO
(b(w) -A(w)X(Q))i otherwise.

We have to show that


limPw({wl({Jill(w)~8})=0 forall 8>0 and i=l, ... ,m.
II~OO

i.e. that the functions ({Jill' i = 1, ... , m converge in measure to zero.


92 IV. Chance Constrained Programming

Suppose, on the other hand, that for some i there exist 8> 0 and J> 0 such that
for every e> 0 there is L\ > 0 for which
Fa, ({wi <'Pie+LI(W):2:8} ):2:J.
Then

where a=minc'x. This inequality contradicts the assumption that lim r/J(e) < 00.
XEX (l--+ 00

Now, if {e} is some sequence increasing to 00, there is a subsequence {ev} such
that <'Pie,' i= 1, .. . ,m, converge to zero almost surely, (i.e. almost everywhere with
respect to Fa,) and lim x(ev) = x* EX.
v-oo
Therefore
Fa, ({ wi lim <'Pie,(W) > O} )=Fa, ({wi (b(w) -A(w)x* )i> O} )=0, i = 1, .. . ,m,
v-oo
yielding
Fa,({wIA(w)x* lb(w)} )=Fa,(~ {wi (b(w) -A(w)x*); > a})
m
~ L Fa, ({wi (b(w) -A(w)x )i> O} )=0,
i= 1

i.e. Fa,({wIA(w)x*:2:b(w)} )=1. D

According to this theorem one may try to get feasible solutions of chance
constraints by solving the parametric simple recourse problem mentioned above.
However, one should be aware of the fact that the theorem could only be proved
under the assumption that probability 1 could be attained.
References

[1] Avriel, M. and A.C. Williams: The Value of Information and Stochastic Programming.
Operations Research 18,947-954 (1970).
[2] Beale, E. M.: The Use of Quadratic Programming in Stochastic Linear Programming. RAND
P-2404, August 1961.
[3] Bereanu, B.: On Stochastic Linear Programming Distribution Problems, Stochastic Technology
Matrix. Z. Wahrscheinlichkeitstheorie u verw. Geb. 8, 148-152 (1967).
[4] Bereanu, B.: The Distribution Problem in Stochastic Linear Programming. The Cartesian
Integration Method. Reprint No. 7103, Center of Mathematical Statistics of the Academy of
the Socialist Republic of Romania, Bucharest (1971).
[5] Kall, P.: Qualitative Aussagen zu einigen Problemen der stochastischen Programmierung.
Z. Wahrscheinlichkeitstheorie u. verw. Geb. 6, 246-272 (1966).
[6] Kall. P.: Das zweistufige Problem der stochastischen linearen Programmierung. Z. Wahrschein-
lichkeitstheorie u. verw. Geb. 8, 101-112 (1967).
[7] Kall, P.: Some Remarks on the Distribution Problem of Stochastic Linear Programming.
Methods of Operations Research, Meisenheim, 16, 189-196 (1973).
[8] Leindler, L.: On a Certain Converse of Holder's Inequality. Acta Scientiarum Mathematicarum,
Szeged, 33, 217-223 (1972).
[9] Madansky, A.: Inequalities for Stochastic Linear Programming Problems. Management Sci. 6,
197-204 (1960).
[10] Marti, K.: Konvexitatsaussagen zum linearen stochastischen Optimierungsproblem. Z. Wahr-
scheinlichkeitstheorie u. verw. Geb. 18, 159-166 (1971).
[11] Marti, K.: Entscheidungsprobleme mit linearem Aktionen- und Ergebnisraum. Z. Wahrschein-
lichkeitstheorie u. verw. Geb. 23, 133-147 (1972).
[12] Marti, K.: Ober ein Verfahren zur Losung einer Klasse linearer Entscheidungsprobleme.
Z. Angew. Math. u. Mech., to appear.
[13] Pn!kopa, A.: Logarithmic Concave Measures with Applications to Stochastic Programming.
Acta Scientiarum Mathematicarum, Szeged, 32, 301-316 (1971).
[14] Pn!kopa, A.: On Logarithmic Concave Measures and Functions. Acta Scientiarum Mathema-
ticarum, Szeged, 34,335-343 (1973).
[15] Walkup, D. W. and R.J. B. Wets: Stochastic Programs with Recourse. SIAM J. App!. Math. 15,
1299-1314 (1967).
[16] Wets, R.: Programming under Uncertainty: The Complete Problem. Z. Wahrscheinlichkeits-
theorie 4,316-339 (1966).
[17] Wets, R.: Characterization Theorems for Stochastic Programs. Mathematical Programming 2,
166-175 (1972).
[18] Wessels, J.: Stochastic Programming. Statistica Neerlandica 21, 39-53 (1967).
[19] Kosmol, P.: Algorithmen zur konvexen Optimierung. OR-Verfahren, Band XVIII, 176-186
(1974).
[20] Kall, P.: Approximations to Stochastic Programs with Complete Fixed Recourse. Numer.
Math. 22, 333-339 (1974).

Books on Stochastic Programming

Faber, M.M.: Stochastisches Programmieren. Physica-Verlag, WUrzburg, Wien, 1970.


Sengupta, J. K.: Stochastic Programming - Methods and Applications. North-Holland Publishing
Company, Amsterdam, American Elsevier Publishing Company, Inc. New York, 1972.
Vajda, S.: Probabilistic Programming. Academic Press, New York and London, 1972.
Subject Index

absolutely continuous 8 integrable function 8


almost everywhere (a.e.) 7 - simple function 8
- nonsingular 28 integral 8
- quasi concave 88
- sure 9 Kuhn-Tucker conditions 5
- theorem 5
basic solution
- variables 2 Lebesgue measure 7
Borel algebra 7 - theorem 8
- measurable function 7 linear programming 1
- - transformation 7 logarithmic concave 89

chance constrained programming 16, 79 mathematical programming


complete (fixed) recourse 51 mean fundamental 8
concave function 5 measurable function 7
constraints 1 - set 6
convergence almost everywhere 8 - space 6
- in measure 8 - transformation 7
convex function 4 measure 7
- polyhedral cone 2 - space 7
- - set 2 Minkowski functional 72
- polyhedron 2
- program 4 non basic variables 2
- set 4 nonlinear programming 4

decision regions 31 objective function


- variables 1
decomposition method 3 parametric linear programming 3
- structure 3 positive linear program 26
defect 24 Pn:kopa inequality 89
degenerated basic solution 3 primal program 4
distribution function 9 probability density function 9
- problem 13,19 - measure 9
dual program 4 - space 9
duality theorem 4 product measure 9
- space 8
events 9
expectation 9 quadratic programming 6
extended real 7 quasi concave 87

Farkas lemma 1 Radon-Nikodym theorem 8


feasible basis 2 random variable 9
- set 1 - vector 9
finite measure 7 recourse problem 16, 39
fixed recourse 45 regularity condition 5
Fubini theorem 9 reliability optimization 17

"here and now" 13 saddlepoint 5


Holder inequality 8 Schwarz inequality 8
95

section 9 stochastic linear programming 11


a-algebra 6 - transportation problem 26
a-finite measure 7 stochastically independent 9
simple function 8 strategy 13
- recourse 56
Simplex criterion 2 two stage problem 16,39
- method 2
solution 1 "wait and see" 12
Okonometrie und Unternehmensforschung
Econometrics and Operations Research

Vol. I H.P. KUnzi, w. Krelle: Nichtlineare Programmierung


Unter Mitwirkung von w. Oettli
Vol. II G.B. Dantzig: Lineare Programmierung und Erweiterungen
Ins Deutsche Ubertragen und bearbeitet von A. Jaeger
Vol. III M. GirauIt: Stochastic Processes
Vol. IV K.-H. Wolff: Methoden der Unternehmensforschung im Versicherungswesen
Vol. V J. M. Danskin: The Theory of Max-Min and its Application
to Weapons Allocation Problems
Vol. VI H. SchneeweiB: Entscheidungskriterien bei Risiko
Vol. VII P.L. Hammer (Ivanescu), S. Rudeanu: Boolean Methods
in Operations Research and Related Areas
With a preface by R. Bellman

Vol. VIII Th. Marschak, Th.K. Glennan Jr., R. Summers: Strategy for R&D:
Studies in the Microeconomics of Development

Vol. IX M.J. Beckmann: Dynamic Programming of Economic Decisions


Vol. X J. Schumann: Input-Output-Analyse
Vol. XI W. Wittmann: Produktionstheorie
Vol. XII W. Dinkelbach: Sensitivitiitsanalysen und parametrische Programmierung

Vol. XIII W. KnOdel: Graphentheoretische Methoden und ihre Anwendungen

Vol. XIV E. Nievergelt, O. MUller, F.E. Schlaepfer, W.H. Landis:


Praktische Studien zur Unternehmensforschung
Vol. XV H. MUller-Merbach: Optimale Reihenfolgen

Vol. XVI R. Selten: Preispolitik der Mehrproduktenunternehmung


in der statischen Theorie
Vol. XVII L.P. Hyviirinen: Information Theory for Systems Engineers

Vol. XVIII F.L. Wilke: Unternehmensforschung im Bergbau

Vol. XIX J. Kornai: Anti-Aquilibrium. Dber die Theorien der Wirtschaftssysteme


und die damit verbundenenForschungsaufgaben
Dbersetzer: L. Tarnoczy, M. Keriedi
Vol. XX E. Blum, W. OettIi: Mathematische Optirnierung. Grundlagen tind Verfahren
Vol. XXI P. Kall: Stochastic Linear Programming
A New Series

Applications of Mathematics
Subtitles: Applied Probability, Control, Economics, Information
and Communication, Modeling and Identification,
Numerical Techniques, Optimization
Editors: A. V. Balakrishnan, W. Hildenbrand

Applications of Mathematics will be devoted to publications in the new areas


of applied mathematics i.e. those different from the natural sciences.
In recent years the fields of applicability of modern mathematics
have been expanded beyond the traditional boundaries; this has made new
areas accessible like economics, biology, certain aspects of engineering, etc.,
and lead, concurrently, to the development of appropriate new methods
and strategies (e.g. dynamic programming, stochastic differential equations,
topological methods). Applications of Mathematics will be devoted
to these developments; the publications in the series will cover fields
where a mathematically meaningful, quantitative and qualitative approach
has become possible, with emphasis on theoretically interesting,
not too specialized presentations and mathematical thoroughness
with specific, concrete questions as starting points.

The first four volumes:


Volume 1. W.H. Fleming and R. W. Rishel: Deterministic and Stochastic
Optimal Control. 4 figures, XI, 222 pages. 1975. ISBN 3-540-90155-8.
Volume 2. G.I. Marchuk: Methods of Numerical Mathematics.
VIII, 352 pages. ISBN 3-540-90156-6.
Scheduled to appear in Dec. 1975/Jan. 1976
Volume 3. A.V. Balakrishnan: Applied Functional Analysis.
Approx. 605 pages. ISBN 3-540-90157-4.
Scheduled to appear in Jan. 1976
Volume 4. A.A. Borovkov: Stochastic Processes in Queueing Theory.
14 figures, approx. 250 pages. ISBN 0-387-90161-2.
Scheduled to appear in Jan. 1976

Springer-Verlag Berlin Heidelberg New York

You might also like