Professional Documents
Culture Documents
Okonometrie Und Unternehmensforschung Econometrics and Operations Research XXI
Okonometrie Und Unternehmensforschung Econometrics and Operations Research XXI
und Unternehmensforschung
Econometrics
and Operations Research
XXI
Stochastic
Linear Programming
AMS Subject Classifications (1970): 28A20, 60E05, 90-02, 9OC05, 90C15, 9OC20, 90C25
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifical1y those of
translation, reprinting. fe-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in
data banks. Under § 54 of the German Copyright Law where copies 3rc made for other than private use a fee is payable to the publisher,
the amount of fee to be determined by agreement with the publisher.
Chapter O. Prerequisites
1. Linear Programming. . . 1
2. Nonlinear Programming . 4
3. Measure Theory and Probability Theory 6
Chapter I. Introduction. . . . . . . . . . . . . . . . . . . . . 11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Subject Index. . . . . . . . . . . . . . . . . . . . . . . . . . 94
Chapter O. Prerequisites
1. Linear Programming
where f' is the transpose off and f' x is the usual scalar product off and x. From
this relation, and from the fact that every real number may be represented as the
difference of two nonnegative real numbers, it follows that every linear program
may be written in the standard formulation
( 1) min{c'xIAx=b, X20},
where cElR", bElRm are constant and A is a constant real (m x n)-matrix, xElR" is
the vector of decision variables and X20 stands for Xi2 0, i = 1, .. . ,n. We under-
stand by a solution of a linear program a feasible x such that c' x~ c'x for all
x E{xl Ax=b; X20}.
The question, under which conditions the feasible set of a linear program is
non empty, is answered by
Theorem 1 (Farkas' Lemma). {xIAx=b,X20};60 if and only if
{ul A'U20} c {ulb'U20}.
Here the prime at A and b again indicates transposition.
An immediate consequence of Farkas' Lemma is
Theorem 2. If there is a real constant y such that c' x 2 y for all x E {x I Ax = b,
x 20} ;6 0, then the linear program min {c' x IAx = b, x 2 O} has a solution.
An important concept in linear programming is that of feasible basic solutions.
We call x afeasible basic solution of (1) if x E {xl Ax =b, X20} and if the columns
of A corresponding to the nonzero components of x are linearly independent.
Obviously the set III = {x I x is a feasible basic solution of (1)} is finite.
2 Chapter O. Prerequisites
(2)
If B is a fc;:asible basis and if we choose y = 0, then we have a feasible basic
solution, where the basic variables have the values of the components of B-1b.
If we reorder the components 01 c into an m-tuple c and an (n -m)-tuple d
corresponding to the reordering of the components of x in x and y, then, from
(2), we get for the objective function
(3) c'x=c'x+d'y
Starting with a feasible basis B and the corresponding feasible basic solution
given by y=O, x=B-1b, the only feasible change, given the constraints x~O in
(1), is to increase some component(s) of y, while keeping X=B-lb-B-lDy~O.
Hence, it is obvious that the Simplex criterion d' -C'B-ID~O is sufficient for
the optimality of that feasible basic solution. If the feasible basic solution is
1. Linear Programming 3
where all elements outside the rectangles are equal to zero. Decomposition methods
take advantage of this structure - especially for large-scale problems - by
pivoting in intermediate steps only on the subblocks of A, whereas in the main
pivot steps one is concerned with a matrix of J..l+ v rows (corresponding to the
so-called masterprogram). Without going into details we can say that this pro-
cedure is essentially based on Th. 3.
In certain practical situations it is of interest to know what happens to the
solution or the optimal value of a linear program if some of the elements of
A, b, c vary. This kind of problem leads to parametric linear programming. For
later purposes we might mention the following special result:
Theorem 7. Let TeJRm be a convex (polyhedral) set. Suppose that {xIAx=t,
x20}#0 for all tET and that y(t)=min {c'xIAx=t, X20} is finite for at least
one tE T. Then yet) isfinitefor all tE T and yet) is a piecewise linear, convex function
on T (i. e.for arbitrary t 1E T, t 2 E Tand AE(O, 1) we have y(At 1+(1-A)t 2 )~Ay(t1) +
(1 - A) y( t 2 )).
4 Chapter O. Prerequisites
2. Nonlinear Programming
minf(x)
(7) subject to g(x).::;O
h(x) =0,
The application of this theorem to problem (9) with cp(x, u) = f(x) + u'g(x)
yields the local Kuhn- Tucker conditions for a solution of (9):
6 Chapter O. Prerequisites
where Q is a real symmetric positive semidefinite (n x n)-matrix (see Th. 12), the
local Kuhn-Tucker conditions are:
c+Qx-A'u~O b-Ax:s;O
(12) x'(c+Qx-A'U)=O u'(b-Ax)=O
x~O u~O.
There are different approaches for solving convex programs. Besides linearization
methods, which approximate the original program by linear programs, there are
gradient methods, applied to the original problem, and so-called complementarity
methods, applied to the Kuhn-Tucker conditions. In the special case of quadratic
programming, gradient methods as well as complementarity methods are available,
which simply require pivoting - as in the simplex method - under additional
rules for choosing the pivot elements.
One of the basic concepts in measure theory is that of a measurable space (R, m:),
where R is some nonempty set (called space) and m: is a a-algebra on R. m: is a
a-algebra, if it is a nonemty class of subsets of R, which is closed under the for-
mation of complements and countable unions, i.e. if A Em:, then A=R -A Em:,
00
Theorem 16. If(f is any nonempty class of subsets of R, then there exists a unique
a-algebra 21 such that (f c 21 and such that if (£ is any other a-algebra containing
(f then 21 c (£.
for every countable class of disjoint sets Aj E21. If 21 is a a-algebra on the space R
and 11 is a measure on 21, then (R, 21,11) is called a measure space. A measure 11 is
called a-finite, if there is a countable class of sets AjE21, such that I1(A j) < = for
00
where for Yio=O and Ji(A io )== the product Yio' Ji(Aio) =0 by definition. A
sequence {f,,} of integrable simple functions is called mean fundamental, if
I If" - fm IdJi tends to zero whenever nand m tend to infinity. Now a measurable
function f on a measure space (R, ~,Ji) is called integrable if there is a sequence
{f,,} of integrable simple functions which is mean fundamental and converges
in measure to f Then the integral is defined as IfdJi = lim If" dJi.
n~oo
Theorem 21. a) A measurable function f on (R, ~,Ji) is integrable if and only if its
absolute value is integrable; and I IfdJiI ::S; Ilfl dJi.
b) Let J,g be integrable functions on (R, ~,Ji) and a,fJ be real constants. Then
af + fJg is integrable and S(af + fJg) dJi = a Jf dJi+ fJ JgdJi.
Theorem 22 (Legesgue). If {f,,} is a sequence of integrable functions converging in
measure (or a. e.) to f, and ifg is an integrable function such that
If,,(r) I::S; Ig(r) Ia. e., n = 1,2, ... ,
thenfis integrable and
lim Ilf - f,,1 dJi=O.
n~oo
by the class of all Cartesian products A x C, where A E ~ and C E (£:, and f.1 x v is
the product measure on ~ x (£: uniquely determined by (f.1 x v) (A x C) = f.1(A)-v(C)
for all A E ~ and C E(£:. If for example 'Ek is the Borel algebra on IRk and f.1k is the
corresponding Lebesgue measure, then (IRm+n, 'E m +n, f.1m+n) = (IRm x IRn, 'Em X 'En,
f.1m X f.1n)·
If Dc R x S, then a section of D determined by r E R, or a r-section of D, is the
set Dr = {s I (r, s) ED}.
Theorem 25. In the product space (R x S, ~ x(£:, f.1 x v), a measurable set DcR x S
has measure zero if and only if almost every r-section (or almost every s-section)
of D has measure zero.
With respect to integration we need the important
Theorem 26 (Fubini's Theorem). Let f be an integrable function on the product
space (R x S, ~ x(£:, f.1 x v).
Then
Jfd(f.1 x v) = J(Jfdf.1 )dv= J(Jfdv )df.1.
Probability theory may be understood as a special area of measure theory.
A probability space is a finite measure space (Q, 0:, P) for which P(Q) = 1. The
measurable sets (i. e. the elements of 0:) are called events and P is called a
probability measure. Instead of a. e. we use the phrase almost sure. A measurable
transformation x:Q--->IRn (where the a-algebra on IRn is always the Borel
algebra) is called a (n-dimensional) random vector. A one-dimensional random
vector is a random variable. Observe that every component of a random vector
is itself a random variable. A random vector x defines a probability measure
P on IRn in a natural way by PCB) = P(x - ! [B]) for all Borel sets B. P is uniquely
determined on 'En by the distribution function of x: FAt) =P (W ~ EIRn, ~:S; t} )
for all t EIRn. If P is absolutely continuous with respect to the Lebesgue measure
f.1m by the Radon-Nikodym theorem there is a probability density function fAT)
defined on IRn such that P(B) = SfAT)df.1n for all BE'En.
B
The expectation Ex of a random vector x is the vector of the integrals of the
components of x. For simplicity we write
Ex = (Jx! dP, JX2dP, . .. , JxndP)' = JxdP.
Q Q Q Q
Hence we have
Ex = JxdP= gdP= gdFA~)
Q Rn Rn
for all Borel sets Bi in IR. There is an obvious connection between stochastic
independence and product measures. Let Pi be the probability measure on IR
corresponding to the random variable X;, i = 1, ... , k, and let P be the probability
measure on IRk corresponding to X=(Xb Xz, . . "Xk)', then stochastic independence
of the random variables Xl,Xz," .,Xk is equivalent to P=P1 x Pz X ... X Pk,
Fx(t) = Fx,(t l )· FX2 (t Z)' .... FXk(t k) and, if the densities exist,
fAr:) = fx/r: 1) "ix2(r: z)· .... fXk(r:d·
From Fubini's Theorem follows
Theorem 27. Let Xl and Xz be stochastically independent and assume that EXb Ex z
and EXIXZ exist. Then Ex1Xz =(Exd (Exz).
In this respect many users of linear programming have already been involved
in a special procedure of stochastic linear programming, namely by replacing the
random variables in a linear program by their expectation values or, fairly good
estimates of them, and solving the resulting linear program. The following
numerical example shows that this procedure is not feasible in all practical
situations. Suppose that the problem is
min Xl +XZ
aXI +xz2:7
°
bXI +xz2:4
X I 2: 0, X z 2:
If we ask for the probability of the event, that this solution is feasible with
respect to the original problem, we get
min c'x
(2) Ax=b
x;;:::O.
Typical questions in this case are: What is the expectation of the optimal value
of (2), or what are the expectation and the variance of this optimal value and so
on. More generally the question is: What is the probability distribution of the
optimal value of (2)? A possible interpretation of this distribution problem is the
following: Suppose that a special production program (with linear structure)
may be adapted for any short period to actual realizations of random factor prices,
random technological coefficients and random demands. Planning the budget
for a long term - i.e. for many short periods - the board of the firm wants to
know the amount of cash needed for this production program "in the mean"
or "for 95% of the time". More precisely, the board wants to know the expectation,
or the 95% percentile, of the probability distribution of this special production
program's costs per (short) period.
"Here and now" models are based on the following assumption: A decision on
x - or on a "strategy" for x - has to be made in advance or at least without
knowledge of the realization of the random variables.
By a "strategy" for x we understand the game theoretical concept of "mixed
strategy" within a feasible set X of pure strategies x; or, equivalently, a
"strategy" for x is a probability measure P x on a Borel set Xc IRn. If we restrict
ourselves to probability distributions Px such that there exists an X EX with
P x ({ x} ) = 1, we are restricted to pure strategies, i.e. to decisions on x instead of
mixed strategies of x's.
The practical interpretation of a strategy is as usual the assumption that the
decision maker plays his game very often with - possibly different - x's resulting
from a Monte-Carlo simulation of the chosen probability distribution P x •
To understand the philosophy of "here and now" models, it seems to be
necessary to start at the very beginning. Our first observation is that in a linear
program some or all coefficients are random variables with a joint probability
distribution. This implies - by definition of random vectors - the existence of
a probability space (0, ~, P w) such that {A(w), b(w), c(w)} is a measurable trans-
formation on 0 into IRmxn + m+n. Our general assumption for SLP is that we know
PW. A further very important assumption is that a decision on x - or on a
mixed strategy Px - does not influence P W. More precisely, the events in 0 - i.e.
the elements of ~ - and the events in X, i.e. the Borel sets in X - are stochastically
independent; or equivalently, the probability measure ofthe product space X x 0 is
the product measure Px x Pw. It should be pointed out very clearly, that this as-
sumption is not at all trivial from the practical point of view. If for example a pro-
ducer with a large share in the factor market takes very extreme decisions on inputs,
it seems very unlikely that these decisions do not influence input prices or quality,
which would alter the technological coefficients. On the other hand there are cer-
tainly many cases, where the assumption of stochastic independence is quite realis-
tic. Therefore, in most practical cases we must check very seriously whether the
influence of the producer's decision on the probability distribution Pw may be
neglected before applying one of the "here and now" models handled in this book.
14 Chapter l. Introduction
A decision maker who does not want to choose his strategy at random out of
a certain feasible set must have a criterion telling him whether a certain strategy
is the "best" one or not. As is well-known, in decision theory there are different
concepts of what the "best" may be. One of them is that there is a partial ordering
on the set of feasible strategies P x : then a "best" strategy is not necessarily "better"
(with respect to the partial ordering) than, or equivalent to, every other strategy,
but there is no other "comparable" and "better" strategy. Since, under a partial
ordering, not every pair of strategies need be comparable, it follows that there
may be a strategy which is "best" in virtue of not being comparable to any other
feasible strategy. This concept has important applications in multi-goal pro-
gramming.
However, we shall be concerned with a stronger concept of "best" strategy. Let us
assume that any two feasible strategies are comparable and that the result of the
comparison says that either one strategy is "better" than the other or both
strategies are "equivalent". In other words, either we prefer one strategy to the
other or we are indifferent. Furthermore, we suppose that the decision maker is
consistent in the following sense. When he has preferred a strategy p~l) to p~2)
and also has preferred p~2) to p~3), then he will prefer p~l) to p~3). When he is
indifferent with respect to p~l) and P~2( then he also thinks of p~2) as equivalent to
p~l). And when he believes p~l) to be equivalent to p<j-) and p<j-) to be equivalent
to p~3>, then he will also be indifferent with respect to p~l) and p~3). To obtain such
a preferential scheme we need, on 'P x 'P, a strict linear ordering -< (the "better"
relation) and an equivalence relation ~ (the "indifferent" relation) so that for
any P~) E'P, i = 1,2,3, the following statements hold:
1) One and only one of these relations hold:
2) If and
p~l)-<p<j-) p~2)-<p~3), then p~l)-<p~3).
3) And if
p~l) ~ p~l). p~l) ~ p~2), then p~2) ~ p~l).
4) If p~l) ~ p~2) and p~2) ~ p~3>, then p~l) ~ p~3).
One can easily verify that these relations" -<" and "~" defined by a) and b) fulfil
the above conditions 1) to 4). We shall discuss now how to construct a criterion
function on 'P in a SLP situation. As we have seen, we shall then have also solved
the problem - at least implicitly - of how to get a preferential scheme on
'P x 'P.
Chapter I. Introduction 15
where W, q may also be deterministic or stochastic, in the sense that they depend
onwEQ,
i=l, ... ,r).
In particular, the case when F = j1Epx x Pw {L(e(w, x) )} + A· (Jpx x Pw { L(e(w, x) )}
with A;;::: 0 seems to be of practical importance.
This type of problem is called a two stage problem of SLP or a SLP problem with
recourse. The practical meaning of problems with recourse is this:
When we have determined x (or x has been determined as a realization of a strategy
P x ) and when we observe a realization of the random vectors (A (w), b(w), c(w)),
then there may be a deviation from the original constraints, i.e. A(w)x=b(w) may
not be satisfied. Such a deviation causes penalty costs arising from a second
stage linear program min {q'yl Wy=b(w)-A(w)x,y;;:::O}, which may be under-
stood as an "emergency program", yielding the last possibility of compensating
for the deviation from the original constraints. The total costs observed in this
situation are the sum of the original costs c'(w)x and the penalty costs, and -
since they depend on wand x - these total costs are random. The objective is
to determine x or Px such that some function F of certain moments of the total
costs becomes minimal, for example the expected total costs or - if risk aversion
is involved - a weighted sum of the expectation and the standard deviation of
the total costs.
3) A further possibility is:
'll = X n{x I P.n(c'(w)x ::;; y);;::: oc}
L(e(w x))=
,
{10 ifotherwIse
A(w)~;;:::b
<PL= -Ew L
However, we shall not discuss these problems here because their general theory
is contained in the literature on decision theory and they do not really make use
of the information on P w' i.e. they just take into account some sets of measure
zero with respect to P W (in determining ess. supL).
WEa
you should be able to deliver the quantity b of iron required with a high proba-
bility, e.g. with probability 0.9. This yields the chance constrained program
min [2x+y]
subject to P(0.5x+0.3y~b)~0.9
x+y~4
x~O,y~O.
Minimizing the total costs (per month) in the mean, i.e. the expected total costs,
yields the two stage program
min [2x+y+EbQ(x,y,b)]
subject to x+ y~4
x~O, y~O,
where xElRn, A is a real (mxn)-matrix, and band c are real m- and n-vectors
respectively. Without loss of generality we assume that m ~n.
If we define Y= + 00 whenever (1) has no feasible solution, then y = yeA, b, c) is an
extended real-valued function, defined on IR m xn +m +n.
Obviously y represents the optimal value of (1), if this linear program has a
solution.
As we have seen in Ch. I, in an SLP situation there is a known probability space
(Q,(j,P OJ) and a measurable transformation (A(w),b(w),c(w»):Q~lRmxn+m+n
such that (A, b, c) = (A (w), b(w), c(w») is a random vector. Therefore yeA, b, c)
becomes yew) = y (A (w), b(w), c(w»). Now we may state the general
Distribution problem: Given P OJ' what is the probability distribution of y(w)?
First of all we have to assure that this problem is mathematically meaningful, i.e.
that yew) is a random variable or, equivalently, that y(w):Q~1R is measurable.
This may be concluded by Th. 0.18 and the following
Theorem 1. y(A,b,c):lRmxn+m+n~1R is a Borel measurable extended real-valued
function.
Proof: We shall prove this statement by constructing a countable set of Borel
measurable functions Yin:lRm xn+m+n~ IR and showing that y = supn infiYin- Apply-
ing Th. 0.19 we get the desired result.
Let ~ be a countable set of vectors Xi 2: 0, which is dense in {XixElR n, X2:0}, for
example the set of all non-negative n-tuples of rationals.
Define by using as vector-norm the Euclidean norm
~in={(A,b,C)iIIAXi-bll ~~}
and
_{ C'Xi if (A,b,c)E~in
Yin - 00 otherwise
(see Th. 0.20) which implies ~in to be Borel measurable sets. Furthermore Yin is
continuous in c on ~in'
Let us show, that y=suPninfiYim in two steps:
A)infiYin:-::;;Y for n=1,2, ...
We have to distinguish three possible cases:
A1) Y= +00.
In this case infiyinS:y is trivial.
A2) - 0 0 < Y< 00
Then there exists a vector xo;;::: 0 such that Axo =b and Y = c' xO. For an arbitrarily
chosen natural number n and any real positive 8 let 15(8, n) = min [n II~ II ; II: III
Since ::0 is dense in {xl x;;::: O}, there exists an Xi E::0 such that II XO - Xi II s: J(8,n).
For this Xi we have
Since we can repeat this procedure for any 8> 0 and for all natural n, we have
proved thatinfiYin:-::;;Y for n=1,2, ...
A3) Y= - 0 0
In this case there exist vectors xv, v = 1,2,3,. .. such that xv;;::: 0, AxV= b,
c'xv:-::;;-v.
Again let
J(8,n) =min[n II~ II; II :IIJ·
Then to each XV there exists a Xiv E::0 which satisfies
Again we have
and
B2)y=00
In this case {x IAx = b, x ~ O} = 0 and therefore
l? =infll Ax -b II> O. This implies
x~o
Obviously "Cn~ Yi.n < Y, since Xi. is feasible in (2). On the other hand "Cn must be
finite, because "Cn= -00 would imply the existence of a vector w~O, Aw=O,
c'w< 0 (see Th. 0.4), which contradicts our assumption that (1) had a finite
optimal value y. With respect to the basis B the basic and non-basic variables
x and y of any optimal solution of (2) must satisfy the equation
x=B- 1b+B- 1dn _B- 1ANY'
Since
we get
"Cn=c' B- 1b+c' B- 1dn+(c;'-c' B- 1A N)y
~y+c' B- 1dn
and therefore
0< Y-Yi•n~Y-"Cn~1 C'_ B-1dnl~ II cll'll B- 1 11· 1
n ,
and this implies again that sUPninfiYin =y.
22 Chapter II. Distribution Problems
x;' °
inf IIAx-bll;::: infIIAxi-bll,
XiE9
and, therefore
IIAxi -bll~ infllAx-bll as V~OO,
\I x~o
Now for any XiEfi2, IIAxi-bll is a Borel measurable function on £71, since it is
continuous. Therefore, by Th. 0.19,
infllAxi-bll
XiE[j)
is Borel measurable.
b) 91 is a Borel measurable subset of £71.
Let B I, . . . , Br be all (m x m)-submatrices of A, i.e. r = (~).
Then
r
91=n {(A,b,c)1 detBi=O} -9Jl.
i= I
By definition we have a finite number of disjoint sets (fi and 'Dij. To show that
n
3=U«fiUU'D i),
i= 1 j= 1
suppose (A,b,c) to be in 3. Then the linear program (1) has either a finite
optimal solution and, therefore also an optimal feasible basic solution implying
(A, b, c) E(fi for some i, or the objective is not bounded from below implying that
there must be a feasible basic solution - i.e. B i- 1 b20 - so that some non-basic
variable may be augmented arbitrarily without violating non negativity -
i.e. Bi 1 Aj~O - and thereby decreasing the objective arbitrarily - i.e.
Cj-CBiBi-l Aj<O. Hence (A,b,c)E'Dij for some (i,j). Conversely it is clear that
n
U «fiUU 'Di)c3.
i= 1 j= I
24 Chapter II. Distribution Problems
n
To define x on 3 we shall represent this vector on (fj U U 'nij by its basic and non-
basic parts xj , / respectively. Then j~ 1
and
.I{oo
Y~=
0 for all v, if (A, b, c) E(fj
for v=j.
o for v#j, If (A,b,c)E'nij,
which defines x on 3 as a Borel measurable transformation yielding a solution
to (1), whenever (1) has a solution and A has full rank. Furthermore we define
x on Wl by
, {+oo
Xj= . Cj~O}, If
if . (A,b,c)EWl.
-00 If Cj<O
On 91 we may define x in the same way as on 3 taking into account the fact that, for
(A, b, c) E 91, a certain subset of (linearly dependent) constraints of (1) can be deleted
yielding a new linear program with ml < m linear independent constraints. D
In a certain sense Th. 2 is stronger than Th. 1.
Precisely:
Theorem 3. Let x be the measurable transformation of Th. 2 and y = c' X. Then y is
a Borel measurable extended real valued function on IRmxn+m+n and y=Y almost
everywhere with respect to the Lebesgue measure.
Proof. Every component Xj of x is a Borel measurable function on IRmxn+m+n as
well as every component Cj of c. Therefore y=c'x as a sum of products of
measurable functions is measurable (see Th. 0.19).
Ymay differ from y just on the set
~ = {(A,b, c)1 (A,b,c) E9Jl, c=O},
because there y=O (if we use the definition 0'00=0) and y=oo.
But the Lebesgue measure of ~ is equal to zero. D
Although these results indicate that, from a mathematical point of view, the
distribution problem is meaningful, we may still get distribution functions of the
optimal value with defect, i.e. it may happen that the probability
P",({wl -00< y< oo})< 1.
In this case the moments of the random variable y certainly do not exist.
Since, however, in many practical cases decision makers are interested in the
mean value and the variance, we shall try to characterize those problems whose
optimal value has a distribution function without defect. Mter that we shall
investigate some special types of problems which are of practical importance as
well as mathematically "well behaved". The following results are due to Bereanu
[4 ].
1. The General Case 25
and therefore
P",(~)=P", ({(A,b,c) Iu'A~O implies b'u~O})= 1.
and hence
f=P",(Bl-U)~P",(Bl- ~)+P",({(A,b,c)1 w~O, Aw=O implies c'w~O}).
The fact, that P",(~) = 1 and consequently P",(Bl-~) = 0, completes the proof. 0
Let us now consider two special types of linear programs, which satisfy the
conditions of Th. 4 and hence yield an optimal value, whose distribution function
has no defect
y=infc'x
(3) Ax~b
x~O
where n
L aij> 0, i= 1, .. . ,m,
j= 1
m
L aij>O, j=I, ... ,n and
i= 1
We call this type of linear program a positive linear program. The assumptions
made here are quite realistic if we imagine that (3) represents a production
program, where A is the technological matrix, x represents the input, Ax the out-
put, b the demand vector and Cj the cost per unit of the j-th factor of production.
y=infd'z
Bz=b
z;;::::O
where d'=(c',O')
B=(A, -E)
z'=(x',y)
To verify condition a) of Th. 4, let UEIRm be such that u'B;;::::O, i.e. u'A;;::::O and
u'::;;O.
Since (3) is positive, it follows that u=o and therefore that b'u=O. Condition b)
of Th. 4 follows from the fact that w;;:::: 0 already implies d' w;;:::: 0.0
Another special type is the stochastic transportation problem, where the unit
transportation costs, the supplies and the demands are assumed to be random
variables with positive range and such that the total demand almost surely does
not exceed the total supply.
m n
y=inf L L CijXij
i= 1 j= 1
n
(4) L Xij::;;ai, i=l, .. . ,m
j= 1
m
L Xij;;::::b j, j=l, ... ,n
i= 1
n m
where cij;;::::O, ai;;::::O, bj;;::::O and L bj::;; L ai with probability 1.
j= 1 i= 1
d' I
I
d' I
, I
\ 1m I
A=
\ I
'd' I II
---1--1--
In I
I
I -In
I
n times
L aiuf+ j=1
i=1
L bjuJ~ 1_._m L ai+ 1_1_n
~~!l uf· i=1 L bj
~~~ uJ- j=1
n
~ (min uf + min uf)'
1:'Oi:'Om 1:'Oj:'On
L bj~O,
j=1
It often happens, as in problems (3) and (4), that only a certain subset of the
coefficients of a linear program are random variables. We may express this fact
by a reformulation of the general problem due to Bereanu [3], which allows
statements of all possible kinds of SLP distribution problems.
Let Tc IRr be the range of a random vector 1= (I h' .. , Ir)' and
y(/)=inf c'(t)·x
(5) subject to A(/)x=b(/)
x~O
where
A(/)=AO+A 1'/ 1+A 2'12+" '+Ar/r
b(/)=bo+b1·/1+b2·12+···+br·/r
C(/) = CO + c 1 . 11 + C2 . 12 + ... +C' Ir
Theorem 7. Let B(t) be an (m x m)-submatrix of A(t) and Jlr the Lebesgue measure
on JR'. Then either detB(t)=O for all t E T or Il,({t IdetB(t) =O}) =0.
Proof Suppose it is not the case that detB(t) =0 for all t E T, i.e. there exists a
t* E T such that detB(t*) =1= o.
Obviously lp(t)=detB(t) is an algebraic function in t, i.e.
L" rt.d"d
lp(t)=
.= t:"
2'•••
where irv~O are integers. Now for any algebraic function lp(t) either lp(t)=O or
Il,({t I lp(t) =O}) =0, as we shall see by induction to r.
For r=1 the fundamental theorem of algebra asserts that lp(t)=O or lp(t) has a
finite number of roots, i.e. 1l1({t Ilp(t) =O}) =0.
Assume that the statement is true for any algebraic function of at most r-1
variables t h •.. , t,_ 1. Consider now
where lpith . .. , t,-I) are algebraic functions of at most r -1 variables t 1, •.. , t,-1
such that there exists at least onello with lpl'o(t 1,· .. , t,_ 1) ¥=O.
which implies Il,(<i:) =0 by Th. 0.25. And for any (th .. . ,t,_I)-section 3)/'-1) of
3) we get
A
where CBi(t) consists of the basic components of c(t) with respect to the basis
Bi(t). Moreover let
i-I q q
mi=mi-Umj ; i -=P k, and umi=umi.
j= 1 i= 1 i= 1
It is now easy to prove
We are interested in the distribution function Fy(t)(~) ofy(t), i.e. in the probability
of the set
'9'(~)={lltET; -=<y(t):::;~}; ~EIR.
q
Let m = U mi. Then obviously Pt ('9'(~»)~ Pt ('9'(~)nm ).
i= 1
30 Chapter II. Distribution Problems
Therefore
q
Fy(t)C~) =Pr (<§c~))= I Pr (<§(~)n(mi -m\O»))
q i= 1
= I Pr({t!tEmi-m\O) and Yi(t)::;;~})
i= 1
q
=I Pr({t!tEm i and Yi(t)::;;O), since pt(m\O») =0,
i= 1
q q
= I Pr((£:i(~))= I J!r(T)dT.D
i=1 i=1(!:i(~)
Sometimes one can find the conjecture in the literature, that under the
assumptions Ai, or even stronger ones like positivity assumptions on the linear
program (5), the sets m: i could be taken as "decision regions" instead of the sets ~i.
The following very simple and well-behaved example shows, that this conjecture
cannot be true in general because
Pr(m:in m:k ) = 0, i =1= k,
and 3
and hence,
which implies
c~i(t)Bi let) bet) -C~k (t)Bi: I (t)b(t) = O.
Hence,
~in~k c {t It E T, C~i(t) Bi let) bet) -c~Jt) Bi: l(t)b(t) = O}
and obviously,
~in~kc {tl t ET, Bi l(t)b(t)~O and Bi: l(t)b(t)~O},
which proves the theorem. D
From the counter example above as well as from the last proof we might
suggest that there is some connection between the fact that t E~in~k and
primal or dual degeneracy. Let Bi(t) again be some almost nonsingular (m x m)-
submatrix of A(t). We say that Bi(t) is primal degenerated with respect to the
linear program (5), if at least one component of B i- l(t)b(t) vanishes. We call
Bi(t) dual degenerated, if at least one component of cRet) - R;(t)B'- let) cBet) is
equal to zero, where Ri(t) is the matrix consisting of 'all columns of A(t) not
belonging to Bi(t) and cR(t) is the vector of those components of c(t) belonging
to R/,J). Certain stochasti~ linear programs of type (5) satisfy the assumption
A2. For every almost nonsingular (m x m)-submatrix Bi(t) of A(t) there exists a
t(i) E T such that Bi(t(i») is nonsingular and neither primal nor dual degenerated
with respect to (5).
Theorem 11. Given Al and A2, we have Pt(~in~k)=O, i#k, i.e. the sets ~i may
be taken as "decision regions".
Proof ~in~k = {t I Bi-l(t)b(t)~ 0, Bi: l(t)b(t)~ 0,
c'(t) -c~(t)Bi l(t)A (t) ~ 0,
c'(t) -c~~(t)Bi: l(t)A(t)~O, t E T}
c {t IB i- l(t)b(t) ~ 0, Bi: l(t)b(t) ~ 0,
C~i(t) -C~k (t)Bi: I (t) Bi(t) ~ 0,
c~i(t)Bi-l(t)b(t) -c~.ct)Bi: l(t)b(t) =0, t E T} = 'D
if and only if at least one of any two corresponding components of the vectors
vanishes. Since Bi(t) and Bit) are different submatrices of A(t), there is at least
one component of d(t), say dvCt), which is an algebraic function in t and, by A2,
dvCt(k)"# 0 and therefore
f.lr ({ t IdvC t) = O} ) = O.
This yields the desired result
and hence
2. Special Problems
From Th. 8 and Th. 9 we may conclude that in general it is not at all trivial to
determine the distribution function Fy(t)(~) or the moments EyV(t), since its
computation involves numerical integration over the sets (£:i(O or functions
y7(t) which are difficult to handle. One of the major reasons for these
difficulties is the fact that, in general, y(t) is not continuous in t.
34 Chapter II. Distribution Problems
The following example due to Bereanu [4] shows that this discontinuity may
appear even if y(t) is finite for all t ETas soon as the technological matrix varies
with t.
Define, for t ElR,
yet) =inf{xl XElR, YElR, x+ ty~ 1, x~O, y~O}.
Then
1 for t:::;;O
y(t)= { 0
for t> 0,
If in addition to the assumptions of Th.12 c = c(t) == co, i.e. c 1 = c2 = ... = c' =0,
we see from this proof that y(t) is piecewise linear (see Th. 0.7). If moreover
T={t I rx:::;; t :::;;/3, t ElR} we may proceed as follows:
(i) For to =rx determine y(t o) =min{ c'xl Ax =b(t o), x~ O} with the simplex method
yielding an optimal feasible basis B o such that y(to)=cBoBr;1b(to). With
k = 0 go to the next step.
2. Special Problems 35
as stated in (5), and assume that t varies over a compact interval T c JRr. For
y(t) to be continuous on T it is necessary that (7) is solvable for all t E T, or equi-
valently (see Th. 0.1 and Th. 0.4) the conditions
are necessary. However, these conditions are not sufficient for the continuity
of y(t) on T, as we know from the example given above.
To motivate a sufficient condition given by Bereanu [4], let us assume that for
(m x m)-submatrices Bi(t), ;=1, .. . ,q, of A(t) the sets
(where dB,(t) and DB,(t) correspond to the nonbasic parts of c(t) and A(t), respec-
tively) cover T, i.e.
(10)
(10) that there is a !)i such that tE!)i and y(t)=cB.(t)Bi l(t)b(t) is continuous
at t because !)i is an open set in 1R'. •
Now assume that for an arbitrary tE!)i there are vectors wE1Rn, UE1Rmsuch that
w#O, w~O, A(t)w=O and u#O, A'(t)u::;;O. Rearranging w into its basic part
wB, and its non basic part wN , we have
A(t)w=Bi(t)WB, +DB,(t)WN,=O
and therefore
B;(t)u::;;O
DB,(t)u::;;O,
So we have shown that for (10) the following conditions are necessary, which are
a strengthening of (8):
Obviously ~* 2: 0 and II ~* II = 1.
and therefore
(12)
I'1m A ' (
A '( t *) 1'/* = J-+ t") I'/j ~ I'1m -II
c(t,,) = 0
U II
_J
00 J r"-'t
00 Xj
and therefore
(13) b'(t")u,,
J J
= II u"J II b'(t,,)l'/j'-
J
- =.
But (12) as well as (13) is a contradiction to
(14) c'(tv)xv-b'(tv)uv~O, v=1,2,3, ... ,
which has to be satisfied for (xv, u v) Em. Hence, our assumption, that mis not
bounded, is contradictory. 0
Theorem 14. If Tc IRr is a compact interval and if (11) is valid on T, then yet) is
continuous on T.
Proof For any t E T the vectors x* ElRn and u* ElR m are solutions of (7) and its
dual program if and only if x* 2: 0 and, for all u E IRm and all x E IRn such that x 2: 0,
where
Lle = e(l) - e(l)
Llb = b(i) - b(l)
LlA = A(i) - A(l).
Since according to the duality theorem (see Th. O.9)(x, u) and (i, u) are elements
of the bounded set m defined in Lemma 13, the continuity of y(t) on t follows
from (16).0
This result suggests the application of numerical quadrature, if we are
interested in determining Fy(t)(~) or Ef(t) and t is a random vector with range in
the compact interval T and with a continuous density function fr(')' However,
this leads in general to a tremendeous amount of work. The question, whether
numerical quadrature or Monte-Carlo simulation is to be preferred, has not yet
been answered.
Chapter III. Two Stage Problems
As stated there, q and W may also be stochastic in the sense that they are
random vectors on the probability space (Q,ty:,Pw) too. First of all we shall check
whether we really need mixed strategies or not. In many practical cases, the
problem will be stated somewhat different from (1), namely
£-i(ew,x
( ))={L(e(W,X)),
. if i=1
1£(e(w,x))I, if i> 1.
The question, whether we can restrict ourselves to pure strategies without loss,
was first answered by Wessels [18] for the case when
= Ai~O.
i= 1
XxQ
J D(e(w,x))d(PxxPw ) exists.
Then 1 1
inf {JD(e(w,x) )dProF S {
XEX Q
J Li(e(w,x) )d(Px x Pro)}'
XxQ
40 Chapter III. Two Stage Problems
x
J
Q x
J J
inf L (e(w,x) )dPa, = {inf L (e(w,x) )dPa,} dPx ::;
X Q
J
infX G(x) = infxG(x)dPx ::; G(x)dPx J
r;
x x
: ; {1 y. {1
1 1
r
1 1
Gi(x)dPx dPx -+-=1
i j
1
={ J
1
i!(e(w,x»)d(Pro x Px)}T,
XxQ
r
XEX i= 1 i= 1
Proof Obviously 1
!~1 itl Ai {Ep", i! (e(w,x) )}'t ::; itl Ai l{J D(e(w,x) )dP ro dPx'
Applying Fubini's theorem and Holder's inequality in the same way as in the
proof of Th. 1 yields the desired result. 0
For problem (1) a similar result may easily be established:
Theorem 3. Provided the integrals involved exist
r r
S Sf LA.i£(e(w,x») dPwdPx
XQ i~ I
r
= I
i~ I
A.i J £ (e(w,x) )d(Pw
Xx Q
X PJ. D
The three statements Th. 1, Cor. 2 and Th. 3 cover more than we shall
treat in this book. However, the type of function - F in (1) as well as in (2) -
with which we may restrict ourselves without loss to pure strategies, does not,
in general, seem to be answered.
Furthermore, in the following we shall reduce problem (1), with the help of
our previous results, to
(3) minEp {c'(w)x + miny{ q'(w )yl W(w)y = b(w) - A (w)x, y~ O}}.
XEX W
First we have to discuss what we call the feasible set of decisions. Here we are
especially concerned with the so-called recourse program
(4) Q(x w) = {inf {q'(~)YI W(w)! = b(w) -:- A (w)x, y~ O}, iffeasible y exist;
, + =, If no feasIble y eXIst.
and hence
Dj={wIQ(Xj,w)< +oo}
we have
Let
Obviously we have
and hence
00 00
and hence
Then there is an
Here
(6) Q(xi,w)=infq'(w)y
subject to W(w)y=b(w)-A(w)Xi; i=l,2,
y:e::O
and
(7) Q (Ax I +(1-A)X2 ; w) =infq'(w)y
subject to W(w)y = b(w) -A(w) (AXI +(1-A)X 2 )
y:e::O.
Let Yi, i= 1, 2, be feasible in (6). Then obviously AYI +(l-A)Y2 is feasible in (7)
for any AE(O,1).
If Q(Xi,W), i=1,2, are both finite, then we can choose Yi to be the solutions of
(6) respectively. Hence,
44 Chapter III. Two Stage Problems
it is still possible that Q + (x) = + 00. This means that either Q(x) would be
undefined, ifQ- (x) = + 00, or Q(x) = + 00. Since both these situations are meaning-
less in practice, because one does not want decisions with either an undefined
outcome or infinitely high costs, it seems natural to restrict x to
Proof Since xEK, Q(X) < + <Xl. Let Xl EK and X2 EK, implying XiEK, i= 1,2. Due
toTh.5 for any AE(O, 1) Q(Ax l +(1-A)X2;W):::;;AQ(X 1 ,W)+(1-A)Q(X2,W) almost
surely.
Hence
Q(Axl +(1-A)X2)= JQ(Axl +(1-A)x2,w)dPu,:::;;
Q
: :; J{AQ(X 1 w)+(1-A)Q(x2,w)}dPu,
Q
=AQ(X 1 )+(1-A)Q(X2)'
W(W)=W
i.e. W is a fixed nonstochastic matrix. From the previous section we know that in
general K c K. Under suitable integrability assumptions on the original random
variables we may strengthen this statement in the fixed recourse case. The fol-
lowing four statements are due to Walkup and Wets [15].
Theor~m 8. Let the random variables in q(w), b(w), A(w) be square integrable.
Then K =K.
Proof: We have to show that K c K. Let xEK be arbitrarily chosen. For
Q(x) = {wIQ(x,w)< +<Xl} we know by the definition of K, that Pw (Q(x»)=1. If
we define Q 1 = {wi WEQ(X), I Q(X,w) I < <Xl}, then obviously Q+(x):::;; S I Q(x,w)ldPw '
Q,
Since Q(x,w) is finite on Q b we may represent Q(x,w) in terms of basic solutions
for any wEQ b i.e. Q(x,w)=q'(w)B- 1 [b(w) -A(w)x], where B is an optimal
feasible basis out of W with respect to wand q( w) is the vector of the components
of q(w) corresponding to B. By the assumed square integrability of the elements
of q(w), b(w) and A(w) and Schwarz's inequality it is now obvious that IQ(x,w)1
is integrable on Ql'
Hence
Proof We have to show, that Q(x) = - 0 0 for some XEK implies Q(x)= -00
for every other point xEK.
Let x be a point of K so that Q(x)= -00.
From our integrability assumption it follows that P'" ({wi Q(x,w) = -oo} }=IX> O.
Let .Qoo={wI3y~0: Wy=O,q'(w)y<O} (seeTh.O.4).
Then
P",(.Qoo) = IX.
If x is any other point of K, then for
.Q(x) = {wi Q(x,W) < +oo}
we know that
P'" (.0 (x) ) = 1.
Hence,
P'" (.0 00 n .Q(x»)= P",(.Qoo) - P",(.Qoo n (.0 -.Q(x»))
=IX
since P",(.Q - .Q(x) ) = O.
For WE.Q oo n.Q(x) there exists a feasible solution to the recourse program -
since WE.Q(X) - and a direction y~O such that Wy=O and q'(w)y<O - since
WE.Q oo - which implies
Q(x,W) = -00.
Q(x)= -00.0
Theorem 10. Let,the random variables in q(w), b(w), A(w) be square integrable
and Q(x» - 00 on K. Then Q(x) is Lipschitz continuous.
Proof Let XiEK, i=1,2 and .Qi={wIQ(Xi'W)< +oo}, hence P",(.Q;) = 1, i=1,2.
From Q(x» - 00 on K it follows, for
.0 00 ={wI3y~0: Wy=O, q'(w)y< O},
that Pa,(.Q oo ) = O.
Therefore Pa, (.0 1n.Q2) -.0 00 )= 1.
Forany wE(.Q 1 n.Q 2)-.Q oo wehave -oo<Q(Xi,W)<+oo
and, obviously, - 0 0 < Q(Ax1 +(1-A)X 2,W)< +00.
Representing the optimal value via basic solutions shows that Q(x,w) is piece-
wise linear on
{XIX=Ax1 +(1-A)X 2, AE [0,1]) and, due to Th. 5, convex.
2. The Fixed Recourse Case 47
Hence
Q(Xi'W) =IXv/W) +d~i(W)Xi'
(see Th. 0.6), where IXv/W) is a weighted sum of products of elements of q(w) and
b(w) and dv,(w) consists of components, which are weighted sums of elements of
q(w) and A(w). Hence, according to our square integrability assumption and
Schwarz's inequality, IXv/W) and dv/w) are integrable. If O::;;Q(X2>w) -Q(xl,w),
then from the convexity of Q(x,w) in x we have (see Th. 0.11)
Q(X2,W) + (x I -X2)' d V2 (w)::;; Q(x I,W),
implying
Hence,
I Q(X2,W) -Q(xl,w) 1= ct v2 (w) -ct v1 (w) +d~2(W)X2 -d~,(W)XI
::;;d~,(W)(X2 -Xl)'
If
Since obviously Maxldvi(w)loo is integrable on(Q l nQ 2 )-Q oo ' and the same is
true for !
Max Idiw) I00' where v varies over the number of all possible bases, we get
v
Since SMaxl diw)1 dPw is independent of X I ,X2, this is the desired Lipschitz
Q v
condition. D
Corollary 11. If either
a) q(w)=q and the random elements ofb(w) and A(w) are integrable or
48 Chapter III. Two Stage Problems
as Lipschitz constant. D
Theorem 12. Suppose one of the integrability assumptions of Th. 10 or Corr. 11 and
Q(x» - 0 0 on K. Suppose further, that the probability measure Pon the Euclidean
space JR.I spanned by the elements of A, b, q is absolutely continuous with respect to
the Lebesgue measure J11 on JR.I - i.e. P has a density function -, then Q(x) has a
continuous gradient on K.
Remark. We still assume (A,b,q) = (A(w),b(w),q(w») to be random in spite of the
fact that we have omitted the wfor simplicity. The following proof consists of
two parts. First we observe that Q(x,w), xEK, has a gradient a.e. and that its
partial difference quotients are bounded by an integrable function. From
Lebesgue's theorem then follows the differentiability of Q(x) = JQ(x,w)dPa,. In
the second part we demonstrate continuity by using an explicit presentation for
the gradient VQ(x).
Proof According to our assumptions Q(x,w) is finite with probability 1 for any
x E K, and may, therefore, be represented via basic solutions as
Q(x,w)=ijiBi l [b-Ax],
where Bi is an optimal feasible basis of Wand iji is the vector ofthose components
of q belonging to B i•
Hence Q(x, w) has a gradient - with respect to x - of the form
for all WE Q such that A, b, q do not belong to one of a finite number of sets of
the type
Sij= {A,b,ql ijiBi l A #ijjBjl A; ijiBi- l [b -Ax] =ijjBjl [b -Ax]}; i#j.
F or every A, q define
2. The Fixed Recourse Case 49
Now either
iii B i- 1A = iijBj-1A, implying EA~q = 0 and
hence the Lebesgue measure Il(EA~q)=O
or
Then
bEEi,j
A,q
if
this means that El'.~ is a hyperplane in the space spanned by the elements of b
and hence the Lebesgue measure Il(EA:q) =0.
From Il(EA~q)=O for every A,q it follows, by Th. 0.25, that IlI(Si) =0 and, since
P is absolutely continuous with respect to Ill,
P(Si) =0.
Therefore Q(x,w) has for every XEK a gradient of the form
7 x Q(x,w) = -(iiiBi-1AY
for all w except a set of Pro-measure zero.
From the proofs of Th. 10 and Cor. 11 we know that, for
7Q(x)=I J -(iiiBi1A)dP,
i !!li(X)
where
i- 1
~i(X) = ~i(X) - U ~k(X)
k~ 1
Obviously we have to show that the symmetric difference of ~i(x+L1x) and ~i(X)
tends to a set of P-measure zero as Ax----+O. Looking at this symmetric difference
50 Chapter III. Two Stage Problems
i-I i-I
k=l
i-1 i- 1
U [(~;(X + Llx) - U ~k(X + Llx) )n U ~k(X)]
k= 1 k= 1
i- 1
k= 1
i-1 i-1
U [(~i(X) - U ~k(X))n U~k(x+Llx)]
k= 1 k= 1
C [~i(x+Ax)-~i(X)]
u[~ (~k(X)-~k(X+LlX))J
u [~i(X)-~i(X+Llx)]
From
(A,b, q) E~i(X+ Llx) -~i(X) it follows that
Bi- 1(b -Ax)~Bi-1 ALlx
and that
Hence
only if
[B i- 1(b-Ax)lv=0, i.e. only if
(A, b) is an element of one of finitely many hyperplanes in the (A, b)-space. Now it
is obvious, that
caused by the fact that from r(W) = m the uniqueness of the solution of
Wy=i,y=( -1, ... , -1)' follows.
In that which follows we try to characterize complete recourse matrices.
Lemma 13. If W is a complete recourse matrix with m + 1 columns, then every m
columns of Ware linearly independent.
Proof Since r(W) = m, let WI>' .. , Wm be linearly independent.
Suppose now that WI" .. Wm- I> Wm+ 1 are linearly dependent; then there exist ai
such that
m-l
Wm+1 = L aiW;,
i= 1
Since - Wm E IRm and W is a complete recourse matrix, there exist Pi 2 0 such that
m+l
- Wm= L PiW;
i= 1
m m-l
= LPiW;+Pm+l LaiW;
i= 1 i= 1
m-l
= L (Pi+Pm+l a;) W;+PmWm
i= 1
and hence
m-l
L (Pi+Pm+ 1 a;) W;+(1 + Pm) Wm=O,
i= 1
contradicting the linear independence of WI>' .. , Wm , since Pm 2 0 and therefore
at least 1 + Pm> O. 0
If we assume the linear independence of WI>' .. , Wm , which is justified by
r(W)=m, we may state
Theorem 14. Let W have m+n columns (n21). W is a complete recourse matrix
if and only if
:D= {y\ Wy=0,Y20;Yi> O,i= 1, ... ,m}#0.
Proof The necessity of the condition may be shown as follows:
Let m
m+1i
Z= L JiW;, where J i 20, i= 1, ... ,m+n.
i= 1
Therefore
m m+n
L PiW;= L JiW;,
i= 1 i= 1
3. Complete Fixed Recourse 53
implying m m+n
~::<c5i-Pi)W;+ L c5iW;=O,
i= I i=m+ I
where
c5 i -Pi>0, i=1, ... ,m; but c5 i 2:: 0, i=m+1, ... ,m+n;
consequently 1) # 0.
Suppose now that 1) # 0. Then there exist numbers
Z= LPiW;,
i= I
since WI' ... ' Wm are linear independent. If Pi2::0,i=1, ... ,m, we have no further
problem. Suppose, therefore, that, for at least one index i, Pi< 0. Without loss of
generality we may assume that
~= max Pi
CXm ISismCXi
which has to be strictly positive, since CXi< 0, i= 1•...• m. and Pi< for at least one
index i.
°
From the linear independence of WI •...• Wm and CXi < 0, i = 1•...• m, follows the linear
independence of WI.· ..• Wm - 1. Wm + ii + 1·
Hence. there is also a unique solution of
m-I
Z= L YiW;+Ym+ii+ 1 Wm+ii + 1·
i= I
Using
m
Wm + ii + 1 = L CXiW;.
i= 1
this implies
m-l
Z= L (Yi+Ym+ii+ 1· CX;)W;+Ym+ii+ICXmWm
i= I
m
= LPiW;
i= 1
54 Chapter III. Two Stage Problems
Ym+ii+I=~m>o
m
and
since
0< ~= max A~.i!.L for every j = 1, ... , m, and
IXm l:5i:5m lXi IXj
IXj< 0, j= 1, .. . ,m.
Hence
rn-l m+n
Z= L YiW;+Ym+ii+ I L
i=l i=m+l
biW;
Q(x,w) =infq'(w)y
Wy=b(w)-A(w)x
y~O
{zl W'z~q}#0.
3. Complete Fixed Recourse 55
For almost every WEQ there exists a unique z(w) such that
Wj'z(w)=qj(w), i=1, ... ,m, which implies
m m
Dm+IW~+IZ(W)= LIXjWj'Z(W)= LIXjqj(W)SDm+lqm+l(w),
j= I j= I
Hence, for almost every WEQ,Z(W) is a feasible solution of W'zsq(w), since
Dm+ I > O. Now the desired result follows from Th. 15. D
However, the condition given in Cor. 17 is not sufficient in general for the
finiteness of Q(x) if W has more than m+ 1 columns, as is shown by the following
example:
-1
2
-1)
-2 .
56 Chapter III. Two Stage Problems
W3 + W4 = -W1 -W2 ,
and hence 0(1=0(2=-1
153=154 =1.
Let q(w)=q, given by q1 =q2=q3=1, q4= -2, which satisfies
0(1q1 +0(2q2= -2::;J 3q3+ J4q4=-1.
Here W'z::;q
is equivalent to
Z1-Z2 ::;1
Z1+Z2 ::;1
-Z1 +2z 2 ::;1
-Z1 -2Z2::; -2.
Summing up the last two inequalities yields
if we add twice the second inequality and the fourth inequality, we get
Z1 ::;0.
Hence {zl W'z::;q} =0, which implies, by Cor. 16, that Q(x) = -exl.
4. Simple Recourse
Q(x,w)=inf[q+'(w)y+ +q-'(w)y-]
subject to y+ -y- =b(w)-A(w)x
y+~O
y- ~O; y+, y- EIRm.
Corollary 18. Given simple recourse and one of the integrability conditions of
Th. 15, then Q(x) isfinite if and only if q+(w)+q-(w)~O with probability 1.
4. Simple Recourse 57
Proof. By Th. 15 Q(x) is finite if and only if {zl W'zS;q(w)} #0 with probability 1,
i.e. if and only if {zl-q-(w)S;zS;q+(w)}#0 with probability 1. This yields the
proposition of the Cor. D .
The simple recourse model has been studied for various applications all of
which have in common that they can be understood as production or allocation
problems where only the demand is stochastic. In this case it turns out that we
get Q(x), or some equivalent, in a rather explicit form which allows more insight
into the structure of the problem than convexity and differentiability do.
Hence we assume that
q+(w)=q+, q-(w)=q- and A(w)=A;
i.e. only b(w) is random. According to Cor. 18 we assume that
Using the definitions q=q+ +q- and Oi= Sbi(w)dP"" Th. 19 yields
Q;(xi)=qt J (b;(w)-X;}dP",-qi- S (bi(w)-X;)dP",
b i (",) > Xi b i (",)!> Xi
=qiOi-qiXi-qi S (bi(w)-X;)dP",.
b i(",) !>Xi
is convex in Xi. D
Suppose now, that there exist IX;, Pi such that
lXi~bi(W)~Pi for all wED.
Then from
Q;(xi)=qioi-qixi-qi S (bi(w)-Xi)dP",
bi("')!>Xi
we know that
and
Thus, it seems desirable to separate the nonlinear and linear terms by constructing
a new objective function which yields the same solution set as
m
I
i= 1
Qi(XJ
This may be done by introducing the variables Xit,Xi2,Xi3 and the following con-
straints:
-Xit +Xi2+Xi3=Xi- Vi
Xit ~ Vi -rxi
Xi2 S Pi -rxi
Xi2~O
Xi3~O
(Xit ~ 0 follows from Vi ~ rxJ
Let ~i(Xi) be the set of all feasible (Xit, Xi2, Xi3).
If
,nxJ= J (Xi-bi(w»)dPro
and bi(ro):SXi
IP/Xit,Xi2,Xi3) = Xi3 + J
bi(ro):S cti + Xi2
(Xi2 +rxi -bi(w) )dPro,
J
bi(ro) :SXi
(hi-Xil +Xi2+Xi3- bi(w»)dPro
t/J/xi)SXi3+ J
bi(ro) :SXi
(Xi2+ rxi- bi(w»)dPro
Where bi(w) is integrable and bounded below by lXi' but not essentially
bounded above we have
Corollary 22. Let
f
and
Proof The proof follows immediately from that of Th. 21 by setting Xi3 =0 and
f3i= +=.0
Now the problem
Min{Q(x)+c'x},
xeX
subject to X-Ax=O
XEX
subject to
X -Ax=O
XEX.
In case that Pi =
Xi2:::;' Pi -(Xi·
+ 00, we set, as in Cor. 22, Xi3 = °and omit the constraint
It seems worthwhile mentioning that this representation of the problem implies
that contrary to the general complete recourse case, for the simple recourse model,
where q+, q- and A are constant, only the probability distribution of every
blw) has to be known, but not their joint distribution. This also means that it does
not matter whether the random variables bi(w) are stochastically independent
or not.
To illustrate the above result let us give some examples. First suppose that the
random variables bi(w) have finite discrete probability distributions, i.e.
where
and
v
Fiv= LPil·
1= 1
If we choose
il2=b il +1 -bi/, 1:::;,I:::;,v-1,
then
bj (ro):5:
J+
Xi2 aj
(Xi2 +OCi -bi(w) )dP=
v- 1 v- 1
Next, we suppose that the random variables bi(w) are uniformly distributed on
[OCi, f3i]' i.e. the distribution is determined by the density function
I:()
Jir= 11
f3i- oc/
O
OC(5:;'r~f3i (OCi< f3i)
otherwise.
Then we get, since
0< X·2<f3·-OC·
- r I - l'
4. Simple Recourse 63
Hence - as was first pointed out by Beale [2] - we have to solve the convex
quadratic program
O~XiZ~Pi-lXi
Xi32:0
xeX.
Finally, let us assume that the random variables bi(w) are exponentially
distributed with density functions
r2:0
otherwise
where A.i>O, i.e.lXi=O, Pi= +00.
Then we have
J J(XiZ -r)e-).itdr
Xi2
· {~
M 10 { + - ij; -).)(2 qi} -,}
.~ qi Xit +qi XiZ +-:e i ' - - : +C x
,-1 A., A,
subject to bi -Xil +Xil -Aix=O
- 1
·I>b·=-
X, -, X
XiZ2:0 '
xeX.
Using the taylor series
, { m { + + qiA.i
Q(X.bX.Z,X)= i~1 qi Xil-qi XiZ+-2-Xil +CX .
2} _,}
64 Chapter III. Two Stage Problems
Q(X.bX.2,X)={t{qtXil-qtXi2+ijiXi2+ ~: (e-,,;x;2_1)}+C'x}
we have
= L ~hiZ)'
i= 1
From
Ai(O) =0,
Ai(XiZ)=ij[1-e~";X;2 -AiXi2]=>Ai(0) =0, and
A;'(XiZ)= iji [Aie -";X;2 -A;]~O for Xi2 ~ O(Ai> 0, iji~ 0),
°
it follows that Ai(X;z) ~ and hence
A(X.2) ~ °
for X.2 ~ 0, i. e.
Q(X.bX.2'X)~ Q(X.1,X.2,X).
On the other hand, it follows from
1
iji [Xi2 +T(e- ";X;2 -1)] ~
,
° for Xi2 ~ °
that m
Q(X.[,X.z,x)~ L {qtXil-qtX;z}+c'x= -q+'Ax+q+'b+c'x=L(x).
i= 1
Therefore, if x*, x** and i are minimal feasible points with respect to Q, Qand
L, we know from Th. 21, that
= f {qtbi-qt AiX**} +
i= 1
L qi[Aix**+~(e-";A;X" -1)]+c'x**~
i:Aix·* > 0 Ai
~ Q(X~t, X~Z*, x**).
It is obvious that the bounds Q(X~t,xY,x**) and L(x), which are determined
by solving a quadratic and a linear program, depend essentially on the data q+,
q -, b, A and the feasible set X.
5. Complutational Remarks 65
5. Computational Remarks
From the theory developed so far it seems rather difficult to get a numerical
solution of a general two-stage program with some arbitrary given joint pro bability
distribution. Take for example a complete fixed recourse problem, the distribution
of which is given by a density function. In this case we have to minimize a con-
tinuous differentiable convex objective function Q(x) subject to XEX. If X is a
bounded convex polyhedral set, this problem can be theoretically solved by the
following special method of feasible directions:
Given Xk E X, solve the linear program
Minx'VQ(xk) subject to XEX.
If xk solves this linear program, then Xk solves the original problem MinQ(x)
subject to XEX. Otherwise let y" be a solution of the linear program. Then solve
the one dimensional problem
Min Q (h k + (1 - A)y" ) subject to 0::;; A::;; 1,
yielding Ak. Now restart the procedure with
xk + 1 = Akxk + (1 - Ak) y".
It is well known that this method converges to a solution of the original problem
MinQ(x) subject to XEX. However, this procedure involves the repeated
evaluation of Q(x) and VQ(x), which as we know from the proof of Th. 12, are
given by sums of multiple integrals over sets ~i(X), which are polyhedral and
depend on x. This type of numerical integration seems not to be completely
investigated in numerical analysis; one can only be sure that the amount of work
evaluating these integrals is tremendous. Therefore, it does not seem to be
reasonable to apply the above procedure. For an alternative approach we may
get hints from the examples in section 4. There we have seen that in the simple
recourse case, where only b is random, a finite discrete distribution of b leads to a
linear program and a uniform distribution of b/s yields a quadratic program.
Finally we gave an a posteriori error estimate for approximating the nonlinear
program resulting from exponential distributions by a special quadratic program.
From these examples it seems obvious to try the following approach: approximate
the given two-stage problem by a special optimization problem which may be
handled more easily, e.g. by a linear or quadratic program. Then the only problem
consists in finding reasonable error estimates.
Suppose for example that the given two-stage problem is of the simple recourse
type where only b is random and the finite distribution of b i is given by the
distribution function Flr) (Fi(CXi) = 0, Fi(Pi) = 1 ). According to the last section the
objective function of the problem is
Q(X.l>X.2,X.3,X) =
m CXi+ Xu
L {qiXil-qiXi2 +qi-xi3+iii J (Xi2+ cxi-'t')dFi(-r)}+c'x
i= 1 (Xi
66 Chapter III. Two Stage Problems
aEj+Xi2
this is the desired error estimate which obvious also remains valid for the optimal
values of Qand Q.
If, in the same simple recourse model, Fi(7:) has a continuous density Ji(7:), we
may try another approximation by replacingJi(7:) by a piecewise constant density
functionl(7:) such that
11(7:) - f(7:) 1~ e V7: E [a.i, Pd.
Then
5. Computational Remarks 67
and hence
IQ-QI~eitlqi (f3i~rxY .
From the last section we know that for constant densities Ji( or) we get quadratic
programs. It is now obvious that piecewise constant densities again yield
quadratic programs.
It is also evident for the general two stage problem that a finite discrete joint
probability distribution yields a linear program. Suppose that we have the
general two stage problem
min{c'x+Q(x)}
XEX
where
J
Q(x) = Q(x,w)dPo" c= Jc(w)dPo,
and
Q(x,w)=min {q'(w)yl W(w)y=b(w) -A(w)x,y;:::O}.
the elements wiEQ,i=1, ... ,r, have the probability Pi (Pi~O; LPi=1). Then it
i= 1
is easily seen, that the two stage problem min {c' x + Q(x)} may be rewritten as
XEX
r
Let Av(w), bv(w), qv(w) be arrays of the same dimension as A(w), b(w), q(w),
but with simple functions as elements. The corresponding objective functions let
be Qv(x,w) and Qv(x) = JQv(x, w) dPa,. Obviously the determination of the simple
functions defines a discrete distribution on Q. We must require, that A.2) is also
satisfied forqv(w) (at least almost surely). For this purpose, we have to be careful.
If for example
W=(1,-1)
and q(w) has the range R(q)={(e,'1)le~ -2.5;'1~2.5}, then A.t) and A.2) are
satisfied for the original problem. Now let M = {(e,'1)1-4< e ~ -2, 1~ '1< 3} be
an interval of some partition. Then q-l [M] #0, such that M could have a
positive probability. Choosing on M the norm minimal vertex v={(e,'1)I~= -2,
'1 = 1} as value of qv(w) does not satisfy A.2), since W' u~ qv(w) yields -1 ~ u~ - 2.
But if we choose the norm minimal element of the intersection of M and
Kw = {(~, '1)13u: -'1~ u~~}
= {(~, '1) I ~ + '1 ~ O},
i.e. qv(w)=( -2,+2), then A.2) is satisfied. In general, the analoguous way
(choosing the norm minimal element of the intersection of every interval and
Kw) yields a sequence satisfying A.2) too.
Let therefore (Av(w), bv(w), qv(w») be an integrable simple function such that
A.2) is satisfied. We want to have an error estimate for the objective function and
hence for the optimal value ofthe approximating problem min {c'x + Qv(x) 1x EX},
which depends on the approximation of (A(w), b(w), q(w») by (Av(w), bv(w),
qv(w»), measured by the (generalized) L 2 -norm. For any vector-valued function
we define
(!(g)= Jilg(w)11 2 dP""
Q
where II ... 11 is the Euclidean norm on IRk. In this connection (!(A) means that the
matrix A(w) is handled as an (m·n)-vector.
General error. There are constants 0(, y, bv such that
1Q(x) -Qv(x)l~ [O(+y II xii] (!(q -qv)+b v[(!(b -b v)+ II xii (!(A -Av)].
where "l<p is the gradient (or some subgradient) of <p. Using basic solutions, from
the former results we get
5. Computational Remarks 69
where {Bil iEJd and {Bil iEJ2 } are those bases out of W, which for some XEX
and wED are primal feasible and dual feasible respectively.
From Schwarz's inequality follows with
0 if b(w) -A(w)x=O
z(x w) =
' 1 b(w) - A(w)x
Ilb(w)-A(w)xll
e se
I
0 if qv(w)=O
and ri(w) = 1II qvi(W) else
qv(w) II
~ Maxll
iell
B i- 1·Z(X,w) II· (J(b -Ax)· (J(q-qv)
weD
it follows that
1 Q(X) -Qv(X)~ [1X+y II x II] (J(q -qv) +t5 v [(J(b -b v )+ II x II (J(A -Av)],
which was the hypothesis.
It must be mentioned that determining the constants IX and y leads in general
to a considerable amount of work, since it implies more or less the inversion of
all nonsingular (m x m)-submatrices of W. This difficulty diminishes rapidly
for certain special cases. Determining t5 v (or at least an upper bound) is not dif-
70 Chapter III. Two Stage Problems
ficult, since Bi l'ri(w) is a feasible u-part of the set {(u,q) I W'u;:£ q, II q 11;:£ I}, which
is bounded according to A. 1).
Assume simple recourse, i.e. W = (/, - J). Then
I Q(x) -Qv(x)1 ;:£ [e(b) + II xii e(A)]e(q -qv) + e(qv)[e(b -b v) + II x II e(A - A v)].
Namely W=(/, -1) implies that for every basis Bi out of W IIBi-1z(x,w)11 =
= Ilz(x,w)11 and, by definition, Ilz(x,w)ll;:£ 1. And for every iEJ2, Bi-1'ri(w)
is a feasible u-part of {(u,q)1 W'u;:£ q, II qll;:£ I}, which for
Obviously every feasible u satisfies II ull;:£ 1. Using these observations, we get for
the constants defined in the proof of the general error formula
ex;:£ e(b)
Y;:£ e(A)
bv;:£ e(qv).
For the general complete recourse problem we get out of the difficulties of
determining ex and y, if q(w) is constant, as is readily seen above. Moreover we can
weaken assumption A. 3) to integrability instead of square integrability.
Defining a generalized 4 -norm for vector valued functions as
lilQ(x,w)l;:£ ¥ax
Ie},
II Bl'qdl'll b(w) -bv(w) - [A(w) - Av(w)]xll
and therefore
then the discretization can be carried through with respect to the random
vector t yielding problems of a size which can be handled today.
It is obvious, then, that all above mentioned error estimates can be expressed on
e(t - t(V» or Jl(t - t(V»' where t(v) is a simple function.
Q(x,w)=min{q'yl Wy=b(w)-A(w)x,y2::0}
Defmition. Let K c lRm be convex and such that OElR mis an interior point of K.
Then QK(z)=inf{A> 01 ~ EK} is called the Minkowskifunctional of K (on IRm).
Theorem 24. Under the assumptions of Lemma 23 K={zlm(z):::;;l} is a convex
polyhedral set, which has 0 E IRm as an interior point.
Proof From the convexity theorem (Th. 0.7) we know that
1) m(z) is piecewise linear and convex on IRm and hence
2) continuous on IRm (Th. II. 12).
Therefore K is convex polyhedral and {zlm(z)< 1} c K is open and contains
OElRm due to Lemma 23 b). D
Theorem 25. m(z) is the Minkowski functional of the set K defined in Th. 24.
Proof The Minkowski functional of K is defined as
QK(Z)=inf{A> 01 : EK}
=inf{ A> 01 m( :):::;; 1} due to Th. 24
=inf{A> 01 m(z):::;;A} due to Lemma 23c)
=m(z). D
l>i=1.
i= 1
6. Another Approach to Two Stage Programming 73
and
we have
Wy=Z,
which implies that W is a complete recourse matrix.
Furthermore we have
r r
LYi=A L OCi=A.
i= 1 i= 1
Hence
QK(Z)=inf{A> 01 ~ EK}
=inf{A> 01 AV=Z, vEK}
=inf{t/il Wy=z,y;::::O},
is, in general, not a Minkowski functional, since Q(z) may be negative. But Q(z)
has still the following properties:
Lemma 27. a) Q(AZ)=AQ(Z) VZEIRm, VA> 0
b) Q(Zl +Z2):$; Q(Zl)+Q(Z2)
c) e(z) is continuous.
Proof a), b) are proved in the same way as in Lemma 23;
c) follows from the complete recourse and finiteness assumptions and the
convexity of Q(z). D
Lemma 28. There exist vectors 9iEIRm, i= 1, .. . ,r,
such that Q(z)=m~x{9;z}.
,
74 Chapter III. Two Stage Problems
and
z'gi:::;; e(z), i= 1, .. . ,r, where equality holds for at least one gi' 0
According to Lemma 28 we may rewrite Q(x) as
Q(x)= Je(b(w) -A(w)x )dPw = Jl~~:)g/ (b(w) -A(w)x )}dPw-
From this representation we may conclude an error estimate for the discretization
mentioned in Section 5 at least for the case when (A,b) has a finite probability
distribution and X is bounded.
Suppose that m is an interval in the (A, b)-space with p(m)= 1, and is partitioned
s
into intervals mj-i.e.m i nmj =0 and umj=m - such that, for d 1,d2 Em j ,
j= 1
II d 1 -d2 11 :::;;<5.
Let Pj=Pw ({wi (A(w),b(w») Emj} ).
For some XEX let (Ajk,b jk ) Emj , k=1,2, be such that
e(b j1 - A jlX):::;; e(b -Ax):::;; e(b j2 - A j2X), V(A,b) Emj .
Then,
S s
LPje(bjl -AjlX):::;; Q(x)= Je(b(w) -A(w)x )dPw :::;; LPje(b j2 -A j2X).
j=l j=l
which yields a linear program, as shown in sec. 5; we then get the error estimate
6. Another Approach to Two Stage Programming 75
s
I Q(x) -Q(x)1 :::;; LPj[Q(b j2 -A j2 x) -Q(bj1 -A j1 x)]
j= 1
s
:::;; LPXmaxllgil1 {b+bllxll }],
j= 1 I
is not at all trivial, if one does not want to determine all inverses of bases in W
For the special case where q> 0 and c'x~O on X, the boundedness assumption
on X is not restrictive, as is indicated by the following theorem, because there
then exists T> 0 such that e(z)~Tllzll, due to Lemma 28.
Theorem 29. Ifc'x~O for x EX and there is a realT> 0 such that e(z» T II.: II and
if
Pc,,({wl A(w)x=O})< 1 for every x#O, then there is a compact set IB such that
inf {c'x+Q(x)} = inf[c'x+ Q(x)}.
x Xn~\
Proof Let
<p(x) = JII A (w)x II dP",.
Then, from the assumption
it follows that
Hence <p(x) is a norm on IRn and there exists a real K> 0 such that
~TKllxll-TEpJbll·
For an arbitrary XEX and every x such that
Hence
inf{c'x+Q(x)1 XEX} =inf {c'x+Q(x)1 XEX, II xii ~r(x)}. D
~c'x+ I Q(b(w)-A(w)x)dPo,.
Proof By Lemma 28
Q(Z) =max g(z.
i
Choose gio so that
gio(b - Ax) = Q(b - Ax).
Then
Q(b - Ax) = Igio(b(w) - A(w)x )dPo,
~ I maxgi (b(w) - A(w)x )dPo,
I
= I Q(b(w)-A(w)x)dPo, VXEX
and, therefore,
c'x+ Q(b -Ax)~c'x+ Q(b -Ax)~c'x+ I Q(b(w) -A(w)x )dPo, VXEX,
In the special case, where only b(w) is random, we get the following
inequalities derived from A. Madansky [9]:
Theorem 31. Under the assumptions of Th. 30 and for deterministic matrices A and
c the following inequalities hold:
=:;c'x+ Se(b(w)-Ax)dPo,.
Scp(b(w) )dPo,;:::cp(b)
which is the first inequality of the theorem, whereas the other ones are trivial. D
As we know from Ch. I
is the expectation of the optimal value in the "wait and see" case, and
is the optimal value obtained by the two-stage model in the "here and now"
situation. M. Avriel and A.c. Williams [1] call the difference P-a the expected
value of perfect information (EVPI). From Th. 31 we get bounds for EVPI:
Then
.,1=2, b=7, c=2 and
cX+Q(b-Ax)=Min {cx+q+y+ I Ax+ y+ -y- =b,/ ;:::0, y- ;:::o}
XEX
+-21 Min{2x+
XEX
Y; 13x+ Y; -Y2 = 12,y; ;:::0, Y2 ;:::o} =
=~.2+l8=5<cx+ Q(b-Ax).
Hence the first inequality ofTh. 31 does not hold in this case.
Further
f3= inf {c'x+ JQ(b(w) - A(w)x )dPa,}
xeX
= · {2x+"2Y1
Mm 1 + Ix+ Yl+ -Yl-
1 + +"2Y2 = 2·,
1. Convexity Statements
Whereas two-stage problems, as we have seen in the last chapter, are rather well-
behaved from the viewpoint of optimization theory as far as convexity, continuity
and differentiability are concerned, this is in general not true for chance con-
strained programming problems. There are essentially two different versions of
chance constrained programs, namely either
min cp(x)
(1) subject to Pro ({wi A(w)x;:::':b(w)} );:::.:a
XEX
or
min cp(x)
(2) subject to Pro ({ wi A;(w) x ;:::.: bi(w)} );:::.: ab i = 1, ... , m
XEX
where Ai(w) indicates the i-th row of A(w) and b;(w) is the i-th component of b(w).
Given that cp(x) is a convex function and X is a convex set, the main question is
whether the sets
X(a) = {xl Pro ({wi A(w) x;:::.: b(w)} );:::.: a}
and
Xi(aJ = {xl Pro ({wi Ai(W)X;:::': bi(w)} );:::.: ad are convex.
The following example shows that the convexity of these sets cannot be
guaranteed in general.
distribution
To get
P({(a,b)lax;:::.:b} )
for a certain value of x E IR, we have to check which of the two constraints
is satisfied. Obviously
1
-3 yields
X<- -3xz -1 and 3x;j:: 2,
X(ct) is convex
empty
The following theorem is the only general convexity statement on X(ct) that can
be made disregarding the probability distribution ~.
Theorem 1. X(O) and X(1) (resp. Xi(O) and X i(1») are convex.
Proof a) X(O) = ]R.n.
b) Suppose that X(1)#0. For Xi EX(1), i= 1,2, define
Qi = {wi A(W)XiZb(w)}, i= 1,2.
Then
Pa,(QJ= 1, i= 1,2, and
Pa,(Q 1 n Qz) = 1 (see proof of Th. I1I.4),
and
for wEQ1nQ Z
A(W)XiZb(W)
and therefore
A(w) (Axl +(1-A)xz)zb(w) for O,:::;;A,:::;;1.
Hence for AE [0, 1 ]
Q .. ={wIA(w) (Axl +(1-A)X 2 )zb(w)} ~QlnQZ'
implying
Ax 1+(1-A)X ZEX(1) for AE[0,1]. D
Corollary 2. Given Pro there are real numbers lJ(o and IJ(? such that X (IJ() and Xi(IJ(J
are convex sets for 1J(~lJ(o and lJ(i~IJ(?
According to Example 1 and Cor. 2 one has to determine lJ(o resp. IJ(? for each
particular probability distribution p",. Among others who were concerned with
these problems, Marti [10] found the results Th. 3 and Th. 5.
Theorem 3. Let p", be a finite discrete probability distribution, i.e.
r
the sets X(IJ() and Xi(lJ(;) are convex for IJ(> IJ(O resp. lJ(i> IJ(?
Proof For N={l, ... ,r} and leN, l#N
LPiS: i-Pi '\IjEN-l
iEI
s: max (l-p)s:max(l-p).
JEN-/ JEN
Hence for 1e N
LPi> max(l-p)
iE/ JEN
implies
P",({wIA(w)x~b(w)})= I Pi~1J(
A(ro;)x;' bIro;)
and hence
P", ({ wi A(w)x~ b(w)} ) = 1;
and this implies X(IJ() =X(1), which is convex by Th. 1. 0
For finite discrete distributions, the condition IJ(> max (1-Pi) is, by Th. 3,
1515r
sufficient for convexity but not necessary, as may be seen in Example 1, where
X(IJ() is convex for IJ(>~, but mfx(1-p;)=~. However, the condition cannot be
weakened in general, as the following example demonstrates.
Example 2. Let p", be a discrete distribution so that
A(w 1 )= C
°-1) l' ( 1-1)
A(W2)= -2 -3 ' A(W3)= -1 (-1-1) 3
and
b(W 1 )=( -;) b(W 2)=( -2~) b(W3)=( -~).
82 IV. Chance Constrained Programming
If
K(Wl)={(~,1])ElR.zl~-1]~ -2; 1]~3}
then
Pro({wIA(w)x~b(w)})= L Pi,
iEl(x)
where
lex) = {il xEK(Wi)}'
Since
Here X(1)=K(w 1 )nK(wz)nK(w 3 ) is the triangle with the vertices (3,3), (5,3),
(4,4).
But for a=2. we get
4
X(i) = [K(wdnK(wz)]u [K(w z )nK(w 3 )]
and hence
In this example we have made use of the fact that max (l-p;) is not unique. If
1 :::; l:5 r
we have a discrete distribution such that minpi is uniquely determined, we may
I
uniquely determined. Then the sets X(IX) and Xi(lX) are convex for every 1X>1 -Pi"
where
mm Pi.
Pi,= ieN-{ia)
"!
Proof For leN we have
if I=N
L.Pi
iel =1
::;; 1 -Pia if io ¢I
::;;1-Pi, if j¢I j=Fio.
Hence
LPi> I-Pit implies I=:JN -{io }.
iel
With K(w;)={xIA(w;)x~b(wi)} it follows immediately that
S and hence S-1 are symmetric and strictly positive definite, and
Jl.i=mi=E(di)
aT = (S- 1)ii =E(di - Jl.;)2.
Suppose now, that
n n
Jl.n+1(x)=rn+1=mn+1- Lmixi
i= 1
and variance
1. Convexity Statements 85
Since S - 1 is positive definite, it is easily shown that Un + 1 (x) is convex in x (Un + 1 (x)
may be regarded as a norm of the vector (-Xl -X 2 · .. -xnl)').
Obviously
f,,+l-Jin+l(X)
8n + 1 (x)
has the standard normal distribution with mean value 0 and variance 1, whose
distribution function shall be called <Per). Now it is evident that
and
if and only if
Since Jin+ 1 (X) and un+ leX) are convex in x, this inequality describes a convex set
as long as <P-l(<Xi)~O, which is true for <Xi~ ~. 0
The following example shows that the convexity of Xi(<Xi) is in general not
maintained for <Xi< ~ .
Example 3. Suppose that a and b are independent random variables with normal
distributions such that
ml =E(a)=1; u!=E(a-ml)2=3
m2 =E(b)=2; u~ =E(b -m2)2 = 1.
Then
P(b-xa:S;;O)=<P( -(2-x)(1 +3x 2)-'2) =
1
1 <P( -6/7)
<P( -2)
<P(2/7)
for x=-4
for x=O
for x=4.
As a matter of fact, for xE1R the results X i(<Xi)=1R and X i(<xJ;i:1R with Xi(<Xi)
convex are also possible:
Example 4. Assume, as in Example 3, that a and b are independently normally
distributed such that m 1 = m2 = 1;
Then
<1>-I(OC)6'2(X) + Ji2(X) = - V x 2+ 1 + 1 -X{ ~g for
for
x<O
x~O
and hence X(oc)={xlx~O}i=IR.
Example 5. If under the same assumption as in Example 4
then
<1>-I(oc)6'ix)+Ji2(X)= -V4+2x 2+1-x<0 forall xE1R
and hence X(OC)= 1R.
It turns out that a result as in Example 4 is only possible in 1R.
Theorem 6. Suppose that the random variables
ail (w), ai2(w), . .. , ain(w), bi(w), where n> 1, have a joint (n + 1)-dimensional normal
!,
distribution. 110< OCi < then either Xi(OCi) = 1Rnor Xi(OCi) is a nonempty nonconvex set.
and Jin+ l(x)=m n+ 1 -m'x, if m'=(ml," .,mn). Suppose that Xi(OCi)i=1Rn and
xfj:Xi(ocJ Since n> 1 there exists a yE1Rn such that yi=O and m'y=O. If we
and
Hence
whereas
A result similar to Th. 5 was obtained by K. Marti [10] for a joint Cauchy
distribution.
Except in the case of a finite discrete distribution we have presented convexity
statements only for Xi(aJ, but not for X(a). This corresponds to the state of
research in the field, as long as the matrix A is supposed to be random.
However, for fixed matrices A - i.e. only the right hand side b(w) is random-
Xi(aJ is always convex, and the convexity of X(a) has recently been proved by
A. Prekopa [14] for a rather broad class of probability distributions, including
the normal distribution.
Theorem 7. Suppose that A is fixed and b(w) is random.
Then
Xi(aJ = {xl Pw ({wi Aix;;::bi(w)} );;:: aJ
is convex for every probability distribution of b(w).
Proof Let FJr:) be the distribution function of b/w).
Then xjEXi(aJ, j=1,2, if and only if Fi(Aixj);;::ab j=1,2. For x=.h l
+(1-A)x 2 ,AE(0,1), we get
Fi(Ai X);;:: Fi (min2 A ix j )
J= I,
If the probability measure P defined on 1R" has a density functionflx) (with respect
to the Lebesgue measure J.L on 1R"), the problem arises under which conditions on
f(x) the measure P is quasi concave. P is quasi concave if, for
2m:+(1-2)~={zlz=2x+(1-2)y,XEm:,YE~},
P(2m:+(1-2)~ )~min{P(m:),P(~)}
for all convex subsets m: and ~ of 1R" and all 2E(0,1).)
Obviously the distribution function of a quasi concave probability measure is quasi
concave.
We call a density function f(x) almost quasi concave, if for every aE1R" and
bE1R" such that a= -yb and y> O,flx)~min{j{x+a),f(x+b)} almost everywhere
with respect to J.L. Then we may state
Theorem 8. Let P be a quasi concave probability measure on 1R" with the continuous
density function f(x). Then f(x) is almost quasi concave.
Proof Suppose f(x) were not almost quasi concave. Then there exist
°
aE1R", bE1Rn, y> with a= -yb such that J.L({xlf(x) < min [f(x + a),f(x + b)]} » 0.
Thus, there exists a convex Borel measurable set R (for example a sphere) such
that
R c {x If(x) < min [f(x + a),f(x + b) nand J.L(R) > 0.
Then
F(Z1)= S f(x)dx=~
-00 . 124
1. Convexity Statements 89
Z2
F(Z2)= Sf(x)dx=~
- 00 124
F(z)= Sf(x)dx=
-00
1~~ < F(ZI)< F(Z2).
Hence the distribution function and consequently the probability measure is not
quasi concave.
With respect to sufficient conditions the strongest results known so far are due
to A. Prekopa [14]. He was concerned with logarithmic concave measures,
which due to their definition satisfy the inequality
P (22l+(1-WB )zP\2l)· pl-.«m)
for all convex subsets 2l and m of lRn and all 2E(0,1).
Obviously a logarithmic concave probability measure is also quasi concave. The
main result is based on
Theorem 9 (Prekopa's inequality). Let f and g be nonnegative Borel measurable
functions defined on lRn and let
r(t) = sup f(x)·g(y); tElRn,
.<x+(1-.<)y=t
where 2 is a constant, 0< 2< 1.
Then ret) is Borel measurable and the following inequality holds:
f 2(X)={fo(x.) if XEm
otherwise
For every xE22l+(1-2)m and every YE2l and ZEm such that 2y+(1-2)z=x,
in view of the convexity of Q(x), we have
f(x) = ye - Q(x) z ye - .<Q(y) - (! - '<)Q(z)
=f'<(y)fl-.<)(z),
90 IV. Chance Constrained Programming
implying immediately
where B;, i = 1, ... , r, are optimal bases of W (i.e. fulfil the simplex optimality con-
dition) and
i- 1
mi(x) = {wi B i- 1 (b(w) -A(w)x» O} -u mix).
j= 1
On the other hand we may sometimes use simple recourse problems to find
feasible solutions of chance constraints. Consider the special simple recourse
problem (qt = Q> 0, qi- =0)
m
Ij!(Q)=Min{c'x+
XEX
LQ
i=l
J
(b(w)-A(w)x),>o
(b(w)-A(w)x)idPw}
Suppose, on the other hand, that for some i there exist 8> 0 and J> 0 such that
for every e> 0 there is L\ > 0 for which
Fa, ({wi <'Pie+LI(W):2:8} ):2:J.
Then
where a=minc'x. This inequality contradicts the assumption that lim r/J(e) < 00.
XEX (l--+ 00
Now, if {e} is some sequence increasing to 00, there is a subsequence {ev} such
that <'Pie,' i= 1, .. . ,m, converge to zero almost surely, (i.e. almost everywhere with
respect to Fa,) and lim x(ev) = x* EX.
v-oo
Therefore
Fa, ({ wi lim <'Pie,(W) > O} )=Fa, ({wi (b(w) -A(w)x* )i> O} )=0, i = 1, .. . ,m,
v-oo
yielding
Fa,({wIA(w)x* lb(w)} )=Fa,(~ {wi (b(w) -A(w)x*); > a})
m
~ L Fa, ({wi (b(w) -A(w)x )i> O} )=0,
i= 1
According to this theorem one may try to get feasible solutions of chance
constraints by solving the parametric simple recourse problem mentioned above.
However, one should be aware of the fact that the theorem could only be proved
under the assumption that probability 1 could be attained.
References
[1] Avriel, M. and A.C. Williams: The Value of Information and Stochastic Programming.
Operations Research 18,947-954 (1970).
[2] Beale, E. M.: The Use of Quadratic Programming in Stochastic Linear Programming. RAND
P-2404, August 1961.
[3] Bereanu, B.: On Stochastic Linear Programming Distribution Problems, Stochastic Technology
Matrix. Z. Wahrscheinlichkeitstheorie u verw. Geb. 8, 148-152 (1967).
[4] Bereanu, B.: The Distribution Problem in Stochastic Linear Programming. The Cartesian
Integration Method. Reprint No. 7103, Center of Mathematical Statistics of the Academy of
the Socialist Republic of Romania, Bucharest (1971).
[5] Kall, P.: Qualitative Aussagen zu einigen Problemen der stochastischen Programmierung.
Z. Wahrscheinlichkeitstheorie u. verw. Geb. 6, 246-272 (1966).
[6] Kall. P.: Das zweistufige Problem der stochastischen linearen Programmierung. Z. Wahrschein-
lichkeitstheorie u. verw. Geb. 8, 101-112 (1967).
[7] Kall, P.: Some Remarks on the Distribution Problem of Stochastic Linear Programming.
Methods of Operations Research, Meisenheim, 16, 189-196 (1973).
[8] Leindler, L.: On a Certain Converse of Holder's Inequality. Acta Scientiarum Mathematicarum,
Szeged, 33, 217-223 (1972).
[9] Madansky, A.: Inequalities for Stochastic Linear Programming Problems. Management Sci. 6,
197-204 (1960).
[10] Marti, K.: Konvexitatsaussagen zum linearen stochastischen Optimierungsproblem. Z. Wahr-
scheinlichkeitstheorie u. verw. Geb. 18, 159-166 (1971).
[11] Marti, K.: Entscheidungsprobleme mit linearem Aktionen- und Ergebnisraum. Z. Wahrschein-
lichkeitstheorie u. verw. Geb. 23, 133-147 (1972).
[12] Marti, K.: Ober ein Verfahren zur Losung einer Klasse linearer Entscheidungsprobleme.
Z. Angew. Math. u. Mech., to appear.
[13] Pn!kopa, A.: Logarithmic Concave Measures with Applications to Stochastic Programming.
Acta Scientiarum Mathematicarum, Szeged, 32, 301-316 (1971).
[14] Pn!kopa, A.: On Logarithmic Concave Measures and Functions. Acta Scientiarum Mathema-
ticarum, Szeged, 34,335-343 (1973).
[15] Walkup, D. W. and R.J. B. Wets: Stochastic Programs with Recourse. SIAM J. App!. Math. 15,
1299-1314 (1967).
[16] Wets, R.: Programming under Uncertainty: The Complete Problem. Z. Wahrscheinlichkeits-
theorie 4,316-339 (1966).
[17] Wets, R.: Characterization Theorems for Stochastic Programs. Mathematical Programming 2,
166-175 (1972).
[18] Wessels, J.: Stochastic Programming. Statistica Neerlandica 21, 39-53 (1967).
[19] Kosmol, P.: Algorithmen zur konvexen Optimierung. OR-Verfahren, Band XVIII, 176-186
(1974).
[20] Kall, P.: Approximations to Stochastic Programs with Complete Fixed Recourse. Numer.
Math. 22, 333-339 (1974).
Vol. VIII Th. Marschak, Th.K. Glennan Jr., R. Summers: Strategy for R&D:
Studies in the Microeconomics of Development
Applications of Mathematics
Subtitles: Applied Probability, Control, Economics, Information
and Communication, Modeling and Identification,
Numerical Techniques, Optimization
Editors: A. V. Balakrishnan, W. Hildenbrand