A proof is given of Farkas's lemma based on a new theorem pertaining to orthogonal matrices. It is claimed that this theorem is slightly more general than Tucker's theorem, which concerns skew-symmetric matrices and which may itself be derived simply from the new theorem. Farkas's lemma and other theorems of the alternative then follow trivially from Tucker's theorem. KEY WORDS: Orthogonal matrices, Cayley transforms, linear programming, duality
1 INTRODUCTION Farkas's Lemma is one of the principal foundations of the theory of linear inequalities. It may be stated thus: Theorem 1.1 (Farkas's lemma). Let A 2 Rm n and b 2 Rm both be arbitrary. Then either (a) 9x 0 such that Ax = b; or (b) 9z such that AT z 0 and bT z > 0: Its importance stems from the fact that it is a typical theorem of the alternative that represents a large class of such theorems, theorems that constitute constructive optimality conditions for several important optimization problems (for more background information on duality, theorems of the alternative and other relevant matters see e.g. Gale 19] or Mangasarian 21]). The rst correct proof was published (in Hungarian) in 1898 although two previous incomplete proofs had appeared (also in Hungarian) in 1894 and 1896. Essentially the same three papers were subsequently published in German (in 1899, 1895 and 1899 respectively 14], 13] and 15]) but Farkas's best-known exposition of his famous lemma appeared (in German) in 1902 16]. Farkas's motivation came not from mathematical economics nor yet from pure mathematics but from physics; as a Professor of Theoretical Physics he was interested in the problems of mechanical equilibrium and it was these that gave rise to the
1
C. G. BROYDEN
need for linear inequalities. In this he was continuing the classical work of Fourier and Gauss, though Farkas claims 16] to be the rst to appreciate the importance of homogeneous linear inequalities to these problems. This claim, together with much other background and historical material, may be found in a paper by Prekopa 24] which gives a fascinating and readable account of the development of optimization theory, and from which the above brief outline was condensed. As be tting such an important result, many di erent methods of proof have been attempted. Farkas himself based his proof on an argument that would be recognised today as being similar to an intellectual implementation of the dual simplex method, the incompleteness of his earlier proofs being due to his overlooking the possibility of cycling (see 24]). Such algorithmic proofs are still popular today. If both the primal and dual linear programming problems have feasible solutions then the optimal objective functions of both are equal (the fundamental duality theorem, see e.g. 22], pp. 159-161). One way of proving this theorem is by showing that both primal and dual problems can always be solved by the simplex method provided that the appropriate steps are taken to overcome degeneracy. Farkas's lemma may then be deduced from the fundamental duality theorem 22]. An outline proof of the lemma in the same style is also given by Prekopa 24] who uses the lexicographical dual method to guarantee the integrity of his proof, and a very similar approach is adopted by Bazaraa, Jarvis and Sherali 2]. Other examples of algorithmic proofs are due to Bland 3], who avoids cycling by his least-index method and Dax 11], who obtains a simple proof of the lemma by applying an active-set strategy to a bounded least-squared algorithm, while proofs based on Fourier-Motzkin elimination are to be found in Chvatal 8] and Ziegler 27]. Another way of proving the lemma is based on some ideas from geometry. If S is a convex set in Rn and p a point, also in Rn ; then either p is in S or it is not, and if not then there exists a hyperplane H with p on one side of H and S on the other. This separation theorem gives a simple geometric interpretation of the lemma and proofs based on this idea appear in 18] and 1] (an earlier edition of 2]). Separation theorems can also be expressed as theorems of the alternative: either p is in S or there exists a separating hyperplane H etc. These proofs tend to be rather simple and seem to sidestep the problems of degeneracy which a ict the other types of proof discussed here, but may do so by sacri cing rigour. It is necessary in this kind of proof to establish that the set S = fAx j x 0g is closed (see e.g. Osborne 23] or Dax 9]) but this, the most di cult part of a geometric proof, is often taken for granted or glossed over (the author is grateful to a referee for drawing his attention to this weakness of geometric proofs). The same referee writes:
"The main point about the geometric approach is that it relates theorems of the alternative to duality. The minimum norm duality theorem says that the minimum distance from a point b to a convex set S is equal to the maximum of the distance from b to the hyperplanes that separate b and S. This way each theorem of the alternative is related to a pair of dual problems: A primal steepest-descent problem and a dual least-norm problem. This elegant feature is fundamental for applications.
It is this property that enables us to derive constructive optimality conditions under degeneracy. See, for example, Dax 9], Dax 10] or Dax and Sreedharan 12]". A third type of proof is purely algebraic and such proofs are due to, among others, D. Gale (unpublished), I. J. Good 20], A. W. Tucker 25] and S. Vajda 26]. Vajda's proof is based on the so-called Key Theorem: Theorem 1.2 (The Key Theorem). Let A 2 Rm n be arbitrary. Then there exist a vector x 0 and a vector z (not sign-restricted) such that Ax = 0; AT z 0 and x + AT z > 0: His proof is inductive, showing that if the theorem is true for an m n matrix then it is also true for an (m + 1) n matrix, and is far less formidable than it appears, most of the complication being due to the notation. It is by no means obvious, for example, that the proof of this theorem in 26] is based on simple oblique projections, but this fact leaps o the page if the same proof is expressed in the language of vectors and matrices. An alternative, non-inductive proof of this theorem is also given by Mangasarian 21]. Some of these proofs, too, are incomplete, the possibility of degeneracy again being overlooked just as in the original proofs of Farkas. The Key Theorem is used by Tucker 25] in the proof of his eponymous theorem, but conversely the Key Theorem may be deduced simply as a special case of this theorem, as is done in Section 3 (below). Moreover Tucker's theorem may be derived from Farkas's Lemma but may also be obtained directly, as was done by Broyden 4] whose proof, somewhat long and not particularly elementary, was based on a function resembling an irregular descending staircase. Finally Farkas's lemma may be derived simply from Tucker's theorem as is done in Section 3 (below). Thus in a very real sense these three theorems are equivalent. They resemble three cities situated on a high plateau. Travel between them is not too di cult; the hard part is the initial ascent from the plains below. The proof of our main theorem is similar in spirit to the proofs of Gale and Vajda, being both algebraic and inductive. The theorem may be stated thus: De nition 1. A sign matrix is a diagonal matrix whose diagonal elements are equal to either plus one or minus one, Theorem 1.3 (The Main Theorem). Let Q be an orthogonal matrix. Then there exist a vector x > 0 and a unique sign matrix S such that Qx = Sx: It may be equivalently stated as For any orthogonal matrix Q there exists a unique sign matrix S such that SQ has an eigenvalue unity with a corresponding strictly positive eigenvector. >From this it is simple to prove Tucker's theorem 25] and thence Farkas's lemma. Our main theorem is slightly more general than Tucker's theorem in the sense discussed below but despite this (or perhaps because of this) is perhaps simpler to prove. Again, much of the proof is taken up with the di culties caused by degeneracy, but degeneracy seems to be an inherent part of the problem and is
C. G. BROYDEN
di cult to avoid in theory even if in practice the presence of rounding errors makes its appearance unlikely. Since, at rst sight, there is no obvious relationship between our main theorem and Farkas's lemma we devote the next section to establishing this connection. The main theorem is proved in Section 3 together with some of its consequences while in the nal section the possible implications of the theorem are discussed. In this nal section we also discuss why, in the last analysis, the theorem is somewhat of a disappointment. 2 THE CAYLEY CONNECTION In this section we trace the connection between our principal theorem and Farkas's lemma and outline brie y the underlying motivation for the approach taken. This motivation stems from the desire to derive better algorithms for solving the general linear programming problem which may be expressed as: Let A 2 Rm n ; b 2 Rm and c 2 Rn all be arbitrary. Then the primal LP problem is: minimise cT x subject to Ax b and x 0: The dual becomes: maximise bT y subject to AT y c and y 0: Simple manipulation of these relationships then shows that if both x and y are feasible, i.e. satisfy the above inequalities, then cT x bT y: (1) Thus if feasible vectors x and y can be found that satisfy cT x = bT y then x and y must be optimal solutions of both the primal and dual problems. All this is standard and may be found in the chapter on duality in any textbook of linear programming. Now one way of de ning these optimal solutions is to ignore the optimisation aspects completely but to add to the existing system of inequalities the inequality cT x bT y; which can only be satis ed concurrently with inequality (1) if cT x = bT y: Thus if we can nd a vector u 0 such that Bu 0; where
2
B =4
and t > 1; u may be scaled so that t = 1 and not only will the inequalities Ax b; x 0; AT y c and y 0 have been satis ed but any feasible solution of this problem will be an optimal solution of the original problem. Again, all this is reasonably well-known, this approach being rst pursued by Tucker 25]. Thus if we can be sure that for any skew-symmetric matrix B we can nd a nonnegative vector u such that Bu is also non-negative then this vector will give the solution to the general LP problem, and the existence of such a vector is precisely what is guaranteed by Tucker's theorem which adds, for good measure, that u + Bu is strictly positive. However knowing that a vector u exists and determining it by practical computation are quite di erent matters, and the present author could
O A cT
AT c 3 O b 5; T b 0
x u =4 y
t
3 5
see no way of constructing an algorithm to determine u that did not resemble the standard methods of solving the general LP problem. He therefore looked for some way to involve the superb numerical properties of orthogonal matrices, and these are related to skew-symmetric matrices by the Cayley transform. The Cayley transform states that if B is skew-symmetric then Q (I + B) 1 (I B) is orthogonal. This transformation is always possible since the eigenvalues of I + B have the form 1 i j where j is real so that I + B is always nonsingular. Demonstrating that such a Q is orthogonal is sometimes set as an exercise in textbooks of matrix algebra but a simple proof follows from the observation that since skew-symmetric matrices have eigenvalues i j and their normalised eigenvectors form the columns of a unitary matrix, Q will have eigenvalues equal to (1 i j )=(1 + i j ) with the same eigenvectors. Its eigenvalues therefore lie on the unit circle in the complex plane and this is su cient to guarantee its orthogonality. An elementary discussion of the Cayley transform is given in 17]. The Cayley transform may be used as a step in solving the general LP problem by observing that Tucker's theorem for skew-symmetric matrices has an analogue, via the transform, that applies to orthogonal matrices and this analogue is a weak form of our main theorem. In the analogue the sign matrices are determined by the complementary pattern of zeroes and non-zeroes in the vectors u and Bu of Tucker's theorem, and a proof based on this theorem is to be found in 5] or 6]. The version of our main theorem so derived is weak because whereas to every skewsymmetric matrix B there corresponds an orthogonal matrix Q the converse is not true. There are matrices Q that are not Cayley transforms of any skew-symmetric matrix. This may be demonstrated by the following simple example. The most general skew-symmetric matrix of order 2 is 0 0 ; where is an arbitrary real scalar, and for which the Cayley transform yields 2 1 (2) Q= 1+ 2 1 2 1 2 2 : Since the diagonal elements of Q must have the same sign regardless of the choice of it follows that no 2 2 orthogonal matrix whose diagonal elements have di erent signs can be expressed as a Cayley transform. For this reason it was thought desirable to nd, if possible, an elementary proof of the full form of the main theorem and this is what we do in the following section. Moreover, since our main theorem is valid for all orthogonal matrices, not merely those that may be expressed as a Cayley transform, we claim that in this respect it is more general than Tucker's theorem. 3 THE THEOREMS
Proof of the Main Theorem. This is by induction. We show that if the theorem is true for an m-th order orthogonal matrix then it is true for an orthogonal
C. G. BROYDEN
and
P Q = qT r (3) where P 2 Rm m : We may assume that j j < 1 since if not, r = q = 0 and the induction step becomes trivial. Since QT Q = I; PT P + qqT = I; PT r + q = 0 rT r + = 1
2 1
(4)
T Q = P rq 1 +
2
Now from the induction hypothesis there exist positive vectors x1 and x2 and sign matrices S1 and S2 such that Q1 x1 = S1 x1 and Q2 x2 = S2 x2 so that, from equation (5), (6) xT S2S1x1 = xT QT Q1x1 = xT x1 1 2 2 (qT x1)(qT x2 ): 2 2 2 2 There are now two cases to consider. Case 1, S1 6= S2 : In this case, since x1 and x2 are strictly positive, xT S2 S1 x1 < xT x1 so that, 2 2 from equation (6), neither qT x1 nor qT x2 can be zero and both must have the same sign. De ne now 1 and 2 by
1
QT Q = I q 1 2
qT :
(5)
T = q x1 1 T = q+ x2 : 1
and
2
7
2
and z2 = x2
then
Qz = S x
1 1 1
and
Now since j j < 1; 1 < 0 and + 1 > 0: Hence, since qT x1 and qT x2 both have the same sign, one of 1 and 2 is positive and thus one of z1 and z2 is the required vector. This establishes the induction if S1 6= S2 . Case 2. S1 = S2 . In this case, xT S2 S1 x1 = xT x1 so that, from equation (6), at least one of qT x1 2 2 and qT x2 is zero. We may assume without loss of generality that qT x1 = 0 so that from equation (4) Px1 = S1 x1 and hence
b If z1 = x1 equation (7) may, from equation (3), be written Qz1 = S1 z1 where 0 b S1 = diag(S1; 1 ) and 1 = 1 is undetermined since the last elements of z1 and Qz1 are both zero. Now re-partition Q so that
Qz = S x
2 2
P r qT
= S10x1 :
(7)
where P1 2 Rm m and repeat the previous argument. If Case 1 applies the induction is established, if Case 2 then by analogy with equation (7) there exists a positive vector x2 (say) and sign matrix S2 such that 0 ; 0 = qT 1 1
T Q= r q P
1 1 1
r P
1
Sx
2
0 b b and if we de ne z2 by z2 = x and S2 = diag( 2 ; S2 ) then Qz2 = S2 z2 where 2 1 and again is not determined since in this case the rst elements of z2 and 2 = b Qz2 are both zero. Adding this last equation to Qz1 = S1 z1 then gives
1 2 1 1 2 2 1 2
b b Q(z + z ) = S z + S z : (8) Now for 2 j m; the j-th elements of both z and z are strictly positive so b that if, for any of these elements, the corresponding diagonal elements of S and b b b S are di erent, S z + S z < kz + z k. This leads, via equation (8) and the orthogonality of Q; to a contradiction so we infer that these elements must be the
1 2 1 1 2 2 1 2
and
8
1 2 1
C. G. BROYDEN
1 2 1 1 2 1 2 1 2
It will be noted that in Case 2 of the proof of the theorem, any strictly convex combination of z1 and z2 may be substituted for z1 + z2 without invalidating the proof so that the strictly positive vector pertaining to Q is not unique. In this case at least two non-negative vectors z (e.g. the z1 and z2 of equation (8)) exist such that Qz = Sz where the sign matrix is not fully determined. The linear programming equivalent of this case is a degenerate solution. As an example of the application of our main theorem, if > 0 in equation (2) then the x and S corresponding to the Q thus obtained are xT = ; 1 and 2 S =diag( 1; 1): The eigenvalues of Q are 1 1+ 22i while those of SQ are 1 with the eigenvector corresponding to +1 being the above strictly positive value of x: If = 0 we have the degenerate case and any strictly convex linear combination of the columns of the unit matrix will serve as x with S being equal to the identity. If < 0 then xT = 1; j j and S =diag(1; 1): Our next two theorems play no part in establishing Tucker's theorem and are included because of their strong connections with the previous theorem. Theorem 3.2. Let V 2 Rm n where m n and let VT V = I: Then there exist a permutation matrix P and a partitioning PV = V1 ; and strictly positive V2 vectors x1 and x2 ; such that T (a) V2 x2 = 0 T (b) V1 V1 x1 = x1 ; and T (c) V2 V1 x1 = 0; where either V1 or V2 may be vacuous, i.e. have no rows whatsoever. Proof. Since 2VVT I is orthogonal there exist, from the Main Theorem, a positive vector x and sign matrix S such that (2VVT I)x = Sx; (9)
b b b S = S : Equation (8) becomes Q(z + z ) = S (z + z ) and since z + z is strictly b positive and S is a sign matrix, this establishes the induction where S = S : For m = 1 the theorem is trivially true (Q and S are both equal to either +1 or 1) so by the induction argument it is true for all m 1; completing the proof of the existence of the vector x and the sign matrix S: To show that S is unique, assume the there exist two positive vectors x and x and sign matrices S and S ; where S = S ; such that Qx = S x and 6 Qx = S x : Then xT x = xT QT Qx = xT S S x ; but if S = S we have 6 xT S S x < xT x giving an immediate contradiction. S therefore is equal to to S and the sign matrix corresponding to a particular orthogonal matrix is unique, completing the proof. Corollary 3.1. If P is diagonally similar to some orthogonal matrix Q then 9x > 0 and sign matrix S such that Px = Sx: Proof. Let P = D QD where D is diagonal, let S be a sign matrix such that D = SD is non-negative and let Q = SQS: Since Q is orthogonal we have, from the theorem, Q x = S x for some x and S so that (D Q D )D x = S D x : Now P = D Q D and D x > 0; and the result follows.
1 2 1 2 1 2 1 1 1 2 2 2 2 1 2 1 2 2 1 1 1 2 2 2 1 1 2 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
VT x = VT Sx:
(10)
1 Let now P be a permutation matrix so that PSx = xx where x1 and x2 are 2 T PT Px = VT PT PSx or strictly positive. Equation (10) may then be written V
VT x + VT x = VT x
1 1 2 2 1
VT x
2
from which we obtain (a). Writing out equation (9) in partitioned form and substituting (a) then yields (b) and (c), completing the proof. Theorem 3.3. Let P be unitary. Then there exist positive vectors x and y and unique sign matrices S and T such that P(x+iy) = Sx+iTy: Proof. Let P = A+iB, where A and B are real. Then, as may straightforwardly B be shown, P is unitary if and only if A A is orthogonal so that if P is B unitary the Main Theorem yields
parts and making the appropriate substitutions from equation (11) then yields the theorem. We now prove Tucker's theorem, the Key Theorem and Farkas's lemma.
1
x = Sx (11) y Ty for some x; y; S and T: Expanding (A+iB)(x+iy); equating real and imaginary Theorem 3.4 (Tucker's theorem). Let A be an arbitrary skew-symmetric matrix. Then 9u 0 such that Au 0 and u + Au > 0: Proof. Since A is skew-symmetric then (I + A) (I A) is orthogonal so that, from the Main Theorem, there exist an x = xj ] > 0 and unique sign matrix S such
(I + A) 1 (I A)x = Sx: Premultiplying this equation by I + A and re-arranging gives Au = v; where u = uj ] = x + Sx and v = x Sx: Now uj is equal either to 2xj or zero so that u 0: Similarly v 0: But u + v = 2x > 0 and the theorem follows. that
A B B A
10
C. G. BROYDEN
Outline proof of Farkas's lemma. Apply Tucker to 2 O O A b 32 z 3 6 O O A b 76 z 7 6 7 76 T AT 4 A O 0 54 x 5 t bT bT 0T 0 and consider the two cases t > 0 and t = 0 with again z = z z : If t > 0 then the vector in the above system may be normalised so that t = 1 from which we obtain Ax = b; giving (a). If t = 0 then AT z 0 but, since t = 0; bT z > 0 which yields
1 2 1 2
(b). Other theorems of the alternative may also be derived simply from Tucker's theorem. For Gale's theorem we consider the system
2 6 6 4
and Gordan's theorem is a special case of Gale's theorem with b being equal to the vector of ones. Less simply, Motzkin's theorem is a special case of Tucker's theorem applied to a 7 7 block skew-symmetric matrix while Dax's recent theorem of the alternative is related to a special case of Tucker where the skew-symmetric matrix is of block-order six. All these theorems may be found in e.g. 10]. Examination of the skew-symmetric matrices for these theorems of the alternative O shows that they may all be expressed as MT M for some matrix M: It may O therefore be possible to use this to simplify the existing theory. Moreover, of all the theorems quoted above, none is based on a 5 5 block skew-symmetric matrix. Clearly there are further examples of this genre waiting out there to be discovered. 4 CONCLUSIONS In this paper we have given an alternative proof of Farkas's lemma, a proof that is based on a theorem, the main theorem, that relates to the eigenvectors of certain orthogonal matrices. This theorem is believed to be new, and the author is not aware of any similar theorem concerning orthogonal matrices although he recently proved the weak form of the theorem using Tucker's theorem (see 5]). His proof of the theorem is "completely elementary" (a referee) and requires little more than a knowledge of matrix algebra for its understanding. Once the theorem is established, Tucker's theorem (via the Cayley transform), Farkas's lemma and many other theorems of the alternative follow trivially. Thus the paper establishes a connection between the eigenvectors of orthogonal matrices, duality in linear programming and theorems of the alternative that is not generally appreciated, and this may be of some theoretical interest.
O AT AT bT
A A b O O 0 O O 0 0T 0T 0
32 76 76 54
z x x
t
3 7 7 5
1 2
11
>From other points of view, though, the theorem is a little disappointing although these disappointments stem from the nature of the problem that the theorem attempts to illuminate. The rst disappointment is that, due to the possibility of degeneracy (Case 2 of the proof) the proof is longer and more untidy than the author would have hoped. However this possibility seems to be an intrinsic part of the problem and has to be taken into account despite the resulting sthetic shortcomings of the proof. Secondly, and more importantly, the proof gives us virtually no help in constructing an algorithm to actually determine the sign matrix S; but this too is inherent in the nature of the problem. Although in Case 1 of the proof, the sign matrix of Q is shown to be virtually the same as that pertaining to one of the two smaller orthogonal matrices Q1 and Q2 ; the proof does not tell us which one to take. These two sign matrices can be quite di erent and it is generally not possible to obtain a simple derivation of the one from the other. It is thus not possible to base a constructive algorithm for determining S on the proof of the main theorem, which must be regarded as a pure existence proof. This, though, should not have been entirely unexpected. The LP-problem is, after all, quite di cult to solve even though polynomial algorithms (the interior-point methods) now exist for its solution (see e.g. 2]), and the ambiguity inherent in the proof is just an embodiment of this very di culty. We nally consider brie y the possibility of obtaining an algorithm for determining the sign matrix of a given orthogonal matrix based on the theorem itself rather than on its proof. Some such algorithms of an iterative nature were proposed recently by Broyden and Spaletta 6] but the convergence of these was slow and erratic and they fell a long way short of being competitive. Moreover it was di cult to obtain convergence proofs for them although a partial one was provided by Burdakov 7]. However it may yet be possible to construct such algorithms and the author suspects that if this is the case then any successful example would have more than a passing resemblence to the interior point algorithms, but only the passage of time will resolve this conjecture. Acknowledgements. The author thanks Professor Aurel Galantai of the University of Miskolc for his suggested improvements to early drafts of this paper and for providing some of the references. He also appreciates the positive and helpful comments of the two referees, from one of whose reports he has taken the liberty of quoting a paragraph verbatim.
REFERENCES 1. M. S. Bazaraa and J. J. Jarvis. Linear Programming and Network Flows. John Wiley and Sons, rst edition, 1977. 2. M. S. Bazaraa, J. J. Jarvis, and H. D. Sherali. Linear Programming and Network Flows. John Wiley and Sons, 2nd. edition, 1990. 3. R. G. Bland. A combinatorial abstraction of linear programming. J. Comb. Theory (B), 23:33{57, 1977. 4. C. G. Broyden. Skew-symmetric matrices, staircase functions and theorems of the alternative. In A. Prekopa, J. Szelezsan, and B. Strazicky, editors, System Modelling and Optimisation, Lecture Notes in Control and Information Sciences, pages 133{140, Berlin, 1986. SpringerVerlag.
12
C. G. BROYDEN
5. C. G. Broyden. Some LP algorithms. Technical Report 1/175, Consiglio Nazionale di Ricerca, June 1993. Progetto Finalizzato Sistemi Informatici e Calcolo Parallelo, Sottoprogetto 1: Calcolo Scienti co per Grandi Sistemi. 6. C. G. Broyden and G. Spaletta. Some LP algorithms using orthogonal matrices. Calcolo, 32(1-2):51{67, January-June 1995. 7. O. Burdakov. Private communication. 8. V. Chvatal. Linear Programming. Freeman and Co., 1983. 9. A. Dax. The relationship between theorems of the alternative, least norm problems, steepest descent directions and degeneracy: A review. Annals of Operations Research, 46:11{60, 1993. 10. A. Dax. A further look at theorems of the alternative. Technical report, Hydrological Service, Jerusalem, Israel, 1994. 11. A. Dax. An elementary proof of Farkas' lemma. SIAM Review, 39(3):503{507, September 1997. 12. A. Dax and V. P. Sreedharan. On theorems of the alternative and duality. JOTA, 94(3):561{ 590, September 1997. 13. J. Farkas. Uber die Anwendungen des mechanischen Princips von Fourier. Mathematische und Naturwissenschaftliche Berichte aus Ungarn, 12:263{281, 1895. 14. J. Farkas. Die algebraische Grundlage der Anwendungen des mechanischen Princips von Fourier. Mathematische und Naturwissenschaftliche Berichte aus Ungarn, 16:154{157, 1899. 15. J. Farkas. Die algebraischen Grundlagen der Anwendungen des Fourierschen Princips in der Mechanik. Mathematische und Naturwissenschaftliche Berichte aus Ungarn, 15:25{40, 1899. 16. J. Farkas. Uber die Theorie der einfachen Ungleichungen. Journal fur die reine und angewandte Mathematik, 124:1{24, 1902. 17. A. Fekete. Real Linear Algebra. Marcel Dekker, 1985. 18. R. Fletcher. Practical Methods of Optimization, volume 2. John Wiley and Sons, rst edition, 1981. 19. D. Gale. The Theory of Linear Economic Models. McGraw-Hill, New York, 1960. 20. R. A. Good. Systems of linear relations. Review of the Society for Industrial and Applied Mathematics, pages 1{31, 1959. 21. O. L. Mangasarian. Nonlinear Programming. McGraw-Hill, New York, 1969. 22. K. G. Murty. Linear and Combinatorial Programming. Robert E. Krieger Publishing Company, Malabar, Florida, 1985. 23. M. R. Osborne. Finite Algorithms in Optimization and Data Analysis. John Wiley and Sons, Chichester, 1985. 24. A. Prekopa. On the development of optimization theory. Amer. Math. Monthly, 87:527{542, Aug-Sep. 1980. 25. A. W. Tucker. Dual systems of homogeneous linear relations. In H. W. Kuhn and A. W. Tucker, editors, Linear Inequalities and Related Systems, pages 3{17, Princeton, New Jersey, 1956. Princeton University Press. 26. S. Vajda. Mathematical Programming. Addison-Wesley Publishing Company, 1961. 27. G. Ziegler. Lectures on Polytopes. Springer-Verlag, 1995.