You are on page 1of 7

Solution of a Satisficing Model for Random Payoff Games Author(s): R. G. Cassidy, C. A. Field, M. J. L.

Kirby Reviewed work(s): Source: Management Science, Vol. 19, No. 3, Theory Series (Nov., 1972), pp. 266-271 Published by: INFORMS Stable URL: . Accessed: 10/11/2011 23:56
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact

INFORMS is collaborating with JSTOR to digitize, preserve and extend access to Management Science.

the payoff is expressed in terms of the utility attached to the possible outcomes. The question of how one gambles in an optimal way is one which is open to considerable discussion and interpretation.. at least partly. CASSIDY. . November. Because the actual payoff on any play of the game depends not only on the row i selected by player I and the column j selected by player II but also on the sample point aij(w). Printed in U. Our approach. Hence the definitions of optimality which we use below are. in this paper. 19. the question arises as to what is meant by playing the game in an optimal way. . Given such a payoff matrix A. zero-sum game with m X n payoff matrix A = {aij}. KIRBY? In this paper. I. The problem reduces to solving a nonconvex mathematical progr amming problem. revised October 1971. They are in effect forced to gamble.MANAGEMENT SCIENCE Vol. Y) is the observed payoff to player I when he uses mixed strategy X (x1.. is to consider alternate optimality criteria for situations in which the payoffs are not necessarily given in terms of a utility function. G. yn). No. subjective definitions. and it can be shown [8] that a player's optimal strategy is the one which maximizes the expected value of this utility. where w is selected from the domain of aij according to the known marginal probability distribution of the random variable a j . In this theory. This technique can be justified in the context of utility theory as developed by von Neumann and Morgenstern.) and player II uses mixed strategy Y = (yi.T C. By noting that Z(X. Y) _ A).1) maxx miny P(Z(X. A model was developed in [2] based on a satisficing criterion of optimality in which a player maximized the probability of his winning a specified amount no matter what strategy his opponent used. this can be expressed as solving (1. 266 by ONR Contract N00014-67-A- . The actual payoff will be aij(w).. L. Mathematically. Y) is the random variable aij with probability xiyj. 1972 SOLUTION OF A SATISFICING MODEL FOR RANDOM PAYOFF GAMES*t R. = where Z(X. This linear problem results from maximizing the confidence with fixed payoff level. ? Dalhousie University. 3. In particular.A. A. FIELD? AND M. One of the most obvious methods of handling this type of game is to replace aij by its expected value and then solve the resulting deterministic game.1) can be * Received January 1971. where A is a random matrix with known distribution function F. ..S. The main result shows that solving this problem is equivalent to finding the root of an equation whose values are determined by solving a linear problem. The random variable aij represents the payoff from player II to player I when player I plays row i and player II plays column j. J. . the players cannot guarantee themselves a certain payoff level. NRC grant A 4024 and DRB grant 9701-18. Introduction In this paper we consider a two-person. it follows that (1. x. a player wants to maximize the payoff level he can achieve with a specified confidence. t The research underlying this report was partly supported 0126-0009. t Carnegie-Mellon University. we consider a "satisficing" criterion to solve two-person zero-sum games with random payoffs.

In [6]. For recent work and references.T. where a is specified by the player.2) can be derived directly by means of utility theory if we assume a utility function of the form = (x1. we mention stochastic games as introduced by Shapley. The results we obtain for solving (1.2) S.3) is the "inverse of the model (1. x) is a mixed strategy for player I and P(aij ? A) is the where X probability that the random variable aij is greater than or equal to a prescribed payoff level d.2) by considering the case in which a player wants to maximize his payoff level subject to the constraint that he receive that payoff with at least a specified level of confidence. Mathematically the problem can be expressed as max S. the randomness appears in the transition from one stage to the next and the solutions maximize the expected payoff. ZlixiP(aij_ > ) 571-1 Xi =1. he studies the case where the payoffs are vector valued and the objective is to maximize the long-run expected payoff. In ?III.T. Kirby and Raike in [3] where they study the problem max S. we develop an alternative to the satisficing criterion used in (1.1. the reader is referred to Pollatschek and Avi-Itzhak [9].3) is a nonlinear programming problem. A satisficing objective has been used by Charnes. O. u(K) =0 if K<. ( P(Z'7. Harsanyi [5] deals with problems in which the players have incomplete information about the game. and an algorithm for its solution is given in the following section. II. We suppose in this section tihat a > 0. given in Theorem 2. we mention briefly some related work. since other- . no matter what strategy his opponent uses. It is worth noting that (1. The result. Hou considers a random payoff game from the point of view of the distribution of the observations as we have done. subject to the constraint that he achieve the payoff level ( with at least a prescribed probability a. xi _ a a Vj.3) do not require independence of the random variables aij as was required in [3]. In these games. A miny P(Z(X. Thomas and David in [10] consider the distribution of the value of a random payoff game in terms of the distribution of the payoff matrix A. However.SOLUTION OF A SATISFICING MODEL FOR RANDOM PAYOFF GAMES 267 rewritten as the linear programming problem: maxx a (1. = 1 if K _A. a numerical example is studied. (1. x2. Vi) .=1xiaij > 3) ? a Vj. that the model (1. Y) > A) > a.2) " forms the basis of the algorithm. Before proceeding.T. In ?1I of this paper. A Model for Maximizing a Player's Payoff Level Suppose that player I wants to maximize his payoff level (. Finally.

Y) is the random variable ai1 with probability xiyj. C.2) S. = a. Then v. the definition of vl(. L.268 R. (2.B)> a Vj.1) C X(a. KIRBY wise the problem is unbounded. In order to examine the relationship that model (2. Since fij( ) is nondecreasing.2) has to (1. G. if we solve 1 We would like to thank the referee for pointing out a simpler proof of Theorem 2. J.) and the nonincreasing charv. Then a proof similar to that of (2b) shows COROLLARY 2. 0 if a' > a and Then (2a) and (2b) yield (2).(v(18)) > 13. . *) is nonincreasing for each a. say.our definitions give X(v(13). For (2). Since Z(X. Suppose that fij is strictly monotone for every i and j. xi > O. then v.Then and v. FIELD AND M.(a) be definedas above. X (a. = On the other hand. x E X(a. THEOREM 2. of the ai3's are strictly right monotoneat This result implies that when the conditions of Corollary 2.e. . . If the distributionfunctions go.2) we first define the following:' Let v(:) be the optimal value of a in problem (1.1A) # v(d) =sup Ia IX(a. Let X(a. v.) P 0} where 0 denotes the empty set. 1) = {x I Z== xifij(13) ?> a Vj and xi _ 0} 0 Vi and xi E== = l} where fij(1) --P(a*j _ 13). where Z(X. CASSIDY. Also.(v(1)) ? 13 acter of X(. V(ca)) v(vi (a) ) _ a. vi. f13> 132 implies X(a. For any specific value of 13. X(. a.T. 1) C X (a2. which is (1). 13). Hence (2a) v (v?(a) ) ? a.Hence.1.1 are satisfied.T. Y) _ ) ? a. Let v(A) and v. 1') # 0 if 1' < 13. with E't-IXifii(v(a)) Therefore. a' > a implies (2b) E. for any particular value of a. i. 1.2) with fixed payoff level 13. vi(a)) imply X(a'.2) with fixed confidence level a for 0 < a _? 1. = xi fii(v. miny P(Z(X. 13) is nonincreasing for each 1: ai > a2 implies X(ai .1.(a) = sup{13X(a. Z= Z=I xiP(ai1 Xi-1.=. This problem can be expressed mathematically as max 3 S. A. where a is a given constant 0 < a < 1. Y) is observed payoff to player I when he uses strategy X and player II uses strategy Y.Similarly let vi(a) be the optimal value of 13in problem (2.(v(1)) = 10. there is j j. V(v.(a)) = (1) (2) PROOF.... (2.(a)) < a so X(a'. vi(a)).(a)) $ 0 if a' < a.1. 12).1) is equivalent to max .

The optimal solution. by the definition of vi. 0. 0. j). j) represents the payoff level player I can guarantee if he is restricted to pure strategies and minj maxi C(i.6 for j = 1.315) are feasible.2).ath fractile points of the set of aij . strictly monotonedistributhe tion functions init interval [maxi minmC(i. In order to develop algorithms for solving (2. Oo) is feasible for (2. Using (ii) of Theorem 2.. The last part of the theorem follows by inotingthat x appears in both ( 1. 1) and a22 is N(0. THEOREM 2. 1). 1).. (3* is optimal for (2. then (x. .B.6 and n = 2.. a2l is N(2.2) wvith = (3*. miiij maxi C(i.03) + 0.This fact will be used in the solution algorithm given below.2. j) on a. j). mn) not even form a convex set. xl + x2 = 1. j)]. < .2) is also a global optimum. . a12 is N(1.2) onlv in the constraint set and the constraints are identical for both problems. if x* is an optimal strategy for (1.03) which is a convex combination of the feasible points a and b. In (2. -0.1. ( In addition.2). the problem is not only nonliniearin and (.2). () is also feasible for all ( ?( From this we conclude that every local optimum of (2. ('" We have that (3* = vi(a). X2. normally distributed with mean 0 and variance 1. 0.1. The following example exhibits this pathology: Example 2.P(aij ? () + X2P(a2j > (3) > 0. d) which satisfy x. The interval in which an optimal (3must lie is given by the following theorem: LEMMA 2.255) and b = (0. then it is an optimal confidencelevel a.8.2) wvith coanfidence a iff (* is a root of the equationv((3) .2) wvith PROOF.a = 0. j) is the 1 - ath fractile point of a1j . C(i. This result follows by nioting that maxi minj C(i. strategyfor (2. we obtain (3o the optimal value of .2) with this confidence level ao . . We note that if (x. Consider the point c = 1/2a + 1/2b = (0.2) with given confidencelevel a lies in the [maxi minj C(i.2P(a12 ? 0. 1.4. But by Corollary 2.6. The following theorem establishes the key result for the solution of (2. as and then solve (2.j) = sup {y: P(aij > y) > a}. the In our notation we have suppr-essed dependenceof C(i. The feasible region consists of the points (X1. 2.1.6 and hence c is not feasible so that we have a noniconvexfeasible region. j)] then (* is a solution level of (2.8P(a22 > 0. it follows that v((*) = v(vi(a)) = a. vi(v((*)) = * Hence vi(a) = (3*as required. j) is the level which player II can prevent player I exceeding by using pure strategies. and (2. 2.2.e.SOLUTION OF A SATISFICING MODEL FOR RANDOM PAYOFF GAMES 269 (1.2) we first show that the optimal S must lie in an interval determined by the 1 . interval wvhere C(i. But 0. 0. minj maxi C(i.2) let a = 0.2). That is. Because of the form of the constraints of (2. a = v((*) implies that vi(v(*)) = vi(a). but also the region of feasible solutions may the variables xi (i = 1.2) with confidence level a if and only if (* = vi(a).03) = 0. We assume that all is N(O.1. We note that. It is straightforward to verify that the points a = (0. 1) i. PROOF.2) with a fixed payoff level fi and obtain a corresponding optimal solution ao.5855 < 0. (3* to (2. If the random variables aij have continuous.

v(a) .2.2) with confidencelevela iff A* = max [6 v(A) = a].a numerically. the variation can be characterized in terms of discrete jumps corresponding to basis changes. the game is not deterministic. Examples We consider a situation in which two competing firms are marketing the same product.4. We note that. We assume that the demand for the product is fixed and that a consumer in this market will buy the product from one of the two firms.1. since the response of the consumer to a given course of action by each of the firms is not predictable with certainty. PROOF. Thus the number of consumers gained by firm I when . j). p. ". If the random variables aij have continuous distributionfunctions over the interval [maxi minj C(i. j)] then d* is a solution of (2. the techniques of parametric programming apply to determine the maximum a such that v(a) = ax. under the conditions of Theorem 2. j). visual advertising. Since the values of v(A) are determined by solving a linear programme. minj maxi C(i. An algorithm for this case would require not only that we find a root of v([) . KIRBY The implications of this theorem are that. However.2).2) is a maximization problem in A. G. Since there are a number of techniques available in the numerical analysis literature. This follows by observing that v(Sl) is the value of the game with payoff matrix {P(aij _ 3)} which is monotone decreasing and continuous as a function of d. Thus the mixed strategy X can be used by firm I to determine how to divide its budget among its various marketing alternatives. C. III. The following result is now true: THEOREM 2. it suffices to give a technique for finding the root of a continuous. "=" A* optimal implies that v(3*) = a as in Theorem 2. J.#" The definition of d* implies that v(3* + e) < v(O*) for all e > 0. 56]. For the calculations in the example of ?111 we use the method of "regulafalsi" (see Collatz [4. for example. in order to solve (2.a where the values of the function are determined by solving a linear programme. j)].2. since (2.) Therefore in order to specify an algorithm. we see that this is the result we need to guarantee that vl(v(f*)) = A*. ?18. minj maxi C(i.) . free samples etc. A first approach would be to determine whether any of the distribution functions occurring in the basis are flat.a is a strictly monotone decreasing continuous function of A on [maxi minj C(i. then it may be possible to increase the value of the root parametrically until the value of the linear programme is less than ax. The result follows as in Theorem 2. (See [1. From the proof of Corollary 2. we are in the situation of a constant sum game. The goal of each firm is to choose from among its possible courses of action that one which will attract as many customers as possible. monotone decreasing function. We now consider briefly the case in which the distribution functions of the aij are simply continuous. it suffices to find a root of the function v(A) . If so. L. Using known results from linear programming. we do not specify any details here.3. but that we find the maximal root. Since the consumers gained by one firm can be thought of as lost to its competitor. CASSIDY.270 R. It is then clear that d* must be the maximal root of v(A) . Each course of action is identified as a type of marketing expenditure.a. A. FIELD AND M.1-2]) to determine the root of v(.a.

pp. Starting with /1 = 0. Since 0. 399-415. 789-813. "Contributions to the Theory of Games. 13 (1968). 7.75 where C(i. Interpretation. "Solutions of Discrete Two Person Games." Ann.. "Games with Incomplete Information Played by 'Bayesian' Players. the optimal A lies between maxi minj C(i. AND TUCKER. Princeton Univ." Maragement Science. Princeton. j) is the 60th percentile of aij .000 with 60 % confidence. dissertation. New Jersey.6 (see [4. Princeton University Press. 1950. M. F. L. 51-72. In order to solve the problem numerically.6. 159-182. AND KARLIN. a2l is N(2. 8. 0. N. F." Ann. COLLATZ. 38 THOMAS. L. Vol.. W. Hence the optimal strategy for firm I is X* = (0. Vol.6). "Algorithms for Stochastic Games with Geometrical POLLATSCHERK. Solving (1." Ann of Math. W. BOHENBLUST. 14 (November 1967). Stat. j) = -0. . H. firm II. For simplicity we assume that each firm has only two marketing alternatives.698 > 0. Canada. We assume that firm I wants to attract as many customers as possible with a confidence level of 60 %/o a = 0.1. In addition. 663-681. OWEN. A. A. 7 (March 1969).. 9. R. Nova Scotia.... pp.01 of the prespecified ao = 0." Ph. pp. B. CASSIDY. Ann. S. 242-250.2).315. we obtain X = (0. AND RAIKE. of Math. tion with mean A and variance o..6045 which is within the specified tolerance region. 15.1-2]). 1). AND DAVID.. The Basic Model. 1) and a22is N(O.j = 1.6. of Math. Philadelphia. so that for firm I. From past statistical evidence the two firms determine that the aij are distributed as follows: all is N(O. Vol.4) yielding a payoff level of $315. I-III.4) and v(f2) = 0. 2. pp. we use (or the method of "regula falsi" to determine the root of v(A) . Academic Press. j) = 0. ?18. 1950. Vol. B. KUHN. Part I. pp. this indicates that the optimal a is greater than /1 . 1969. 0. W. we note that by Lemma 2. "Game Value Distribution. Saunders Co..6. To proceed. M. Stat." Management Science.D.. Studies 24. uses its jth alternative is a random variable aij . A. and solving (1.. G. of Math. "On Random Payoff Games.0. 1). R. J.. "Weak Approachability in a Two Person Game. HARSANYI. M.SOLUTION OF A SATISFICING MODEL FOR RANDOM PAYOFF GAMES 271 it uses its ith alternative and its competitor. "Zero Zero Chance Constrained Games. Hou.25 and minj maxi C(i. Princeton. Press. H. References 1. 10.698.6. 5. 1966. New York. Functional Analysis and Numerical Mathematics.2) with 02 = 0. T. pp... Game Theory. we specify that : is optimal when it determines a value v(A) which is within e = 0.315..We suppose the aij are expressed in units of hundred thousands. H. T. Dalhousie University. J. 2. An iteration of the method of "regula falsi" yields a value 02 = 0. o-2) refers to the normal distribu2. No. 2 and for firm JJ. AND AvI-ITZHAK. G. 3. Vol. i = 1.. W. Halifax. Studies 24." (1967). 1) where N(. D. 6. KIRBY. 40 (1969). Theory of Prob. Appl. J.. 4. a12 is N(1." CHARNES. the result is v(fl) = 0.