You are on page 1of 10

Systems Research and Behavioral Science Syst. Res.19, 281^290 (2002) DOI:10.1002/sres.

435

&

Notes and Insights

Non-Trivial Solutions to the N-Person Prisoners Dilemma


Miklos N. Szilagyi* and Zoltan C. Szilagyi
Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA

We have developed a new agent-based simulation tool to model social dilemmas for the case of a large number of not necessarily rational decision-makers (Szilagyi and Szilagyi, 2000). The combination of various personalities with stochastic learning makes it possible to simulate the multi-person Prisoners Dilemma game for realistic situations. A variety of personality proles and their arbitrary combinations can be represented, including agents whose probability of cooperation changes by an amount proportional to its reward from the environment. For the case of such agents the game has non-trivial but remarkably regular solutions. We discuss a method and present an algorithm for making accurate advance predictions of these solutions. We also propose our model as a viable approach for the study of populations of cells, organisms, groups, organizations, communities, and societies. It may lead to better understanding of the evolution of cooperation in living organisms, international alliances, sports teams, and large organizations. Copyright # 2002 John Wiley & Sons, Ltd.

INTRODUCTION Prisoners Dilemma is usually dened between two players (Axelrod, 1984) and within game theory that assumes that the players act rationally. Realistic investigations of collective behavior, however, require a multi-person model of the game (Schelling, 1973). In such a game, each rational player will choose defection and, as a result, cooperation will not be learned unless special efforts are applied. The game can be played in two ways: as a round robin where each player is paired with every other in
* Correspondence to: Miklos N. Szilagyi, Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85721, USA.

successive iterations so that interaction between the pairs is repeated (Rapoport et al., 1995) or as a group game where players remain unidentied (Franzen, 1995; Marinoff, 1999). Various aspects of the multi-person Prisoners Dilemma game have been investigated in the literature (Bixenstine et al., 1966; Weil, 1966; Kelley and Grzelak, 1972; Hamburger, 1973; Anderson, 1974; Bonacich et al., 1976; Goehring and Kahan, 1976; Dawes, 1980; Heckathorn, 1988; Liebrand et al., 1992; Huberman and Glance, 1993; Komorita and Parks, 1994; Schulz et al., 1994; Schroeder, 1995; Szilagyi, 2000). Formal models have been proposed to simulate collective phenomena (Oliver, 1993). Some of the models include computer simulation. The vast majority of published results, however, are
Received 1 September 2000 Accepted 13 March 2001

Copyright # 2002 John Wiley & Sons, Ltd.

NOTES AND INSIGHTS devoted to two-agent games only, especially to the two-agent iterated Prisoners Dilemma game. Papers on computer simulation of larger societies have started to appear only recently. Feinberg and Johnson (1990) simulated the effects of alternative strategies on achieving consensus for action. A stochastic learning model was developed by Macy (1991) to explain critical states where threshold effects may cause shifting of the system from a defective equilibrium to a cooperative one. A computer simulation of temporary gatherings was presented by McPhail et al. (1992). Glance and Huberman (1993) used a thermodynamical model to investigate outbreaks of cooperation in a social system. Nowak and May (1992) and Lloyd (1995) wrote simple computer programs that demonstrate the dynamics of deterministic social behavior based on pair-wise interactions between the participants. Epstein and Axtell (1996) demonstrated that it is possible to build complex articial societies based on simple participating agents.

Syst. Res. important to investigate the role of personalities in Prisoners Dilemma. The psychological literature on the impact of personalities in social dilemmas is summarized in Komorita and Parks (1994). It is possible but not easy to quantify personality proles in the traditional psychological sense. As the rst step in this direction, we will use the term personality in the sense of decision heuristics in this work. This is a rather primitive approach but it is still much better than the unjustied assumption of universal rationality. The cellular automaton format describes the environment in which the agents interact. A cellular automaton is a discrete dynamic unit whose behavior is specied in a simple way in terms of its local relation to the behavior of its neighbors; i.e., the behavior of each unit depends on its own and its neighbors states. Experiments with cellular automata models conrmed that even trivial, deterministic rules can produce extremely complicated and unforeseeable collective behavior (Nowak and May, 1992). Stochastic learning rules provide more powerful and realistic results than the deterministic rules usually used in cellular automata. Stochastic learning means that behavior is not determined but only shaped by its consequences; i.e., an action of the agent will be more probable but still not certain after a favorable response from the environment. Stochastic learning was introduced into the investigation of the twoperson Prisoners Dilemma by Rapoport and Chammah (1965) and used by some investigators since (Kraines and Kraines 1989, 1993, 1995; Macy, 1991). The combination of various personalities with stochastic learning makes it possible to simulate human-like behavior that is generally unpredictable but some predictions are possible if one knows the agents personalities. A stochastic learning automaton is a unit characterized by probability distributions for a number of possible actions. The units are connected to the stochastic outside environment. A stochastic reward/penalty is the only input that the units receive from the environment. The probabilities of the agents actions are updated by the reward/penalty received from the
Syst. Res. 19, 281^290 (2002)

THE MODEL We have developed an agent-based model for the investigation of social dilemmas with a large number of decision-makers operating in a stochastic environment (Szilagyi and Szilagyi, 2000). Our model has three distinctive new features: (1) It is a general framework for inquiry in which the properties of the environment as well as those of the agents are user-dened parameters and the number of agents is theoretically unlimited. (2) The agents have various distinct, userdened personalities. (3) The participating agents are described as stochastic learning cellular automata, i.e., as combinations of cellular automata (Wolfram, 1994) and stochastic learning automata (Narendra and Thathachar, 1989). Biological objects and even human beings are rarely rational. It seems to us that human behavior can be best described as stochastic but inuenced by personality characteristics. In view of this hypothesis, it becomes crucially
Copyright 2002 John Wiley & Sons, Ltd.

282

Miklos N. Szilagyi and Zoltan C. Szilagyi

Syst. Res. environment based on their and other agents behavior. Actions are taken according to these probabilities. The outputs of the stochastic environment are inuenced by the actions of all participating units whose existence may not even be known to the other units. Behavior is learned by adjusting the action probabilities to the responses of the environment. The learning capability alters the agents behavior as they make repeated decisions. The aggregate behavior of the society of agents usually converges to a stable or oscillating state. In our model, the participating agents are combinations of the two types of automata described above with distinctly dened personalities (in the sense discussed above). They are stochastic learning automata directly connected to their neighbors. A realistic simulation model of a multi-person game must include a number of parameters such as the size and shape of the simulation environment, the pay-off functions, updating schemes for subsequent actions, and the denition of neighborhood. Our simulation environment is a two-dimensional array of the participating agents. The size and shape of the simulation environment are user-dened variables. Its size is limited only by the computers virtual memory. The behavior of a few million interacting agents can easily be observed on the computers screen. A linear environment that consists of at least two participants is a special case that makes the investigation of two-agent games available. The size of the neighborhood is also a userdened variable. It may be just the immediate neighborhood or any number of layers of agents around the given agent. In the limiting case, all agents are considered neighbors, and they collectively form the environment for each participating agent. In this case, the neighborhood extends to the entire array of agents. The number of neighborhood layers around each agent and the agents location determine the number of its neighbors. The depth of agent As neighborhood is dened as the maximum distance, in the horizontal and vertical directions, that agent B can be from agent A and still be in its neighborhood. We do not wrap around the boundaries;
Copyright 2002 John Wiley & Sons, Ltd.

NOTES AND INSIGHTS therefore, an agent in the corner of the array has fewer neighbors than one in the middle. The pay-off (reward/penalty) functions are given as two curves: one (C) for a cooperator and another (D) for a defector. The D curve is always above the C curve but C(1) is higher than D(0); i.e., it is always advantageous to defect but if everyone cooperates everyone is better off than if they all defect. The pay-off to each agent depends on its choice, on the distribution of other players among cooperators and defectors, and also on the properties of the environment. The pay-off curves are functions of the ratio of cooperators to the total number of neighbors (Figure 1). A stochastic factor that is multiplied by a random number can be specied to simulate stochastic responses from the environment. Zero stochastic factors mean a deterministic environment. The freedom of using arbitrary functions for the determination of the reward/penalty system makes it possible to simulate a wide range of social situations, including those where the two curves intersect each other (Schelling, 1973). The agents take actions according to probabilities updated on the basis of the reward/penalty received for their previous actions and of their personalities. The updating scheme may be different for different agents. This means that agents with completely different personalities can be allowed to interact with each other in the same experiment. Agents with various personalities and various initial states and actions can be placed anywhere in the array. The response of the environment is inuenced by the actions of all participating agents. The specic probability updating schemes depend on the agents personalities. The probability update curve is user-dened. It species the change in the probability of choosing the previously chosen action based on a number of factors. Such factors include the reward/penalty received for that action, the history of rewards/ penalties received for all actions, the agents neighbors actions, etc. A variety of rational and irrational personality proles and their arbitrary combinations can be represented. The model in its present form allows for the following personalities:
Syst. Res.19, 281^290 (2002)

Non-Trivial Solutions to the N-Person Prisoners Dilemma

283

NOTES AND INSIGHTS

Syst. Res.

Figure 1. Reward/penalty functions for defectors (D) and cooperators (C). The x-axis represents the ratio of cooperators to the total number of neighbors; the y-axis is the reward/penalty provided by the environment. The thickness of each curve is determined by the stochastic factor

(1) The Pavlovian agent whose probability of cooperation p changes by an amount proportional to its reward from the environment (the coefcient of proportionality is the learning rate that is a user dened parameter). (2) The stochastically predictable who cooperates with a given probability p. Special cases: (a) the benevolent who always cooperates (p 1), (b) the angry who always defects (p 0), and (c) the unpredictable who acts randomly (p 0.5). (3) The accountant whose probability of cooperation depends on the average reward for its previous actions. (4) The conformist who always imitates the action of the majority. (5) The greedy who always imitates the neighbor with the highest reward. All these personality types represent interesting and more or less realistic models of actual
Copyright 2002 John Wiley & Sons, Ltd.

behavior. In the present study, we will use just one of them that we call the Pavlovian agent. Most biological objects modify their behavior according to Pavlovs experimental studies formulated by Thorndikes (1911) law of conditioning: if an action is followed by a satisfactory state of affairs, then the tendency to produce that particular action is reinforced. We conclude that our Pavlovian agents are the most realistic candidates for the investigation of the evolution of cooperation. These agents are primitive enough not to know anything about their rational choices. However, they have enough intelligence to follow Thorndikes law. We will show below that it is possible to accurately predict the solutions of the multi-person Prisoners Dilemma for such agents. It is also very important to properly describe the environment. In our model, even in the almost trivial case when both pay-off curves are straight lines and the stochastic factors are both zero, four parameters specify the environment. Attempts to describe it with a single variable (Komorita, 1976; Nowak and May, 1992) are certainly too simplistic.
Syst. Res. 19, 281^290 (2002)

284

Miklos N. Szilagyi and Zoltan C. Szilagyi

Syst. Res. Our model is implemented in the simulation tool called Dilemma. It allows the user to draw each agent as a rectangle of a given size. If the experimenter is working with relatively few agents, he may wish to have more than one pixel on the screen represent each agent, so that the agents are more clearly visible. The state and action of each agent in the array change with time. One unit of time is called an iteration. The experimenter sets up the array at iteration zero. Thereafter, the experimenter can view and record the evolution of the society iteration by iteration. For convenience, Dilemma can be told to run for a given number of iterations. There are two actions available to each agent, and each agent must choose exactly one action: cooperation or defection. The updated probabilities lead to new decisions by the agents that are rewarded/penalized by the environment. With each iteration, the software tool draws the array of agents in a window on the computers screen, with each agent in the array colored according to its most recent action. The experimenter can view and record the evolution of the society as it changes in time. Dilemma allows the experimenter to use any neighborhood depth. In the current version, each agent has the same neighborhood depth. All of the personality types described in the model are supported by Dilemma. Agents with various personalities and various initial states and actions, as well as various parameters for the state updates associated with their respective personalities, can be placed anywhere in the array. This means that different agents with completely different personalities can be allowed to interact with each other in the same experiment, resulting in observations of how the various personality types respond to one another. Dilemmas design even provides for the addition of entirely new personality types with minimal programming. When Dilemma is run, it displays two windows, labeled graphics output and status display. The rst window contains blocks that represent the agents in the simulation and their most recent actions. The user can ask for a fourcolor display, which shows the current and
Copyright 2002 John Wiley & Sons, Ltd.

NOTES AND INSIGHTS previous action for each agent, or a black-andwhite display, which shows only the current action for each agent. The second window provides more detailed textual information about individual agents. The experimenter selects the agent to be examined in detail by pointing at it with the mouse. This information includes the agents coordinates in the array, the agents two most recent actions, the agents personality type, the last reward or punishment that the agent received, and the agents current probability of cooperation. As the mouse pointer moves and as iterations are run, both windows are continuously updated. When the experimenter stops the simulation, the history of aggregate cooperation proportions for each iteration is presented as a list of numerical values as well as an automatically generated plot. Dilemma is implemented in object-oriented C. It is portable and currently runs on both Unix (with the X Window system) and Win32based operating systems (Microsoft Windows 9x and Windows NT). Dilemma is Copyright 1998 Arizona Board of Regents on behalf of The University of Arizona. All rights reserved. It is available from the authors by emailing to mns@ece.arizona.edu. Use is subject to certain restrictions, terms of which are provided with the software.

SOLUTIONS TO PRISONERS DILEMMA The simulation of Prisoners Dilemma is an iterative process. Aggregate cooperation proportions are changing in time, i.e., over subsequent iterations. As we have come to the conclusion above that the Pavlovian agents are the most realistic candidates for the investigation of the evolution of cooperation, let us assume that in a society of N Pavlovian agents there are x cooperators and (N x) defectors distributed randomly over the lattice at a certain time. (The neighborhood extends to the entire array of agents.) Then xC (N x)D is the total pay-off received by the entire society and [xC (N x)D]/N is the average pay-off to a single agent where C and D are the reward/penalty functions as dened
Syst. Res.19, 281^290 (2002)

Non-Trivial Solutions to the N-Person Prisoners Dilemma

285

NOTES AND INSIGHTS earlier. This latter quantity is the so-called production function for the collective action of the society (Szilagyi, 2000). When the average pay-off is zero, it is easy to think that nothing will happen and an equilibrium is reached. This is, however, not true. Indeed, this situation can only happen if either C D 0 or C and D have opposite signs. The rst case means the two curves are crossing, which contradicts the denition of Prisoners Dilemma. In the second case evidently D is positive and C is negative; therefore, the defectors are rewarded and the cooperators are punished. As a result, the number of cooperators will decrease and we do not have equilibrium. Let us investigate what happens when the cooperators receive the same total pay-off as the defectors, i.e., xC (N x)D. This may happen if C and D are both negative or both positive. In the rst case, a small number of cooperators are punished big and a large number of defectors are punished little. This leads to a stable equilibrium at this point. In the second case, a large number of cooperators are rewarded slightly and a small number of defectors are rewarded greatly. This point corresponds to an unstable equilibrium. If C and D are both linear functions of x, then xC (N x)D is a quadratic equation; if C and D are quadratic functions, then it is a cubic equation, etc. The equation generally has up to two real solutions. If both solutions are in the interval 0 < x/N < 1, then both equilibria are present. We will call these equilibrium solutions x1/N p1 and x2/N p2, so that 0 < p1 < p2 < 1. The initial cooperation probability (which is set as a constant and uniform across all the agents) is p0. Let us consider the pay-off curves shown in Figure 1 and explained in the previous Section. Suppose rst that p0 < p1. Then there are few agents cooperating and many agents defecting. Those agents that happened to cooperate will be heavily punished, and their probability of cooperation will consequently go down substantially. As a result, some of the cooperators will become defectors. The agents that happened to defect will be punished somewhat, and their probability of cooperation will consequently go
Copyright 2002 John Wiley & Sons, Ltd.

Syst. Res. up. Because there are so many more defectors than cooperators, the aggregate effect is that the overall cooperation probability goes up toward p1. If p1 < p0 < p2, three regions must be distinguished. Near the lower limit, the defectors and the cooperators are still both punished but there are more cooperators now who become defectors; therefore, the aggregate effect is that the overall cooperation probability goes down toward p1. If the value of p0 is chosen higher, we are in the region where the cooperators are punished but the defectors are rewarded. As a result, more will defect and the aggregate probability of cooperation again goes down toward p1. When the value of p0 is even higher, the defectors and the cooperators are both rewarded, but the defectors are rewarded more than the cooperators, so the proportion of cooperators will decrease and an equilibrium will be reached in this region or, if the aggregate probability reaches the region of mutual punishment, the equilibrium will occur at p1 again. The two cases above work together to keep the long-term aggregate cooperation proportion stable at p1. However, since none of the agents are rewarded for always taking the same action (always cooperating or always defecting), the probability of cooperation for an individual agent varies according to the agents own history of actions (and hence rewards). Over the long term, every single agent acquires a distinct cooperation probability depending on its own history of random actions. The amplitude of the aggregate oscillation depends on the size of the population: the larger the population, the more effectively the oscillation of each agents actions is compensated for by the oscillation of all the other agents actions. When p2 < p0 there are many agents cooperating and a few agents defecting. The agents that cooperated are rewarded for cooperating; at each iteration their cooperation probability tends toward 1. Since their cooperation probability is high, most of the cooperators continue to cooperate. After a few iterations their cooperation probability reaches 1, and they continue to be rewarded so they can never again defect. The
Syst. Res. 19, 281^290 (2002)

286

Miklos N. Szilagyi and Zoltan C. Szilagyi

Syst. Res. few agents that happened to defect are also heavily rewarded; this encourages them to defect. The defectors still have a fairly high probability of cooperation, so at each iteration several of the defectors start to cooperate. (Note that there are defectors with high probability of cooperation and vice versa. What we cannot have is a defector with probability of cooperation consistently 1 or a cooperator with probability of cooperation consistently 0.) The few defectors that still continue to defect will be rewarded greatly for their defection; they eventually reach a probability of cooperation 0, after which they will never cooperate for the duration of the simulation, because they are continuously being rewarded for their defection. After a while, the net result is that most of the agents are cooperating with probability 1 and are being continuously rewarded for doing so, and a few of the agents are always defecting, have cooperation probability 0, and are being continuously rewarded for doing so. Thus, a steady state is reached. The two solutions are different from each other in three important ways: (1) The solution at p1 is a stable equilibrium (attractor) with respect to the aggregate cooperation proportion, while the solution at p2 is an unstable equilibrium (repulsor). (2) The solution converges toward p1 as an oscillation while it stabilizes exactly in the p2 < p0 case. This is because around p1 the agents are punished no matter what they do and tend to change their cooperation probabilities over time. Therefore, these probabilities do not converge to 0 or 1 for any individual agent. In the latter case, each agent in the steady state has a probability of cooperation of 0 or 1, and it is just the proportion of agents cooperating that determines the nal aggregate cooperation proportion. (3) Initial aggregate cooperation proportions of p0 > p2 do not result in the aggregate cooperation proportion converging to 1, as you would expect if you think of p2 as an unstable equilibrium. This is because, for an individual agent that started off as a
Copyright 2002 John Wiley & Sons, Ltd.

NOTES AND INSIGHTS defector, there is always some likelihood that the agent will continue to defect. This probability is initially small but continues to increase as the agent is always rewarded for defecting. If the number of agents is sufciently large and p0 is not too close to 1, then there will be some agents that continue to defect until their cooperation probability reaches 0 due to the successive rewards they have received, and these agents will defect forever. The exception is if you start off with the aggregate cooperation proportion equal to 1. Then no agent starts as a defector and there is no chance of any of them defecting in the steady state. The solutions can be predicted in a similar way for any situation. We have developed an algorithm that accurately predicts the nal aggregate outcome for any combination of Pavlovian agents and any pay-off functions. Let us dene the aggregate cooperation proportion x(t) for iteration t as the ratio of the number of agents cooperating to the total number of agents. The algorithm computes x(t) for any value of t when the array consists of a large number of agents and each agent is every other agents neighbor. The initial value of x(0) is given. If there are N agents, then Nx(0) agents are initially cooperating and N[1 x(0)] agents are initially defecting. First, we take all of the agents in a given iteration of the simulation and distribute them into a set of groups called rows, where each row represents agents that have exactly the same state. Two agents have the same state if and only if they have the same probability of cooperation and the same current action. We dene a row as an ordered triple indicating the proportion of the agents described, the probability of cooperation for these agents, and a Boolean value for the action of cooperation (1) or defection (0). Then we dene a table as an array containing all of the rows for certain iteration. Table(t) returns the table that describes iteration t. The sum of the proportions from each row of a table of course always equals 1, so that each agent is described exactly once. A table is essentially a complete description of the
Syst. Res.19, 281^290 (2002)

Non-Trivial Solutions to the N-Person Prisoners Dilemma

287

NOTES AND INSIGHTS state of all the agents in an iteration with the locations of the agents neglected. To compute x(t) for any t, rst compute Table(t). From Table(t), we can compute x(t) by summing up the proportions of agents in each row that describes cooperating agents. We can compute Table(t 1) from Table(t). This will give us Table(1) based on Table(0),
Table (0) is as follows, based on the given information: Row 1 2 Proportion of agents (PA) x(0) 1x(0) Probability of Action cooperation (PC) x(0) x(0) 1 0 Row[2i](t 1) Row[2i 1](t 1)

Syst. Res. cooperation PC(t 1) for both new rows from that of the old row by using the given update function. Denote the proportion of agents in Row[i](t) as PA[i](t). The two new rows of Table(t 1) then will look like this:

Row

Proportion of agents (PA)

Probability of Action cooperation (PC) 1 0

{PA[i](t)} x PC(t 1) {PC(t 1)} {PA[i](t)} (2x) PC(t 1) {1PC(t 1)}

Table(2) based on Table(1), and so on; then we compute x(t) as described above. For each row in the old Table(t), construct two rows in the new Table(t 1). Denote the ith row in Table(t) as Row[i](t). The two new rows in Table(t 1) will then be Row[2i](t 1) and Row[2i 1](t 1). To create them, we rst compute the probability of

Repeat this procedure for all values of i from 1 to the number of rows in Table(t), and we obtain Table(t 1). As noted above, creating the series of tables for an arbitrary number of iterations is sufcient to nd the aggregate cooperation proportion x(t). As Table(0) has 2 rows, and the number of rows doubles whenever t is incremented, Table(t)

Figure 2. Evolution of the Prisoners Dilemma for the case when D 2x/N 0.5, C 2x/N 1, and p0 0.6. The graph shows the proportion of cooperating agents as a function of the number of iterations. The solution oscillates around p1 0.18 Copyright 2002 John Wiley & Sons, Ltd. Syst. Res. 19, 281^290 (2002)

288

Miklos N. Szilagyi and Zoltan C. Szilagyi

Syst. Res.

NOTES AND INSIGHTS

Figure 3. Evolution of the Prisoners Dilemma for the case when D 2x/N0.5, C 2x/N1, and p0 0.8. The graph shows the proportion of cooperating agents as a function of the number of iterations. The solution converges to pnal 0.93 > p2 0.695

has 2t 1 rows. Therefore, this is an exponential algorithm but we were able to compute the value of x(t) for t 20 iterations in a couple of minutes. The predictions are exact for an innite number of agents but the experimental results of the simulation approximate the predictions very closely even for a few hundred agents and they are in complete agreement with the above qualitative explanation. Figure 2 shows the evolution of the Prisoners Dilemma for 900 agents when the reward/ penalty functions are given as D 2x/N 0.5 and C 2x/N1, and the initial aggregate cooperation probability is p0 0.6. The agents are distributed randomly over the lattice. The graph shows the proportion of cooperating agents as a function of time (number of iterations). The solution oscillates around p1 0.18. Figure 3 shows a completely different picture although the only difference here is that the initial aggregate cooperation probability is p0 0.8. The solution converges to a constant value of the proportion of cooperating agents pnal 0.93 > p2 0.695.
Copyright 2002 John Wiley & Sons, Ltd.

CONCLUSION As our Pavlovian agents are not rational, their behavior cannot be explained on the basis of game theory. The solutions presented in this paper are non-trivial but remarkably regular and predictable. Universal defection is possible but it only occurs when p1 0 and p0 < p2 or when p2 is the only solution and p0 < < p2. Our model may constitute a viable approach for the study of populations of cells, organisms, groups, organizations, communities, and societies. It may lead to better understanding of the evolution of cooperation in living organisms, international alliances, sports teams, and large organizations.

REFERENCES
Anderson JM. 1974. A model for The Tragedy of the Commons. IEEE Transactions on Systems, Man, and Cybernetics 103105. Axelrod R. 1984. The Evolution of Cooperation. Basic Books: New York. Syst. Res.19, 281^290 (2002)

Non-Trivial Solutions to the N-Person Prisoners Dilemma

289

NOTES AND INSIGHTS


Bixenstine VE, Levitt CA, Wilson KV. 1966. Collaboration among six persons in a Prisoners Dilemma game. Journal of Conict Resolution 10(4): 488496. Bonacich P, Shure GH, Kahan JP, Meeker RJ. 1976. Cooperation and group size in the N-person Prisoners Dilemma. Journal of Conict Resolution 20(4): 687706. Dawes RM. 1980. Social dilemmas. Annual Reviews of Psychology 31: 169193. Epstein JM, Axtell R. 1996. Growing Articial Societies. Brookings Institution Press/MIT Press: Washington/Cambridge/London. Feinberg WE, Johnson NR. 1990. Radical leaders, moderate followers: effects of alternative strategies on achieving consensus for action in simulated crowds. Journal of Mathematical Sociology 15: 91115. Franzen A. 1995. Group size and one-shot collective action. Rationality and Society 7(2): 183200. Glance NS, Huberman BA. 1993. The outbreak of cooperation. Journal of Mathematical Sociology 17(4): 281302. Goehring DJ, Kahan JP. 1976. The uniform N-person Prisoners Dilemma game. Journal of Conict Resolution 20(1): 111128. Hamburger H. 1973. N-person Prisoners Dilemma. Journal of Mathematical Sociology 3: 2748. Heckathorn DD. 1988. Collective sanctions and the creation of Prisoners Dilemma norms. American Journal of Sociology 94(3): 535562. Huberman BA, Glance NS. 1993. Evolutionary games and computer simulations. Proceedings of the National Academy of Sciences USA 90: 77167718. Kelley HH, Grzelak J. 1972. Conict between individual and common interest in an N-person relationship. Journal of Personality and Social Psychology 21(2): 190197. Komorita SS. 1976. A model of the n-person dilemmatype game. Journal of Experimental Social Psychology 12: 357373. Komorita SS, Parks CD. 1994. Social Dilemmas. Brown and Benchmark: Madison, WI. Kraines D, Kraines V. 1989. Pavlov and the Prisoners Dilemma. Theory and Decision 26: 4779. Kraines D, Kraines V. 1993. Learning to cooperate with Pavlov: an adaptive strategy for the iterated Prisoners Dilemma with noise. Theory and Decision 35: 107150.

Syst. Res.
Kraines D, Kraines V. 1995. Evolution of learning among Pavlov strategies in a competitive environment with noise. Journal of Conict Resolution 39(3): 439466. Liebrand WBG, Messick DM, Wilke HAM (eds). 1992. Social Dilemmas: Theoretical Issues and Research Findings. Pergamon Press: Oxford. Lloyd AL. 1995. Computing bouts of the Prisoners Dilemma. Scientic American 272(6): 110115. Macy MW. 1991. Chains of cooperation: threshold effects in collective action. American Sociological Review 56: 730747. Marinoff L. 1999. The tragedy of the coffeehouse. Journal of Conict Resolution 43(4): 434450. McPhail C, Powers WT, Tucker CW. 1992. Simulating individual and collective action in temporary gatherings. Social Sciences Computer Review 10: 128. Narendra KS, Thathachar MAL. 1989. Learning Automata (An Introduction). Prentice-Hall: Englewood Cliffs, NJ. Nowak MA, May RM. 1992. Evolutionary games and spatial chaos. Nature 359: 826829. Oliver PE. 1993. Formal models of collective action. Annual Review of Sociology 19: 271300. Rapoport A, Chammah AM. 1965. Prisoners Dilemma. University of Michigan Press: Ann Arbor, MI. Rapoport A, Diekmann A, Franzen A. 1995. Experiments with social traps IV. Rationality and Society 7(4): 431441. Schelling TC. 1973. Hockey helmets, concealed weapons, and daylight saving. Journal of Conict Resolution 17(3): 381428. Schroeder DA (ed.). 1995. Social Dilemmas: Perspectives on Individuals and Groups. Praeger: Westport, CT. Schulz U, Albers W, Mueller U (eds). 1994. Social Dilemmas and Cooperation. Springer: Berlin. Szilagyi MN. 2000. Quantitative relationships between collective action and Prisoners Dilemma. Systems Research and Behavioral Science 17: 6572. Szilagyi MN, Szilagyi ZC. 2000. A tool for simulated social experiments. Simulation 74: 410. Thorndike EL. 1911. Animal Intelligence. Hafner: Darien, CT. Weil RL. 1966. The N-person Prisoners Dilemma: some theory and a computer-oriented approach. Behavioral Science 11: 227234. Wolfram S. 1994. Cellular Automata and Complexity. Addison-Wesley: Reading, MA.

Copyright 2002 John Wiley & Sons, Ltd.

Syst. Res. 19, 281^290 (2002)

290

Miklos N. Szilagyi and Zoltan C. Szilagyi

You might also like