This action might not be possible to undo. Are you sure you want to continue?
A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
by Manabu “Mark” Kimura May 2002
TABLE OF CONTENTS CHAPTER ONE: INTRODUCTION Use of Computers in Regional Science Why Simulation? Role of Randomness Agent-Based Models Spatial Issues CHAPTER TWO: BACKGOUND CHAPTER THREE: INTRODUCTION TO AGENT-BASED MODELS A Highly Simplified Model A More Realistic Simplified Model CHAPTER FOUR: GENERALIZED SPATIAL AGGLOMERATION MODEL Properties of Location—Q-Vector Decision Making Process in Migration CHAPTER FIVE: APPLICATIONS OF THE GENERALIZED SPATIAL AGGLOMERATION MODEL CHAPTER SIX: APPLICATION #1: A SIMPLE MODEL 33 34
1 1 2 2 2 4 6 8 9 21 25 25 27
Specifications Simulation Results Discussion CHAPTER SEVEN: APPLICATION #2: RESOURCE DEPLETION AND POLLUTION ACCUMULATION Specifications Simulation Results and Discussions CHAPTER EIGHT: APPLICATION 3: PROPAGATION Specifications Simulation Results CHAPTER NINE: APPLICATION 4: GEOGRAPHY Specifications Simulation Results CHAPTER TEN: CONCLUSIONS BIBLIOGRAPHY
34 36 44
50 50 53 65 65 67 72 72 76 81 84
LIST OF TABLES
Table 6.1: Perfectly Rank-Sized Distribution Table 6.2: Not Rank-Sized Hierarchical Distribution
LIST OF FIGURES
Figure 3.1: An example of Moore neighborhood. Figure 3.2: A snap shot of the space. Figure 3.3: Initial distribution of agents. Figure 3.4: First village of 2. Figure 3.5: First village of 3. Figure 3.6: Emerged clusters at t = 0.06 . Figure 3.7: Three-dimensional representation of urban clusters at t = 0.06 . Figure 3.8: Three-dimensional representation of urban clusters at t = 0.1 . Figure 3.9: Migrants’ net payoff with negative externalities ( π = 0.9 , d = 1 ). Figure 3.10: Emergence of cities with 98 family units, π = 0.9 , c = 0.01 , t = 0.2 . Figure 3.11: Further development of cities with 98 family units, π = 0.9 , c = 0.01 ,
t = 0 .5 .
10 11 15 16 18 19 20 20 22 24
24 Figure 4.1: Distribution of attractive forces around cell i where p = 1 and G j = 1 for
all j . 30 Figure 4.2: Distribution of attractive forces around cell i where p = 1 and Gi = 1 for all j except for Γ(Q j ) = 70 at d ij = 15 31
Figure 6.1: Spatial patterns in equilibrium under various sets of parameters. Figure 6.2: Evolution of spatial patterns at c = 0.3 . Figure 6.3: Effect of p and c on number of clusters. Figure 6.4: Effect of p and c on average cluster size. Figure 6.5: Variance of cluster sizes on the p − c space. Figure 6.6: Spatial patterns with travel cost uniformly distributed from 0 to 1. Figure 6.7: RSR Index vs. p under c = 0.2 . Figure 6.8: Set of p and c that offer realistic population distributions added to figure 6.5. Figure 7.1: Initial distribution of resources for Case 1. Figure 7.2: Formation of clusters when resources exist with low transport cost. Figure 7.3: Formation of clusters when resources exist with high transport cost. Figure 7.4: Formation of clusters and behavior of mega-cluster with both resources and pollution (low transport cost). Figure 7.5: Formation of clusters under larger influence of pollution. Figure 7.6: Initial distribution of resources for Case 5.
37 40 42 43 44 46 48
49 53 55 56
57 58 59
Figure 7.7: Behavior of agents with depletable resources concentrated in the center of the space. 60
Figure 7.8: Behavior of agents with renewable resources concentrated in the center of the space and with accumulation of pollutants. 62
Figure 7.9: Behavior of agents with renewable resources concentrated in the center of the space and with pollutant purification. Figure 8.1: Propagation of agents. Figure 8.2: Propagation of agents in a larger space. Figure 9.1: Initial condition for qi , 4 for all i (North American Map). Figure 9.2: Formation of urban clusters of North America simulated under various transport cost ranges. Figure 9.3: Snapshots of simulations at t = 100 under 0 < p < 10. 79 80 64 70 71 74
CHAPTER ONE: INTRODUCTION The role of science is to expand our knowledge about the universe and supply the findings to the fields that make use of them. For example, mathematics arms physicists with vigorous logical reasoning; physics allows astronomers to understand how objects interact with each other; astronomy helps engineers construct a guidance system for a space shuttle; and, we go to outer space. Conversely, the progress of science also has been helped by engineering—especially, in the late 20th century, the advance of computer capability changed the way most scientific work is done. Likewise, regional science has also exploited the computer capabilities of recent years in order to expand our knowledge about our economy in terms of space and offer the findings to policy makers.
Use of Computers in Regional Science
In regional science, computers are used in three ways: (1) massive computation of simple mathematical problems, which involves data processing and numerical solution of a system of nonlinear equations; (2) Geographic Information Systems (GIS) and (3) simulation. (1) Massive computation of simple mathematical problems includes implementations of traditional methods such as statistical analyses (e.g. Econometrics1), Input-Output analysis and CGE models. Regression analyses and I-O analyses with real-life data require a series of calculation of large matrices and we usually choose to not do it by hand. A CGE model, since it normally includes a number of non-linear equations, cannot be solved analytically; thus, we need to resort
Guy Orcutt (1990), who is known as the pioneer of microsimulation, designed the first multiple
regression analyzer during World War II.
2 to approximations by numerical calculations. These methods have been extensively developed for the last several decades. (2) Geographic Information Systems, which is an analysis and visualization tool rather than a method, is used in regional science when one applies his/her models to real-world geography. GIS has been actively used for less than decade and will be extensively used with existing and new methods. (3) Simulation is a common technique in natural sciences and engineering such as Operations Research, but the use of it has been rare in regional science. It is the first objective of this study to introduce this new method to the field of regional science.
Simulation has advantages over analytical approaches under some situations. First, it is easy to deal with dynamic phenomena—it is, in fact, often inevitable for a model to be dynamic. Second, one can model systems that offer complex spatial patterns and/or non-linear behaviors. Third, randomness—more precisely,
stochasticity—can explicitly be added to a model. Role of Randomness
Traditionally, when economists deal with random phenomena they use probabilistic approaches such as sunspot equilibria, but with simulation it is possible to leave randomness truly random. In other words, every time we run a simulation with the same environment we would observe a different result. It is the second objective of this study to show how the models with stochastic factors work, and, the effect of randomness on the results.
An agent-based model can be defined as a system of numerous autonomous agents who interact with each other according to each agent’s rules, which are often
3 local. Also, it is known that even very simple rules can cause extremely complex behaviors. The use of agent-based models is an ideal approach to make use of all the advantages mentioned above; namely, the model can be dynamic and simultaneously capable of simulating complex and random phenomena. One of the implications of using simulation would be that we could possibly “experiment” with new policies before applying them to the real world. With agentbased models, in particular, we could also experiment on an individual’s preferences and responses to the environment. Indeed, with agent-based approaches, it is possible to even model adaptive behaviors of economic agents without allowing traditional economic assumptions such as rational expectation. The difficulty in, and significance of, modeling the complex behaviors of the real-world economy were pointed out by Isard (1956) in the preface of “Location and Space-Economy”:
“…A presentation of conditions of equilibrium in a theoretical system may seem to imply a tendency toward the attainment of a state of equilibrium in the real world. But in a full historic sense, actual economic life never does realize a state of equilibrium. There are always changes impinging upon the economy. The process of adjustment is constantly in operation. Witness, for example, the adaptation of population to environment. There has never been a complete adjustment which might be said to characterize an optimum or equilibrium spatial distribution of population.”
When this was written in 1956, analytical tools were the only ways for general researchers; in fact, the methods introduced in the book attack complex problems within the framework of analytical approaches. With agent-based models, however, it is not a demanding task to create a model to satisfy the kind of complex behaviors
4 described in the citation above. The third objective of this research is to demonstrate this. In general, agent-based models are uniquely capable of handling the followings: (1) Interaction between agents. Unlike traditional microeconomics theories (not to mention macroeconomics), where only a representative agent is considered with the interactions with no or a few agents at most, agent-based approaches can explicitly model the interaction between each agent and an arbitrary number of agents; (2) Asymmetrical space. Agent-based models allow us to assess complex-shaped spaces. For example, the shape of the space could be as complex as a map of the United States; and (3) Heterogeneous agents. Since the concept of agentbased models is object-oriented, it is relatively easy to model heterogeneity of agents. Note that we integrate all of these into one set of simple mathematical expressions.
As implied in the last part of the citation, one of the dominant factors that complicate the real-life economy is the existence of space, which is, of course, the motivation of regional science. As a matter of fact, traditional methods in regional science such as those in “Methods of Interregional and Regional Analysis” (1998) do answer some economic questions with regard to location, but they by themselves have the limitation that the geography must be highly simplified so that the questions can be solved analytically. In other words, if we are to apply an existing method to the real geography, there is no way for the method to recognize the complex shapes/dimensions of political and economic units. It is an advantage of agent-based models that the physical complexity of the space does not matter because an agent’s local rules are concerned with its
5 neighborhood with a simple shape2. For this reason, the existing methods with simple scalar dimension such as the gravity model could be soundly incorporated into an agent-based model and used for vectorial analyses. The final objective of this study is to integrate a few existing regional science methods into agent-based models and construct a general spatial agglomeration model that can be used in other methods in regional science.
Moreover, as will be shown in later chapters, the intricate spatial distribution of population emerge
rather than be exogenously given. Obviously; this concept of emergence—which is often seen in complexity science—is consistent with historical evidence.
CHAPTER TWO: BACKGOUND Krugman (1996) constructed an elaborate economic model using the idea of “monopolistic competition” and developed it to a migration model. His model, however, does not allow a region to have “size”; in other words, each region is a point in the space. Furthermore, the scopes of both trade and economics agents’ ability to obtain the information about other regions are global. As a result, depending on the values of parameters, centrifugal and centripetal forces cannot coexist at the same time in the whole system and this prevents the model from offering realistic spatial patterns. Page (1999), on the other hand, applied simple rules to agents’ migration and acquired results both analytically and by simulations. He assumed a lattice as the space with each border not connected to the other side; this border condition is a strong assumption, which dominates some of the results presented in his study. Although it is an interesting case to consider, it lacks generality and often hides the effects of other important parameters. The scope of agents was either strictly global or local. Schweitzer and Steinbrink (1997) used a stochastic model to simulate urban agglomeration based on empirical facts such as the Rank-Size rule and developed a kinetic model, which generates realistic spatial distributions. Although the model is elaborate in terms of physics and the result is consistent with Berlin’s spatial distribution, it lacks economic rationale; the realistic results seem coincidental. Therefore, the model does not appear to be applicable to general urban problems. Axtell and Epstein (1996) developed an agent-based model on a closed lattice space and demonstrated extensive experiments, which included birth, death, gender, culture, conflict, disease and so on. In their model, however, the agglomeration of 6
agents is governed by the initial distribution of “sugar”, which is exogenously given, so the resulting spatial patterns do not arise by emergence.
CHAPTER THREE: INTRODUCTION TO AGENT-BASED MODELS In this chapter, a spatial agglomeration model is presented as an example of agent-based models. The model is kept simple just enough to illustrate how one could construct an agent-based model for social phenomena and show its outcome. In general, steps involved in developing an agent-based model would be: 1. Define a world where agents perform their activities. 2. Define behavioral rules of each agent. This often includes a. How it behaves by itself, b. How it interacts with other agents, and c. How it responds to the changes in the world. 3. Define how the world responds to the actions taken by agents. 4. Build an algorithm. 5. Write a computer program for simulation. Furthermore, in order to utilize the model for analysis, one may also: 6. Construct hypotheses. 7. Determine simulations that would test these hypotheses. 8. Run simulations. 9. Analyze the results. In the following sections, these steps (except for 2.c and 3) are followed to build the model.
A Highly Simplified Model Specifications of Space
The space in which economic agents live is set to be a lattice of cells, each of which can contain multiple agents. Furthermore, we forge a space of infinite horizon by connecting one horizontal edge to the other and one vertical edge to the other3, making a torus, in order to avoid potential border effects, which are undesirable for the purpose of this chapter4. The distance between any two cells is defined by its Moore neighborhood. Figure 3.1 shows an example of Moore neighborhood5 where the distance from the central cell is marked in each cell. We choose this type of distance ( L ∞ distance) rather than Euclidean ( L 2 ) distance because of the space being a lattice and for the
This is a setting commonly seen in spatial interaction models. For example, if we are to construct a model that includes trades between locations, all agents might
simply gather at the central cell of the lattice to minimize the transport cost. In this case, there could be no place for randomness. An alternative neighborhood is:
4 4 3 4 3 2 4 3 4
4 3 2 1 2 3 4
4 3 2 1 0 1 2 3 4
3 2 2 1 2 3 4
2 2 3 2 3 4 3 3 4
which is characterized by L1 distance or Manhattan distance, which might be, as its name suggests, appropriate for certain problems in regional science.
ease of coding. It is also know that the choice of distance does not affect the qualitative results (Durrett and Levin, 1994).
4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 4 4 3 2 2 2 2 2 3 4 4 3 2 1 1 1 2 3 4 4 3 2 1 0 1 2 3 4 4 3 2 1 1 1 2 3 4 4 3 2 2 2 2 2 3 4 4 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4
Figure 3.1: An example of Moore neighborhood.
Figure 3.2 is a snapshot of a lattice of 50 × 50 (=2500) cells with most cells occupied by one family unit (small white dot), some with 2-9 family units (numbered) and others empty (black). In addition, notice that there is a non-inhabitable area represented by 23 white cells in the center of the lattice. Let us consider this area as a lake, swamp, desert or other form of infertile area. Also, we assume no natural resources such as oil, coal, etc., so, in other words, there are only human resources.
Figure 3.2: A snap shot of the space.
Productivity of Agents
Consider each agent as a family unit and assume that there are N homogeneous agents. The basic principle of each family’s activity is, as in conventional microeconomic problems6, that it attempts to maximize its productivity. To obtain the productivity of family α , consider first the gross regional product of location i in which α resides. We conceive agglomeration to originate in
Of course, we may introduce other factors—in fact, as many as factors we want including non-
economic ones. However, the main concern of this chapter is a smooth transition from conventional economics; thus, it is appropriate to stick to the traditional microeconomic way. The generalized model in the next chapter allows us to include other factors systematically.
economies of scale (e.g. specialization, localization, spatial juxtaposition, etc.) and, as stated above, labor is the only input of production; thus, if we let ni denote the population in location i , we set up the production function,
Fi (ni ) = ni2 ,
which indeed reveals increasing returns to scale. Therefore, since all ni agents are homogeneous, average product of i or the real wage of family α is,
ni2 f i (ni ) = = ni ni
This suggests that in order for family α to gain more real wage, it needs to increase the number of the other family units—by migrating to another location that has a larger population7.
Law of Motions
Family α (which is in location i ) relocates to location j if such a move gives
α the best net gain among all possible moves including no relocation. A net gain is
obtained by subtracting transport cost of migration from the real wage at the destination. Therefore, if α relocates from i to j , the net gain of family α is
Gα (n j ) = n j − πd ij
where π is the transport rate per unit distance and d ij is the distance between i and
j . Thus, α chooses j that maximizes Gα .
The family could also propagate and increase the population in the same location. The idea of birth
and death will be added to the models introduced in chapter 8.
Note that family α is assumed to perceive the gross regional product of every location or have perfectly global scope. This is a somewhat unrealistic assumption (and not common in agent-based models). Family α , however, tends to choose j that is in the neighborhood of i due to the existence of transport cost and behaves as if the family ignored distant locations. The models presented in the following chapters have a more realistic approach toward agents’ scope.
Once we set up the characteristics of the space and the behaviors of agents, we need to build a flow of simulation—or an algorithm. In this model, first, parameters are initialized as follows:
• • •
Dimension of Space: 50 × 50 = 2500 cells with 23 non-inhabitable cells Number of Family Units: 2477 Initial Distribution of Family Units: Uniform, or one family unit per cell8
Transport Rate Per Unit Distance (π ) : 0.9
The initial distribution of agents is shown in Figure 3.3. Second, the first family unit
a is randomly chosen to consider relocation. (Let the location of a is denoted by ia )
Note that this is the main part where the role of randomness enters into this model. Third, family unit a evaluates all other locations by equation (3) and finds the location, jmax , that gives the highest gain. If there are more than one location that give the highest gain, then the family unit chooses one of them randomly; this is another
The assumption of uniform initial distribution of family units is based on extensive historical facts.
See Flick (1934, p. 164).
part where randomness might take effect. Fourth, if the gain by relocating to location
jmax is larger than that by staying in location i , the family moves to jmax ; otherwise,
it stays in location ia . It also stays if those two gains are the same. Last, the algorithm goes back to the random selection of a family unit and repeats this process. Note that when the next family is randomly chosen, every family has an equal chance for relocation, so the same family could also be chosen again by a probability of 1/2477.
The unit of time in this model is assumed to be the average length between the times when a family unit is chosen and when the same family unit is chosen again, so for every unit of time, every agent has a chance to be selected once. Therefore, in this model’s settings, one unit of time is equivalent to 2477 selections of family units since there are 2477 family units in total. This definition of time allows us to change the number of family units yet keep the normalized time consistent. This way of updating the system, or letting only one agent move at the same time, is called asynchronous update, and allows us to have continuous time. The alternative is synchronous update, where all agents make their moves at once (usually with each agent not knowing how the other agents move), and is equivalent to discrete time. Synchronous update causes additional (and sometimes undesirable) effects (Durrett and Levin, 1994), but in certain economic cases, it could be more appropriate.9
For instance, economic agents (consumers, investors, governments, etc.) often make their decisions
based on the information that is available after a certain period time or only periodically, such as after census data. In this case, time could be considered discrete.
Figure 3.3: Initial distribution of agents.
Implementation of a Sample Simulation
The simulations presented in this research were performed using a C++ language compiler on a personal computer. When the program is started, it first shows the initial distribution of family units as shown in Figure 3.3. Then it chooses the first family unit α1 in location i1 to consider relocation. Among all locations except for i1 , the adjacent ones ( jd =1 ) give the highest payoff; namely,
Gα1 n jd =1 = n jd =1 − πd ij = 2 − 0.9 × 1 = 1.1
Also, the payoff by staying in i1 is
Gα1 ni1 = ni1 − πd ij = 1 − 0.9 × 0 = 1 .
And this is smaller than the payoff from relocation. Therefore, the family relocates to one of its adjacent locations (recall that when a family unit is indifferent between destinations, it chooses one of them randomly), making the first “village” that consists of two family units. The state at this point in time is shown in Figure 3.4, where there is one cell marked “2” next to a black cell without a white dot.
Figure 3.4: First village of 2.
It is conceivable that the very first—therefore, any—family does not move at all and the initial distribution of families remains if π is larger than a certain value. In fact, such a critical value exists and it can be algebraically derived; that is, the first family moves only if
Gα1 n jd =1 = n jd =1 − πd ij = 2 − π × 1 > 1
In an anthropological sense, this could partially explain that when and/or where the cost for migration was too high, ancient family movements did not form groups that eventually led to a civilization, further explaining why civilizations emerged on plains. After some random selections of families making villages of “2”, one family eventually joins a village of “2” and makes the first village with three family units (See Figure 3.5). Notice that the third family came from a cell two units distance away. Indeed, this is possible since
Gα n jd =2 = 3 − 0.9 × 2 = 1.2 > 1
although not from a cell three distant units away because
Gα n jd =3 = 3 − 0.9 × 3 = 0.3 < 1 .
Figure 3.5: First village of 3.
Similarly, families keep forming villages of 4, 5, 6 and so on. After 148 random selections (or t = 0.06 ), three clusters that range from 10 to 20 can be observed as shown in figure 3.6. In this figure, a color bar is added in order to estimate the numbers of families in the cells with more than nine family units.
Figure 3.6: Emerged clusters at t = 0.06 .
To provide different perspectives on the emerging spatial pattern of clusters, let us use a three-dimensional representation of the space. The three-dimensional graph in figure 3.7 corresponds to the two-dimensional picture in figure 3.6, and figure 3.8 shows the spatial pattern after 2477 random selections or at t = 0.1 . In these figures, the vertical axis is log-scaled in order to depict larger concentration. Note how the spatial pattern changes and starts to form a clear hierarchical structure in spite of the simplicity of the model.
4 3 2 1 0 50 40 30 0 10 20 30 x 40 0 50 10 20 y log10POP
Figure 3.7: Three-dimensional representation of urban clusters at t = 0.06 .
4 3 2 1 0 50 40 30 0 10 20 30 x 40 0 50 10 20 y log10POP
Figure 3.8: Three-dimensional representation of urban clusters at t = 0.1 .
This hierarchical structure, however, does not last long in the highly simplified model because all families eventually go to a single location (most likely the one with the highest peak in figure 3.8). This is always the case because the model assumes infinite capacity, which allows an urbanized area to simply keep attracting more families. Obviously, it is unrealistic. In the next section, we tweak the model and attempt to solve this problem.
A More Realistic Simplified Model Extension of the Model—Diseconomies and Negative Externalities
We improve the highly simplified model in the last section by introducing diseconomies and other negative externalities such as high rent, crime, environmental pollution, etc. to the model. This can be done by adding a new term, − c j n 2 , to equation (3); namely,
Gα (n j ) = n j − πd ij − c j n 2 ,
where c j is the parameter that captures the magnitude of negative externalities. We keep c j small enough to allow negative externalities to take effect only when n j becomes reasonably large. In fact, for c j << 1 and − c j n 2 being quadratic, the term
− c j n 2 is negligible when n is small and the term becomes exponentially non-
negligible as n increases. Figure 3.9 illustrates this tendency under π = 0.9 and d = 1 . We can observe that Gα (n ) indeed starts to decline after obtaining a certain number of families (to be precise, at n = 1 /(2c) ) and eventually becomes incapable of attracting even a lone family in the nearest neighborhood.
25 c = 0.01 c = 0.02 c = 0.05
10 G(n) 5 0 -5 -10 0
Figure 3.9: Migrants’ net payoff with negative externalities ( π = 0.9 , d = 1 ).
Implementation of a Sample Simulation
Based on the modification above, let us run a simulation with c = 0.01 with all other parameters kept the same as in the previous section. For a better observation of the effect of the new term, − c j n 2 , on the result, we start the simulation at t = 0.1 with the population distribution identical to that in figure 3.8. Figure 3.10 shows the result after 495 random selections or t = 0.2 . The primary node clearly seen in figure 3.8 has not developed; instead, there have emerged other cities of equal size. In actuality, any city stops to grow when its population reaches 98 family units for the following reason: if an agent is alone in location i and stays there, its net gain is 0.99. Similarly, if the family moves to the next cell10 to
We do not need to consider the cases where the family moves to a location more than one unit
distance away because it is obviously more costly to make a village of 98 families by migrating from a location two units distance away or farther.
make a village (or city) of 98 families, it gains 1.06, which is larger than 0.99. But if the family made a village of 99 families, it would gain only 0.09, which is, of course, smaller than 0.99. Thus, any village that already has 98 families cannot accommodate another family. Let us move the simulation forward. Figure 3.11 is a snapshot of the same simulation at t = 0.5 . We observe the further development of cities of 98 families. Notice also that some smaller villages have disappeared; the families there have chosen to migrate to make new cities of 98 families each. Actually, in equilibrium, with c = 0.01, the families most often make 25 cities of 98 and the residual families make one or more smaller village(s). In general, given a value for c , all families attempt to make a certain number of cities of equal size. Unfortunately, this still is an unrealistic result. In the next chapter, we construct an even more realistic and general model for spatial agglomeration, which is capable of capturing arbitrarily many factors, in order to simulate better the real world.
4 3 2 1 0 50 40 30 0 10 20 30 x 40 0 50 10 20 y log10POP
Figure 3.10: Emergence of cities with 98 family units, π = 0.9 , c = 0.01 , t = 0.2 .
4 3 2 1 0 50 40 30 0 10 20 30 x 40 0 50 10 20 y log10POP
Figure 3.11: Further development of cities with 98 family units, π = 0.9 , c = 0.01 ,
t = 0 .5 .
CHAPTER FOUR: GENERALIZED SPATIAL AGGLOMERATION MODEL This chapter involves two major tasks for the development of a generalized agent-based model for spatial agglomeration. First, we mathematically formalize the relationship between properties of a location (e.g. population, resources, environmental pollution, etc.) and the perception of each agent in the location. This will enable us to systematically add any quantifiable factors involved in the urban agglomeration/diffusion phenomenon. Second, we examine the way individuals in the real world make their decisions when they choose a place to live, and incorporate the procedure into the algorithm. This will result in the formation of highly realistic spatial patterns.
Properties of Location—Q-Vector
In this section, we establish a formal expression of quantifiable properties (e.g. population, resources, pollution level, etc.) that belong to a cell so that we can treat those properties uniformly and do not have to change the model itself when we add more properties. We first introduce the Q-vector of cell i . Write:
qi ,1 qi , 2 q Qi = i ,3 M q i ,M −1 q M
where each element qi is a quantity or level of a property and M is the number of properties we choose. For convenience, we reserve the first element, q1 , for the
domestic population of agents throughout this study. For example, if one wishes to include resources and pollution into consideration, the Q-vector is written as:
qi ,1 population of agents Qi = qi , 2 = amount of resources at cell i q pollution level at cell i i ,3
Properties such as amount of resources and pollution level are dynamic: they change over time depending on human activities, other properties, and sometimes themselves. The dynamics of properties is explicitly defined by an ordinary differential equation (ODE) of Q-vector:
& Q = f (Q ) Qt =0 = Qo
where the second line of equation (13) is the initial condition. For instance, if we are to define the dynamics of each property in (12) by the following rules: • • • Population grows at a constant rate, r1 . Each agent consumes resources at a constant rate r2 . Each agent generates pollutants at a constant rate r3 .
then, the ODE of the Qi is written as:
One may point out that there is inflow or outflow of agents; therefore, the ODE for q1 should be
adjusted accordingly—We do not have to include inflow or outflow of agents here because equation (13) defines the changes in Qi during the time between one migration and another. Recall that we simulate continuous time by asynchronous update.
& qi ,1 r1 & = q = − r q &i , 2 Qi 2 i ,1 q r q &i ,3 3 i ,1
Note that this single ODE indeed embraces the fact that both the amount of resources and pollution level are influenced by the presence of agents. Once the amount of resources reaches zero, equation (14) needs to be replaced by: & qi ,1 r1 & & Qi = qi , 2 = 0 q r q &i ,3 3 i ,1
Equation (14) is a system of first-order linear equations, so it can be written as:
& Qi = AQi + B
0 A = − r2 r 3 0 0 r1 0 0 and B = 0 0 0 0
Therefore, the stability of equation (14) can also be obtained analytically by using the eigenvalue of A .
Decision Making Process in Migration
Our strategy for modeling migration is to simply make each agent “mimic” a real person’s behaviors without making the model too complicated. Thus, we divide the whole decision making process into two parts:
1. First, an agent decides which city12 she wants to live in. 2. Then, she finds the location she actually settles in. The concept of this two-stage process is quite intuitive: When we consider relocation, we often do think of a city first (e.g. “I want to live in the New York Metropolitan Area!”) without detailed information about every community or each block in the city. Also, we tend to assess a city by its economic center—or cultural center, depending on what the criterion is. Then we start looking into various neighborhoods using more detailed criteria such as rent, security, access to downtown, etc. (e.g. “The rent in Manhattan is too high although I want to visit there sometimes”), and finally find the most reasonable place to live (e.g. “I decided to live in Bronx”). The decision making process stated above is a search problem: one searches for one place to move to among other possibilities, given a certain amount of information. In fact, the first stage is a global search, where an agent performs a search throughout the whole space; the second stage is a local search, where the agent has only limited scope. Thus, we model each of the two stages as a separate search problem.
For the first stage, an agent in cell i evaluates each cell j (= 1, 2, 3, K , i, K) in the whole space by the following utility function: G j = Γ(Q j )D(d ij ) (18)
From this chapter on, when we use the word, “city”, imagine a cluster that consists of more than a
single cell rather than a dimensionless cluster as in the previous chapter.
where Q j is the Q-vector of cell j . Γ(Q j ) is a utility function (Γ : R M → R) which is to be defined appropriately. D(d ij ) is a function of distance between cell i and j . As stated in chapter 1, one of our goals is to show how existing regional science models can be incorporated into agent-based models; so, we take a gravity model type approach13 for D(d ij ) ; namely,
Γ(Q j ) d
The parameter p controls the contribution of distance, d ij , or in an economic sense, it could be considered as a function of transport cost. The agent then chooses the cell jmax that gives the highest value of G j 15. We consider jmax as the economic center mentioned above. Although the agent evaluates all cells in the space for the sake of the algorithm’s consistency, it gives the same jmax as the case where she recognizes clusters and evaluates the economic centers of the clusters. Note that in terms of gravity models, G j can be viewed as the attractive force of j on i . Figure 4.1 illustrates the distribution of attractive forces for p = 1 . In this figure, Γ(Q j ) is set to be unity for all j to see the pure effect of d ij on attractive force. Observe how distance discounts Γ(Q j ) for d ij > 1 ; this implies that although
Another possible form for D(dij) would be D(d ij ) =
2 p / 2π exp(− pd ij / 2) , which is equivalent to the
p.d.f. of Brownian motion. This form is appropriate if we assume that the agent has to search for a city by random walk within a limited length of time.
It should be noted that when i=j, the distance, dij, is set to be 0.25, which is the average of arbitrary
many distances from the center of the cell and within the cell.
If there are more than one cell with the highest Gj, then the agent randomly chooses one.
the search is still global, the agent is strongly biased for her local cells16 and tends to dismiss distant cells17.
5 4.5 4 3.5 3 G 2.5 2 1.5 1 0.5 0 0 5 10 d
Figure 4.1: Distribution of attractive forces around cell i where p = 1 and G j = 1 for all j .
The agent, however, will not dismiss a cell in a distant location if Γ(Q j ) is very large. One such case is presented in figure 4.2. In this example, Γ(Q j ) has been increased to 70 only at d ij = 15, keeping everything else unchanged from the settings in figure 4.2. In this case, the agent will indeed choose the cell at d ij = 15 because her net welfare at d ij = 15 is higher than that in the original location although she is far from the cell. As shown in this example, the gravity type approach enables the agent
However, note that if p=0, the scope of the search is purely global. This is analogous to the case in astronomy where a rocket experiences zero “gravity” when it is far
enough from the earth.
to “jump” to prominent cells in a distant location instead of dismissing it—just as a filmmaking student in Syracuse would consider Los Angeles as well as New York City.
5 4.5 4 3.5 3 G 2.5 2 1.5 1 0.5 0 0 5 10 d
Figure 4.2: Distribution of attractive forces around cell i where p = 1 and Gi = 1 for all j except for Γ(Q j ) = 70 at d ij = 15 .
After choosing a city (more precisely, the economic center of the city), the agent begins to look for a place to settle in by performing a local search starting from
jmax . First, the agent evaluates eight adjacent cells h(= 1, 2 K8) in her neighborhood
by the following utility function:
Lh = Λ(Qh ) +
k = h 's neighborhood
∑ Λ(V ⊗ Q )
and chooses h1max such that Lh is the largest18. The function Λ(Qh ) (Λ : R M → R) is a utility function that gives a larger information set (or simply, more detailed information) than that of Γ(Q j ) 19. The second term of the equation is the contribution of the agent’s neighborhood—it captures the assumption that she gains or suffers from adjacent cells. The existence of this term encourages the agent to stay with the cluster, ensuring its growth outwards. V is a vector with M positive elements, each of which corresponds to an element of Qk . It controls, for each property, how much she gains or suffers from cells adjacent to her. The agent, then, moves her focus to h1max and evaluates the eight adjacent cells of h1max to find a better cell. If she indeed finds a better cell ( h2 max ), she moves her focus to it again. She repeats this process until she hits the cell hmax where she cannot find a better cell in its neighborhood. Finally, the agent moves to hmax .
As in the global search, if more than one possibility exist, she randomly chooses one. This specification is important because otherwise the agent would find her final destination just by
the global search, which is against the idea of the model.
CHAPTER FIVE: APPLICATIONS OF THE GENERALIZED SPATIAL AGGLOMERATION MODEL In the following chapters, four applications of the generalized spatial agglomeration model are shown. For each application, specifications of the model involve the following steps:
• • •
Determine the Q-vector. Construct an ODE for the Q-vector and its initial condition(s). Define utility functions.
To determine the Q-vector, we simply have to list the quantifiable properties that belong to a cell, including the population of agents. To construct ordinary differential equations for the Q-vector’s elements, we examine the relationships between each property and the rest of the properties and mathematically formulate those relationships. To define utility functions, we need to define Γ(Q j ) in equation (18),
Λ(Qh ) and vector V in equation (20).
Unless mentioned, other settings are the same as in the highly simplified model in chapter 3, except that we no longer assume the 23 non-inhabitable cells. That is, (1) the size of the space is 50×50; the shape of the space is a torus and (3) the number of agents is 2500 for the models without birth or death.
CHAPTER SIX: APPLICATION #1: A SIMPLE MODEL The purpose of presenting this application is to show common behaviors of the generalized spatial agglomeration model by using simple Q-vector, ODE and utility functions. By referring to the results of this simple model, we can observe how the results change when more advanced models are introduced.
In this application, we have two properties. As stated earlier, q1 is the population of agents. The other property, q2 , is the negative externality and diseconomies as in the more realistic simplified model in chapter 3; namely, q2 , comprises the term − cn 2 , where c is the parameter that captures the magnitude of diseconomy. We do not assume exogenous factors such as resources and pollution. Thus, the Q-vector is:
qi ,1 n Qi = = i 2 q − cn i i,2
There is no birth or death for agents, or endogenous change in population; accordingly, there is no change in q1 or q2 . Therefore, the ODE of the Q-vector is:
& q & i ,1 0 Qi = = q 0 &i , 2
The initial distribution of agents is random, so the initial condition of this ODE is:
qi ,1 (0 ) Qi (0 ) = − cq 2 (0 ) i ,1
where qi ,1 (0) is zero or a random positive integer (≤ 2500) such that:
∑ q (0) = 2500
For the global search, we assume that each agent is only concerned with positive externalities. So, Γ(Qi ) = qi ,1 = ni Therefore,
G (Qi ) = Γ(Qi ) ni = p d ijp d ij
For the local search, diseconomy enters:
2 Λ(Qh ) = qh,1 + qh , 2 = nh − cnh
Furthermore, we assume: 1 V = 0 (28)
This means each agent benefits from the economies in her surrounding cells without loss and will not be distressed by the diseconomies emanating from those cells. Thus, the utility function for the local search is:
36 Lh = Λ(Qh ) +
k = h 's neighborhood
∑ Λ(V ⊗ Q )
2 = nh − cnh +
k = h 's neighborhood
The simulations in this study were performed using a software program developed with a C++ language compiler on a personal computer. In this section, we first observe the development of spatial patterns from various angles, followed by the analyses.
Spatial Patterns in Equilibrium
Figure 6.1 shows spatial patterns in equilibrium under various sets of parameters p and c . The parameter p increases toward the bottom of the table and
c toward the rightmost column. Different colors indicate population densities, which
correspond to the color bar at the bottom of the figure. Roughly speaking, we can observe that the increase in c causes a cluster to expand its horizontal size; conversely, the decrease in c causes it to increase in density. This result is consistent with the predictions of the model itself. Similarly, the larger p is, the smaller the distances between clusters. This is also a natural consequence, given that p is the parameter that represents the relocation cost. Notice that for each c there seems to be a transitional state between the state where there is one mega-cluster and that where there are arbitrary many small clusters. In fact, outside that transitional state, the effect of p appears to be very small. There is a possibility that this is a phase shift, which is commonly seen in the natural sciences. Furthermore, in the transitional state, the system offers clear hierarchical structures. Since this is a significant topic in regional science, it is examined in detail later in this chapter.
Figure 6.1: Spatial patterns in equilibrium under various sets of parameters.
Evolution of Spatial Patterns
Another concern in spatial agglomeration is its dynamics, or how spatial patterns emerge over time. Figure 6.2 shows the evolutions of spatial patterns toward their equilibria for p = 0.1, 0.3, 0.4, 0.5, 0.7 with c = 0.3 . At p = 0.1, we can see that once a single dense cluster is formed, all agents who have chosen to relocate directly move to that cluster to create a mega-cluster. As p increases, however, this tendency changes. That is, agents first create small clusters all over the space and some of the small clusters start to grow further. This can be most prominently observed at p = 0.5 in the figure: At t = 0.5 there are only small clusters of equal size, but at t = 1, there have emerged a few larger clusters, which further attract other agents who may or may not belong to other clusters—thus, creating a hierarchical structure in equilibrium. The case where p = 0.4 elicits behavior somewhat in between that seen in the two examples above. Agents do make small clusters first; but, after creating a megacluster, some who are already in the “small” clusters also move to the mega-cluster, as do those who do not belong to any emerged clusters, resulting in a hierarchical structure. This implies that agents are repeating long-distance relocations to larger clusters. At p = 0.7, agents choose to travel only for short distances to form relatively small clusters locally and do not create larger clusters. The migration behaviors stated above are consistent with the classification of migrants by Ravenstein (1885); namely, (1) the long-journey migrant, (2) the migrantby-stages, (3) the short-journey migrant and (4) the local migrant. In figure 6.2, the long-journey migrants are observed at p = 0.1 − 0.3 and in late stages (t ≥ 1) at p = 0.4, the migrants-by-stages at p = 0.4 and 0.5, and the short-journey migrants and local migrants are seen in early stages for p = 0.4 − 0.7 .
One could claim a condition for hierarchical structures of urban clusters from the fact that this model covers all types of migration patterns introduced by Ravenstein. Note for example, that we observe realistic spatial patterns around p = 0.4 − 0.5 under c = 0.3 (or the “transitional” area mentioned earlier in this section) where all or most classes of migrants exist. This might tell us that hierarchical structures take place when all types of migrants co-exist in the process of pattern formation.
Figure 6.2: Evolution of spatial patterns at c = 0.3 .
In order to examine the effects of the parameters p and c on the spatial patterns that emerge, a series of simulations were performed for analyses. Analysis is possible often only by repeating the simulation as many times as possible under the same setting due to the stochastic nature of the model. For the analyses in this section, simulations were done 20 times for each set of p and c . Figure 6.3 shows the effect of p and c on the number of clusters in equilibrium. For any c , there emerges only one mega-cluster when p is close to zero, and the number of clusters increases as p increases. This is, of course, because the higher the transport cost, the harder it is for agents to travel long distances; therefore, they choose to form clusters near their original locations. For small c, however, the number of clusters stays one for relatively large p (e.g. for c = 0.1, it takes p = 0.5 for the number of clusters to be more than one). This
is because when c, which is the magnitude for diseconomy, is small, the attractive force of the mega-cluster is large enough to compensate for higher travel costs. Conversely, when c is large, agents are more sensitive to transport cost. As p further increases, the increase in the number of clusters diminishes. This should be mainly because of the fact that the space is discrete. Notice that when c is smaller, it requires fewer cells to hold the same number of agents—making it possible for the space to contain more distinct clusters. In fact, this is the reason that in equilibrium, smaller c makes more clusters.
50 Number of Clusters
0.1 0.3 0.5 0.7 0.9 0 0.2 0.4 p 0.6 0.8 1
Figure 6.3: Effect of p and c on number of clusters.
Let us consider the role of the two parameters in terms of the size of clusters. Figure 6.4 shows the average sizes of clusters, which is measured by the number of agents that belong to the cluster for the same set of p and c in figure 6.3. Since the total number of cluster is 2500, each point in the figure 6.4 is equal to 2500 divided by the number of clusters in figure 6.3.
For small p and c, the size of clusters is 2500 for the same reason that the number of clusters is one, as explained above. Similarly, it decreases as p decreases for the same reason stated above. For large values of p, the differences between values of c appear marginal compared to that for small values of p (although differences do exist, as seen in figure 6.3).
2500 c 0.1 0.3 0.5 0.7 0.9
Average Size of Clusters (# of agents)
Figure 6.4: Effect of p and c on average cluster size.
Recall that in figure 6.1, we have observed that certain combinations of p and
c apparently result in hierarchical structures and other combinations of p and c
make either one mega-cluster or many small clusters of approximately equal size. It would be of interest to quantitatively classify p and c by those three types of spatial patterns. One way to assess the classifications above is to compute the variance of cluster sizes. Notice that (1) if there is only one cluster in the space, the variance should be equal to zero; (2) if there are only small clusters of approximately equal size, the variance should still be small and (3), if the sizes of clusters describe a hierarchy, the variance of clusters’ sizes should be larger than that of the two other spatial patterns. Figure 6.5 shows the variance of cluster sizes on the p − c space. There clearly exists an area with large variance, so we now know that this model will generate a hierarchical structure as long as p and c are within that area. Below that area, where
the variance is zero in the figure (the flat deep blue area with low p and low c ), is the domain where we observe a mega-cluster. Above the area (high p and high c ) is the domain where there emerge small clusters of approximately equal size. For the simulations performed for this analysis, the variances in this domain ranged approximately from 500 to 50,000.
Figure 6.5: Variance of cluster sizes on the p − c space.
The purpose of this chapter is to offer an example of the generalized spatial agglomeration model developed in chapter 4, and illustrate the behaviors of the generalized model with simple settings. It has been shown that the desired characteristics of the space and the preferences of agents can be incorporated into the generalized model using the Q-vector and utility functions for global and local searches. It also has been shown that, in this simple version of the generalized model, two parameters—those for travel cost and diseconomy—determine the resulting
spatial patterns that include (1) one mega-cluster, (2) many small clusters of approximately equal size and (3) hierarchical structure. Recall that, in the highly simplified model presented in chapter 3, hierarchical spatial patterns cannot be sustained over time. In fact, the only possible spatial patterns in equilibrium are the single mega-cluster or multiple clusters of exactly equal size. Note that the utility functions of the model in this chapter are analogous to that in the highly simplified model in chapter 3; yet the model in this chapter has clearly been proven to show sustainable hierarchical structures—or more realism. Even though the generalized model in this chapter does produce relatively realistic spatial patterns, the conditions under which those spatial patterns emerge are restricted to a small range of parameters as seen in figure 6.5. In other words, a hierarchical structure in this model would be inherently unstable, collapsing with very little change in either travel cost or the coefficient of diseconomy.
An Experiment—Heterogeneous Distribution of Transport Cost
Remember, earlier in this chapter, it has been pointed out that hierarchical structures emerge when there are all kinds of migrants (as introduced by Ravenstein). The instability of hierarchical structure arises from the fact that only a small range of both p and c allows multiple migrant types to exist. Actually, it is possible, for instance, to assign different travel costs to the agents to solve this problem. Figure 6.6 shows sample spatial patterns that emerge under such a circumstance. The travel costs were uniformly assigned to agents with p ranging from 0 to 1. This specification of travel cost clearly offers stable hierarchical structures with the primate city except for c = 0.1 . The reason that it does not when c is small can be explained using figure 6.5: when c is very small, all agents form a single mega-cluster for every p less than 1—in this case, the short-journey migrants
do not exist; thus, no hierarchy. Similarly, one could also impose different distributions of transport costs and/or heterogeneous diseconomy coefficients if desired.
c = 0.1 c = 0.3 c = 0.5 c = 0.7 c = 0.9
Figure 6.6: Spatial patterns with travel cost uniformly distributed from 0 to 1.
The Rank-Size Rule
Although we have charted, in figure 6.5, the area of p and c that generate hierarchy, let us further examine the model’s consistency with the real world. The Rank-Size Rule is a commonly used theory, originally claimed by Zipf (1949). According to this theory, the second largest city should contain 1/2 as many people as the largest, the third should contain 1/3, the fourth should contain 1/4, and so on; in other words, a city’s population weighted by its rank is expected to be roughly constant. Using this empirical rule, one can evaluate the performance of an urban agglomeration model. To quantify “how consistent a population distribution is with the rank-size rule,” one way is to use the variance of the product of the cluster size times its rank. If a population distribution is perfectly consistent with the rank-size rule, the variance should be equal to zero. As it becomes less consistent with the rank-size rule, the variance should increase. In addition, in order for this index to be independent of the
total population size, divide the variance with the square of the total population. The resulting value should be able to measure the closeness to the perfectly rank-sized hierarchical structure. Let us call it the RSR index. For example, consider the perfectly rank-sized distribution where the largest cluster contains 1200 agents; the second largest contains 600 agents; the third contains 400 agents; and the fourth contains 300 agents, as shown in table 6.1. In this case, of course, the RSR index is zero. In table 6.2, the population distribution is a similar hierarchical structure, but it is slightly closer to the uniform distribution than the previous case. In this case, the RSR index is 0.0036. Size (total = 2500) Rank 1 1200 2 600 3 400 4 300 Variance RSR index (normalized by 25002) Weighted Size 1200 1200 1200 1200 0 0
Table 6.1: Perfectly Rank-Sized Distribution
Size (total = 2500) Rank 1 1000 2 700 3 500 4 300 Variance RSR index (normalized by 25002)
Weighted Size 1200 1400 1500 1200 22500 0.0036
Table 6.2: Not Rank-Sized Hierarchical Distribution
Figure 6.7 illustrates how the RSR index changes over the values of p under
c = 0.2 where each plot shows the average of the RSR indices obtained from 50
simulations. The RSR index appears to have its minimum level, or 0.006, at p = 0.59 . According to data from the US Census 1996, the RSR index of the Unites States ranges from approximately 0.001 to 0.00620, so the model successfully gives realistic spatial patterns as far as population distribution over cities is concerned. Similarly, in figure 6.8, other values of p that offer the minimum RSR indices are overlaid to figure 6.5. It is shown that the combinations of p and c that generate the most realistic distributions reside on the upper edge of the strip with high variances in the figure.
Figure 6.7: RSR Index vs. p under c = 0.2 .
This value depends on how many cities are included in the calculation. I used the largest 10 to 50
Figure 6.8: Set of p and c that offer realistic population distributions added to figure 6.5.
CHAPTER SEVEN: APPLICATION #2: RESOURCE DEPLETION AND POLLUTION ACCUMULATION The settings in the previous application have only homogeneous land and the qualities of the land do not change over time. In other words, all elements in the Qvector are functions of population only. Hence, since there is no population growth, no substantial dynamics is defined in the ODE of the Q-vector as shown in equations (22) and (23). In this chapter, resources and pollution are introduced as properties of the space to demonstrate how the properties of cells are used in the generalized spatial agglomeration model, and to observe how they affect the resulting spatial patterns the agents create.
In addition to the two properties, qi ,1 and qi , 2 , in the previous application, there are two new properties to cell i . Define qi ,3 as the amount of resources that are consumed by agents in i and qi , 4 as the level of the pollution from which agents in i would suffer. Thus, the Q-vector is:
ni qi ,1 2 − cni qi , 2 Qi = = c × amount of resources at i q i ,3 r q − c × pollution level at i p i,4
where cr and c p respectively represent each agent’s sensitivities to resources and pollution.
As in the previous application, there is no birth-death process, so q1 and q2 do not change endogenously. Resources are consumed by each agent in i at a constant rate rφ , so the amount qi ,3 depletes by rφ q1 for each unit of time since q1 is the population in cell i . Similarly, pollution level is raised by each agent in i at a rate rτ , so qi ,3 increases by rτ q1 for each unit of time. Therefore, the ODE of the Q-vector is written as:
& qi ,1 0 & qi , 2 0 & Qi = = & q −r q i ,3 φ 1 q r q &i , 4 τ 1
with qi ,3 ≥ 0 (Note that the amount of resources cannot be negative, so once qi ,3 reaches zero, it stays zero). The initial condition for the population is the same as before; qi ,1 (0) is either zero or a random positive integer (≤ 2500) that satisfies equation (24). As for the initial condition of resources, we will try different distributions. Also, we assume pollution does not exist at the beginning of time, so qi , 4 (0) = 0 for all i . Thus, write:
qi ,1 (0) 2 − cqi ,1 (0 ) Qi (0) = q (0) i ,3 0
For the global search, simply add q3 and q4 to equation (25) and take the linear sum; namely,
Γ(Qi ) = qi ,1 + qi ,3 + qi , 4
Hence, the utility function for the global search is:
G (Qi ) = Γ(Qi ) qi ,1 + qi ,3 + qi , 4 = dijp dijp
For the local search, take the linear sum of all elements in the Q-vector.
Λ (Q h ) = q h ,1 + q h , 2 + q h , 3 + q h , 4
Moreover, we assume that each agent cannot use the resources in her neighborhood except for that in her own cell, but does suffer from the pollution produced in her neighborhood. Therefore,
1 0 V = 0 1
Thus, the utility function for the local search is:
Lh = Λ(Qh ) +
k = h 's neighborhood
∑ Λ(V ⊗ Q )
k k ,1 k = h 's neighborhood
= qh ,1 + qh , 2 + qh ,3 + qh , 4 +
+ qk , 4
Simulation Results and Discussions Case 1: Random endowment of resources, No pollution, cr=50, rφ=0.2 and p=0.1.
For our first example, let us focus on the effect of limited resources distributed over the space. The initial distribution of resources is random; qi ,3 (0) is a random value in [0,1] (See figure 7.1). The coefficient for resources, cr , is 50, which enables the cell with enough resources to be more attractive than increased productivity simply by agglomeration. The depletion rate of resources is 0.2; therefore, one single agent can deplete all resources for five units of time at most. Transport cost is low enough ( p = 0.1), that agents are able to travel for long distances. The parameter c, is 0.3. Finally, we assume that agents’ decisions are not influenced by pollution for this example; thus c p = 0 . All other conditions are the same as before.
Figure 7.1: Initial distribution of resources for Case 1.
Figure 7.2 shows an example of simulations under these conditions. The agents first move to nearby cells that are initially endowed with more resources (t = 1) . As a
result, those cells attract more agents due to their resources and agglomeration economies (t = 2 − 3) . As the simulation advances, however, the resources in those clusters are depleted and the agents start to move out from the clusters (t = 4) . Then, the agents create new clusters where there are still resources (t = 5 − 7) . When most of the resources in the whole space have been consumed, agents begin to decide more by economies of scale than by the amount of resources, making larger clusters (t = 8 − 10) . Eventually, there emerges only one mega-cluster in the entire space (t = 11) . Recall when p = 0.1 and c = 0.3 in the simpler version presented in the previous chapter, agents form a mega-cluster from the beginning of simulations (see figure 6.2). There, it takes only about two units of time until most agents create the final mega-cluster whereas with the presence of resources it takes about 10 units of time. Notice also that before the simulation reaches its equilibrium, the spatial pattern exhibits a hierarchical structure (t = 8) . Although this is not a sustainable hierarchical structure, it should be noted that there are factors other than the specific range of transport costs that could cause hierarchical structures to emerge.
4 16 64
256 1024 4096
Figure 7.2: Formation of clusters when resources exist with low transport cost.
Case 2: Random endowment of resources, No pollution, cr=50, rφ=0.2 and p=0.5.
This is the case where p = 0.5 with all other parameters unchanged from the previous example. A sample result is shown in figure 7.3. Unlike the case with
p = 0.1, the agents do not create “temporary” clusters that are fated to disappear
before the equilibrium. Instead, they stay in their original locations or make only short trips to consume local resources first (t = 1 − 2) . Only after most of the resources are depleted do the agents start forming clusters (t = 3) and eventually the system reaches its equilibrium (t = 7) .
Obviously, the presence of resources delays the formation of clusters— Compare the path that the same set of p and c in figure 6.2 describes (the spatial pattern at t = 1 in figure 6.2 is close to one at t = 4 to that in figure 7.3). The spatial pattern at equilibrium appears to be unaffected by the resources.
4 16 64
256 1024 4096
Figure 7.3: Formation of clusters when resources exist with high transport cost.
Case 3: Random endowment of resources, cp=0.1, rτ=0.1, cr=50, rφ=0.2 and p=0.1.
We now add pollution to Case 1. The agents’ sensitivity to pollution, c p , is 0.1 instead of zero and the accumulation rate, rτ , is 0.1. All other parameters are the same as Case 1. Figure 7.4 is a sample result under these conditions. We observe that the result shows a behavior similar to Case 1 up to the point when the agents form a megacluster (t = 0 − 12) . In Case 1, this would be the equilibrium and the cluster would stay there forever. With pollution, however, the cluster stays only for a while
(t = 14 − 19) while agents accumulate pollutants in the cluster. When the pollution reaches a certain level, the agents start to “escape” from the cells with pollutants to other cells nearby (t = 20 − 22) . After agents finish this group migration, they stay in the new cluster for a while (t = 24 − 26) and move again (t = 28 − 30), repeating the process described above.
4 16 64
256 1024 4096
Figure 7.4: Formation of clusters and behavior of mega-cluster with both resources and pollution (low transport cost).
Case 4: Random endowment of resources, cp=50, rτ=1, cr=50, rφ=0.2 and p=0.1.
In this example, c p and rτ have been increased to 50 and 1 respectively, with other parameters held unchanged from Case 3. In other words, this is the case where pollution is dominant. The unique spatial pattern under this condition is shown in figure 7.5. We clearly observe the formation of “orphaned” cells instead of clusters that consist of the cells adjacent to each other. This happens because each agent is sensible of the pollution in his neighborhood, not only in his own cell; thus, he avoids having neighbors to minimize the negativity of pollution.
4 16 64 256 1024
Figure 7.5: Formation of clusters under larger influence of pollution.
Case 5: Skewed endowment of resources, No pollution, cr=50, rφ=0.2 and p=0.5.
This case is the same as Case 2 except that the initial distribution of resources is not random. The resources are initially distributed weighted by a Gaussian distribution with its peak in the center of the space as shown in figure 7.6. This is an example where a specific part of the space has absolute advantage that does not last (e.g. mining, etc.).
Figure 7.6: Initial distribution of resources for Case 5.
Figure 7.7 shows a result under this situation. First, due to the high concentration of resources, agents create the first cluster in the center (t = 0.2) and it grows (t = 0.4 − 0.8) . However, as the central part of the space loses its resources, it also starts to lose inhabitants (t = 1) . Eventually, it completely loses its attractive force and creates a sprawl (t = 1.8 − 2.0) . Agents then develop satellite cities around the center of the space (t = 3 − 7) .
This example demonstrates the model’s potential to recreate complex phenomena in the real world. In the next example, we will see a somewhat different approach to simulate an urban sprawl.
4 16 64
256 1024 4096
Figure 7.7: Behavior of agents with depletable resources concentrated in the center of the space.
Case 6: Skewed endowment of resources, cp=1, rτ=0.1, cr=50, rφ=0 (recoverable resources) and p=0.5.
In the previous case, agents leave the center of the cluster because of the depletion of resources. In this example, resources are never depleted, or rφ = 0 . (Imagine the resources to be wood or other kinds of plants that can be naturally replenished) However, let us assume again that agents pollute indiscriminately, eventually expelling themselves (c p = 1, rτ = 0.1) . A result is shown in figure 7.8. As in the previous example, agents start to create a mega-cluster in the center of the space (t = 0.5) and the cluster simply grows (t = 1 − 3) while they continue to pollute. Since the center of the cluster had immigrants first and has held agents longer than other areas, the pollution level is the highest. Thus, the center loses its inhabitants first (t = 4), creating a hole in the cluster (t = 4.5 − 6) . The hole continues to grow (t = 6.5 − 7) . Eventually, the “ring” breaks into several smaller clusters that will continue to move around through the space (t = 13) . Notice that although the environment set for this example is quite different from the previous case, it forms a similar spatial pattern in the early stages.
4 16 64
256 1024 4096
Figure 7.8: Behavior of agents with renewable resources concentrated in the center of the space and with accumulation of pollutants.
Case 7: Skewed endowment of resources, cp=1, rτ=0.1, cr=50, rφ=0 (recoverable resources), pollutant purification, and p=0.5.
In this example, the pollutants produced by agents diminish over time, as is sometimes the case in Nature. This can be represented by simply altering the fourth element of equation (31):
& qi ,1 0 & qi , 2 0 & Qi = = & − rφ q1 q i ,3 q r q − r &i , 4 τ 1 p
where rp is the rate of purification. Note that this rate does not depend on the population of agents since the purification is assumed to be accomplished by Nature.
Figure 7.9 shows a sample simulation under rp = 1 . As in the previous case, the agents form a ring-like cluster (t = 0 − 8) . Then, while agents are gone from the central part of the lattice, the pollutants are purged (t = 10 − 12) . Since this area is still attractive because of its high concentration of (recoverable) resources, some agents move back to the central area, which they once abandoned (t = 14 − 18) . They, however, produce pollutants again and evacuate that area (t = 20 − 30) . The agents repeat this process and the system exhibits periodic behavior.
4 16 64
256 1024 4096
Figure 7.9: Behavior of agents with renewable resources concentrated in the center of the space and with pollutant purification.
CHAPTER EIGHT: APPLICATION 3: PROPAGATION In the real world, population growth plays a significant role in the formation of spatial patterns. One of the benefits of using agent-based models is that when we need to deal with population growth, the procedure is straightforward—simply add more agents. The addition of new agents should be done by certain rules. For example, it is reasonable to place the new agents in the cell in which other agents (or parents) live or in their neighborhood. Also, the birth rate could be a constant or a function of the Qvector. In this application, newborn agents are initially placed in the same cell as their parents and the birth rate is assumed constant and each agent gives birth to a new agent stochastically with a constant probability.
The Q-vector of this version is the same as the one in the last chapter—or equation (30)—except that we do not consider pollution (for the sake of simplicity). Thus,
qi ,1 ni 2 − cni Qi = qi , 2 = q c × amount of resources at i i ,3 r
& The main difference from previous applications is that qi ,1 is no longer always & equal to zero. In fact, if we have a constant probability of birth, qi ,1 can be any value
in [0, 1] and it changes at every moment. However, on the average or when qi ,1 is
& large enough, qi ,1 becomes equal to the constant probability. Thus, & qi ,1 rn 2 & & Qi = qi , 2 = crn q − r q &i ,3 φ 1
where rn is the probability of birth. Note that we assume that a single agent can give birth to a new agent without a mate although this is not the case for human beings. To solve this problem, consider the agents as family units rather than individuals. For future studies, this can be improved by introducing gender and by imposing more strict conditions on propagation. At t = 0, only one agent is placed in one cell21, so the initial condition for equation (40) is:
qi ,1 (0) Qi (0) = − cqi2,1 (0) q (0) i ,3
where qi ,1 (0) = 0 for i = 1, 2, K , m-1, m + 1, K , N and qmi ,1 (0) = 1 . The initial distribution of resources, or qi ,3 (0) , is the same as Case 1 in the previous chapter;
qi ,3 (0) takes a random value in [0, 1] for all i (See figure 7.1).
Although, the first agent is located in the center of the lattice in the figures in this chapter, where the
“center” is does not affect the result since the space is a torus.
Figure 8.1 shows sample results for three different transport costs ( p = 0.1, 0.5 and 0.9) . The probability of birth, rn , was set to 0.1. The graph on the top row, for each level of transport cost, shows the population growth (dashed lines) and the change in the number of clusters (solid lines). Note that although the population explosions take place at different times for those three examples (e.g., p = 0.1 at t = 0 and p = 0.5 around t = 20 ), this is not because of different transport costs. Under the same probability of birth, they happen stochastically, especially at the early stages when there are but a small number of agents. Initially, for each level of transport cost, when the agents have not formed clusters, the population and the number of clusters are the same; thus, the dashed line and the solid line overlap. For this period, individual agents appear to walk randomly—they actually repeat short journeys seeking new resources in their neighborhoods since resources are a more attractive force than agglomeration. Eventually, resources in the whole space run short and agglomeration becomes more attractive to the agents. Consequently, the number of clusters diverges from the trajectory (see the top graphs in figure 8.1) and the rate of growth of the number of clusters slows down. Finally, it starts to decline due to further agglomeration. Compare p = 0.1 and p = 0.5 . For p = 0.1, the number of clusters begins to decline, or the agents start to agglomerate, when the population is around 60 whereas for p = 0.5 the population is around 100. This is because when transport cost is high, the population needs to be dense in order for an agent to find another agent in her vicinity to form a new cluster.
200 Agents Clusters 180
200 Agents Clusters 180
200 Agents Clusters 180
140 Number of Clusters/Agents Number of Clusters/Agents 120
140 120 Number of Clusters/Agents 0 10 20 30 40 50 Time 60 70 80 90 100
Figure 8.1: Propagation of agents.
Recall that in the previous example, as agents diffuse some of them reach the lattice’s boundaries, which are connected to each other. Therefore, the resulting spatial patterns are equivalent to those from a space of infinite horizon with a “seed” agent
deployed every 50 cells. By expanding the unit size of the space, we should be able to observe the spatial patterns where the boundary effect is reduced. In figure 8.2, a case of p = 0.5 is presented. The only difference from the previous example (see figure 8.1) is that the size of the space is now 100×100 cells instead of 50×50. Initially, (up to t = 30 ), agents simply disperse forming a “cloud” of agents as in the previous case for t = 0 to 30. Then, at t = 40, they form the first clusters in the center of the cloud due to the denser population in that area. This continues as the cloud expands while the initial clusters grow further (t = 50) . At
t = 60, we observe the hierarchical spatial pattern, where the larger clusters are
located in the center and small ones are located in the outer tier.
64 256 1024 4096
Figure 8.2: Propagation of agents in a larger space.
CHAPTER NINE: APPLICATION 4: GEOGRAPHY In order for the model to be applied to specific regions, it is essential to include real geography to the space. The geographical features might include complex coastlines, political borders, lakes, rivers, mountains, etc., each of which has unique effects on agents’ preferences. In this chapter, an example is presented to show that those geographical features can be incorporated into the Generalized Spatial Agglomeration Model simply by adding a new property to the Q-vector.
The first three elements of the Q-vector ( q1 , q2 and q3 ) are the same as equation (39) in the previous chapter. In order to include geography to the model, we add another element to the vector; that is,
ni qi ,1 2 − cni qi , 2 = Qi = qi ,3 cr × amount of resources at i q gi i,4
where g i is an index that represents the “habitability” and the attraction level of location i . For example, if location i is on the ocean or the lake, g i is 0, making it impossible for agents to live in i ; if location i is a habitable, but not so attractive, land (e.g. infertile or mountainous areas, etc.), then g i could be 1; if it is land facing the ocean (thus extremely attractive), g i may be 10.
Unlike the model in the previous chapter, the birth rate here is not constant. Instead, we assume that the more urbanized the location is, the less children people decide to have; thus, the birth rate is now a function of population. One way to realize this would be to divide the birth rate by the population. Also, the ODE’s for qi , 2 and
qi ,3 are inherited from equation (40) and qi , 4 is assumed to be constant22. So, the ODE
is written as:
& qi ,1 rn / q1 2 & q & i , 2 cr Qi = = n & q −r q i ,3 φ 1 q 0 &i , 4
The initial condition also inherits elements from the previous chapter with an additional element for geographic features; that is,
qi ,1 (0 ) 2 − cqi ,1 (0 ) Qi (0 ) = q ( 0) i ,3 q ( 0) i,4
where qi ,1 (0) = 0 for i = 1, 2, K , m - 1, m + 1, K , N and qmi ,1 (0) = 1 . As in equation (41), qi ,3 (0) takes a random value in [0, 1] for all i. The initial condition for qi , 4 is given externally for each, shown in figure 9.1. As suggested earlier, cells facing the ocean or lakes are given high values (especially the ones in California have even higher values because of the attraction of gold early in the history of the United States). However, coasts further north than the Great Lakes receive a handicap, due to their cold climate.
This assumption should be relaxed if we consider the time scale of natural history.
Figure 9.1: Initial condition for qi , 4 for all i (North American Map).
Unlike the utility functions used in previous chapters, we do not simply take the linear sum of all elements of the Q-vector. Instead, we treat q j , 4 in a special manner since it affects how each agent perceives the other elements. For example, if location j is inhabitable ( q j , 4 = 0 ) because it is on the ocean, then that fact should invalidate the other elements. Hence, an effective form for Γ(Q j ) might be:
Γ(Q j ) = q j , 4 (q j ,1 + q j , 2 + q j ,3 )
Thus, the utility function for the global search is:
Gj = q j , 4 (q j ,1 + q j , 2 + q j ,3 ) d ijpa
The parameter p has the subscript a because we distribute various transport costs over the agents, as in Chapter 6. Similarly, for the local search, the utility level solely from location h is:
Λ(Qh ) = q j , 4 (q j ,1 + q j , 2 + q j ,3 )
Let us make the same assumptions for the neighborhood effect as before for population and resources. In addition, the geographic features of location h' s neighborhood by themselves do not affect one’s preferences over location h . Hence,
1 0 V = 0 0
Therefore, the total utility level for the local search is:
Lh = Λ(Qh ) +
k = h 's neighborhood
∑ Λ(V ⊗ Q )
k k = h 's neighborhood
= qh , 4 (qh ,1 + qh , 2 + qh ,3 ) +
∑ qk ,1 ⋅ qk ,4
Figure 9.2 demonstrates simulation samples under three ranges of transport costs, which are randomly distributed over agents. Every time a new agent is born, it is assigned a random transport cost within the specified range. Also, the first agent is located in Philadelphia as the snapshots on the top row indicate. The leftmost column shows the behaviors of agents with a low transport cost range (0 < p < 1) . Since the agents have higher mobility, the early agents quickly leave the east coast and find the attractive lands in the west coast (t = 10) . Then the offspring of those agents explore other areas all over the continent and find other attractive areas such as the Great Lakes and Florida (t = 30 − 40) . Those in California and around the Great Lakes develop urban clusters (t > 50) . Since most agents tend to belong to a cluster due to their low transport costs, the whole population does not grow fast. (Recall that the birth rate is inversely proportional to population density.) The rightmost column shows a case of agents with high transport costs. Since
p can be up to 100, most of the agents are “sluggish”. In fact, in the simulation
shown in the figure, the early agents stay in the east coast for a while (t < 30) and start to shape the first clusters there. Then, instead of jumping to California, the agents first make short trips to the nearer side of the Great Lakes (t = 40) . Sometime between t = 40 and 50, a few agents with low transport costs are born and they move to the west coast while some other agents stay around the Great Lakes (t = 50 − 60) . Those agents with high transport costs slowly expand their territories and create clusters in the eastern half of the continent whereas those in California form their own
cluster by reproducing and by attracting newborn agents with low transports costs (t > 60) . The middle columns shows a case between those explained above. The early agents do form clusters around the east coast first (t < 50), but as soon as more mobile agents are born they move to California and create new urban clusters (t = 50 − 70) . Eventually, agents in both east and west sides of the continent develop equally large clusters.
0 < p < 10
0 < p < 100
Figure 9.2: Formation of urban clusters of North America simulated under various transport cost ranges.
As far as these three cases are concerned, the one with medium transport cost range (0 < p < 10) appears more realistic than the others. Figure 9.3 shows various results (t = 100) of simulations under this transport cost range. Naturally, the resulting spatial patterns are different from each other due to the fact that the model is
stochastic. In spite of the variety of the results, the east and the west coasts tend to have roughly equal populations.
64 256 1024 4096
Figure 9.3: Snapshots of simulations at t = 100 under 0 < p < 10.
CHAPTER TEN: CONCLUSIONS This study serves four major purposes: (1) to introduce agent-based models to regional science, (2) to construct a generic spatial agglomeration model, (3) to show that the model offers hierarchical structures, and (4) to explore the model’s potential. This chapter summarizes the results from previous chapters with regard to these purposes. In chapter 3, a very simple agent-based model for spatial agglomeration is introduced. Although the model is highly simplified and there are unrealistic assumptions, it offers general ideas as to how agent-based models work. In chapter 4, a generalized spatial agglomeration model is presented. The model is designed so that the user can “customize” it not only for urban agglomeration but also for a wide variety of agglomeration phenomena in general, which is realized by combining global and local search for agents’ migration behaviors. When using this model, the user can specify (1) the attributes of location, (2) their dynamics and (3) each agent’s preferences in a consistent way. Chapter 5 explains the common procedures for the model’s applications demonstrated in the following chapters. The first application is shown in chapter 6. The factors involved in each agent’s decision-making process is as simple as the model in chapter 3, but the model is re-written in terms of the generalized model and it now includes the “global-local hybrid” search as well. The effects of the two key factors—transport cost and carrying capacity—are examined. In particular, it is shown that there is a set of transport costs and carrying capacities that generate realistic spatial distribution, which is validated using a newly introduced index that measures how well the spatial distribution matches the Rank-Size Rule. In fact, this model indeed offers realistic spatial patterns 81
and the results show that this study provides a reasonable answer as to how the RankSize Rule can be valid in the real world. Recall that, as pointed out in chapter 2, Krugman’s model produces only uniform spatial patterns with no concept of area. Schweitzer and Steinbink’s model does produce realistic spatial patterns, but the behavioral rules of agents lack economic sense (as they themselves point out that their model is a “physical” model); therefore, one cannot expect to use this model for policy analyses. In chapter 7, resources and pollution are added in order to show how the dynamics of spatial or land attributes affects agents’ behaviors. We observe that additional parameters such as the depletion rate of resources, accumulation rate of pollution and initial distribution of resources dramatically alter the resulting spatial patterns and the existence of equilibria. It is also shown that this model can simulate urban sprawls. In chapter 8, this model is proven to be capable of simulating the propagation of agents. In chapter 9, a real North American map is incorporated into the model. It is done by simply treating geographic features of each location in the same way as other attributes (such as population and resources) without modifying the model itself. The results show the potential that the model could be applied to real life problems within the framework of the generalized spatial agglomeration model. Axtell and Epstein also incorporated additional features such as resources, pollution, propagation, gender, etc. Their model, however, is not constructed in a highly generalized way, so it is not easy to “use” the model for specific purposes. With the Generalized Spatial Agglomeration Model presented in this study, on the other hand, it has been shown that one can include any quantifiable attributes of the
space into the model in a consistent way—by introducing new elements to the QVector and defining its ODE. This study is the first to vigorously utilize agent-based models in the field of Regional Science. It has proven that agent-based models are not only useful in the field (especially when combined with existing methods), but they could also offer good answers to the complex questions that traditional models cannot solve. It is expected that agent-based models will be explored further in the coming decades and will eventually become a norm for the types of analyses where realistic modeling is required.
BIBLIOGRAPHY Durrett, R. and Levin, S. A. “Stochastic Spatial Models: A User's Guide to Ecological Applications.” Philosophical Transactions of the Royal Society of London B 343, 1994. 329-350. Flick, Alexander C. (ed.). History of the State of New York, New York: Columbia University Press, Vol. V., 1934 Isard, W. Location and Space Economy: A General Theory Relating to Industrial Location, Market Areas, Land Use, Trade and Urban Structures. Cambridge: MIT Press, 1956. Isard, W et al. Methods of Interregional and Regional Analysis. Burlington: Ashgate, 1998. Joshua M. Epstein and Robert L. Axtell. Growing Artificial Societies. Washington, D.C.: Brookings Institution Press, 1996. Krugman, P. The Self-Organizing Economy. Malden: Blackwell, 1996. 88-92. Orcutt, G. “From Engineering to Microsimulation.” Journal of Economic Behavior and Organization 14, 1990. 9-10. Page, S. “On the Emergence of Cities.” Journal of Urban Economics 45, 1999. 184208. Ravenstein, E.G. “The laws of migration.” Journal of the Royal Statistical Society 48, 2: 167-235, 1885. Schweitzer, F. and Steinbrink, J. “Urban Cluster Growth: Analysis and Computer Simulation of Urban Aggregations.” Self-Organization of Complex Structures: From Individual to Collective Dynamics Ed. F. Schweitzer. London: Gordon and Breach, 1997. 501-518. Zipf, G. Human Behavior and the Principle of Least Effort. Cambridge: AddisonWesley Press, 1949
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.