You are on page 1of 36

The Review of Economic Studies Ltd.

Social Interactions, Local Spillovers and Unemployment Author(s): Giorgio Topa Source: The Review of Economic Studies, Vol. 68, No. 2 (Apr., 2001), pp. 261-295 Published by: The Review of Economic Studies Ltd. Stable URL:

Accessed: 18/03/2010 16:11

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at JSTOR's Terms and Conditions of Use provides, in part, that unless

you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact

information about JSTOR, please contact The Review of Economic Studies Ltd. is collaborating with

The Review of Economic Studies Ltd. is collaborating with JSTOR to digitize, preserve and extend access to The Review of Economic Studies.


of Economic Studies (2001) 68, 261-295


? 2001 The Review of Economic Studies Limited



Interactions, Local



GIORGIO TOPA New York University

First versionreceived May 1997;final versionaccepted July 2000 (Eds.)

I analyse a model that explicitly incorporates local interactions and allows agents to exchange information about job openings within their social networks. Agents are more likely to be employed if their social contacts are also employed. The model generates a stationary distri- bution of unemployment that exhibits positive spatial correlations. I estimate the model via an indirect inference procedure, using Census Tract data for Chicago. I find a significantly positive amount of social interactions across neighbouring tracts. The local spillovers are stronger for areas with less educated workers and higher fractions of minorities. Furthermore, they are shaped by ethnic dividing lines and neighbourhood boundaries.


Over the past decade economists have increasingly recognized the importance of non- market interactions in a variety of contexts, such as joblessness, crime and other social pathologies, peer influences in education, social learning and the diffusion of innovations, localization choices by households and firms, growth and income inequality. One common feature in these studies is the assumption that agents' choices and payoffs are affected by other agents' actions not just indirectly through markets, but also directly through imi- tation, learning, social pressure, information sharing, or other non-market externalities. It is also assumed that agents interact locally, with a set of neighbours defined by an economic or social distance metric. Benabou (1993, 1996) and Durlauf (1996a, b) incorporate local interactions in the human capital accumulation process into endogenous growth models that exhibit neigh- bourhood stratification and persistent and widening income inequality. In the field of economic geography, Audretsch and Feldman (1996) and Rauch (1993) argue that local knowledge spillovers produce agglomeration economies and hence affect the location decisions by firms. There exists also a rich theoretical literature that stresses the role of interactions in models of herds and information cascades (see Banerjee (1992) and Bikhchandani et al. (1992)), or in models of social learning (Bala and Goyal (1998), Gale and Rosenthal (1999), Morris (1997)). On the empirical side, a vast and growing literature has attempted to provide a stat- istical estimate of the magnitude of local interactions and neighbourhood effects.1 Glaeser et al. (1996) explain the very high variance of crime rates across U.S. cities through a model in which agents' propensity to engage in crime is influenced by neighbours' choices:

in so doing, they provide estimates for the range of social interactions. Case and Katz (1991) explore the impact of neighbourhood effects on several behavioural outcomes, such

1. Jencks and Mayer (1990), loannides and Datcher (1999) and Brock and Durlauf (1999) give excellent surveys of the empirical literature.




as criminal activity, drug and alcohol use, childbearing out of wedlock, schooling, church attendance. Ludwig et al. (1999) use the Moving To Opportunity programme as a natural

experiment to evaluate the magnitude of

that local social networks have a significant impact on individual welfare participation. In studies concerning education, there is a long tradition starting with the Coleman report (Coleman et al. (1966)) of studying possible peer influences on educational outcomes:

Hanushek et al. (2000) use very detailed data on Texan schools to estimate peer effects in student achievement, whereas Zax and Rees (1999) use a Wisconsin Longitudinal Study to estimate the impact of peer influences during school years on subsequent earnings. Sociologists have also argued, long before economists, that "one's neighbours mat- ter" in defining one's opportunities and constraints (see Burt (1992)). Individuals are not considered as isolated entities but rather as being part of networks of friends, relatives, neighbours, colleagues, that jointly provide cultural norms, economic opportunities, infor-

mation flows, social sanctions and so on. Wilson (1987) argues that adults in a community influence young people by providing role models in terms of the value of education, steady employment and stable families. Coleman (1988) considers social networks as a source of "social capital" since they provide valuable information, lower transaction costs, allow monitoring and enforcement of socially optimal outcomes. The main objective of this paper is to formulate and estimate a model of local interac- tions in the labour market. In particular, I assume that agents exchange information with their social contacts about job opportunities and that hiring may occur through informal channels. The model generates a stationary spatial distribution of unemployment that can be compared to the empirical unemployment distribution over a set of contiguous locations in order to estimate the model parameters. I can then test for the existence of local information spillovers and provide an empirical measure of their magnitude. The main innovation with respect to the existing empirical literature is to provide a more structural approach to the estimation of local interaction effects. The importance of informal channels in finding jobs has been documented, among others, by Corcoran, Datcher and Duncan (1980) and Granovetter (1995). Both studies report that more than 50%of all new jobs are found through friends, relatives, neighbours or occupational contacts rather than through formal means. This is especially true for low- skill jobs, for less educated workers and for black workers. From a theoretical standpoint, informal hiring channels may coexist in equilibrium with a formal labour market because of information asymmetries: Montgomery (1991) analyses a model in which employers cannot perfectly observe the quality of prospective employees and solve the adverse selec- tion problem by relying on referralsfrom their high-ability workers. The basic assumption is that there exists assortative matching in agents' social networks, so that high-ability workers are more likely to refer individuals like themselves. In my model, agents reside in locations that are linked through an explicit network structure. Each agent, while employed, can transmit information about job openings to her unemployed contacts; the same agent, while unemployed, can receive useful tips about jobs from her employed contacts. Risk-averse individuals find it in their best interest to engage in such information exchanges in order to (partially) insure themselves against possible future unemployment shocks. These mutual insurance arrangements can be sustained even in a limited commitment environment. Unemployment in the model evolves according to a Markov process over the set of locations. At any point in time, employed individuals may become unemployed with some exogenous probability, whereas unemployed individuals may find a job with probability

neighbourhood effects. Bertrand et al. (1999) find




increasingin the number of neighbourscurrentlyemployed. This process generates positivespatialcovariancesof unemploymentbetweennearbylocations. I use data on the spatialdistributionof unemploymentin Chicagoto estimatethe structuralparametersof the model. One of the most strikingfeaturesof unemployment

in Chicagoin recentyearsis its geographicconcentrationin a few areas,mainlyin the

South and the West Side: both

unemploymenttendedto be clusteredtogetherin geographicallycontiguousareas,rather than being spreadaround in a randomfashion (see Figures 1 and 2). The change in unemploymentratesbetween1980and 1990was also spatiallycorrelated(Figure3). This geographic"lumping"is consistentwiththepresenceof localinteractionsandinformation spillovers.

in 1980 and in 1990, Censustractswith high levels of

fK-1 - 5-1-7-9 Unemploymentrate, 1980 0-5:::1 _n ]:.:::'.''1.e:1|9: 2 0 2 4 Miles 7.9-12.1
Unemploymentrate, 1980
0-5:::1 _n



Map of unemployment rate, 1980



Unemploymentrate 1990 2 0 0=5iio-56 2 4 Miles -56-9 9 -9 13-9-23-4 23-4-100
Unemploymentrate 1990



Mapof unemploymentrate, 1990

The unit of observationin the structuralmodel is a Census tract, and the basic assumptionof the modelis that residentsof one tractexchangeinformationlocallywith residentsof the adjacenttracts.Thislocal interactionimpliesthat the employmentratein one tract is affectedpositivelyby the employmentrate in the neighbouringtracts.The transitionsinto andout of unemploymentalso dependon a set of observabletractcharac-

teristics.Allowing for tract heterogeneityis imnportantin

positive sorting of individualsacross locations. In fact, agents may sort into different

neighbourhoodson the basis of their neighbours'characteristicsor becausethey have similarpreferencesover differentconsumptionbundles (see, for example,Beckerand


Such sortingmay inducepositivespatialcorrelationof unemploymenteven in the absenceof any local informationspillovers.I try to distinguishthe social interaction

orderto addressthe issue of




i.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ , Unemnploymentrate,90-SO [I]-66*7--19 2 0 2 4 Miles -1 9-0.6 0 6-3.2
i.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ,
4 Miles
-1 9-0.6
0 6-3.2



Map of unemploymentrate,1990-80

channelfromotherpossibleinter-tractinfluencesby usingadditionalinformationon the specificdimensionsalongwhichagents'socialnetworksareconstructed,and on the geo- graphicboundariesof local communitiesidentifiedby residents.For example,sincethere exists a strongdegreeof ethnichomogeneity in social networks,one would expect social interactionsto be weakerbetweentractsthat haveverydifferentethniccompositions. I estimatethe modelvia the indirectinferencemethodologyof Gourieroux,Monfort and Renault(1993),sinceit is not possibleto characterizeanalyticallythe invariantdistri- bution of the contact process describedabove. The structuralparametersare estimated indirectly,by minimnizingthe distancebetweenthe actual data and simulationsof the structuralmodelfor differentparametervalues.In particular,one uses the parameters of an auxiliarystatisticalmodel (morereadilyestimnablethan the structuralone) to definea



chi-square criterion for the indirect estimation.2 In the auxiliary estimation, I also control for unobserved tract-specific fixed effects that do not vary over time. The empirical results support the model with built-in local interactions and reject the hypothesis that the observed spatial patterns in unemployment can be explained by observed tract characteristics alone. The estimated local spillovers are significantly posi- tive: on average, an increase in the employment rate of neighbouring tracts by one stan- dard deviation brings about an increase in expected employment of 0-6 and 1 3 percentage points in 1980 and 1990, respectively.3Interestingly, spillovers are stronger in tracts with lower education levels and with higher fractions of non-whites. This is consistent with other direct evidence on informal hiring.4 Furthermore, the local interaction channel is found to be weaker across adjacent tracts that have very different ethnic compositions, as well as across local community boundaries that have been identified by residents. This result lends support to the hypoth- esis that the observed spatial patterns in unemployment are indeed generated by social interactions as opposed to other possible inter-tract linkages. Finally, an important caveat is representedby the finding that the introduction of spatially correlated shocks, to mimic the possible presence of correlated unobservables, greatly reduces the size of the infor- mation spillover. However, the model with correlated shocks does not fit the data as well as the baseline model. The theoretical and empirical framework developed in this paper may be applied more generally, to any of the settings described above in which local interaction effects may be present. The wider goal of this paper then is to provide some tools to test for the existence of interaction effects, to estimate their magnitude, and to simulate the effects of proposed policy experiments in their presence. The remainder of the paper is organized as follows. Section 2 presents the structural model and its general properties. Section 3 describes the indirect inference methodology and the auxiliary model, and discusses important identification issues. The data used in the estimation and the simulations are described in Section 4. Section 5 reports the empiri-


numerical estimates of the magnitude of the spillover effects. Section 6 concludes.

results of the indirect inference estimation of the structural model, and provides some



The starting point of the analysis is the idea that agents are embedded in social networks:

in particular, they can exchange information about job opportunities with their social contacts, who are more likely to transmit useful information if they themselves are employed. Thus the probability of finding a job is greater, the higher the employment rate in one's social network. The information exchange activity among agents can be seen as the result of an

agent has an incentive to

efficient insurance arrangement with limited commitment. Each

transmit job information to her contacts when she is employed, in the expectation that they will reciprocate when bad times hit. With risk averse agents, such implicit contracts are sustainable even in the absence of formal enforcement mechanisms by means of direct penalties (e.g. social stigma) or the threat of exclusion from the insurance arrangement if a violation occurs (such contracts have been analysed by Thomas and Worrall (1988) in

2. A verygood introductionto indirectinferencemethodscanbe foundin Tauchen(1996).

3. SeeTable4. A one standarddeviationchangein theemploymentratecorrespondsto 8 and 12percent-

agepointsin 1980and 1990,respectively.




the context of long-term wage contracts, and by Ligon, Thomas and Worrall (1999) in the context of informal credit in developing economies). The specific way in which I model the information interaction is through a discrete- time Markov process on a set of locations, that is similar to a contact process (this was first studied by Harris (1974) in the context of interacting particle systems). In the first part of this section I present the model and describe its basic properties. The individual units in this setup are Census tracts, since the empirical analysis will be conducted at this level of aggregation. In the second part, I discuss the use of physical distance to determine who is "close" to whom, and I present some evidence that justifies carrying out the analy- sis at the tract level. In addition, I discuss possible micro-foundations of the model at the level of individual agents, and show that this disaggregated framework is observationally similar to the former one.

2.1. A model of inter-tract local interactions

The building blocs of the model are a set of locations, a state variable and a set of neighbours for each location, and conditional transition probabilities for the state of each location that depend on the state of the neighbours in the previous period. Let S be a finite set of locations. A location in this framework is taken to represent

a Census tract. Each location i is indexed by a vector of characteristicsXi that are constant over time. These characteristics may affect the probability of gaining or losing jobs in each location. Time flows discretely from 0 to oo in the model. The state of each tract

every period, Yit, is the employment rate within each area. For

the state variable can only take a finite number of values in the interval [0, 1]. In particular,

, state of the system at any point in time is a configuration of employment rates YtE <t/=Es. 91_Es An "adjacent tract" distance metric is defined over the set of locations: tracts i and

j are defined to be at distance 1 if they share an edge on the physical map of the city;

tracts i and k are at distance 2 if k is adjacent to a tract that is at distance 1 from i, and

so on. In general, the distance d between any two tracts i and j is defined as the number of tract boundaries that one needs to cross to travel from a point inside tract i to a point

computational reasons,

K. Therefore, the


{el, .

., eKl, where e1= 0, eK=

1, and ek-ekJ,


0 1, k = 1,

insidetract j.5

Using this metric, a finite set of neighbours Ni is defined for each location i, as the set of tracts located at distance 1 from i.6 Therefore, I assume here that residents of one tract interact only with residents of the areas physically nearest to them. This definition could be easily modified to include all neighbours at distance up to d from i, for some



The evolution of the system is governed by the following conditional transition prob- abilities, that spell out how the state of any one site may change in the next period, given the present configuration. If tract i is at full employment (i.e. yit = 1), then it may drop to the next lower employment rate with a probability that depends on its own characteristics


Pr (yi,t+1= ? 91yit= 1; Xi) = y(Xi),


5. In thecaseof sitesarrangedon a two-dimensionalintegerlatticeandindexedby a pairof integers(i,j),

thisis equivalentto defininga distancemetricd s.t. I(il,Il)




-1 i2


|IIl -j2-

6. Onaverage,a Chicagotracthas4 2 adjacentneighbours.Typically,tractsareroughlyrectangularareas

withfourneighbours,one for eachedge.



and will stay at the present state with probability (1 -Pd). On the other hand, the prob- ability of going from zero employment to the next higher employment rate depends on the tract's characteristics, but also on the information Iit that residents of tract i may receive from their employed social contacts in the neighbouring areas


The flow of information received by tract i is in general an increasing function of the employment rate in the neighbouring tracts

Pu. Pr (yi,t, I = 0 1Yit= 0; yt, Xi) = cx(Xi)+ X(Xi) Iit-

Iit = f(YNt),




where yN 1/N Yit.7 Again, tract i may stay at a zero employment rate next period

with probability (1 -Pu).

so the information variable Iit is equal to the average employment rate of the neighbours yN. As for the functions A(-), a(-), and y( ), they are assumed to be linear in their

arguments: e.g. 2(X) .X0+ = l

Finally, if the state of tract i at time t is in the interior of the unit interval, then it

switches to either an upward mode or a downward mode with probability 0.5, and in each case the same transition probability as in equations (1) or (2) applies. Therefore, we can write:

In the estimation that follows f()


is simply the identity function,

Pr (yi,t+ 1 = ek-llYit = ek; Xi)



Pr Yi,t+1=ek+lJYit=ek;Yt,Xi)=



Again, the state of tract i may remain unchanged during the next period with probability



The transition probabilities (2) and (5) capture the information exchange process. The employment rate in any one tract may rise through two separate channels: one is a function of tract characteristics that may reflect labour supply or demand conditions in the area, and is independent of any interaction. The second factor is the information about job opportunities received by tract residents from social contacts in the neighbour- ing tracts. The term X(.) captures the "contagion" effect as in the standard contact pro- cess. It is worth noting that the strength of the information exchange channel may also be affected by the tract residents' characteristics.This allows me to estimate local spillovers in unemployment for different "types" of tracts, in terms for example of the education levels of its residents or its race composition. Finally, the information exchange process is assumed to affect the probability of an increase in employment, but not the probability of a drop in employment in the tract (equations (1) and (4)). In other words, I assume that information interactions may affect employment opportunities, but do not play a role in the transitions out of employment. Allowing for tract heterogeneity with respect to certain observable characteristics is important in order to be able to discuss the implications of positive sorting across

some characteristics, which

may also be related to employment outcomes. Therefore, one could observe positive spa- tial correlations of unemployment that are not related to any information spillovers but are simply due to the spatial correlation patterns of the covariates along which people

locations. Agents may

sort into different areas on the basis of

7. With a slight abuse of notation, I use Ni to indicate both the set of neighbours of tract i, and the number of elements in this set.




sort.8 The absence of a time subscript on the X covariates amounts to assuming that the process through which agents lose and find jobs takes place at a higher frequency than the process that rules individuals' locational choices and hence the spatial distribution of these characteristics across tracts. One final note concerns the nature of the random shocks that are used to simulate the model. The system is started off at some initial configuration yoe X< Every period, a vector (o, of shocks is drawn from a uniform distribution on [0, 1]. Then the state of each location in the model is updated according to the realization of the shock wi, using the transition probabilities detailed above.9 These exogenous shocks to employment can be i.i.d. in space over the map of the city, or they can be generated according to some autocorrelation structure. The latter option is exploited in the estimation to allow for the possibility of correlated unobservables driving the spatial correlation patterns observed in the data.

2.1.1. Model properties. In Section 3 the structuralmodel is estimated by comparing

simulated realizations of the cross-sectional distribution of unemployment, drawn by the stationary distribution of the model, with the empirical cross-sectional distribution of unemployment at a given point in time. Therefore, I focus on the cross-sectional properties of the stationary distribution rather than on the dynamic properties of the model. The model described above generates a first-order Markov process on the map of locations. The state space 3"contains all possible configurations of employment rates over

., W where W is

the set of locations. Since 3"is finite, I can index each state by w = 1,

the total number of possible states for the whole system. Let X be the set of all probability measures on 3". Then a probability measure A eY on v is simply a finite vector of

W. In particular, the evolution of the system is governed by

probabilities Aw, w = 1, , the following rule



Q It,

where Q is the (Wx W) transition matrix, whose entries qrs denote the transition prob- abilities from state r to state s. These transition probabilities can in principle be calculated from the conditional transition probabilities (1)-(5). A stationary distributionof the pro- cess is a vector v such that v = Q v. It is straightforwardto show that a unique stationary

distribution v(X) exists, for

It is very difficult to prove analytical results on the properties of the stationary distri- bution. However, if one considers the case where locations are homogeneous in terms of their characteristics,the structuralmodel is very similar to a discrete time contact process defined over a finite integer lattice. The behaviour of the contact process has been exten- sively studied in the literature on interacting particle systems." Two results are particu- larly interesting from my perspective. According to the first, the states of any two sites on

any given choice of tract characteristics X.'0

8. Onegood exampleis education.Highlyeducatedpeoplemay locatein an areabecausetheyenjoythe

companyof othereducatedpeople,or becausethey attachgreatimportanceto school qualityso they move to tractswith good schools. On the other hand, agents'educationlevelspositivelyaffect theiremployment


9. For example,supposeYit= 1. Theshock Oi is drawnfroma U[O,1].If oit <Pd, thenYi,t+ 1 = 0 9; other-

wise,Yi,t+I = 1.

10. Sincethe X characteristicsarefixed,the transitionprobabilitiesaregivenanddefinea Markovchain

withtransitionmatrixQoverthefinitestate-space3".Thischainis irreducibleandaperiodic,so standardresults on Markovchainsimplythat a stationarydistributionexistsandis unique.



the map exhibit non-negative spatial correlations. The second result states that this corre- lation is bounded above by a quantity that decays exponentially with the distance between locations. Simulations of the structural model show that indeed as one lets the system evolve for several periods, high or low employment clusters appear on the map of locations. If tracts are homogeneous, these clusters are equally likely to take place anywhere on the map. Tract heterogeneity implies that low-employment clusters are more likely to occur

in certain areas of the map, since tract characteristics affect the transition probabilities

between different employment levels. In my simulation and estimation exercise I take as given the spatial distribution of the

X covariates (determined by the sorting process), and I run the local interaction process

conditional on that, until it converges to the stationary distribution. The structuralparam- eters of the y ( ), a( Q), and X(.) functions are estimated off the stationary distribution. The case in which all the spatial correlation is driven by sorting rather than by local interactions corresponds to the case where )L(*) is identically zero for all values of the X

characteristics, so the model delivers a very straightforward way to distinguish the two effects.

2.2. Census tracts, physical distance and aggregation

In the structuralmodel, the set of neighbours for each site is defined by physical distance, in that the neighbours of tract i are the tracts immediately adjacent to it. Ideally, one would like to have individual data on social networks for a large but well delimited set of agents, with information on the sequence of social ties that connect each agent to any other agent in the set. Then one could cast the model in terms of individual agents and define neighbours as the set of alters to whom each agent is directly connected. Such a model would predict the emergence of unemployment clusters in the abstract space gener- ated by the map of social ties between agents, which could then be matched to the data. In the absence of such direct data on networks, one has to rely on economic and sociological considerations to make assumptions about the likely dimensions along which social networks are constructed. In general, one can think of several distance metrics as potentially good candidates to trace out the spatial structure of social networks.12 I want to argue that physical distance is an important determinant of the way networks are constructed. The underlying idea is that establishing and maintaining social ties is costly, and these interaction costs increase with physical distance. For example, transportation costs have both a monetary and a time component that make it harder to maintain active contacts with persons living far away or simply reduce the frequency of exchanges with such persons. In addition, local institutions (such as neighbourhood clubs or associations, churches, PTAs in local schools, local businesses) play an important role in fostering local social ties and facilitating information exchanges at the local neighbourhood level. There has been considerable debate in the sociological literature on whether the notion of a local community has lost its meaning in an era of increasingmobility and an expanding array of communication devices. However, there is evidence that an important fraction of social contacts takes place at the local level. In a study of Toronto residents in the 1980's, Wellman (1996) finds that about 38%of yearly active contacts in all networks takes place

12. Conley and Topa (2000) define four different metrics, physical distance, travel time, ethnic, and occu- pational distance, and examine the spatial patterns of unemployment with respect to each metric.




between pairs of agents who live less than 1 mile apart; this percentage rises to 64% for contacts between agents less than 5 miles apart. In a Detroit study that used a 1975 survey of about 1,200 residents, Connerly (1985) states that 41% of respondents had at least one third of their Detroit friends residing within one mile. More relevant to this paper, Hunter (1974) reports that out of roughly 800 Chicago residents interviewed during 1967-68, about 49% said that the majority of their friends resided in the same local community. Therefore, identifying social contacts with agents who live physically nearby seems like a reasonable approximation for one of the dimensions of social networks. A second question concerns the appropriate scale of analysis. Perhaps the most relevant social interactions occur within Census tracts, thus making the assumption that residents of one tract exchange information with residents of neighbouring tracts not very meaningful. However, in the city of Chicago, Census tracts represent fairly small units of

analysis. Pairwise distances between adjacent tract centroids are less than one mile for all but a handful of tracts. Typically, tracts contain five or six street blocks. The average population (16 years and older) in a Chicago Census tract was 2700 in 1980 and 2500 in


More importantly, Census tracts are a sub-division of larger units called Community Areas. There are 75 such areas in Chicago, and each area contains on average 12 tracts. These areas, such as Hyde Park, Woodlawn, Lincoln Park, Englewood, have names that Chicago residents readily associate with neighbourhoods. In fact, Community Area boundaries were drawn in the 1920's by a group of Chicago sociologists, such as Ernest Burgess and Robert Park, to represent communities with a distinctive identity. The main criteria used to establish these Areas were: "(1) the settlement, growth, and history of the area; (2) local identification with the area; (3) the local trade area; (4) distribution of membership of local institutions; and (5) natural and artificial barriers" such as rivers, railroad lines, large roads, parks.'3 Even though neighbourhood boundaries change over time, these Community Areas still represent, in many cases, meaningful and cohesive neighbourhoods. Hunter (1974), in his survey of Chicago residents, asked respondents to name the boundaries of their neighbourhoods. He identified roughly 200 neighbourhoods, which in all cases represent units larger than Census tracts. In many cases, the boundaries of these neighbourhoods coincide with the original Community Area boundaries.'4 Therefore, it makes sense to assume that there exist meaningful inter-tract social interactions, and to take Census tracts as the unit of analysis. Finally, I would like to focus on the issue of aggregation. The structuralmodel could be set up at the individual level, where sites represent individual agents, and the set of neighbours is defined as the set of contacts within a neighbourhood of radius r. The transition probabilities would then be defined at the single agent's level, determining the transitions into and out of work. Then the model could be aggregated to the tract level for estimation purposes. It is hard to come up with an individual level model that, when aggregated, generates transition rules as in (1)-(5). However, one can compare the behaviour of such a model, in terms of its spatial properties, with the structural model presented here. Conley and

13. Fromthe Local CommunityFact Book, Erbeet al. (1984).A certaingeographicareawas considered

a CommunityArea if it had "a historyof its own as a community,a name,an awarenesson the part of its inhabitantsof common interests,and a set of local businessesand organizationsorientedto the local


14. In the estimationthatfollows,I exploittheseboundariesto tryto distinguishspilloversdue to social

interactionsfromotherpossiblesourcesof spatialcorrelation.



Topa (2001) analyse a model in which artificial agents are placed at the centroid of each tract, and interact both with agents in their same tract, and with agents in tracts whose centroids are less than one mile away, on the physical map.15 The transition probabilities are defined at the agent's level, and are very similar to the ones described here. The contagion term At() is a declining function of physical distance, so the interaction is stronger between residents of the same tract than between residents of two separate tracts. The spatial properties of these two alternative models can be compared by looking at the Auto-Correlation Functions (ACFs) and the spectral densities generated by the spatial distribution of employment in model simulations, following a procedure described

0-3 , - l Tractlevel - - - Individual level 0-25 0-2 - 0-15 0-1
Individual level
0-2 -
-005 0




ACF comparison: tract level vs. individual level model

in Section 3.2. Figures 4 and 5 show that one can find parameter values such that the ACFs and the spectral densities are quite similar in the two models, thus implying that the models can be made to deliver similar spatial correlation patterns.16 Therefore, I con- sider the structural model presented in this section as a viable approximation to a more disaggregated model in which local interactions are defined at the level of individual agents.


The objective of the estimation strategy is to estimate the structural parameters of the model presented in Section 2.1, in order to test for the existence of spillovers generated

15. The numberof agentsin eachtractis proportionalto the actualpopulationin Chicagoin 1980.One

artificialagentcorrespondsto about20 peoplein reallife.

16. However,the parameterestimatesand the spillovermagnitudesmay be differentin the two setups.

ConleyandTopa(2001)examinethisissuein moredetail.









(015 '



































Spectrum comparison: tract level vs. individual level model

by the local interaction process. The cross-sectional data used in the estimation are assumed to be a realization of the unique stationary distribution of the model. I focus on the stationary distribution in order to be able to disregardthe potential impact of different initial conditions on the spatial moments used in the estimation. In principle, one can write the likelihood function for the stationary distribution of the model starting from the set of conditional probabilities of the state at one location, given the contemporaneousstate at neighbouring locations (the key result here is the fac- torization theorem discussed in Cressie (1993, p. 412)). However, it is extremely hard to write the likelihood in this case, because the construction involves pinning down a nor- malizing constant (itself a function of the model parameters) which is not available in closed form. Therefore, following an indirect inference strategy, I look for an auxiliary statistical model that can fit the unemployment data well, but is much simpler to estimate than the structuralmodel. Such a model need not necessarily nest the structural model and may in fact even be misspecified. In particular I look for a statistical model that best approximates the spatial properties of the stationary distribution out of the local interaction model. Choosing an auxiliary model is similar to selecting moments in Simulated Method of Moments to be matched between the data and simulations of the structural model, in order to deliver estimates of the structural parameters. The idea behind indirect inference is quite intuitive. Let 0E Oc93P be a vector of p parameters that fully characterize the structural model.'7 Let 00 denote the true value, assuming that the structural model is correctly specified. Likewise, let pe AcC91q be the

17. These are the parametersof the y( ), a(-), and 2(-) functions in the conditional transition probabilities




vector of q parameters of the auxiliary model, with q' p. First, one estimates the auxiliary model using the actual data. These parameter estimates p depend on the true value 00.

Then one simulates the

0. For each choice of 0, one can estimate the auxiliary model using the outcome of the

simulations (in this case, a simulated unemployment realization over the map of the city). This estimation yields parameter estimates P(0) that depend on the specific value of 0 used for that simulation. The parameters of interest 0 are estimated by minimizing the distance between i(00) and p(O), according to some metric to be specified. In the remain- der of this section, I first of all present the indirect inference procedure and describe the auxiliary model; secondly, I discuss identification issues.

structural model for different values of 0 in the parameter space

3.1. Indirect inferencemethodology

The data (y,, xn) used in the estimation are cross-sectional data on unemployment and

tract characteristics over the city of Chicago. The outcome variable y, is the (n x 1) vector

, n. The covariates x, are a (n x M) matrix

of exogenous variables. The auxiliary model is a statistical approximation to the structural model. Let Jn(yn xn p) be the GMM criterion used in the auxiliary estimation; when performed on the actual data, this yields the following parameter estimates


of unemployment rates for all tracts i = 1,

pn = arg min Jny(Yn,xn, p).


The auxiliary model is possibly misspecified, since I assume that the structural model is

the true model. One can show that p, -

r: e


-4 p as n -* oo, where p r(0O). The function


is defined as follows:

r(0) = arg min Joo(G,0, p).


where Joo(G,0, p) = lim,OO J(Yn 9Xn Ip), and G is the distribution of the random shocks that determine the stochastic process of y. The limit criterion Joo(G,0o, p), evaluated at the true value 00, is assumed to have a unique minimum at p. Turning now to the simulations, for each value of Oe 0EI draw H simulated realiza- tions of y out of the stationary distribution of the structural model, PYh (X, 0), h

, H. I then perform the auxiliary estimation on the simulated outcomes, for each


value of 0 and for given x,: this yields

pn~(0)= arg minJn (Yn




Again, for the simulated estimator of p, one has pn(0)


4 r(0), VOe 0;

in particular for

the truevalue 0 n, j^(0O) -4 p.

over the parameter

space e, and picks the value 0* that minimizes this distance mnh(0). The indirect inference


The indirect inference method simply evaluates mn(0) -n-


nHis then the solution to the following minimum chi-square problem

on = argmmi[n _




1h =1 p(o)]



HI Hh=



As is the case for a standard GMM procedure, the optimal weighting matrix Qnin the quadratic criterion (9) is Q = V-1, where V, is the covariance matrix of the auxiliary




parameters p. The indirect inference estimator is consistent and asymptotically normal. A standard chi-square specification test for the structuralmodel is also available: the statistic icKis equal to the minimized value of (9), scaled by a function of H, and is distributed as

a 2(q -p),

Finally, Gourieroux, Monfort and Renault (1993) provide a test of hypotheses on the structural parameters 0. Let 0 be partitioned into 0= [ ] , and consider the null hypothesis Ho: 01 = 0. Define the constrained indirect estimator OH,o as the estimate that results from minimizing (9), subject to the constraint 01 = 0. The test statistic Kcis defined

as the difference between the constrained and the unconstrained optimal value of (9), and is distributed as a x2(dim 01).

where q = dim p and p = dim 0.

3.2. The auxiliary model

Since I am particularly interested in the spatial properties of the stationary distribution generated by the structural model, I need to consider an auxiliary statistical model that mimics these spatial characteristicswell.'8 An obvious choice is a Spatially Auto-Regress- ive model (SAR), since theoretical results and simulations on the contact process suggest that the stationary distribution exhibits positive spatial correlations, decreasing in the distance between locations. A SAR of order D can be defined as follows





where the superscript Nd refers to the average level of unemployment in the tracts at distance d from tract i.19 I use a criterion, based on spectral decompositions, to compare the spatial properties of a contact process and of a SAR model, and to choose the order of the SAR that best fits the contact process. Let the contact process be the "true" model, and let the SAR be an approximating model to the truth. For a given order D of the Spatial Auto-Regression, I estimate via maximum likelihood the parameters of the SAR that best fit the true model.20I can then compare the fit across different possible approximating models (i.e. SAR models of different orders). The result is that the goodness of fit improves as one adds higher order terms to the SAR: Figure 6 reports the spectral density of a typical cross-sectional distribution generated by the contact process and compares it to that of several SARs. In most cases, a SAR of order 6 or higher fits the spectra generated by a contact process remarkably well: therefore, I use a SAR(6) as the basis of my auxiliary model. The statistical model in (10) needs to incorporate a set of M covariates xi in order to control for tract characteristics that may both affect the probability of finding and losing jobs, and be dimensions along which agents sort into locations. The structural model of Section 2.1 also lets the local interaction term 24() be itself a function of local character- istics. To capture this feature in the auxiliary model, I add interaction terms between y_1

18. Consistencyof the indirectinferenceestimatordoesnot dependon the specificchoiceof the auxiliary

model. However,the closenessof fit betweenthe structuraland the auxiliarymodels affects efficiency:see


19. Thisis definedas yNd=


y, whereWi(d) is the i-throw of a weightingmatrixW(d) constructed

as follows. Wi(d)givespositiveequalweightsto all tractsthatareat distanced fromi. If a tractsharesan edge with i on the physicalmap, d= 1; if a tractis adjacentto one of theseimmediateneighboursof i (but is not adjacentto i), d= 2, andso on. Theweightsaddup to unity.


See Hansenand Sargent(1993).The detailsof the constructionare availablefromthe authorupon





CPvs. SAR(1)


CPvs. SAR(5)
























CP vs.


CPvs. SAR(8)




0.4 -





- 4












Spectrum comparison: Contact Process (solid) vs. SARs

and a subset of the covariates. Let xi be a Jx 1 vector of regressors, with J< M. Let

Equation (10) can be rewritten as







= (.i)

+ (xi)







The following assumptions are made concerning the covariance structure of the error

In other words, I assume no cross-corre-

lation between different units, but I allow for heteroskedasticity. Another observation concerning the error terms is in order. The data used in the estimation are taken from two Census years, 1980 and 1990, thus creating a short panel. Therefore, all the variables in (11) should be indexed by a time subscript. However, the short panel feature allows me to incorporate an unobserved, time-invariant component into the error term:

terms: E(8i) = 0, E(,c2) = ui, E(8cie) = 0, Vi?j

Eit= pi + uit. The term rpiattempts to capture features of a specific location (unobservable to the econometrician), that may attract or turn away agents with certain characteristics that may be in turn correlated with the ability to find jobs. Insofar as these unobserved characteristicsare really time-invariant over the decade under consideration, I can elimin- ate them by first-differencing the data. I follow this approach in the estimation, so the auxiliary model (11) is to be taken in its first-differencespecification from now on, unless otherwise noted.21Hence, all time subscripts are suppressed. In equation (11), the yN covariates are correlated with the error term E?, since they are endogenous. Therefore, I use instruments for these variables: one obvious choice is to

use the exogenous variables in the neighbouring tracts to i. In particular, I use obser- vations in neighbours up to a distance 3 from tract i: so the instruments are

21. Therefore, y' really denotes Yit - Yit-1; xi denotes xit - xit; difference in the time-varying part of the original error term, uit - uit -1 .

and the error term actually denotes the





















. These variables are assumed to be uncorrelated with the error terms


E(zjEj) = 0.



Equation (12) provides moment conditions that I can use to estimate the auxiliary model of equation (11) via a GMM procedure. The use of IV is not strictly necessary here. The auxiliary model is only a statistical model that summarizes certain properties of interest of the structural model, and can be misspecified. Thus, an equally valid procedure would be to estimate (11) via ordinary least squares. This procedure would yield biased estimates for 4 and ,B,but it would not affect the consistency of Vn'. However, using IV can be interpreted as adding several spatial

cross-correlations of y and x to the auxiliary model (as it is, the model only includes autocovariances of y). This is roughly equivalent, in a time-series context, to adding cross- moments involving leads and lags of y and x. Finally, I augment the auxiliary model bv a certain number of raw spatial moments of the unemployment variable. The rationale for doing so is to add flexibility to the auxiliary model, in order to be better able to detect signs of misspecification of the struc-

of the over-identifying restrictions is

more likely to fail to detect misspecification if the auxiliary model is not flexible enough. In addition to the mean and variance, I consider the spatial covariances up to distance D, since the structural model delivers implications in terms of the spatial distribution of

unemployment: these are the pairwise autocovariances for pairs of tracts at given distance d, whereas the 4 parameters in the SAR measure the autocovariance between unemploy- ment in tract i and the average unemployment in the neighbouring tracts. These additional auxiliary parameters can be estimated together with those of the SAR using the same GMM framework. Let v, be the vector of moments to be estimated

tural model. As Tauchen (1996) shows, the X2 test




2, C15




where Cd = Cov (yi yY), d= 1,

that are at distance d from tract i. So the complete vector of auxiliary parameters is p

, D. Here yd indicates the unemployment rate in tracts


yr] . I also need to define

Zi - Ei(0,


Yi -









ization of

the optimal choice of a weighting matrix is SA?',where S is the covariance matrix of the

a quadratic criterion Jn(p) based on g(41, p), with the standard properties, and

[Y jiN



yf xN

. The GMM estimator pn6MM is obtainedviaminim-

limitingdistributionof 1/ S

In?= I

g(4i5 p).22

22. Conley (1999)providesa robustestimatorfor the covariancematrixS that does not rely on any

specificassumptionson the structureof the errorterms,but only uses informationon economicdistances betweenlocations.Use of thisrobustestimatordoesnot leadto significantlydifferentresultsthanmorestandard

estimators. Therefore, in what follows, I use Sn = 1/n

n= g(4i,






3.3. Identification

Identification issues are crucial in this empirical exercise. This paper argues that local spillovers, generated by local social interactions, may explain part of the spatial patterns of urban unemployment observed in the data. The estimation methodology basically works by matching empirical spatial moments with simulated moments out of the local interaction model, to estimate the structural parameters of the model and thus give a measure of the magnitude, if any, of the local spillovers. However, there may be other factors that give rise to the same patterns of positive spatial correlation in unemployment, but may be indistinguishable from the social interaction channel that I focus on. Formally, this discussion can be cast in terms of Manski (1993)'s analysis of identifi- cation issues in the context of endogenous social effects, such as peer influences, neigh- bourhood effects, social interactions, contagion, and the like. Restricting oneself for simplicity to linear models, one can posit the following population relationship

y=a?4+E(yIA)+E(xIA)y?+x1 +,


where y is a scalar outcome variable, x and ? are attributes that directly affect the unit's outcome (observedand unobserved,respectively), and A are attributes that define the indi- vidual unit's "reference group". In the present context, y is unemployment, x are tract characteristics, and A is the distance metric that specifies which tracts are neighbours of each tract i. Each term in (14) represents a separate effect: in addition to the direct effect ,Bof observable characteristicson outcome, 4 expresses an "endogenous social effect" that the mean outcome in the reference group has on the individual unit's outcome, whereas y represents an "exogenous or contextual effect" of the mean observable attributes in the neighbouring tracts on tract i's unemployment rate. Furthermore, the error term E may depend in part on unobservable characteristics that are themselves spatially correlated. In the context of the linear model (14), Manski (1993) shows that the endogenous social effects 4 can be identified if one assumes that there are neither exogenous effects (y = 0) nor correlated effects (Eei?j = 0). I make these assumptions both in the structural and in the auxiliary model. Specifically, in the structural model the conditional transition probabilities (1)-(5) only depend on tract i's own characteristics and on the neighbours' employment rate in the previous period, while the neighbours'characteristics do not play a role. In addition, the evolution of the system is ruled by random draws a) from a vector of uniform distributions that are i.i.d. in time and in space by construction. In the special case where exogenous and correlated effects are absent, 4 and ,B are

expectation E(x IA) varies non-linearly with A, and

Var (xIA)> 0 (Proposition 2 in Manski (1993)). In my model, x are tract characteristics, whereas A is the physical distance metric. Therefore, Manski's identification conditions are satisfied. In practice, I estimate E(x IA1) non-parametrically as a local average I /Ni xj 1 1, where Ni is the set of neighbours at distance 1 from i. Furthermore, it is worth noting that even if neighbours' characteristics did play a role in the evolution of the system, identification would still be ensured by the non-linearit- ies implied by the stationary distribution of the structural model, as well as by the asym- metric way in which tract characteristics and information enter those conditional

transition probabilities (this point has been made by Brock and Durlauf (1999, p. 31-33)).

), ac( ), and y( ) can also

identified if the conditional

Identification conditions for the structuralparametersin


be expressed in terms of the indirect inference methodology. Intuitively, one wants to rule out the possibility that the chi-square criterion (9) may be minimized by more than one set of parameter values. Formally, this requirement amounts to assuming that the limit




criterion JCO(G, 9o, p) has a unique minimum at p, that the binding function r( ) in (7) is one-to-one and that ar/aO is of full column rank. The rank condition can be locally tested and is in fact satisfied at the point estimates reported in Section 5.23 Finally, with regard to the identification of the auxiliary model, I assume that the error terms are uncorrelated in the cross-section, E(Ei?j) = 0, Vi ?j, and are uncorrelated with the neighbouring tracts' characteristics: E(x7E?) = 0. The latter condition concerns the validity of the instruments used in the auxiliary estimation. Both conditions can be easily tested. The assumptions made here, regarding the absence of exogenous or correlated effects, may be considered too strong. For example, suppose criminal activity in one location spills over to neighbouring locations, because of physical contiguity. An increase in crime may induce certain types of residents to move out, who are more likely to be employed than people who stay. Alternatively, the rise in crime may have an adverse effect on local businesses and employers, forcing them to leave. Thus the crime level in the neighbouring tracts would impact unemployment within those tracts, and have a contextual effect on the unemployment rate in tract i (y? 0), generating positive spatial correlation in unem- ployment that is not due to social interactions. Another example is the location of schools. Suppose high school quality in tract j induces agents with high ability or motivation to locate in tract j as well as in neighbouring tract i. This may affect attributes such as graduation rates or school drop-out rates in both tracts, while at the same time reducing unemployment in both tracts. Again one would observe a non-zero exogenous effect (y? 0) in that attributes of neighbourj would be associated with outcomes of tract i . Finally, as an example of correlated effects, one can think of positive sorting inducing positive corre- lation of certain unobserved attributes across neighbouring areas, that may directly affect employment outcomes. These are potentially serious violations of the identification assumptions. Therefore, I follow three alternative strategies in the estimation of the structural model, to try to distinguish spillovers that are due to social interactions across neighbouring tracts from other possible inter-tract influences. The first two approaches exploit indirect information that may be available on the dimensions along which social networks are constructed, while the third addresses the issue of generic correlated effects across tracts. There is considerable evidence in sociology on the extent of assortative matching in agents' social networks. In particular Marsden (1987,1988), using data from the 1985 General Social Survey, shows that network homogeneity with respect to race and ethnicity is very high.24Quite simply put, individuals are more likely to interact with people of the same race or ethnicity than with members of different groups. The idea then is to divide the set Ni of tracts physically adjacent to i into two subsets: those that are ethnically "close" to tract i, and those that have a very different ethnic composition. Let me call the latter subset EDNi.25 If social networks follow racial and ethnic lines, then one can expect

23. This is done by running a very long simulation of the structural model at the estimated value 0 H, and

numerically evaluating the matrix of partial derivatives ap/IO at the optimum OH. I can then test whether this

matrix has full rank.

24. For example, the likelihood of observing a social contact between two black persons is 4 2 times higher

than that generated by pure random matching, given the relative proportions of the different ethnic and racial categories in the population.

25. Operationally, I use the same procedure as in Conley and Topa (2000) to calculate pairwise ethnic

distances between tracts. I consi<