Published by Oxford University Press on behalf of the International Epidemiological Association ß The Author 2009; all rights reserved

. Advance Access publication 27 January 2009

International Journal of Epidemiology 2009;38:361–368 doi:10.1093/ije/dyn356

Commentary: Individual, ecological and multilevel fallacies
J Michael Oakes

Accepted

6 November 2008

The new paper by Drs Subramanian, Jones, Kaddour and Krieger (hereinafter Authors) contains many important and subtle insights about the fallacies of single-level research, be it at the individual or ecological level.1 The Authors urge epidemiologists to consider contexts and multilevel phenomena when investigating and explaining population health. They also criticize the late William S. Robinson and his classic 1950 paper, and methodological individualism (MI) as a research paradigm.2 Support comes from historical anecdotes, theory and a re-analysis of Robinson’s data. Assuming I understood it properly, I am in full agreement with the primary aim of the new paper. Epidemiologists, especially those interested in the effect of social forces on health, should consider contexts and multilevel phenomena. And as a general proposition, I also agree that critical examination of a scientist’s culture, history and personal motivation can be enlightening. The Authors’ scholarship on these matters merits careful study. On the other hand, I find the Authors’ critique of Robinson and his paper unhelpful and their critique of MI misguided. More importantly, the Authors conflate multilevel thinking with the so-called multilevel regression model and in so doing offer readers questionable advice. My goal here is to explain these disagreements in hopes of stimulating more critical thinking about multilevel phenomena and research for the improvement of population health.

Robinson’s paper in context
According to Robinson, he was motivated to write his paper because of the ‘impressive number of quantitative ecological studies’ that relied on ecological correlations to make inference about individual behaviour.2 He cites research on tuberculosis, voting, crime and fertility, and states that in each case the authors were not interested in ecological correlations but rather discovering something about the behaviour
Division of Epidemiology, University of Minnesota, USA. E-mail: oakes007@umn.edu

of individuals. Robinson thus aimed to determine whether ecological correlations could be validly substituted for individual correlations. He showed through some simple algebra and a worked example that they cannot. His example demonstrated that a correlation between illiteracy and race differed quantitatively by level of aggregation and that a correlation between nativity and illiteracy differed qualitatively—reversed signs—by level of aggregation. Robinson expressed concern that his finding would have serious consequences. But he wanted to prevent future mistakes and set researchers on a more fruitful path. I believe Robinson’s paper is so widely cited (by those who have read it, anyway) because of its elegant simplicity in answering one important question. It remains a delight to read, especially in light of the work it spawned on Simpson’s Paradox, aggregation bias, the multiple-area unit problem, confounder control and so forth. I cannot help but associate Robinson’s classic paper and the Authors’ work with two new papers by Gelman and colleagues and, ´n and colleagues.3,4 separately, Herna Gelman and colleagues illuminate the fallacy that citizens of wealthier American states tend to vote for Democratic candidates while those residing in less wealthy states vote for Republican candidates. This paper uncovers a striking example of Simpson’s Paradox and presents a remarkably clear and accurate explanation of it. Recall that Simpson’s Paradox is a situation where the relationship between two variables (e.g. income and political party) is reversed when a third variable (e.g. state) is considered. 5,6 Figure 1 is a caricature of the paradox addressed by Gelman and colleagues. The idea is that the withinstate probability of an individual voting for a Republican party candidate increases with individual income, while the between-state mean probability of voting Republican declines with increasing mean state income. Gelman and colleagues fit a sophisticated multilevel model that nicely describes their paradoxical data. ´n and colleagues address the divergent effect Herna estimates of hormone replacement therapy (HRT) on cardiovascular disease (CVD) in menopausal females

Downloaded from http://ije.oxfordjournals.org/ at Pennsylvania State University on November 7, 2012

if it prevents the future computation of meaningless correlations and stimulates the study of similar problems with the use of meaningful correlations between the properties of individuals. Since its founding by Comte and Quetelet. but to the improper conceptualization of desired effects and the inappropriate use of regression adjustment for time-dependent confounding in the observational study. a tiny fraction of even the social science citations of the period. In light of his stated aims. I interpret his concluding sentence as an admonition to readers of the American Sociological Review who mistakenly think ecological correlations may be substituted for individual ones and those who may believe that one correlation is as good as any other.3 between a large cohort study and a randomized clinical trial (RCT). Herna showed the presumably mistaken inference from the observational data was not due to the observational design (e. To this end. Clogg does not mention Robinson’s work or the ecological fallacy.g. it seems there is more myth than fact here. 2012 . 357) Unlike the Authors. But it is not clear how responsible Robinson’s paper is for individual-level research in epidemiology and/or public health. Recall that inferences from several epidemiologic cohort studies suggested HRT was protective for CVD while a more recent RCT showed ´n and colleagues HRT deadly with respect to it. Herna mated effects nearly identical to those in the RCT. ˆtre has been the impact of social sociology’s raison d’e context and social forces on behaviour.17 Parallel but unpublished data on the citations in American Journal of Public Health (AJPH) articles shows Robinson’s paper was cited just 10 times over the same time period. one must appreciate that Robinson was one of the first sociological methodologists and served the council of the American Sociological Association’s methodology group from 1961 to 1965. He was interested in causal inference and contributed to debates about measurement error. however. In his paper on the impact of sociological methodology on statistics. Robinson was ‘speaking to’ sociologists who misunderstood the meaning of ecological correlation and confounding. Yet the only time Robinson uses the word ‘meaningful’ is in the last sentence of his conclusion.7–14 Consequently. Robinson’s mistake The Authors take umbrage with Robinson’s statements about the meaning of individual-level correlations. affects individual behaviour. By comparison. there can be no question that his paper would have eliminated any enthusiasm for doing so. including despicable laws. I do not read this as Robinson saying researchers should only consider individual correlations or that only individual correlations are universally meaningful. What about the impact of Robinson’s paper on social and health science research? For those willing to heed Robinson’s warning about improper substitution. confounding) or the data themselves.org/ at Pennsylvania State University on November 7. modelling and even counterfactual thinking. I read Robinson’s concluding sentence much more narrowly than do the Authors. my analysis of citations in articles published in the American Journal of Epidemiology during the period 1981–2002 showed Robinson’s paper was cited just six times in actual research papers. In none of his other papers (I tried to read everything he wrote) does he argue against consideration of contexts or anything even related. where he wrote: The purpose of this paper will have been accomplished.2. While not diminishing the paper’s import. Susser’s 1994 paper on the fallacy of ecological fallacy was cited 63 times in AJPH alone.18 Finally. Together. (p.15 The same goes for Raftery’s review of the impact of statistics on sociology. falsificationism.362 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY Key: Within-state slope Between-state slope Person-level data point High State-level mean data point State 1 Prob(Vote Repub) State 2 State 3 Low Low Individual Income High Figure 1 Caricature of Simpson’s paradox found by Gelman et al.oxfordjournals. And it is true that the paper has been widely cited. The trouble is that data on a ‘squelching effect’ are difficult to come bye: papers not published cannot easily be counted. By analysing cohort data in a way that mimics ´n and colleagues estian experimental trial. It is difficult for me to accept the claim that a sociologist could be uninterested in how social context. these two papers highlight the contemporary relevance of Robinson’s problem and the vital importance of clear. a paper that carefully addresses multilevel modelling.16 Further. causally informed thinking in epidemiology. induction. That is. in an online supplement to his 2004 article about citations to papers published Downloaded from http://ije.

were Downloaded from http://ije. Robert K. also tabulated are citations to a paper by Leo Srole which addressed social integration from a Durkheimian (i. Lazarsfeld was analysing census data at the tract level and became worried about the validity of his correlations. If Srole’s more popular paper is not responsible for contextual research how can Robinson’s paper be responsible for individual-level research? The Authors suggest a link between Robinson’s paper and the 1954 legal case of Brown v.20 While social isolation remains central to social epidemiology.e.24. Chapin and Samuel A. he and every other American sociologist must have been worried about xenophobic stereotyping. it seems the idea for the ‘ecological fallacy’ paper came not from Robinson but his colleague/teacher. That is. and in the face of many individual Japanese-Americans trying to join the military to fight for the USA. Based on this ‘sociological evidence’. But few would associate Robinson’s work. This one revolves around another major decision by the US Supreme Court: Korematsu v. contextual) perspective.19 I abstract the central result in Table 1 below. Robinson’s hidden motivation for his paper (also) included concerns about the use of such correlations as ‘evidence’ for political/legal activity. it seems to me Robinson’s only mistake was in concluding his paper with a vague statement. Second. While I agree that in hindsight Robinson should have done more to clarify the narrowness of his paper. Fisher’s masterpiece.27. paying particular attention to the social scientific research used to support or refute the plaintiff’s claims. the Court accepted claims about the ‘threatening and sneaky nature’ of the Japanese as a group. For comparison. subsequent law journal debates about the validity and propriety of social science in the related cases. I reviewed the actual Court decision. Lazarsfeld of Columbia University.21–29 Neither Robinson. Accordingly. Lazarsfeld’s role as initiator is important because Lazarsfeld was not only a leading sociological methodologist. as compared with causal effects. or that of the ‘Columbia School’. For purposes here. Board of Education (wherein the US Supreme Court ruled that the racial segregation of public schools was unconstitutional). I have tried to examine the major works on the case. Among the alternative hypothesis about the hidden motivation. the Appellants’ Brief signed by thirty-two social/behavioural scientists (including Prof. circumstantial evidence leads me to believe that. or the ‘ecological fallacy’ is mentioned or cited—not even once—in the reviewed literature. suffice it to say that only some American sociology circa 1950 espoused the American ideology of freedom. in 1944 the Court upheld the internments. behind Robinson’s paper.org/ at Pennsylvania State University on November 7.REVISITING ROBINSON Table 1 Citations to Robinson’s 1950 paper and Srole’s 1956 paper as of 2004 Robinson 1950 Srole 1956 1965–69 69 83 1966–75 208 278 1976–85 291 206 1986–95 301 93 363 Total 1261 699 in the American Sociological Review. The Design of Experiments. ever. Jacobs shows that (ISI Thompson database) citations to Robinson’s paper were relatively meager for a decade or so. his paper. Robinson should have articulated the meaninglessness of correlations. I doubt many today know Srole’s work.35 In linking the ecological fallacy to correlation. I worry that readers will misunderstand things. at any level. to any such paradigm. and what is widely considered the most definitive history of the case. But I am less concerned with his obtuse comments on the meaning of individual correlations than I am with his comments on the meaning and utility of correlations more generally. United States of 1944. The Authors also tie Robinson’s work to the Cold War and all the negatives that accompanied it. if any.oxfordjournals. Among other literature. if he had one. First. for reasons beyond the scope here. 2012 .28 The issue here was the Constitutionality of the forced internment (in barbed-wire camps) of Japanese Americans at the start of World War II.31–33 Absent new data about some hidden motivation. was available in 1935. but a Jewish political socialist who fled Vienna in the 1930s and author of the first comprehensive study on the impact of McCarthy-era intimidation on academic freedom—a text widely viewed as one of the first mulitlevel contextual analyses. By 1950 sociologists such as Stuart F. In fact. Stouffer had already clarified the distinction. Robinson does not mention this case either but it seems entirely possible that the problem of relying on ecological correlations to detain innocent individuals was on Robinson’s mind. Evidence for this claim is also lacking. Here I am referring to the second half of his concluding sentence. choice and opportunity. the late Paul F. Robinson ended up working out the math and publishing the result.34 And statistician Ronald A. While I have found no hard evidence to support the claim. one merits special consideration. From my perspective. the only social science research that mattered in the case was a few (rather crude) psychological experiments with toy dolls and children.30 It is reported that a few years before Robinson’s publication. Despite there being no evidence of individual treason or sabotage. it is illogical to associate the ideology of freedom with the oppressive fascism the late Senator Joseph McCarthy espoused. In any case.6 Regardless. Robinson contributed to the Pearsonian myth that correlation is superior to causation. Merton of Columbia University).

Lazarsfeld’s. from the smaller to the larger circle. Notice that Coleman’s is a multilevel theory. and subsequent multilevel modeling research. What do their MLM results really mean? Even assuming the models yield unbiased parameter estimates. For purposes here. represents the impact or influence of society/contexts on individuals. The downward pointing arrow to the left.42 On the heels of the aforementioned 1954 Brown decision.41 A student of Paul F. states. It was his landmark 1966 report on the equality of educational opportunity (in America) that gave birth to contemporary multilevel theory and modelling. appreciate that the Authors’ preferred ecosocial theory treats contexts and institutions as given. But with respect Micro-level Individual: time 1 Individual: time 2 Figure 2 Multilevel framework . among other things. It is critical to understand that for methodological individualists like Coleman. They claim that only by incorporation of contexts (e. Group-level phenomena are never simple aggregations but rather complex dynamic and multilevel phenomena.oxfordjournals. the thicker (near) vertical arrows to the left and right are most important. Coleman aimed to estimate the independent effect of school funding and social contexts on student academic achievement. the work of the late sociologist and methodological individualist James S. But in any case. what scientific and/or political use is a finding that the odds ratio for black illiteracy changes when conditioned on state? And the variation of odds ratios across states means what. from the larger circle to the smaller. represents the impact of individuals on society. I am not so sure. Coleman tried to better formalize how social change occurs by drawing a trapezoidal figure which I have affectionately called the ‘Coleman bathtub. This is the macro-to-micro transition and includes various aspects of socialization and resource constraints. MI and multilevel thinking As part of their effort to promote consideration of multilevel contexts the Authors decry MI. Together. and their interrelationships in the context of laws and social norms. 2012 Multilevel models In an effort to demonstrate the importance of the multilevel perspective. it is the foundation of it. Consider. 9 and 12) nested in 60 000 teachers nested in 4000 public schools. That is. the fact is a methodological individualist laid the foundation for Bryk and Raudenbush’s famous text. Coleman. these ‘micro–macro’ transitions represent the most important but most difficult challenge for multilevel thinking in sociology and epidemiology alike. The result was one of the largest social science studies in American history: a crosssectional survey that included some 645 000 school children (grades 3. I fully agree that contextual factors are often necessary to understand a given phenomena. Institutions and other social phenomena play a key role in analyses. I imagine Robinson would forcefully reject the Authors’ indictment. among many others. The arrow to the right.org/ at Pennsylvania State University on November 7. His work initiated the enormous and still thriving academic industry that addresses the effect of schools. collective action.g. Hierarchical Linear Models. exactly? Setting aside interpretative disagreements. contexts and policies on educational attainment. Population: time 1 Macro-level Population: time 2 Coleman’s approach is important here for yet another reason. the Authors re-analyse Robinson’s data with a sophisticated multilevel statistical model (MLM). how can this theory explain the emergence. where does X come from? Is it not some complicated function of y and x? The fact is that MI is not only not consistent with multilevel thinking.’ The simple idea is that society is made up of individuals—indivisible objects in sociology—and that individuals make up society. maintenance or change of social contexts? With respect to the Authors’ own typology of studies (their Figure 5). Changes in smoking rates can therefore only be explained by understanding the actions of individual smokers and non-smokers. social choice and social movements. Much could and should be learned from this body of work.44 Downloaded from http://ije. Coleman aimed to conduct a multilevel study before the multilevel model was even recognized. the Authors believe their MLM reveals the pitfalls of Robinson’s argument against ecological correlations. 6.364 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY he able to defend himself. This is the micro-to-macro transition and incorporates. To see this. But the Authors’ conception of MI is a ‘straw man’ that few contemporary MI scholars.36–40 would accept.43 Multilevel theory without MI seems indefensible. state law and state educational resources) can one properly understand the relationship between race and illiteracy. Absent some notion of individual agency and interaction. the Civil Rights Act of 1964 required the US Department of Education to conduct a study of the equality of educational opportunities in American schools. societal or group-level change does not just happen mysteriously without the involvement of actual persons. even those on the political left. Further. but these phenomena must be grounded to the activity of individuals. Figure 2 is my basic interpretation of Coleman’s bathtub.

But since it is obvious that such an experiment cannot be conducted we are left to look for a natural experiment or rely on a purely observational design. Gelman. Instead. That is. Like Robinson and the Authors.edu). It is worth noting that my results support the Authors’ methods for analysis of tabular (census) data. the analytic goal remains the same: use observable data to mimic the ideal (unobservable) conditions. blacks in Jim Crow states have over four times the illiteracy rate as blacks in non-Jim Crow states. for example.45 2.56 An empirical illustration would seem helpful. risk difference (across row).36 5. they yield multilevel observations that may substitute for the desired but unobservable counterfactuals.51 4. for all 48 states.29 3. I examine not tabular data.45–51 In terms of causal inference. . whites in Jim Crow states have over five times the illiteracy as whites in non-Jim Crow states. native born).umn. I do not think the MLM is always necessary.e.24 4.30 0. As a first step I replicated Robinson’s and most of the Authors’ analyses. my results differ only slightly from theirs and such differences are surely due to slightly different data and estimation algorithms.50 2.57 0.ipums. is abundantly clear that his models should not be interpreted causally. black. most of which are self-explanatory (e.92 0.org/ at Pennsylvania State University on November 7. Group randomized trials are useful because. My next step is to address a more fruitful multilevel question: what is the effect of Jim Crow laws on African American illiteracy? This question is meaningful since it suggests an (absurdly obvious) decision: repeal Jim Crow or not? The question conforms to the counterfactual framework because it asks: Absent Jim Crow. only a handful of coarse measures are available. But before concluding that Jim Crow elevates illiteracy risk in blacks we must consider that.47.17 5. RR.88 4. so long as sample sizes are large. And to yield statistically valid results.14 0. Many epidemiologists would instinctively worry about confounding by SES.52. It is critical to draw a distinction between multilevel thinking and MLMs because there is nothing especially multilevel about MLMs. what would the 1930 illiteracy rate of African Americans have been? The question is multilevel because individual persons are nested within state law conditions. Such data is publically available from the Integrated Public Use Microdata Series (IPUMS) project at the Minnesota Population Center (www. Why would Jim Crow inhibit the literacy of whites? Simply put.33 2.65 RD RR RD. Why MLM advocates continue to ignore the vast literature on causal inference and the limitations of regression-based inference escapes me. For the 1930 census. risk ratio (across row).62 19.67 9. but person-level 1930 Census data. the distinguished statistician. wherein some randomly selected states (i.44 2. It is easy to see that.12 0.oxfordjournals. MLMs must meet a number of strong if not heroic assumptions that few appear interested in examining. the principal benefit of MLMs lies in their ability to relax the ‘no-clustering within groups’ assumption and to borrow strength from relationships found in datarich contexts so as to improve those in data-poor contexts. compared with other states.43 2. But the data do include two proxy measures of socioeconomic status (SES): having a radio and owning a house.90 3.57 as Robinson and the Authors did. groups) would adopt Jim Crow laws while others did not. I limit my data to persons aged 10 years or older living in the 48 continental states. Unfortunately.54 Some of my own work has revealed the tendency for MLMs to yield inferences dependent not on data but model assumptions and extrapolations. Unlike experiments.48 15. Yet I know of no methodologist who believes MLMs give special purchase to causal inference.62 À0. MLMs do not model cross-level processes if processes are properly understood to be the micro–macro transitions described above.g.87 4.55. especially their multilevel logistic regression models and their state-wise correlations. IPUMS offers a 1% random sample of the actual Census Bureau person-level records. The ideal data for answering the question include outcomes measured under counterfactual conditions.3. which are by definition unobservable. There was no difficulty here. exposures are endogenous. exposures in observational contextual effect studies are typically some function of group members. David Draper. Table 2 presents the percent persons illiterate by race and state Jim Crow status. expressed concern about the misuse and abuse of the model. Nevertheless. it did not. the enactment of Jim Crow laws emerged in the first place through some function of the illiterate and otherwise intellectually retarded views of whites in the US south.33 1. 2012 Table 2 Percent illiterate by race and state-level Jim Crow status.53 And as early as 1995.REVISITING ROBINSON 365 to the Authors’ other point. US 1930 Census Jim Crow No Yes Fourty-eight states Black Native White Border states Black Native White Anomalous states Black Native White 2. The second best data set would contain measures from a group randomized trial. But confounder control Downloaded from http://ije. My take on the social epidemiologic literature employing MLMs to observational data is that MLM analyses have increased confusion and distracted researchers.

Do we really want to employ a procedure (i. the effect of Jim Crow seems to double black illiteracy but now there are negligible difference in white illiteracy and relatively little difference in radio ownership (not shown). Presumably. and illuminates the inferential limitations of it. but there remains wide disparities in white illiteracy. if we adjust for SES in a blacks-only data set we ignore the fact that the meaningful variation in SES is not within blacks but between blacks and whites. Kentucky. Indiana. A cross-tabulation tells the story. This analysis would be considerably strengthened by evaluating data on Confederate states that did not enact Jim Crow. if my last analysis is considered to mimic a conventional group randomized trial about the impact of Jim Crow laws on illiteracy. The way forward is to recall that the goal is to compare illiteracy in persons with otherwise identical characteristics residing in Jim Crow and non-Jim Crow states. Yet another approach would be to seek out data akin to a natural experiment. were higher SES). since SES is most certainly a function of literacy the rationale for statistical adjustment of SES is dubious. Finally. I do not see how estimated associations from a sophisticated MLM are an improvement over Robinson’s simple correlations. First. hasty research suggests there were two such states: Kansas and Wyoming. Such states would presumably have a diminished culture of racial animus that is confounding the desired effect. from the experimental perspective advocated here. now better disentangled from the background cultures so troubling in the 48-state and border state analyses.e. Downloaded from http://ije. In this case. The middle panel of Table 2 presents the corresponding results. not the imputed fiction generated by a sophisticated regression model.58 The number of persons in total. Among these are Type I error rates. abundant data show that the Authors’ claims about MI and multilevel research are wrong.e. And non-Jim Crow border states as: Colorado. Oklahoma and West Virginia. it may be useful to restrict analyses to states that lie at the border of Jim Crow states (see Authors’ map). but I am unaware of any with these characteristics. A final word on statistical inference seems necessary.29. SES) between Jim Crow and non-Jim Crow states in this border region remains large. Again.366 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY in a multilevel study is anything but straightforward. As with the border state analysis. I also agree with them that critical examination of a scientist’s culture. 2012 Conclusion I am in full agreement with the primary goal of the Author’s new paper: epidemiologists should consider contexts and multilevel phenomena. I seek states that were not part of the confederation of states that attempted to succeed from the American Union during its racially motivated Civil War (circa 1861–65) but later ended up enacting Jim Crow laws. Historians would not be surprised. With the tens of thousands of subjects analysed here there is no reason to present confidence intervals or P-values: all point estimates are extremely precise and most any difference is large enough to reject a null hypothesis of no difference. Colorado and Nebraska) would seem to better identify the effect of Jim Crow. Restricting analysis to a comparison between these states and other similarly situated states that did not enact Jim Crow (e. the number of degrees of freedom for main effects equals the number of experimental arms times the number of groups in each arm (e. Third. It follows that the identification of Jim Crow effects rests on a careful analysis of anomalous but real cases. I hastily defined Jim Crow border states as: Arizona.e. All of the border states were racially charged with many incidents of horrific racist acts in Jim Crow and nonJim Crow states alike. But our agreement seems to end here. Ohio and Utah. regression adjustment) that fictitiously removes this disparity from consideration? Further. the primary test statistic has only two degrees of freedom! This is because in group trials.g. these states have more exchangeable populations. If contexts are thought to be the driving force theoretically. For purposes here.g. Iowa. the upshot is that teasing out the independent effect of Jim Crow on black illiteracy is quite complicated and MLMs appear unable to help. per arm. the Type I error rates from analyses that include only a small number of groups are much larger than programme output implies.oxfordjournals. Because most MLM programmes default to evaluating test statistics against the asymptotic Z-distribution. Critically. history and personal motivation can be enlightening. Toward this end. P-values from MLMs may be artificially too small. Missouri. Nebraska. or per group is not part of the calculation. But several issues merit attention if one was interested in testing hypotheses from a frequentist perspective in a typical multilevel data set. For this and many other reasons. Of special note is that only 3% of blacks in Jim Crow states owned a radio (i. existing data do not support the Authors’ claims about Robinson’s motivation and/or impact. Illinois. they should be evaluated as such statistically. The third panel in Table 2 presents these results. Second. There is a place for MLMs in social epidemiology but in too many cases researchers who employ them end up making . It is readily seen that the impact of Jim Crow on black illiteracy declines to 2. Another problem (not shown) is that the disparity in radio ownership (i.org/ at Pennsylvania State University on November 7. Yet including whites in order to adjust for between race SES confuses things. Neither is the intraclass correlation coefficient. Kansas. we all agree that Robinson’s analysis is technically correct and has withstood the test of time. In other words. leaving the key difference between them Jim Crow laws. New Mexico. Jim Crow or not) minus one.

Rich state. The Design of Experiments. Hovenkamp H. Epidemiol 2008.17:151–56. The logical structure of analytic induction. The statistical measurement of agreement. IL: University of Illinois Press.a reply to Professor Kenneth Clark. Robinson WS.27:545–48. Qualitative and Quantitative Social Research: Papers in Honor of Paul F Lazarsfeld. For proof of this. Irons P. Edinburgh: Oliver and Boyde. Polit Method 1974. Am Sociol Rev 1951. Am Sociol Rev 1959. Am Sociol Rev 1957. To sum up.1:203–31. and Evolution. Park D. Susser M. On following in someone’s footsteps: two examples of Lazarsfeldian methodology. The Use of Social Science Data in the Supreme Court. Raftery AE. discussion and debate are the life blood of scientific advancement. the two new ´n demonstrate yet papers by Gelman and Herna again that clarity in thought and a deep substantive understanding of the phenomena under investigation are the keys to scientific advancement. ASR: yesteday. however. New York: Viking. Appellant’s Brief.5:224–46. Bowles S. Shor B. Revisiting Robinson: the perils of individualistic and ecological fallacy. Villanova Law Rev 1961. Pearl J.31:47–87. Ecological correlations and the behavior of individuals.REVISITING ROBINSON 367 a Faustian bargain (I am not against such transactions so long as researchers give informed consent and disclose accordingly). Bafumi J. 1979. Int J Pub Opin Res 2001. blue state: what’s the matter with Connecticut? Quart J Polit Sci 2007. Causality: Models. Rosen PL. Princeton. Oakes JM. The revenge of homo-economicus: contested exchange and the revival of political economy. Robinson WS. Robinson WS.24:338–45. 1st edn. An experimental (thinking) approach. A method for chronologically ordering archaeological deposits.38:342–60.org/ at Pennsylvania State University on November 7. red state. Robinson WS. Bowles S.7:83–102. Kaddour A. probability.6:69–79.3–4:624–72. DC: National Academy of Sciences. Hernan MA. J Royal Stat Soc B 1951.161:494–500. In: Merton RK. Simon RJ. our analytic goal must not be reduced to a mere demonstration of model-fitting. Int J Epidemiol 2009. Am Sociol Rev 1950. Minnesota Law Rev 1953.91:1309–35. NJ: Princeton University Press. DeGroot MH. IL: University of Illinois Press. Coleman JS.7:183–207. race and education in the United States. Clark KB. Washington. Asymmetric causal models: comments on Polk and Blalock. poor state. Jerabek H. Am Sociol Rev 2004. Sociol Method 2001. Statistics and the Law. The interpretation of interaction in contingency tables. A framework for the study of individual behavior and social interactions (with discussion). Erickson RJ. as originally advanced by Fisher. Krieger N. The desegregation cases: criticism of the social scientist’s role.13:229–44. Przeworski A. 2000. 1935. Jones K.15:73–78. The geometric interpretation of agreement. New York: Free Press. The logic in ecological: I. The effects of segregation and the consequences of desegregation: a social science statement. Finally. Gelman A. Am J Epidemiol 2005. An analysis of AJE citations with special reference to statistics and social science.38:337–41. appears more promising. Fisher RA. Am J Sociol 1986. Ann Rev Law Social Sci 2005.oxfordjournals. As this journal’s editors know. The logic of analysis. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Robinson WS. Robinson WS. tomorrow. I sincerely commend the Authors for writing a provocative paper. Am Sociol Rev 1952. Sills DL.31:1–45. Robinson WS. Lazarsfeld: 1901–1976. Van Den Haag E. Social integration and certain corrollaries: an exploratory study. Srole L. Kadane JB (eds). Law. Dislike’ and their psychological interpretation. pp.13:238–41. Alonso A. Social theory. Morgan SL.16:293–301. Paret M. Social science testimony in the desegregation cases . Contextual models of political behavior. and Inference. Winship C. it prevents the future computation of meaningless variance components and stimulates the study of multilevel causal inference in social epidemiology. New York: John Wiley & Sons. Villanova Law Rev 1960. Fienberg SE. No Opinion. Sociometry 1940.84:825–29. The motivational structure of political participation. Sociol Method 2001. 1998. New York: Cambridge. The impact of sociological methodology on statistical methodology. Microeconomics: Behavior. and a theory of action.15:351–57. A People’s History of the Supreme Court. 1972.3:151–78. Am Sociol Rev 1962. and trial by jury. Lazarsfeld PF. see the vast body of research on school effects. Urbana. Stat Sci 1992. 2012 . 1987. J Econ Perspect 1993. Lucas SR. Rossi PH (eds). Bias. 232–44. Coleman JS. 2007. 1986. Reprinted Int J Epidemiol 2009. The purpose of this paper will have been accomplished if. Durlauf SN. Some properties of the trichotomy ‘Like. 1986. New York: Cambridge. Logan R et al. Reasoning. Institutions. Counterfactuals and Causal Inference.or multilevel. New York: Cambridge University Press. social research. Statistics in sociology. ecological.22:17–25. Gintis H.4 Whether our research is at the individual-. Urbana. 15 16 17 18 19 20 21 22 23 24 25 References 1 26 2 3 4 5 6 7 8 9 10 11 12 13 14 Subramanian SV. Paul F. Am Sociol Rev 1956. 2004. Robinson WS. Robinson WS. Analytical Marxism: Studies in Marxism and Social Theory.69:1–3. 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Clogg CC. Downloaded from http://ije. 1999. today. Am J Public Health 1994.21:709–16.16:812–18. 1950-2000: a selective review. Selvin HC. Simpson EH.19:766–79 (with discussion). Am Antiq 1951.3. Duke Law J 1985.37:427–93. Roemer J (ed).1:27–61. Social science and segregation before Brown. Am Sociol Rev 1950. The Supreme Court and Social Science. Paul Lazarsfeld—the founder of modern empirical sociology: a research biography. Jacobs JA.2:345–67.

27:1934–43. Am Statist 2005. Hearst MO. Although that statement refers specifically to the aims of the studies Robinson has just listed.psu.edu Robinson actually said from a caricature of what he said (his article is only seven pages long. it is sometimes quoted out of context as axiomatic for all social research. no context effects. Hannan PJ. Rosenbaum PR. Fifteenth Census of the United States. Oakes JM.368 41 INTERNATIONAL JOURNAL OF EPIDEMIOLOGY 51 42 43 44 45 46 47 48 49 50 Coleman JS. Multilevel (Hierarchical) modeling: what it can and cannot do. From association to causation: some remarks on the history of statistics. Should we abandon statistical modeling altogether? Am J Epidemiol 1987. To cite one example: after defining terms. 2004.126:10–13. all rights reserved. Stat Med 2007. Oakes JM. The idea that single-level analysis is problematic when there are multilevel effects is quite consistent with Robinson’s classic warning about the ecologic fallacy. What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference. Robinson lists. In some instances interpreters have ‘out-Robinsoned’ Robinson. 1992. Bryk AS. Raudenbush SW. In: Oakes JM.S. Ignorability and stability assumptions in neighborhood effects research. Downloaded from http://ije. 52 53 54 55 56 57 58 Sobel ME. 2012 Published by Oxford University Press on behalf of the International Epidemiological Association ß The Author 2009. J Educ Behav Stat 1995. DC: US Government Printing Office. Thousand Oaks. no contagion effects—then single-level analysis would do. 2006. Hobson CJ et al. Data Analysis Using Regression and Multilevel/Hierarchical Models. pp. 1933. Experimental social epidemiology: controlled community trials. I suspect that Robinson himself would have embraced multilevel analysis had it existed in his day. E-mail: firebaug@pop. USA. Technometrics 2006. Freedman D. New York: Cambridge. Stat Sci 2002. Gelman A. 352). Robinson writes that ‘Ecological correlations are used simply because the properties of individuals are not available’ (p. it is sometimes difficult to separate what Department of Sociology. 2007.oxfordjournals. In the next paragraph.1093/ije/dyn355 Commentary: ‘Is the Social World Flat? W. Berk R.58:1929–52 (with discussion). Am J Epidemiol 2008. let us begin where he did.168:1247–54. Washington. Neighborhood poverty and American Indian infant death: are the effects identifiable? Ann Epidemiol 2008. Massachusetts: Belknap Press of Harvard University Press. more than a dozen pre-1950 articles that used ecological correlations to estimate individual-level correlations. and I recommend that readers examine it for themselves). Heterogeneity and causality: unit heterogeneity and design sensitivity in observational studies. Vandenbroucke JP. San Francisco: Jossey-Bass/ Wiley. Kaufman JS (eds). Invited commentary: rescuing Robinson Crusoe.20:115–47. Department of Commence.101:1398–407. Oakes JM.59:147–52. 1966. Oakes JM. Foundations of Social Theory. Kaddour and Krieger1 is that the social world usually is not flat. CA: Sage Publications. Cambridge. Washington DC: U. Hierarchical Linear Models. Draper D.48:432–35. Johnson PJ. in his third paragraph.18:552–59. Am J Epidemiol 2008. Campbell EQ.14:243–58. Soc Sci Med 2004. JASA 2006. with the issue of whether individual-level relationships can be reliably inferred from aggregate- . Newbury Park: Sage.org/ at Pennsylvania State University on November 7. Gelman A. Equality of Educational Opportunity. Anderton DL. Pennsylvania State University. Inference and hierarchical modeling in the social sciences. Vanderweele TJ. Methods in Social Epidemiology. Johnson PJ.168:9–12.S. referring to ‘these studies’ (sic). Because Robinson has had a multiplicity of interpreters. Coleman JS. Jones. The effect of racial residential segregation on black infant mortality. U.S.2 In fact. 1930: Population Vol II: Chapter 13 Iliteracy. Robinson and the Ecologic Fallacy’ Glenn Firebaugh Accepted 9 September 2008 If the social world were ‘flat’ in the sense that it did not matter where you lived or with whom you associated—no place effects. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. so multilevel analysis usually is called for. To be true to Robinson. 335–64. Hill J.38:368–370 doi:10. A constructive critique. Regression Analysis. Government Printing Office. The message of Subramanian. Advance Access publication 28 January 2009 International Journal of Epidemiology 2009. 1990.