Professional Documents
Culture Documents
REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/270884?seq=1&cid=pdf-
reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Wiley and American Sociological Association are collaborating with JSTOR to digitize, preserve
and extend access to Sociological Methodology
Desmond S. Cartwright
UNIVERSITY OF COLORADO
155
system tradition. There are also methodological divisions, into those who
use factor analytic techniques and those who do not, for example.
The sights of the present paper are set upon sociological method-
ology. However, the work of other disciplines should be examined also
in attempting to obtain an overview of the scope of methodological
issues. Urban ecology, for example, has not so far been concerned with
evaluating population density in terms of three-dimensional layered
space; the suggestion that it should be so concerned arises from con-
sideration of altitude as a variable in the ecological studies of geographers
and biologists.
This paper will have four main parts. In the first two parts, the
scope of the subject matters and methodological issues will be surveyed,
covering human ecology in sociology and four other disciplines, and
according to any of the different meanings of "ecological variables" that
given scientists implicitly adopt. Then the diversity of methodological
issues surrounding ecological variables will be organized into a compre-
hensive schema, selected parts of which will be given more detailed
treatment. In the last part of the paper attention will be focused on the
logical problems of inference from group correlations to individual cor-
relations. Some previous solutions will be reviewed and a new one
described.
SCOPE: SOCIOLOGY
Early Studies
ence that relations found between measures taken over groups of one
size apply also to the same measures if they would be taken over groups
of a different size. The problem is especially acute in (though not at all
restricted to) ecological research, where the size of the areal unit needs
to vary with the problem and with the available data. The group size
of counties is typically larger than that of cities, smaller than that of
states.
Levin and Lindesmith (1937) report on the work of Joseph
Fletcher, who in 1850 published a Summary of Moral Statistics of
England and Wales. Most interesting from the present point of view is
his use of maps and larger ecologic areas. His first map was one effec-
tively of "natural areas" based on the prevailing industry type, agri-
culture, mining, manufacturing, and so on, drawn ". . . with as much
accuracy as was permitted by the large and varying size of the counties;
the civil divisions which . . ." were the integral ones for his data.
Neither the counties nor the areas were equal in size. The counties
for Fletcher were what Duncan et al. (1961) call the "basic set of areal
units" into which the entire "universe of territory" is subdivided, such
that the units are exhaustive of the universe and also nonoverlapping.
Basic units may be combined into larger ones, as Fletcher made "natural
areas."
In reality the investigator is often confronted with the choice of
taking the data in the units available from government surveys or making
up new units and collecting all his own data. Most commonly, the latter
alternative would be prohibitively costly of time and money. The given
units might be recombined in some fashion if it can be shown more
desirable; however, it is usually not done on the basis of equating units
for size, but rather, like Fletcher, on the basis of making homogeneous
units with respect to one or more variables.
The problems of differing size of unit affect the measures them-
selves, not only the inferences. Obviously, size of population will vary
directly with areal size of county, given average equal density.
Construction of Indices. Variations in size can be partially ac-
counted for by construction of rates and indices. Fletcher constructed
maps showing distributions by county and ecologic division for various
"indices of moral influences and results," including "dispersion of
population," "ignorance, as measured by the percentage of signatures
by marks in the marriage registers," "Crime, as indicated by criminal
commitments of males (allowance made for the age of the population),"
Modern Era
Duncan. First, linear distance as the crow flies is probably quite poor
as a measure of actual distance to be traveled or of the difficulty of
traveling from residence to place of work. Streets turn corners, run
around parks or industrial areas, and sometimes do and sometimes do
not have public transportation along them. Sometimes they run through
neighborhoods inimical to selected segments of the population of
potential workmen. Second, since "workplace potential" was ". . .
interpreted as a measure of the accessibility of the site to workplaces
in the . . ." territory, it is appropriate to consider other factors involved
in "accessibility." While the place of the workplace might be reachable,
its work potential in terms of actually providing a job for a potential
workman might be quite remote, depending on numerous aspects both
of the workplace and of the potential workman. From an ecologic point
of view, the notion of accessibility would be poorly represented by
geographic linear distance.
Advances in Multivariate Analysis: Methodological. Multivariate
techniques include multiple regression, factor analysis, cluster analysis,
analysis of dispersion, and several others. Duncan et al. (1961) provide
extensive discussion of uses of multiple regression, much of which
depends upon prior formulation of a model of the relationships to be
studied, selection of an appropriate dependent variable, and regression
upon several independent variables pertinent to a test of the model or
alternative models. The use of such models for estimating the individual
correlation from group data will be discussed in the last section of this
paper.
One series of uses of regression analysis described by Duncan et al.
(1961: 128-160) has to do with assessment of contiguity and regional
classification. Let Y be the dependent variable, X1, X2, X3, X4, the
independent variables, Y* the predicted estimate of Y. Two areal units
may be similar on Y, on some or all of the X, and/or on Y*. The authors
give examples from State Economic Areas, with Y = percentage of land
in farms, X1 = population potential, X2 = distance to metropolitan
area, X3 = index of urbanization, X4= index of soil quality, Y* =
24.33 + 0.590X1 + 0.629X2 - 0.284X3 + 0.472X4, R = .72. The fol-
lowing pair of areas shows similarity in all ways:
Duncan et al. describe the latter pair of units as being ". . . alike for
different reasons." (p. 155); whereas Nebraska 6 and S. Dakota 4a were
"alike for the same reasons."
The basic problem with such analysis is the causal assumption. If
variables are manipulated it is appropriate to call them "independent,"
and those whose consequential variation is observed "dependent."
Without manipulation the implied causal direction, if any, is in doubt
until further consideration demonstrates otherwise. For example, it
might be agreed that where soil quality is higher it is reasonable to
suppose that humans would be more likely to put more acres into farms
on that very account; but it is difficult to imagine that humans would do
their best to put as much distance as possible between their agricultural
and their metropolitan areas, so the distance as such cannot be causal.
Similarly, it is hard to see that smallness of urbanization could produce
or otherwise be causally efficient in relation to percentage of land in
farms, although it is known that if a town or city occupies a place, no
part of such place can be classified as a farm in the census; also that it is
not easy to extend the city limits over a farm if the owner objects.
Multiple regression procedures minimize the residuals Y - Y* the
best way they can for the given data, and it is well known that the
weights multiplying the predictor variables to produce that minimization
will change with additional independent variables or with subtractions.
For a given underlying factor, A, measures sharing the variance of
Factor A will share also the weights; remove one measure from that set
and the weights on the remainder increase in absolute size; remove a
measure with variance in an orthogonal factor set and no change occurs
in the weights of Factor A measures. (See Gordon (1967), for example.)
Thus the procedure has internal restraints that are independent of
choice of predictor variables in a substantive sense. It is these restraints
that are imposed upon the linear combination Y* and allow it, for
rotation will ensure that the smallest number of original measures has
a nonvanishing correlation with the hypothetical factor; and at the
same time, it will ensure that each original measure has nonvanishing
correlations with the smallest number of the m factors. The second task
is that of estimating the factor scores from the obtained measurements
of the original measures. A thorough review of available procedures is
given by Horn (1965).
If it can be shown that some one of the original r measures is very
highly correlated with a particular factor, then that one measure may
henceforth be used in place of all the measures having substantial cor-
relations with the factor-whatever the variable is, it is measured very
well by that one measure; and the remainder are redundant.
Cluster analysis is in some forms a shortcut to factor analysis
with rotation. Using Tryon's (1955) procedures of key-cluster analysis
yields an excellent approximation to a principal axis factor analysis
followed by rotation to an independent cluster solution in the Harris-
Kaiser sense.
Several workers have argued strongly for the use of factor analytic
techniques in ecological research. The noted ecologist Schmid writes:
"Factor analysis possesses two special advantages. The first is parsimony:
it can reduce a large number of interrelated variables to a relatively
small number of independent factors. The second advantage is in provid-
ing a means for discovering underlying unities. It affords a technique for
determining the patterns, regularities, and basic structure of a large
number of variables" (1960a: 535). "The factors obtained from this
analysis are, like any scientific concept, abstract statistical artifacts . . .
it may or may not be possible to demonstrate a direct relationship
between the factors and basic sociological processes. Although one may
be strongly tempted to infer causality, the results of factor analysis
merely measure the degree of concomitance among community structures
and characteristics. The elements that are revealed may be purely
coincidental and do not necessarily possess direct relevance to the
etiology of crime" (1960a: 542).
It is apparent that Schmid here reflects an ambivalence or am-
biguity that is common throughout all work with factor analysis. On
one hand, a factor is supposed to represent an underlying unity, some-
thing which serves to bring together all the measures loading the factor.
On the other hand, a factor is thought of as a reducer of the number of
variables, as a statistical artifact merely measuring the degree of con-
In the work of Tryon (1955, 1967) may be seen the full use of
cluster analysis for purpose of multivariate typing. First, measures are
clustered into as many independent clusters as are demanded by the
correlational data; then scores for each cluster are produced and each
object is scored on all clusters; then the objects are clustered on the
basis of homogeneity of pattern across the cluster scores. The object
clusters are called 0-types. Then 0-type prediction is used on criterion
measures of various kinds, the search being made for homogeneity in the
criterion scores of members of a given 0-type.
Finally, brief mention must be made of a class of multivariate
techniques depending on the analysis of dispersion. Dispersion is the
multivariate equivalent of variance in the univariate case. It includes the
variances of the several measures and also their covariances. The
simplest question asked is: do two or more groups differ in the total set
of means for n measures? Several expositions of the mathematical basis
may be found (for example, Rao, 1962), and of recent developments in
analysis for more complex designs (chapters by Jones and Bock in
Cattell, 1966). So far little use has been made of these procedures in
ecological research, but they have much to off er.
Advances in Multivariate Analysis: Substantive. The main part of
this section will be concerned with results in factor and cluster analysis;
the survey is not intended to be complete. As will be seen, most of the
work has been in urban settings.
Schmid (1950) and Schmid, MacCannell, and Van Arsdol (1958)
studied correlations between a dozen census tract measures for two score
American cities in 1940 and 1950. They concluded that a major dimen-
sion of socioeconomic status of the population, as determined by such
measures as education, income, and occupation, underlies the social
structure of American cities.
Working independently, using procedures of cluster analysis,
Tryon (1955) identified three major clusters of measures on the census
tracts of San Francisco, 1940. The defining measures for the Family Life
cluster were owner-occupied dwelling units, large families, percentage of
females not working (housewives), number of young children. Defining
measures for the second cluster, Assimilation, were skilled males,
native-born whites, females, foreign stock from Protestant Europe, and
white-collar females. The third cluster, Socioeconomic Independence,
was measured by percentage of managerial and professional males,
own-account males and percentage college-educated. The third cluster
ED R ASH NOC ND 00 W F
ED 89 76 71 51 39 41 -12
R 73 68 53 47 34 -13
ASH 86 69 67 58 -07
NOC 73 72 69 01
ND 80 70 16
00 76 12
W 32
and Borgatta offer a profile of the cities, by states, each city having
twelve measures, one of population in thousands and eleven in approxi-
mate deciles. Thus a complex, multivariate typology of cities is presented
(pp. 76-100) in quantitative form, waiting to be put to work. Hadden
and Borgatta comment: ". . . about the use of the profile as a source
for comparative research . . . . It is important to emphasize that one
value of this kind of study is that it permits, indeed compels, the re-
search to progress beyond the limits of 'traditional' variables, because
the relationships among these variables have already been summarized"
(p. 75). It is evident that the profiles will permit valuable nonrandomized
contrast designs (Keyfitz, 1964) and quasi-experimental designs (Camp-
bell and Stanley, 1963) in the study of impacts of different urban ecolo-
gies upon diverse aspects of social organization.
As the unit area is increased, the homogeneity decreases, in
general. Evidently tracts will be more homogeneous than community
areas, the latter more homogeneous than cities, and the latter more
homogeneous than counties. In their appendix (pp. 185ff.), Hadden and
Borgatta discuss the observation that results obtained in ecological
research depend, to a considerable extent, on the way the ecological unit
is defined. The concern is especially important for varying definitions of
a given type of unit such as a city. Changes in the census definition of
urban units from 1940 to 1950 led Bogue (reported in Hadden and
Borgatta, p. 187) to assert that it meant abandoning forty years of
thinking and research done on metropolitan districts and breaking the
chain of continuity in urban population statistics. However, no empirical
evidence on the effects of changes had been produced. Hadden and
Borgatta then report a study of Urbanized Areas, Standard Metro-
politan Statistical Areas, and the principal cities of the latter areas.
Using a multitrait-multimethod matrix (Campbell and Fiske, 1959),
they show that the fifteen factors extracted for each type of area are
highly comparable, with most validity coefficients in the .9 range and
only one below .6, namely that for density in Urbanized Area with
density in Standard Metropolitan Statistical Areas which was .24.
Considering the types of areas at issue, it would have been strange
indeed to find a high correlation between density values. Thus the
factor scores (for factors highly similar to those in the American cities
data reported above) show a robustness that overrides variations in
areal definition of comparable units.
But do factors describing tracts also describe cities? It seems that
Anthropology
Biology
live births; for "unlikely" areas it was 12.9. Areas with outcrops of
igneous rocks had the highest malformation rate, 17.5. Another ex-
tremely careful study by Grahn and Kratchman (1966) shows that
neonatal mortality rates are higher in most mountain states; and that
there is a regular increase in rate with altitude, which appears related not
to geologic environment, but to associated increases in cosmic ray
intensity and decreases in oxygen partial pressure. The latter appears to
function by reducing fetal overall growth rates in the last ten weeks of
pregnancy, thereby reducing birth weight, with consequent lowering of
resistance potentials.
Geography
Psycho
knowledge of how often people say things proudly and with definiteness,
for example; and whether such occurs more frequently in the behavior
setting of dressing in the morning or in the setting of the drugstore at
noon.
Sells (1963) attacked the problem somewhat differently. He
argued that focus on phenomenological data was liable to obscure the
relation between situation and behavior, and that attempts should be
made to dimensionalize characteristics of situations in as objective and
complete a way as personality and ability dimensions of individual
behavior have been established. He proposed that such a program be
started with an outline of "Basic Aspects of the Total Stimulus Situation"
(p. 9), for which measures should be gathered or created, and among
which the major interdependencies should be examined through multi-
variate techniques. His primary heads and major subheads were: (1)
Natural aspects of the environment-(a) gravity, (b) weather, (c)
terrain, (d) natural resources; (2) Man-made aspects of the environ-
ment-(a) social organization, (b) social institutions, (c) transitory
social norms; (3) Description of task-problem, situation, and setting-(a)
factors of the focal task situation (for example, hazards and risks in-
volved, permitted procedures, skill required), (b) factors of the indi-
vidual's relation to the situation (for example, status hierarchy), (c)
factors defined by other persons in the situation (for example, new or
previous acquaintances), and (d) factors of the setting (for example,
physical restraints, habitability); (4) External reference characteristics
of the individual-(a) biological, (b) social (for example, group member-
ships); (5) Individuals performing relative to others-(a) togetherness
(primary groups), (b) group situation (formal structure, control, and
so on, and intergroup considerations), (c) collective situations. The work
has been pursued; dimensions of ordinary social groupings have been
obtained, for example (Sells, 1965). Sells (1966) has advocated that
psychologists generally pay much greater attention to behavior in
natural settings and focus upon the "ecological niche" as a source of
substantial contributions to behavior variance.
It is apparent that psychologists ultimately have in mind the
study of individual behavior, whether they study distal perceptual
achievement (like Brunswick), ongoing individual behavior in behavioral
settings (like Barker and Wright), or the dimensions of situations (like
Sells).
ANALYTIC SCHEMA
The Entities
TABLE 1
A Schema for Analytic Consideration of Studies in Ecology
1. The Entities:
How Many?
What?: 1 2 ..... N
Species:
Groups:
Individuals:
2. The Setting:
Dimension: 2 3--; 4
Duration: Short__; Decades-_; Centuries _; Millenia
Size: Micro__; Meso ; Macro
3. Entities as Components of the Setting:
No entities considered components
Some entities considered as components of the setting:
How Many?
What? 1 2 ..... N-1
Species:
Groups:
Individuals:
What entity considered as target entity?
4. Informational Content of Variables:
Descriptive__; Inferential_-
Denotative__; Abstract
5. Measurement Procedures:
The Setting
change in areal unit means plus interaction between the changes in areal
distribution and areal unit means.
Numerous limits are placed on measurement procedures if
changes are to be studied; for example, the correlation between initial
values and changes must be taken into account (see Lord, 1956, for
example); measures with ceilings pose the special problem of limiting
change effects in the ceiling (or floor) direction for values initially close
to it. In fact, the topic is so complex that an investigator without
previous experience in measuring changes should probably consult some
sources devoted to the problem: Duncan et al. (1961: 62ff., 84-90, 160ff.),
Harris (1962).
Size. It might be useful to distinguish between macroecological
settings (for example, Dorf (1966) on terrestrial climate in the northern
hemisphere); microecological settings, like a city street; and meso-
ecological settings, which would include tracts and cities and counties
and states, but not continents, which would be macro. One implication
of size differences is that presumably the homogeneity of units is more
difficult to establish for larger units.
nicelless of clothes the immigrant enitities might see, for example) as all
entity (such as the richness of the occupants). If a demographic measure
is used (such as median age), it has the same double interpretative
possibility.
Now a measure of distance from the central business district, or
distance from workplaces, or age and condition of housing, seems to
reflect more evidently a variable pertaining to the setting only; while
measures of psychopathology incidence or prevalence seem to have a
greater applicability to the incumbent population only. And yet such
measures may be used by the researcher to reflect the character of areas
to which newborns or young children must adapt. Any subgroups might
be treated as the target entity, faced with the characteristics (such as
psychosis rate) of the ambient population just as much as with those of
the surrounding buildings and spaces. Therefore we must conceptualize
the setting as at times including certain entities; the target entity under
focus must adapt to its entire setting, including other entities within it.
It is a matter of figure and ground, and of degree of entitativity.
One difficult problem with complex settings arises from the
movements of entities in migration, transiency, and mobility. In using
entities to characterize areas, migrants from other areas may provide
false characterization (cf. Fletcher's attempts to adjust for migration).
Influence of setting upon entity may also be misunderstood if selective
migration to or from the setting is not examined closely.
Ross (1933) comments especially on the problem of transiency,
and notes that in an area where transient persons reside for an average
of three to four months only, it makes no sense to use Average Daily
Population (ADP) as a base for an Annual Commitment Rate since the
population at risk is several times larger than the ADP. In an area with
little transiency the base figures will be correct and the Commitment
Rate not inflated. Comparisons between the two areas on that rate will
obviously be affected more by the base than by the real proportions of
persons committed. A similar condition attends resort areas. In Aspen,
Colorado, for instance, the resident population is 1,500, according to the
1960 census, but it is estimated that 25,000 persons are there during the
ski season, staying for anywhere from one day to the whole season.
Arrest figures for juveniles would clearly have to take account of these
differences in population figures before appropriate rates could be
computed.
Mobility provides problems of very many kinds for ecological
of the labor force. Hadden and Borgatta interpreted the factor as "whole-
sale concentration," and it appears to be a straightforward descriptively
summarizing variable.
By contrast, Schmid's (1960) interpretation of his Factor I as Low
Social Cohesion-Low Family Status is mainly inferential. Some of the
salient measures are: Per cent families in labor force (.94), Fertility ratio
(-.93), Per cent married (-.91), Per cent Housing units built prior to
1920 (.89), Per cent population sixty years old and over (.87). Measures
with much lower loadings are: Per cent male, Median school grade
completed, Per cent professional workers, Per cent proprietors, manag-
ers, etc., Per cent laborers, all with factor loadings between - .35 and
+.24. The interpretation of Low Social Cohesion seems entirely in-
ferential; the interpretation of Low Family Status is probably inferential
too, unless it means mainly Low Married Status. Indeed, under the
circumstances of the given collection of measures that the factor analysis
has brought together, it would be difficult to assign any one clearly and
purely descriptive summarizing variable name to cover all the high
loading measures.
Denotative or Abstract. The interpretation of a factor as "residen-
tial mobility" refers to some definitely observable behaviors and numbers
of persons engaging in them. The variable of "dominance" refers to one
species or group winning out in competition with others. These types of
variables are denotative. By contrast, the variables of altitude or
latitude do not denote any particular events or processes or entities;
rather they refer simply to certain dimensions of an arbitrary framework
of reference axes: they are abstract. As defined, abstract variables can
be neither causal nor caused.
Measurement Procedures
Existing Solutions
I II Sum
A 95
B 782
Sum 638 239 877
I II
A p 1-p
B r 1-r
E{u} = np (1)
and
E{v} = mr (2)
E{u + v} = np + mr (3)
Since (m/(n + m)) + (n/(n + m)) = 1.00, and letting X = n/(n + m),
we have
E{Y}=a+bX (7)
I II Sum
A p* Sum A (1-p*) Sum A Sum A
B r* Sum B (1-r*) Sum B Sum B
Sum Sum I Sum II
m~~
A < k > n
C 22
A New Solution
Ng
Ng
REFERENCES
ALLPORT, G. W.
ASCHMANN, H.
1966 "The Impact of Man on the Gippsland Lakes, Australia." Pp. 55-73 in
S. R. Eyre and G. R. J. Jones (Eds.), Geography as Human Ecology:
Methodology by Example. New York: St. Martin's Press.
BIRDSELL, 3. B.
1902 Life and Labour of the People in London. Final Volume. London:
Macmillan.
BORGATTA, E. F., AND HIADDEN, J. K.
1965 "An analysis of tract data by regions." Social Behavior Research Center,
University of Wisconsin. (Mimeo).
BRESLER, J. B. (ED.)
1967 "Ecologic analysis of the State of Colorado." Paper read at the annual
meetings of the Society of Multivariate Experimental Psychology,
Berkeley, California, November 1967. Available in dittoed form from the
authors.
CATTELL, R. B. (ED.)
1966 "Climatic changes of the past and present." Pp. 5-27 in J. B. Bresler
(Ed.), Human Ecology. Reading, Mass.: Addison-Wesley.
DUNCAN, B.
1966 "Variation in neonatal death rate and birth weight in the United States
and possible relations to environmental radiation, geology and altitude."
Pp. 251-276 in J. B. Bresler (Ed.), Human Ecology. Reading, Mass.:
Addison-Wesley.
GORDON, R. A.
HANKINS, F. H.
1908 "Adolphe Quetelet as statistician." In Studies in History, Economics, and
Public Law 31: 443-576. Published by Columbia University Press.
HARMAN, H.
1963 Local Community Fact Book: Chicago Metropolitan Area, 1960. Chicago
Community Inventory, University of Chicago.
KOCH, S. (ED.)
1861 London Labour and the London Poor, Vol. 11, London: Griffin.
NEWMAN, M. T.
1966 "Landforms, drainage, and settlement in the Vale of York." Pp. 91-121
in S. R. Eyre and G. R. J. Jones (Eds.), Geography as Human Ecology:
Methodology by Example. New York: St. Martin's Press.
PARK, R. E.
1915 "The city: Suggestions for the investigation of human behavior in the
urban environment." American Journal of Sociology 20 (March): 577-612.
1925 "Human behavior in the urban environment." In R. E. Park, E. W.
Burgess, and R. D. McKenzie (Eds.), The City. Chicago: University of
Chicago Press.
1926 "The urban community as a spatial pattern and a moral order." In
E. WV. Burgess (Ed.), The Urban Community. Chicago: University of
Chicago Press.
1936 "Human Ecology." American Journal of Sociology 42 (July): 1-15.
RAO, C. R.
1960b "Urban crime areas: part II." American Sociological Review 25 (October):
655-678.
SCHMID, C. F., MACCANNELL, E. H., AND VAN ARSDOL, M. D.
1958 "The ecology of the American city: further comparison and validation of
generalizations." American Sociological Review 23 (August): 392-401.
SELLS, S. B.
1962 U.S. Censuses of Population and Housing: 1960. Census Tracts. Final
Report PHC(I)-26. U.S. Government Printing Office. Washington, D.C.
VAN ARSDOL, M. D., JR.
1967 "On American society and the immediate future of demography and
human ecology." Et Al. 1 (Fall): 5 and 10.
WIRTH, L.
1945 "Human ecology." American Journal of Sociology 50 (May): 483-488.