You are on page 1of 25

Fifty shades of grue: indeterminate categories and induction

in and out of the language sciences

Matthew Spike

If you’ve seen one electron,


you’ve seen them all.

Steven Weinberg

The ultimate conclusions of the


population thinker and of the
typologist are precisely the
opposite. For the typologist, the
type (eidos) is real and the
variation an illusion, while for
the populationist the type
(average) is an abstraction and
only the variation is real. No
two ways of looking at nature
could be more different.

Ernst Mayr

Abstract

It is hard to define structural categories of language (e.g. noun, verb, adjective) in a way
which accounts for linguistic variation. This leads Haspelmath (2018) to make the following
claims: i) unlike in biology and chemistry, there are no natural kinds in language; ii) there
is a fundamental distinction between descriptive and comparative linguistic categories, and;
iii) generalisations based on comparisons between languages can in principle tell us nothing
about specific languages. The implication is that cross-linguistic categories cannot support
scientific induction. I disagree: generalisations on the basis of linguistic comparison should

1
inform the language sciences. Haspelmath is not alone in identifying a connection between
the nature of the categories we use and the kind of inferences we can make (e.g. the ‘new
riddle of induction’: Goodman, 1965), but he is both overly pessimistic about categories in
language and overly optimistic about categories in other sciences: biology and even chem-
istry work with categories which are indeterminate to some degree. Linguistic categories
are clusters of co-occurring properties with variable instantiations, but this does not mean
that we should dispense with them: if linguistic generalisations reliably lead to predictions
about individual languages, and if we can integrate them into more sophisticated causal ex-
planations, then there is no a priori requirement for a fundamental descriptive/comparative
distinction. Instead, we should appreciate linguistic variation as a key component of our
explanations rather than a problem to be dealt with.

1 Categorical status anxiety

Scientists make all sorts of generalisations. Electrons carry a negative charge, water is H2 O,
giraffes and dolphins are related species, all languages have nouns, and happy people make
better workers. But some generalisations are better than others. Sometimes this comes down
to the evidence: maybe it turns out that happy workers aren’t so great after all. Sometimes,
however, we run into problems with the things we make generalisations about. Take nouns,
for example: what are they? The class of words that refer to objects, perhaps, but of course
nouns can refer to many other kinds of thing (e.g. death, taxes); some have fixed referents
(e.g. Elvis) and others don’t (e.g. Prime Minister ), and many languages cut finer distinctions
between nouns than in English (e.g. gender, animacy, alienable vs. inalienable, etc.). As ‘noun’
is a grammatical distinction, maybe we should concentrate on the structural roles nouns take
on within linguistic systems. But, even in English, nouns are heterogeneous and occasionally
unpredictable: they can be countable and uncountable (e.g. dollar v.s. cash); they trigger
verbal agreement for number (e.g. money talks), except when they don’t (e.g. ten dollars is
enough); they combine freely and productively through compounding — except when they don’t
— and so on. And nouns in other languages often do none of these things, but instead do all
sorts of other things which don’t happen in English at all: nouns in languages like Lardil are
marked for tense (e.g. ‘I cook-will fish-will’), incorporated nouns in polysynthetic languages
productively compound with verbal predicates (something like ‘I fish-ruined this morning’),
and Samoan nouns are marked for specificity (i.e. the difference between ‘we all burn a fish
sometimes’ and ‘Tua burned a particular fish’). Indeed, in some languages the existence of a

2
meaningful distinction between nouns and verbs is up for debate. So much for nouns, then. But
this kind of messy situation is typical in linguistics, where the establishment of a noun category
is relatively uncontroversial compared to that of other categories like adjectives and adverbs,
let alone less familiar classes such as ideophones and classifiers, and where even ‘word’ is a
notoriously tricky concept to nail down (see Haspelmath, 2011). The temptation here, perhaps,
is to accept that — unlike in physics, chemistry and biology — there are simply no well-defined
categories in human language. And, if this is indeed the case, and comparing nouns in English
with nouns in Ojibwe is just comparing apples and tennis balls, we probably need to have a
serious re-think about the basis of any cross-linguistic comparison.
Haspelmath (2007, 2010a, 2010b, 2018) is a proponent of the view above. Haspelmath’s
motivations are clear and well-founded: there don’t appear to be any linguistic categories with
exactly the same properties in any two languages. But he draws a radical conclusion, arguing
for a principled distinction between the description of individual languages on one hand, and
comparison of languages on the other. Linguistic description should assume a structuralist
stance in the Boasian tradition: every language is a unique entity, to be described on its own
terms, using a special metalanguage consisting of descriptive categories, with each of these
categories particular to the language in question. Linguistic comparison, on the other hand,
requires a different metalanguage which employs comparative concepts. The two categorical
systems should be fit for purpose but — crucially — are fundamentally different. Although it is
likely that some terms (e.g. ‘noun’) will be used both descriptively and comparatively, they do
not refer to the same thing, or even the same kind of thing. Accordingly, we can refer to ‘the
Nahuatl noun’, or ‘the comparative concept noun’, but the former is not a token of the latter,
and the latter is not the type of the former. The Nahuatl noun might play an important role in
the description of that language, and the comparative noun allows us to say, for example, that
(like most languages) German and Maori both have something which we might, comparatively,
call a noun. But the things we call nouns in English and Nahuatl and Maori are not variations
on a theme, i.e. the comparative noun: they are incommensurable.
There are major implications here. The first is methodological: although field linguists
are free to continue describing languages, and linguistic typologists in turn to proceed with
their comparisons, great care should be taken in borrowing terminology from one practice into
another. This is eminently sensible, and echoes similar calls in the linguistic literature (see
Huddlestone & Pullum, 2002, p.31 for the distinction between language-particular and general
definitions; also Bybee, 1989; Lazard, 2006). The second implication, however, is an episte-

3
mological one, with potentially profound consequences: if no basis of meaningful comparison
exists, we need to seriously constrain — or perhaps even abandon — our attempts to make
scientific inferences on the basis of linguistic comparisons.

2 Haspelmath’s categorical imperatives

What kind of science concerns itself with linguistic diversity? According to the following claims,
a rather restricted one:

1. “Ontological difference: Comparative concepts are a different kind of entity than descrip-
tive categories.”

2. “General category fallacy: We do not learn anything about particular languages merely
by observing that category A in language 1 is similar to category B in language 2 or by
putting both into the same general category C.”

(Haspelmath, 2018, p.2)

The ontological claim has already been fleshed out above, i.e. that no meaningful or useful
type/token relation pertains between descriptive and comparative categories: they are not re-
lated as determinates and determinables (e.g. the relationship between ‘square’ and ‘shape’).
And although this might be seen as a kind of methodological prescription in line with Boasian
and structuralist thought — individual languages and speaker communities should be treated
objectively and on their own terms as much as possible — it is not: with his second, epistemo-
logical claim, Haspelmath denies a role, or even the prospect of a role, for inductive practice
based on linguistic comparison.
Haspelmath (2018, pp.89-91) distinguishes three types of ‘scientific category’: natural kinds,
comparative concepts, and social categories. Chemical elements and biological species such as
gold and the red fox are taken to be good examples of natural kinds: they are things which
“form a group regardless of any observers.” For natural kinds, “we need detailed descriptions
and agreement on a label but not a definition,” as they can be recognized by their various
“symptoms”, which need not be necessary and jointly sufficient, and where there must be a
clear difference between different natural kinds. Mountains and streams, on the other hand, are
comparative concepts “created by observers”, which “must be defined rigorously and delimited
from similar phenomena (e.g., mountains versus hills, streams versus rivers)”, whose “delimi-
tations are somewhat arbitrary”. Finally, social categories are “neither natural kinds (in the

4
sense that they recur across continents, independently of individual cultures) nor observer-made
concepts but that are recognized by every member of the society.”: these include things like
boyfriends, office towers, and rabbis. Social categories require exhaustive definitions, but only
within their particular milieu: what exactly do French boyfriends do, for example, and how
are they embedded within French culture? Finally, social categories and natural kinds contrast
with comparative concepts in that they are ‘independently existing’, as opposed to ‘observer
created’, but natural kinds and comparative concepts are ‘universally applicable’, while social
categories are ‘culture-specific’.
As might be expected, this novel ontology has potentially deep ramifications. As both
social categories and natural kinds are objectively real, knowing that something belongs to
one of these categories provides you with information about that thing: “once we realize that
an animal is a red fox (Vulpes vulpes), we can predict much about it and, if an investor is
told that a developer wants to build an office tower, they have clear expectations... Realizing
that something is subsumed under a natural kind or social category is a finding that gives
us additional information and we can establish a causal link between the phenomena and the
categories”(pp.97-98, my emphasis). This is not, however, the case for comparative concepts:
“If a geographer calls a landscape form on a newly discovered island a mountain, this does not
add any information and it does not establish a causal link ”(p.98, my emphasis again).
Further to this, only natural kinds can support meaningful comparison: I can compare one
red fox with another. Comparative concepts (unsurprisingly) also support comparison, but on
account of being artificially constructed and arbitrarily delimited from other similar categories,
they are not ultimately meaningful. Social categories such as boyfriend can be compared within
a culture, but it would not be meaningful to compare French boyfriends with, for example,
Quechua boyfriends.
Linguistic categories, Haspelmath insists, are never natural kinds. Categories defined for
specific languages, such as we might find in a grammar, are social categories; the categories
familiar from typological work are (or should be) comparative concepts. The reason that lin-
guistic categories are not natural kinds is that the latter require only symptoms and diagnostic
tests (p.98, 102): we can recognise them via some subset of their properties, where these prop-
erties are not required to be necessary or jointly sufficient (p.98). Linguistic categories, on the
other hand, do require definitions. But categories within a specific language can only be defined
with respect to that language, and as such are social categories. Cross-linguistic categories
cannot — by their very nature — be defined within the context of specific languages, so they

5
are comparative concepts: “since the categories are not defined by their meanings, their nature
is different and they are incommensurable”(p.97).
Given the above, we can understand how Haspelmath concludes that there are no natural
kinds in language, that meaningful cross-linguistic comparison is impossible, and that any scien-
tific inference on the basis of such comparison is invalid. If all this is true, however, it is terrible
news for the disciplines guilty of trafficking in radical notions like noun, verb, and adjective: it’s
back to the drawing board for much of theoretical linguistics, as well as linguistically-oriented
branches of psychology, neuroscience, philosophy, and further afield.
I am more optimistic. I believe that Haspelmath is absolutely correct in drawing attention to
the fact that even basic linguistic categories show tremendous variation. However, Haspelmath
has drawn an unrealistic distinction between two types of scientific practice, i.e. those which do
and do not have access to natural kinds, where only the former can make meaningful comparisons
which support scientific inference. For all his scepticism about the ontological status of cross-
linguistic categories, Haspelmath is happy to assume that natural kinds in other sciences, for
example species in biology and even chemical compounds in chemistry, are uncontroversial: they
are not. Scientific categories are almost always fuzzy to some degree, yet scientific progress
continues in spite of — and often as a consequence of — this being explicitly understood.
Linguistic categories may well be exceptionally fuzzy in this regard, but this fact alone does not
preclude them from a role in meaningful comparison, and ultimately in scientific explanation
and prediction.
I will proceed as follows: first, I will sketch out some relevant work in the philosophical
literature which focuses on categories, naturalness, and their role in induction, with a focus
on the new riddle of induction (Goodman, 1965), and the role natural kinds play in scientific
explanation (Quine, 1969; Godfrey-Smith, 2003). Next, with particular reference to the role
of species in biology, I will show how a very similar set of conceptual problems relating to
categorisation have cropped up in other scientific domains. In these cases, a less restrictive
notion of naturalness seems plausible, while still allowing for useful activities like comparison
and prediction. I then assess Haspelmath’s claims in light of the above: there is nothing a
priori wrong with cross-linguistic categories which are defined in terms of clusters of features
in individual languages: we can compare languages on such terms, and — provided we use
them with care — they can play an important role in the language sciences. What is more, we
should see variation as a feature not a bug, and an essential component in any explanation of
the nature and origin of linguistic categories.

6
3 Kind of grue

Explanations and predictions in linguistics (and more generally), whether they concern historical
or structural relationships between languages, the factors behind linguistic variation, or an
individual’s knowledge and use of language, require evidential support. That is to say, a good
explanation or prediction should be in line with the data we have. Good explanations need
more than this, of course, one reason being that almost all explanations are underspecified by
the data: there are often many possible ways to group our data, which means we can construct
multiple explanations which are all consistent with that data. For example, I might observe
that Subject-Object-Verb order occurs in Mongolian and Marathi, and Verb-Object-Subject
order in Malagasy and Tzotzil, and make the generalisation that SOV languages are Northern
Hemispheric, and VOS languages are Southern Hemispheric. This would be an awful inference,
of course: for a start, we have plenty of evidence to the contrary, and know that it will make
bad predictions. Moreover, my categories of Northern and Southern Hemispheric languages
are extremely dubious. It is not a distinction which appears in the literature, and for good
reason: firstly, locating the nearest geographic pole doesn’t tell me much about the features of a
given language; secondly, there doesn’t appear to be any reasonable basis for the categorisation
beyond my tiny initial sample. However, this kind of problem can still occur with much larger
samples, and much less clearly suspect categories.
Despite such problems, it seems reasonable to insist that good explanations and predictions
tend to rely on previous observations. So far so good: this is the stuff of scientific inference
and the backbone of science. Given the importance of this kind of reasoning, it would be nice
to feel secure in the validity of induction as a general and universally applicable procedure.
Unfortunately, this turns out to be much harder than expected. This is the ‘Problem of Induc-
tion’: Hume (1740) showed that when we extend from the known to the unknown, we assume
something along the lines of a ‘uniformity principle’: that observations made in the past will
resemble observations made in the future. However, there doesn’t seem to be a way to estab-
lish that principle as true a priori — that it is necessarily the case that past observations will
resemble future ones. And if this is the case, establishing the validity of the principle itself
must also rely on a form of induction, and we are left with circular reasoning: induction can
only be justified via induction. Hume’s solution was to abandon hope for a deductive proof
of induction’s validity, and instead conceive of it as a ‘habit of mind’: we continue to make
scientific inferences because they have been successful in the past, and we presume (and hope)

7
that they will continue to be so in the future.
So why won’t anyone publish my magnum opus, ‘Culinary aesthetics of the relative clause:
a minimalist account’ ? Imagine that I could establish some statistical relationship between the
distribution of Feature X and right-branching relative clauses, where Feature X is whether I
enjoy the regional cuisine associated with a language. Most linguists would be quick to dismiss
my work, and they would probably be right, but why? Something has gone (very) awry with
my inferences, but it is not because of their overall form, i.e. generalising from known data.
The problem is Feature X: not only is it highly subjective and ill-defined, it doesn’t crop up
anywhere else in the linguistic literature. In fact, it seems very reasonable to reject any linguistic
theory which is propped up to any degree on Feature X.
The key insight here, the importance of the things I employ in scientific inference, was first
made by Goodman (1965) with his ‘New Riddle of Induction’. Goodman’s illustrative example
is notoriously counter-intuitive, and involves a property called grue. Maybe, Goodman suggests,
emeralds are not green, but grue. What is grue? It is not the grue familiar to linguists, i.e. a
colour category which straddles parts of what English separates into green and blue, for example
the Japanese term ‘ao’. Goodman’s grue is much stranger: objects that are grue are green if
first observed before some point t in the future, e.g. next Tuesday, but blue if observed at any
time after that. Note that grue things don’t change colour: once observed as being green, they
will always be green, and the same goes for things observed as blue. Also, grue is falsifiable: if
we observe an emerald after next Tuesday and it turns out to be green, then it is not grue. And
so — strange as it may be — by being both observable and falsifiable, grue has some desirable
scientific qualities.
Goodman’s point is that, to date, every observation of emeralds is equally in line with the
generalisation that all emeralds are green, and that all emeralds are grue. Our intuitions about
the second generalisation are that it is unlikely to be well-received in peer review. Why? Who
cares, you may think: grue is a silly concept and has helped to re-confirm your opinion on
philosophers. But reserve judgement for a moment: grue is a silly concept, but it is hard to pin
down why it is silly, and the reason behind this turns out to be very important for the question
at hand, i.e. the nature and utility of linguistic categories.
Firstly, there’s nothing wrong with the form of ‘all emeralds are grue’ or ‘right-branching
relative clauses predict good meals’ as inferences: they have the same structure as any number of
reasonable generalisations. So, given that we should be happy with statements like ‘all emeralds
are green’, it looks like the problem lies with grue. This is unlikely to surprise the reader. Grue

8
sounds very much like the kind of thing only philosophers would talk about, and there seems to
be something decidedly strange about its construction. But, again, what? A standard response
is the weird time-dependence of grue: we don’t normally define things in terms of an arbitrary
cut-off in time, or other apparently arbitrary conditions. Goodman anticipated this, and showed
that when we appeal to the relative simplicity of definitions for terms like ‘green’ in comparison
to seemingly contrived ones like ‘grue’, this simplicity is language-specific: some other language
might take ‘grue’ as fundamental, and green as contrived. 1

So what is wrong with grue? Goodman’s answer relates to how we use categories. Green,
as a category, has a solid pedigree: it has played a role in an enormous number of successful
past inferences; grue (as far as I know) has not. Goodman designated successful categories like
green as projectible predicates. Projectible predicates are categories which have a history of
supporting successful inferences, and are established over time.
The notion of projectible predicates was further developed by Quine (1969), who considered
the naturalness of kinds. He felt that both successful generalisations and the things we use to
make them tend to reflect the structure of the world. At the same time, he was quite liberal
in what he would accept as a respectable predicate. Quine saw projectibility in the context of
scientific progress: for him, there are any number of “gradations” (Quine, 1969, p.15) between
an “intuitive” notion of similarity and a “scientifically sophisticated one”. In this way, less
clear-cut predicates are essential steps in the process of developing more fleshed-out theories.
Ideally, a mature theory will eventually be subsumed into a more general body of science, i.e.
it should be commensurable, but this does not mean we should abandon less rigorous theories
because of their immaturity, or even that they lack utility once we have a more sophisticated
theory. Simpler ideas can be retained for use in some or even most contexts: “we all still say
that a marsupial mouse is more like an ordinary mouse than a kangaroo, except when we are
1
Goodman’s example goes as follows: we can create a new term, ‘bleen’, which is a counterpart to grue: blue
if first seen before next Tuesday, or green thereafter. Once we have done this, we can provide a new definition
of green: grue if observed before next Tuesday, or bleen thereafter. Again, this is silly, but if we claim to only
be worried about the fact that grue has a time-dependent definition, this shows that blue and green are equally
susceptible to these reservations. As such, given that definitions are language dependent anyway, I would need
to justify that the specific language I am using is better than all others, which returns us to the same problem
again. A possible rejoinder here is that grue requires a time-dependent definition, where green does not, but if
my only definition of green is then something like ’the colour that emeralds are’, I’m stuck with the same problem
again: I could equally insist that emeralds are grue. Another rejoinder is that of definitional parsimony: grue
has a more tortured definition than green — but, again, only in the language we are used to: as seen above, we
can consider a language in which grue and bleen are primitive, and green and blue have the tortured definitions.

9
concerned with genetic matters”(p.15). If we take this perspective, the naturalness of some
category can depend on context, can be difficult to integrate with other scientific knowledge,
and can rest on unsettled notions of similarity, all as long as it continues to serve up useful
inferences. So we can say that neither grue nor Feature X look very natural at all, but this
is not because of their properties, i.e. that their definitions are weird, or badly delineated, or
subjective: they are un-natural because they appear to play no role in induction.
This is not, of course, the only take on natural kinds and their role in inference. However,
we have already reached some important conclusions: one, there are very good reasons to think
that (unlike class) naturalness isn’t an ‘either you have it, or you don’t’ kind of property. Two,
the best way to see the naturalness (or otherwise) of a kind is probably not as the thing which
determines what you can hope to do with that kind. Instead, naturalness can be something
conferred by what you have been able to do with that kind: a consequence, rather than a
restriction.
This seems a good juncture to focus on (slightly) more practical concerns, and think about
the kind of inferences scientists actually make: Godfrey-Smith (2003) draws out a further
connection between naturalness and induction which is relevant to our discussion here. Godfrey-
Smith notes a difference between two modes2 of scientific inquiry, one statistical, and the other
causal/explanatory. We can see the distinction in the difference between questions like ‘how
many dogs bark’ on the one hand, and ‘how do dogs bark’ on the other. It turns out that
naturalness is largely irrelevant for statistical questions of the first type, but critical for causal
questions of the second type.
What matters for statistical-type questions is whether we can feel secure in my sampling
practices. So something like Feature X makes — in principle — an acceptable predicate, as
long as I can sample randomly from representative cultures. If I sample 30 cultures and enjoy
15 of their cuisines, I have at least some degree of justification in inferring that I like about 50%
of global cuisines. The details of how (and indeed what it would mean) to take independent
samples from cultures are of course important, but the naturalness of Feature X is largely
irrelevant to such concerns. Grue, on the other hand, is unsuitable for even this more restricted
kind of inference. It has semantic properties which intervene in our ability to randomly sample:
there is no way to collect data from the future. Intervening semantic properties do not have
to reference time: for example, defining Feature X as ‘cuisines I enjoy but which dragons and
aliens would not’ introduces its own problems for random sampling, connected of course to the
2
Note that this is not intended to be exhaustive.

10
difficulties of surveying those populations, and it would end up having similar problems to grue.
The second scientific mode Godfrey-Smith considers does rely on the naturalness of a kind.
As an example, let’s say you want to explain what happens in a cloud prior to a lightning
discharge. Understanding the way electric charge builds up is probably an important part
of any relevant explanation, and knowing that this is mediated by electrons is even better.
Electrons have a number of invariant properties: they carry a negative charge, they have a
specific rest mass, and so on. Because of this, I can assume that an electron in my laboratory
is the same as an electron in a cloud overhead, or in a cloud that Genghis Khan looked at, or
in the atmosphere of Venus. And, if that is true, I don’t need to resort to random sampling:
I can create artificial lightning in my laboratory, and can infer that the conditions required
for lightning in my laboratory are very similar to the conditions overhead or on Venus. If
electrons had less reliable properties, on the other hand, I would need to be much more careful
with my inferences. This kind of inference is where both grue and Feature X would encounter
big problems: as Godfrey-Smith (2011, p.43) points out, believing in the naturalness of grue
would imply “a large collection of very strange beliefs about chemistry, history, and light”,
and any inference about the role of Feature X in language would require a rather unorthodox
understanding of how the causal mechanisms involved in my taste in food interact with the
linguistic norms of a large number of far-flung communities.
So where does this leave us? Categories are important, but it is also important to know what
we want to do with them. Some categories, such as grue, are so pathological that they appear
to be entirely ill-suited for scientific use, but it is not clear how often they turn up outside of
contrived philosophical thought-experiments. Many other categories, on the other hand, are
still useful in the absence of particularly rigorous definitions, never mind an unshakeable faith
in their objective, concrete existence.
This is for two main reasons: firstly, if a category appears to be doing a reasonable job in
supporting successful inferences, and doing this in the face of some degree of uncertainty over
its naturalness, then that is a good reason to keep it around — for the time being at least.
Agronomists can tell you what to plant, geologists have a good idea of where to look for oil,
and climate scientists can make detailed predictions about the consequences of over-exploiting
natural resources, and all this despite the fact that no two instances of soil, rock or weather are
ever identical.
Secondly, as long as their definitions aren’t weird enough to intervene in random sampling
(and very few are anywhere near this weird, evidenced by our intuitions regarding definitions

11
which reference things like the future, dragons, or aliens), then almost any category will suffice
for this less demanding form of inference: if all I want to do is guess whether I’ll enjoy the next
song on the radio, it doesn’t really matter that classifying music into genres which I do and do
not like is a hopelessly inconsistent and subjective act.
Finally, however, there are categories which we want to integrate into causal or mechanistic
explanations: for these, we should require some degree of faith in their naturalness, i.e. that the
category entails some set of invariant properties which allow us to extend generalisations even
from extremely limited data. And presumably we do want to integrate linguistic categories into
causal explanations of language and its use.
Returning to linguistic categories, the question is now less ‘do cross-linguistic categories
exist’, and more ‘what do I want to do with cross-linguistic categories’. This looks close to
Haspelmath’s proposal: descriptive categories for individual languages, comparative concepts
for cross-linguistic study. But this misses the point: if comparative concepts as construed by
Haspelmath are like Godfrey-Smith’s statistical predicates — good for estimating global counts
of features but incompatible with causal explanations — then the language sciences are still in
deep trouble. And, as Haspelmath describes them, cross-linguistic comparative concepts are
not even fit for that purpose, as they do not support any predictions at all. The crux of the issue
remains the same, i.e. the naturalness of cross-linguistic categories, which admittedly don’t give
the impression of being very electron-like. As this point, however, it is worth turning to natural
kinds outside of fundamental physics, and looking at some cases in biology and chemistry.

4 Natural kinds in biology and chemistry

The idea of species in biology has underwritten an enormous amount of scientific work, so it
might come as a surprise to many linguists that the best way to define the species concept
is controversial even now, and that despite Haspelmath’s assertion that species constitute a
paradigmatic example of a natural kind, this too is highly contentious.
The first thing to note here is that biology is doing pretty well despite these apparently shaky
ontological foundations. The discipline is relatively mature and has an extraordinary record of
success: anyone who cited metaphysical reasons for changing the way molecular geneticists do
their work, or zoologists theirs, would almost certainly be ignored (or worse). While the exact
relationship between different branches and fields of biology may be extremely complicated,
nobody seriously doubts that they form parts of a larger, integrated whole. Another factor

12
may be the perceived concreteness of the subject matter of biology: it is hard to dispute the
existence of lions and fruit-flies, and DNA and the genetic code have been established beyond
all possible doubt as the primary physical basis for inheritance and evolution. But, despite all
this, a satisfactory and all-encompassing definition of species in biology has proved elusive.
So what are species then, and why aren’t they a good candidate for natural kinds as tra-
ditionally construed? The problem here lies with variation. If variation were not important,
then membership of a species might simply entail the existence of a species-specific essence,
possessed by all members of that species and not by any other species, and which would provide
us with something akin to a single infallible diagnostic test for species membership. If this were
so, then establishing a scientific definition of a lion, for example, would simply require us to
stake out a set of lion-specific features. Anything with one or more of these features would be
a lion, and anything without them would not.
Darwin (1859) dispelled such notions by placing variation at the centre of his explanation of
the origin of species. For evolution and speciation to proceed in a living population, variation
must always be present. And in a large enough population, we might expect some degree of
variation in any feature which distinguishes, for example, lions from other animals, whether this
feature is a lion-specific sequence of DNA or some physiological or behavioural trait particular
to lions. In practical terms, of course, it is very likely that — given the enormous length
of the genetic code — there are multiple sequences of DNA which are found in all lions and
no other creatures, but this raises further problems. First, we should still be open to the
potential for variation: a point mutation in an individual would not suddenly disqualify it from
species-membership. Second, although these shared sequences are the consequence of shared
evolutionary history, they will almost certainly be very difficult to associate with any important
(or even interesting) aspects of being a lion. To identify a potentially inconsequential stretch of
genetic code as the essential factor conferring lionhood seems to miss the point entirely, in the
same way that the essential thing about you is not your fingerprint.
Moreover, many features of lions are shared by other species: it is important to have a heart
to be a lion, but it is important for lots of species to have a heart, so we can’t say that heart-
ownership is an essential property of being a lion any more than it is of being a vertebrate or a
squid. This is even more pertinent when we consider closely-related species, such as lions and
leopards, and more so again in populations which are undergoing speciation; the boundaries
between species are very often indistinct, and deciding on a boundary at all can depend on
which criteria are applied. Ultimately, given that the properties associated with any species

13
are both subject to variation and potentially shared with other species, it is hard to escape the
conclusion that this kind of essentialism is not really a helpful way to think about species.
Nevertheless, biologists have made numerous attempts to establish the scientific credentials
of a species concept in some other way which takes variation into account. This includes propos-
als such as biological species (individuals are able to interbreed), ecological species (individuals
share an adaptive environment), phylogenetic species (shared evolutionary history), phenetic
species (morphological, behavioural, or genetic similarity) and more (see De Queiroz, 2007 for
one of many surveys). These all come with their own set of problems (see Ereshefsky, 1992;
Sterelny, 1994), but there is no space for a thorough treatment here, and settling on the best
definition of species is not our primary concern in any case. Rather, we want to know whether
species are something like natural kinds, and — if not — how they have managed to be so sci-
entifically successful. Again, the literature on this topic is divided: many contend that species
are better construed as lineage-bound individuals than kinds, others as sets, or as a re-purposed
version of natural kinds with historical rather than property-related essences. Each of these
(especially those which stress the importance of lineage and history) has some parallel with
natural language, but arguably the most applicable framework is homeostatic property cluster
theory (Boyd, 1999), to which we will now turn.
As we’ve seen above, an evolutionary understanding of species is one in which natural varia-
tion is a requirement of any good explanation, not a problem to be overcome. The homeostatic
property cluster view on species — that a given species is defined by a fuzzy, and possibly
overlapping set of properties which are nonetheless homeostatic, i.e. which remain coherent
over time and space — captures this feature, with the additional proviso that we should pay
attention to the mechanisms, both synchronic and over evolutionary time, internal and external
to the organism, which serve to maintain these clusters of properties. Whether this constitutes
a bona fide natural kind is (of course) open to debate, but it does permit for inference: “the
similarities among the members of a kind must be stable enough to allow better than chance
prediction about various properties of a kind”, and “species are kinds and kinds are ultimately
similarity-based classes that play a role in induction” (Ereshefsky, 2017). And what it also gives
us is a distinction between appearance — the observable features we treat as symptomatic of
species membership — and the underlying causal mechanisms which explain those symptoms.
Before moving on, it is worth noting that these kinds of difficulties are by no means restricted
to the status of species in biology. For example in chemistry, where even the seemingly innocuous
statement ‘water is H2 0’ is problematic (VandeWall, 2007). Different isotopes of both hydrogen

14
and oxygen, atoms of which have the same number of protons and electrons but vary in their
count of neutrons, nevertheless have very similar chemical properties: they can and do combine
in nature to make different kinds of water molecules; water found in nature always consists of a
mixture of molecules of these different types; water in nature always contains other impurities. If
we take water to be defined by its micro-structural properties, then, we have to admit variation:
water is any one of many selections of isotopes of hydrogen and oxygen combined in the correct
configuration. Similarly, taking water to be the stuff we interact with on a daily basis also needs
to account for the fact that it is a mixture of isotopes and impurities. Perhaps we could require
that water as a natural kind refers only to H2 0 which consists only of the most common isotopes
of hydrogen and oxygen, or some completely impurity-free distillation of water in a laboratory
somewhere, but the former would exclude vast amounts of water in the universe, and the latter
would entail that all water found in nature — the substance studied by biologists measuring
water loss via transpiration in leaves, or water intake by camels at different temperatures — is
somehow not the natural kind water. The same holds, of course, for gold too, albeit to a lesser
degree: gold has many isotopes (although only one is stable), and is never completely free of
impurities. Variation in a kind is never good evidence that it is not a natural kind.

5 Response to Haspelmath

We can return to the central claim made by Haspelmath (2018): that natural kinds do not
exist in language, and that we should instead make a principled distinction between ‘social
categories’ (i.e. language-specific descriptive terms which allow for induction within a single
language but not between languages), and ‘comparative concepts’ (which allow for comparison
across languages but not induction).
Haspelmath states that natural kinds can be recognised (p.83,90,98), without needing to be
defined in any “necessary or jointly sufficient” way. This is hard to interpret out of context: how
can we reliably recognise something unless it has some necessary or essential properties? For this
reason, it is helpful to look at the source of Haspelmath’s understanding of natural kinds. This
is not found in the philosophical literature, but in work from the generative linguistic tradition,
and more specifically in Haspelmath’s reaction to that work. Generative linguists, Haspelmath
argues, think that linguistic categories exist independently of observation: they think they
are natural kinds. For Haspelmath, however, the variation and indeterminate boundaries of
linguistic categories mean that they cannot be natural kinds. Leaving this point aside for a

15
moment, we will first take a look at why Haspelmath places such an emphasis on recognition
rather than definition: interestingly, Haspelmath appears to have inherited his ideas about the
appropriate way to treat bona fide natural kinds from the way he thinks generative linguists
erroneously treat linguistic categories.
Haspelmath cites Zwicky (1985), who points out the difference between linguistic definitions,
i.e. exhaustive lists of properties, and linguistic tests, i.e. diagnostic tools used to judge category
membership. Zwicky observes that linguists often confuse definitions with tests, but that “in
the case of terms (like ‘word’ and ‘clitic’) which function as theoretical primitives, only lists of
symptoms can be provided” (Zwicky, 1985, p.285). Lists of symptoms are necessary because
‘theoretical primitives’ cannot be defined in terms of their relationship with other parts of
the theory (so we can’t define clitics as things which optionally affix words), and can’t be
decomposed into smaller theoretical entities (so we can’t define the syntactic word in terms
of smaller syntactic parts). But whether primitives really exist in some irreducible form (as
part of our minds, perhaps, or up in Platonic heaven), or if the point is more that they are
irreducible within the context of a particular theory, Zwicky makes his argument because of
the unpredictable behaviour of words and clitics cross-linguistically: we can’t use definitions
(again, lists of properties) because their properties are different in each language, but we also
can’t define them in reductive terms because they are primitives. This being the case, and
assuming of course that such primitives really do exist, our best bet at identifying something as
a word or a clitic is to resort to diagnostic criteria, which Zwicky (1977, p.95) himself states is
not meant “to suggest that linguistic litmus tests are guaranteed to decide any question that is
put to them.”
There are two problems here. The first is that when Haspelmath equates irreducible theo-
retical primitives with natural kinds — whose naturalness is not conferred by irreducibility in
any case — his conclusion is that this kind of “diagnostic fishing”(p.102), while inappropriate
for linguistic categories, really is the appropriate way to recognise natural kinds. But this runs
counter to Zwicky, who argues that diagnostic criteria are all we have in such cases. The sec-
ond problem is that Haspelmath wants to treat genuine natural kinds as something linguistic
categories are not, rather than something they are.
Whatever the provenance of his ideas, it is clear that, for Haspelmath, natural kinds are
both distinct and invariable. If something does not ‘recur... with identical properties’ (p.109),
then it is not a natural kind. This shines a light on Haspelmath’s insistence that natural kinds
only require “detailed descriptions and agreement on a label but not a definition” (Haspelmath,

16
2018, p.90): because they are distinct and invariable, perhaps there is no need to list all of their
properties, just enough to pick them out in the world. The philosophical position this most
closely resembles is ‘scientific essentialism’ (see Kripke, 1972; Putnam, 1979), which requires
that natural kinds share a collection of necessary (but not sufficient) properties with all and
only other members of that kind, where these properties might be micro-structural, intrinsic,
or modally necessary (Khalidi, 2013).
But as we have seen, Haspelmath’s biological exemplars of natural kinds, such as red fox
and tuberculosis, are very hard to reconcile with an essentialist view. It is simply a fact that
individuals drawn from any biological species show, or have the potential to show, variation in
any of their properties; that species membership is often unclear; and that settling on a list of
essential descriptive properties is fraught with difficulties. This state of affairs is not restricted
to biology: similar issues are found across the sciences, for example with geological categories
like ‘quartz’ and ‘metamorphic rock’, or ‘cumulonimbus’ and ‘cirrostratus’ in meteorology, all
of which have played a part in innumerable explanations and predictions. And if we insist on a
natural kind role for species, or rocks, or clouds, then we should see these too as cluster-kinds,
and focus on trying to understand what the clustered properties are, and which mechanisms
sustain them. To sum up in the interim: Haspelmath’s picture of natural kinds is inappropriate
for species (as are most flavours of essentialist interpretation) and, as we have seen, not even
gold or water would be unproblematic.
On the other hand, if we really are meant to interpret Haspelmath’s natural kinds in the
way he claims they are employed in generative linguistics, i.e. requiring only flexibly applied
diagnostics, then it is hard to find any parallels in the scientific or philosophical literature,
except perhaps for Justice Potter Stewarts’s threshold test for obscenity: “I know it when I see
it” (Jacobellis v. Ohio, 378 U.S. 184, 1964).
This leaves us with two alternatives. Haspelmath could retreat to a strict essentialist take
on natural kinds which only exist in the domain of physics and (maybe) chemistry: his claims
regarding language do not rest on the status of species, after all. But, if this is how we choose
to slice up the world, this begins to place some serious limitations on the domain of inferential
science, i.e. physics and perhaps a smattering of basic chemistry. And if biologists and chemists
and agronomists are able to operate in the absence of natural kinds, this seems much less of a
crushing blow to the language sciences. Another tack would be to accept something along the
lines of homeostatic property cluster theory for species, but not for linguistic categories, but
this would require solid evidence of the absence of any clustered properties across languages:

17
more on this later.
Next, we can turn to the distinction between natural and social kinds. This is a topic
which has been discussed in the philosophical literature (see Khalidi, 2013), typically in relation
to the social sciences, where categories such as ‘criminal’, ‘narcissist’, and ‘president’ would
fail an essentialist test of naturalness: such terms can be (and are) variably interpreted, and
with significant variation over time and space even within a specific culture. This cuts against
Haspelmath’s view of social kinds as being a culture-specific version of natural kinds, and
objectively real in a way which comparative concepts are not. More pressing than this, the
subsuming cultural context is also hostage to such problems: if mountains are regarded as
hopelessly ill-defined, what then of something like ‘French culture’, which is surely even more
negotiable and a matter of agreement. However, to reiterate, broader notions of naturalness are
possible, both as suggested by Quine and as we have seen in biology, so there is no reason to
assume that this cannot be the case in the social sphere too.
In anthropology, the validity of cross-cultural social categories remains highly controversial
to this day, but even questionable cultural designations such as ‘clan’ appear to be sufficiently
well-defined to inform some debates, but not well enough for others,3 and there is no requirement
for categories to be useful across all contexts. And, natural or not, the utility of some kind
does not require clear-cut, exceptionless definitions, but rather a track record of being used
in successful inferences. In fact, when a category consistently plays some role in a number of
apparently fruitful predictions or causal explanations despite being somewhat rough around the
edges, this seems like a very good reason to investigate it further, rather than just dismiss it
out of hand.
What, then, of Haspelmath’s comparative concepts? These have the odd feature of being
grue-by-design, or even ‘the best grue money can buy’: they are avowedly and intentionally bad
for inference, are “not psychologically real, and they cannot be right or wrong” (Haspelmath,
2010a). The main issue for Haspelmath, in this case, is what he sees as the arbitrariness of the
way they are “defined rigorously and delimited from similar phenomena” (Haspelmath, 2018),
for example in the distinction between a mountain and a hill. Another worry is that they
may genuinely belong to quite distinct phenomena, such the wings of a bat and the wings of a
magpie.
There are two things to be said here. Firstly, despite Haspelmath’s pronouncement that a
geologist gains no information upon learning that some landform is a mountain, there is a great
3
Thanks to Ian Keen for this observation.

18
deal of knowledge this could impart: mountains are formed via a limited number of processes
of plate tectonics and subsequent erosion, they have predictable effects on the climate and
environment, they foster similar patterns of ecology, and certain types of rocks and minerals are
more likely to be brought to the surface depending on the context of their formation and region.
There is a whole science dedicated to orogeny, the process of mountain formation. The label
‘mountain’ allows us to make any number of reasonable inferences about the history, makeup,
and interactions of a specific mountain.
Similarly, despite the very different lineages of bat and magpie wings, we can make inferences
about their shared functions and mechanisms, and the similar adaptive environments they may
have faced. There are further complications: bat wings and magpie wings are different as wings,
but not as tetrapod forelimbs: they are homologous (i.e. the product of shared ancestry) in some
ways, but not in others.4 Finally, perhaps the problem here is one of determinacy: designating
something as a ‘mountain’ never allows me to logically deduce, with 100% certainty, the existence
of a specific feature possessed by it and everything else in the universe which one might call a
mountain, but not by any non-mountains. But neither is this the case for a biological species,
or even a sample of water or gold, for all the reasons discussed above.
Haspelmath argues that linguistic categories “are not natural kinds, because they do not
recur across languages with identical properties”(Haspelmath, 2018, p.109, my emphasis) We
have seen, however, that the presence of variation is not a deciding factor in establishing the
naturalness of a kind. As Evans and Levinson (2009, p.446) remark, languages tend to be
“characterized by clusters around alternative architectural solutions” and, as Dahl (2016) points
out, linguistic categories which are variable and require rigorous definitions and knowledge of
similar phenomena are entirely compatible with a cluster-kind analysis in the very same way
that species are. This being the case, what matters then is i) showing that such clusters exist,
ii) showing that they are persistent, and eventually iii) explaining why they are persistent. I
will discuss this in the final section.
But perhaps it doesn’t matter: if Haspelmath’s comparative concepts are sufficiently well-
designed for what Godfrey-Smith (2011) calls statistical inference of the form ‘how many X are
Y’, e.g. ‘how many languages are verb-initial’, they can be almost anything. The problem is that
Haspelmath sees a fundamental, rather than methodological, distinction between comparative
concepts and natural and cultural kinds: his message is that even the weaker form of inference
described above is more than we can hope for. Cross-linguistic comparison, and indeed any
4
Thanks to Kim Sterelny for pointing this out; see also Bromham, 2019, this issue

19
form of induction involving comparative concepts, is incompatible with meaningful induction,
and nothing you observe about the categories in one language will tell you anything about the
categories in another. But although Haspelmath’s position is motivated by empirical observa-
tion, it is argued on the basis of theory, and the theory is unrealistically restrictive. Essentialist
natural kinds are few and far between: a great deal of science is done using more complex
categories with variable properties, where these categories blend into each other. Haspelmath
wants to fence off social categories, but in practice these are used to make all sorts of meaningful
comparisons; in the same way, categories which he relegates to the status of comparative con-
cepts play a productive part in inferences, explanations, and predictions across the sciences. Of
course, it bears repeating that none of these categories or kinds are guaranteed to be good ones,
in that they are necessarily accurate descriptions of nature. But that is a empirical matter, not
a judgement to be made in advance by carving the world up into things which do or do not
meet some set of criteria.
Haspelmath is not against the idea of, for example, theoretical linguists working with cross-
linguistic comparative concepts, but it’s hard to understand why anyone would want to. As
Haspelmath remarks, “different languages represent historical accidents and (unless they influ-
enced each other via language contact or derive from a common ancestor) the categories of one
language have no causal connection to the categories of another language,” and “there may
not be anything special to learn about such historically accidental phenomena anyway, beyond
their exhaustive description” (Haspelmath, 2018, pp.85,94). The historical point here is both
valid and important, especially by analogy with biological evolution. As Evans (2016) argues,
an evolutionary perspective should play an important role in linguistic typology, which means
that many similarities between languages can and should be understood as the result of shared
ancestry.
Adopting an evolutionary perspective, however, has further benefits: we can also learn a
great deal from similarities between unrelated languages, or between related languages when
historical explanations are found wanting. Languages are a cultural phenomenon, but they are
used by humans embedded in the practicalities of the world, and with somewhat convergent
needs and capacities. If we see this cultural phenomenon as one which evolves under the
constraints of a shared cognitive, physical, and social environment, we can certainly resist the
idea that all linguistic features are no more than historical accidents. By this light, clusters are
meaningful, both in terms of the mechanisms which maintain the cohesion of similar properties
across time and space, as well as those which drive variation.

20
6 A way forward

I am arguing against the principled distinction made above, i.e. that the world is cleanly divided
between natural kinds, descriptive categories, and comparative concepts. These distinctions,
similarly to the natural language categories which motivate Haspelmath, are themselves in-
determinate: both cultural constructs and cross-cultural generalisations can, under the right
conditions, stand in for natural kinds and allow us to make inferences, predictions, and gener-
alisations. However, this does not mean that anything goes, category-wise: far from it. Good
categories earn their status: “in induction nothing succeeds like success” (Quine, 1969). Estab-
lishing the respectability of cross-linguistic categories is an empirical matter. As we have seen
above, there are a number of ways to go about this:
1) Do cross-linguistic categories have a history of being used in successful inferences? The
jury is out, but the apparent presence of statistical linguistic universals (in the sense of Green-
berg, 1974), and a wealth of work in linguistics, psycholinguistics, and the cognitive sciences
seems to point in this direction.
2) Do linguistic categories have clustered properties? Much hangs here on what constitutes
a property, and whether these can avoid similar accusations of being post-hoc contrivances. Ide-
ally, we could arrive at some set of theory-neutral, universally accepted criteria, but this seems
uncharacteristic of linguistics as a field. Haspelmath appears to be content for semantics to play
the role of a universally applicable framework for comparison, and has even worked on semantic
role clustering at the level of individual verbs (Hartmann, Haspelmath, & Cysouw, 2014, see
also Croft & Poole, 2008; Zwarts, 2008). These results do provide evidence of clustering, but
the variation across languages is taken as evidence against linguistic categories: we have seen
that this is not necessarily so.
However, when Haspelmath states that “it is universally recognized that, ultimately, linguis-
tic categories must be defined in structural terms”(p.109), this implies that giving a partially
semantic (as opposed to purely grammatical) basis for linguistic categories somehow counts
against their independent existence. But there is no particular reason to believe that gram-
matical systems must be completely autonomous, and defined wholly in terms of themselves.
Partially reductive analyses of linguistic categories do not disqualify them from being com-
parable cross-linguistically, and in fact both Canonical Typology (see Brown, Chumakina, &
Corbett, 2012; Round & Corbett, 2019, this issue) and Distributional Typology (see Bickel,
2013) do use a combination of semantic and grammatical definitions, where in the latter case

21
the door is left open for psychological, genetic, and anthropological ones too (p.21).
3) Do clusters persist? This is a tricky question. There are no historical records for most
contemporary languages, and any data gleaned from reconstructions of ancestral languages
is open to reinterpretation. Today’s languages of the world can give us a snapshot of what
is possible, but due care should be given to control for genetic relatedness and the effects of
borrowing and mixing, and it is impossible to know whether today’s languages are representative
of all the languages that have existed, will exist, or could exist (enter Hume’s ghost). But
evidence of clustering across unrelated and spatially separated languages would certainly count
as evidence for persistence.
4) Why do clusters persist? If we find ourselves in the happy position of having positively
answered all the questions above, there is plenty of work left to do: we need to identify the
homeostatic mechanisms which keep clusters together. This would require an explanation in the
spirit of Tinbergen (1963), entertaining the possibility of both biological and cultural evolution
under cognitive, social, and environmental constraints, with reference to mechanisms and with
a role for development.
On the other hand, every cross-linguistic category may ultimately reduce to its underlying
semantic, biological, or anthropological properties; there may genuinely be no clustering, or that
clustering may be ephemeral or the product of some more scientifically satisfying mechanism.
But, as Quine (1969, p.22) remarks, “we can take it as a very special mark of the maturity of a
branch of science that it no longer needs an irreducible notion of similarity and kind:” linking
into a larger and more general body of knowledge need not invalidate the original theory, in the
same way that our present knowledge of genetics does not deny the existence of species, but helps
us to better understand them. Finally, however these matters are eventually resolved, we should
take advantage of the parallels with biology: the ubiquity of variation and the intractability of
nicely-defined linguistic categories is not a mark of inadequacy, but a hint at the fundamental
nature of language, and the utility of an evolutionary perspective on it.

7 Acknowledgements

Thanks to Nick Evans and Kim Sterelny for setting the ball rolling on this project, as well as
their guidance, commentary and feedback throughout; thanks also to Lindell Bromham and Ian
Keen for their insightful reviews, and to Bruno Ippedico, Hedvig Skirgård, and Kevin Stadler for
their comments on an earlier version of this paper; thanks to Ron Planer and Fausto Carcassi

22
for their invaluable philosophical input. This project was supported by a Postdoctoral Research
Fellowship with the ARC Centre of Excellence for the Dynamics of Language.

References

Bickel, B. (2013). Distributional Typology: statistical inquiries into the dynamics of linguistic
diversity. Oxford Handbook of Linguistic Analysis, 901 - 923.
Boyd, R. (1999). Homeostasis, Species, and Higher Taxa. In Species (pp. 141–185). MIT Press.
Bromham, L. (2019). Comparability in evolutionary biology: the case of Darwin’s barnacles.
Linguistic Typology.
Brown, D., Chumakina, M., & Corbett, G. G. (2012). Canonical Morphology and Syntax.
Bybee, J. L. (1989). in the Languages of the World. Studies in Language, 103 (1), 51–103.
Croft, W., & Poole, K. T. (2008). Inferring universals from grammatical variation: Multidi-
mensional scaling for typological analysis. Theoretical Linguistics, 34 (1), 1–37.
Dahl, Ö. (2016). Thoughts on language-specific and crosslinguistic entities. Linguistic Typology,
20 (2), 427–437.
Darwin, C. (1859). On the Origin of the Species.
De Queiroz, K. (2007). Species Concepts and Species Delimitation. Systematic biology, 56 (6),
879–886.
Ereshefsky, M. (1992). The units of evolution: Essays on the nature of species. MIT Press.
Ereshefsky, M. (2017). Species. In The stanford encyclopedia of philosophy.
Evans, N. (2016). Typology and coevolutionary linguistics. Linguistic Typology, 20 (3), 505–
520.
Evans, N., & Levinson, S. C. (2009). The myth of language universals: language diversity and
its importance for cognitive science. Behavioral and Brain Sciences, 32 (5), 429–494.
Godfrey-Smith, P. (2003). Goodman’s problem and scientific methodology. Journal of Philos-
ophy, 100 (11), 573–590.
Godfrey-Smith, P. (2011). Induction, Samples, and Kinds. Carving Nature at its Joints: Topics
in Contemporary Philosophy, Volume 8 , 8 , 33—-52.
Goodman, N. (1965). Fact, Fiction and Induction. Oxford University Press.
Greenberg, J. (1974). Language typology: a historical and analytic overview.
Hartmann, I., Haspelmath, M., & Cysouw, M. (2014). Identifying semantic role clusters and
alignment types via microrole coexpression tendencies. Studies in Language, 38 (3), 463–

23
484.
Haspelmath, M. (2007). Pre-established categories don’t exist: Consequences for language
description and typology. Linguistic Typology, 11 (1), 119–132.
Haspelmath, M. (2010a). Comparative concepts and descriptive categories in crosslinguistic
studies. Language, 86 (3), 663–687.
Haspelmath, M. (2010b). The interplay between comparative concepts and descriptive cate-
gories (Reply to Newmeyer). Language, 86 (3), 696–699.
Haspelmath, M. (2011). The indeterminacy of word segmentation and the nature of morphology
and syntax. Folia Linguistica, 45 (1), 31–80.
Haspelmath, M. (2018). How comparative concepts and descriptive linguistic categories are
different. In D. V. Olmen, T. Mortelmann, & F. Brisard (Eds.), Aspects of linguistic
variation. De Gruyter.
Huddlestone, R., & Pullum, G. K. (2002). The Cambridge Grammar of the English Language.
Cambridge University Press.
Hume, D. (1740). A Treatise of Human Nature.
Khalidi, M. A. (2013). Kinds (Natural Kinds Vs. Human Kinds). Sage, 2013 .
Kripke, S. (1972). Naming and Necessity. In Semantics of natural language (pp. 253–355).
Dordrecht: Springer.
Lazard, G. (2006). More on counterfactuality, and on categories in general. Linguistic Typology,
10 (1), 61–66.
Mayr, E. (1994). Typological versus population thinking. In Conceptual issues in evolutionary
biology (pp. 157–160).
Putnam, H. (1979). The meaning of “meaning”. In Philosophical papers 2. Cambridge University
Press.
Quine, W. V. O. (1969). Natural Kinds. In Essays in honour of Carl Hempel (pp. 5–23).
Round, E., & Corbett, G. G. (2019). Comparability and measurement in typological science:
the bright future for linguistics. Linguistic Typology.
Sterelny, K. (1994). The nature of species. Philosophical Books, 35 (1), 9–20.
Tinbergen, N. (1963). On aims and methods of Ethology. Zeitschrift für Tierpsychologie, 20 ,
410–433.
VandeWall, H. (2007). Why Water Is Not H20, and Other Critiques of Essentialist Ontology
from the Philosophy of Chemistry. Philosophy of Science, 74 (5), 906–919.
Weinberg, S. (1998). The Revolution That Didn’t Happen. New York Review of Books, 45 (15),

24
48-52.
Zwarts, J. (2008). Commentary on Croft and Poole, Inferring universals from grammatical vari-
ation: Multidimensional scaling for typological analysis. Theoretical Linguistics, 34 (1),
67–73.
Zwicky, A. M. (1977). Litmus tests, the bloomfieldian counterrevoloution, and the corre-
spondence fallacy. In Second annual linguistic metatheory conference. Michigan State
University.
Zwicky, A. M. (1985). Clitics and Particles. Language, 61 (2), 283.

25

You might also like