You are on page 1of 109

Language Acquisition, Computational Phylogenetics,

Creole Languages and Family values


Michel DeGraff, Robert Berwick & Trevor Bass

Table of contents:
ABSTRACT  ...............................................................................................................................................................  2  
1.   FAMILY  VALUES?  ..........................................................................................................................................  4  
2.   TERMINOLOGICAL  AND  CONCEPTUAL  BACKGROUND  ....................................................................  7  
2.1.   A  BRIEF  HISTORY  OF  “SIMPLIFICATION”  AND  “BREAK  OF  TRANSMISSION”  IN  CREOLE  STUDIES  ....................  8  
2.2.   SIMPLIFICATION:    CREOLES  AS  CONTEMPORARY  “FOSSILS  OF  LANGUAGE”  WITH  EXCEPTIONAL  “RIGOR  
AND  SIMPLICITY”?  ......................................................................................................................................................................  9  
2.3.   BREAK  OF  TRANSMISSION:    CREOLES  AS  LANGUAGES  OUTSIDE  STAMMBAUMTHEORIE?  ..............................  12  
3.   CREOLES  ARE  NOT  EXTRAORDINARILY  SIMPLIFIED  “CONVENTIONALIZED  
INTERLANGUAGES  OF  AN  EARLY  STAGE”  ..................................................................................................  14  
3.1.   EMPIRICAL  AND  THEORETICAL  EVIDENCE  AGAINST  THE  EARLY-­‐INTERLANGUAGE  HYPOTHESIS  (EIH)  ..  16  
3.2.   ARCHIVAL  AND  SUBSTRATAL  EVIDENCE  AGAINST  THE  EIH  ...............................................................................  24  
3.3.   CREOLES  DEFINITELY  DO  NOT,  AND  COULD  NOT,  STAND  OUT  AS  “CONVENTIONALIZED  INTERLANGUAGES  
OF  AN  EARLY  STAGE”  ..............................................................................................................................................................  26  

4.   WHEN  COMPUTATIONAL  PHYLOGENETICS  IS  NOT  ABOUT  EVOLUTIONARY  HISTORY  ....  28  
4.1.   METHODOLOGICAL  PROBLEMS  IN  THE  CREOLE  TYPOLOGY  HYPOTHESIS:  PROBLEMATIC  ASSUMPTIONS  
AND  CIRCULARITY  OF  DEFINITIONS  .....................................................................................................................................  29  
4.2.   WHAT  IS  COMPUTATIONAL  PHYLOGENETICS?    BASIC  CONCEPTS,  SCOPE  AND  LIMITATIONS  .......................  30  
4.3.   METHODOLOGICAL  PROBLEMS  IN  BAKKER  ET  AL  2011:  BIASES  IN  FEATURE  AND  LANGUAGE  SELECTION
  39  
4.3.1.   Fundamental  issues  in  statistics  as  regards  (un)biased  samples  .................................................  39  
4.3.2.   Feature-­‐  and  language-­‐related  biases  in  Bakker  et  al  2011  and  Daval-­‐Markussen  &  
Bakker  2012  .........................................................................................................................................................................  42  
4.3.3.   Biases  inherited  from  Parkvall’s  (2008)  features  and  languages  ................................................  47  
4.4.   EMPIRICAL  PROBLEMS  IN  CREOLE  TYPOLOGY  HYPOTHESIS:  CLUSTERS  BASED  ON  INTERDEPENDENT  
FEATURES  AND  ERRONEOUS  FEATURE  VALUES,  INCLUDING  EMPIRICALLY  PROBLEMATIC  VALUES  AND  LOGICALLY  
CONTRADICTORY  VALUES  ......................................................................................................................................................  53  
4.4.1.   Logically  contradictory  feature  values  .....................................................................................................  53  
4.4.2.   The  statistics  of  Parkvall’s  (2008)  interdependent  features  ...........................................................  54  
4.4.3.   Empirically  problematic  feature  values  and  their  problematic  treatment  in  B&al’s  
clustering  calculations  .....................................................................................................................................................  57  
4.4.3.1.   The  problem  of  missing  data  ...................................................................................................................................................  57  
4.5.   THE  LATEST  VERSION  OF  THE  CREOLE  TYPOLOGY  HYPOTHESIS:  A  TYPOLOGY  BASED  ON  9,  THEN  6,  THEN  
4  FEATURES  (DAVAL-­‐MARKUSSEN  &  BAKKER  2012)  ...................................................................................................  64  
4.6.   AGAINST  QUANTITATIVE  EXCEPTIONALISM  IN  CREOLE  STUDIES  ......................................................................  71  
4.7.   OTHER  CONCEPTUAL  ISSUES  IN  DETERMINING  TYPOLOGICAL  VS.  PHYLOGENETIC  RELATEDNESS:    
STRUCTURAL  VS.  LEXICAL  FEATURES?    STABLE  VS.  UNSTABLE  FEATURES?  ................................................................  72  
5.   BACK  TO  FAMILY  VALUES:  CREOLE  LANGUAGES  IN  THE  COMPARATIVE  METHOD  ...........  76  
6.   ENVOI  ............................................................................................................................................................  81  
APPENDIX  A:    INCOMPATIBLE  FEATURE  VALUES  IN  BAKKER  ET  AL  2011  ....................................  83  

  1  
APPENDIX  B:  EXCEPTIONALITY  OR  DISTINGUISHABILITY?  ...............................................................  89  
NUMBER  OF  SHARED  FEATURES  ..................................................................................................................  96  
Maximum  number  of  languages  ..................................................................................................................................  96  
APPENDIX  C:  STATISTICS  FOR  BAKKER  ET  AL’S  FEATURE  VALUES  FROM  PARKVALL  2008  .  100  
REFERENCES:  ....................................................................................................................................................  101  

Abstract

As a handbook chapter, this contribution investigates two sorts of applications of


language-acquisition results and computational phylogenetics in one important domain of
historical linguistics, namely Creole formation. In effect, then, this chapter is made up of three
related papers in one: one paper is on the relationship among language acquisition, Creole
formation and the oft-quoted claim that Creoles are extraordinarily simple (Section 3); another
paper is on the scope and limitations of computational phylogenetics in resolving questions about
the genealogy and typology of Creole languages (Section 4). And yet another paper (Section 5)
argues that Creole languages duly belong to the family of “normal” human languages and, as
such, fall under the scope of the Comparative Method. Section 1 and 2 introduce necessary
socio-historical, epistemological and terminological background. The overarching goal of the
chapter is to establish stronger foundation toward improving the use of interdisciplinary methods
in Creole studies and beyond.

Creole languages have traditionally been excluded from the scope of the Comparative
Method defined as the comparison of cognate sets across related languages in order to establish
systematic correspondences and, thus, genetic relationships of the sort that is displayed in
phylogenetic trees where an (hypothetical) ancestor language is at the root of the tree (Meillet
1958, Rankin 2003, Ringe & Eska 2013). It’s often assumed that methods that construct
phylogenetic trees of related languages are not applicable in the case of Creoles (Taylor 1956,
Thomason & Kaufman 1998, Ringe et al 2002, Nakhleh et al 2005b, Labov 2007, etc.). Recent
studies continue to claim that Creoles are special both in their genealogy and their typology: they
lie outside well-established language families and they constitute an exceptional typology as the
least complex human languages (Parkvall 2008, McWhorter 2011). In a related vein, Bakker et
al 2011 (“B&al”) and Daval-Markussen & Baker 2012 (“D-M&B”) enlist novel phylogenetic

  2  
tools (i.e., SplitsTree from Huson & Bryant 2006; cf. Dunn et al 2008) for their argument that
“Creoles are typologically distinct from non-Creoles.” Similarly, recent studies have argued that
Creole languages show various degrees of (local) simplicity, due to their origins in
“conventionalized interlanguages of an early stage” (Plag 2008a,b, 2009a,b).

Here we show that these claims are all unfounded, and then we argue that Creoles fall in
the scope of the Comparative Method and are not typologically exceptional. Our data and
analyses demonstrate that Creole languages cannot be taken to reflect early interlanguages. As
for computational phylogenetic (or more precisely cladistics) methods such as NeighborNets,
they do not reconstruct evolutionary history regarding Creole formation. More importantly, we
show that B&al and D-M&B’s results are undermined by several conceptual, methodological
and empirical flaws.

Given historical linguists’ increasing use of both computational phylogenetics and


language-acquisition findings, we conclude that such methods must be reviewed with great care,
as we do here, before they can be applied in the analysis of language-creation and language-
change phenomena. Once the Comparative Method is adequately applied, Creole formation
appears indistinguishable from language change. For example, Caribbean Creoles descend from
their European “lexifier” languages, with Niger-Congo substrate influence, which creates quasi-
Sprachbund effects among Caribbean Creoles.

  3  
1. Family values?

In the 1950s, Douglas Taylor wrote that Creoles, because they originated in Pidgins, were to be
considered “genetically ‘orphans’ [with] two ‘foster-parents’: one that provides the basic
morphological and/or syntactical pattern, and another from which the fundamental vocabulary is
taken” (1956:413). This argument was in opposition to the view that Creoles were to be
genealogically related to their European superstrate languages (e.g., Hall 1958). This debate was
somewhat reminiscent of an earlier one, in the 19th century, between Lucien Adam and Charles
Baissac. Both of these scholars saw Creoles as extremely simplified languages, due to their
creation by non-Europeans. But Adam, like Taylor, considered Creoles such as Mauritian Creole
as languages with two “parents” whereas Baissac took the same Mauritian Creole to be
genealogically affiliated with its European superstrate language—more precisely with the 17th-
century French patois spoken in Normandy and Brittany (Baissac 1880:XLVII). Regardless of
their incompatible positions about Creoles’ genealogical affiliation, one belief permeated the
writings of creolists in the 19th and 20th century: that Creoles were extraordinarily simple
languages.

Now, in the 21st century, there is a prolific group of creolists who still view Creoles as
exceptionally simple languages, following the intellectual trend that began in the earliest writings
on Creole languages (see DeGraff 2005a for overviews, and Martin 2012 for recent case studies
with focus on Portuguese-derived Creoles). These 21st-century creolists, like Taylor, but on
somewhat different theoretical grounds, insist that the concept of genetic relatedness cannot be
applied to Creole languages which must, therefore, live outside the language families defined by
the Comparative Method in historical linguistics. This is the view that is most often represented
in linguistic textbooks. There, Creoles are generally defined as nativized Pidgins, and described
in a chapter separate from the one on language change (see, e.g., O’Grady et al 2010). The
status of Creoles as the world’s simplest languages is also reified in textbooks (e.g., Dixon 2011,
v1, p. 21).

The goals of this chapter are two-fold:

(i) We evaluate recent theoretical arguments in favor of Creole Exceptionalism (i.e., the
view that Creoles belong to an exceptional class of languages on diachronic or

  4  
typological grounds). Such exceptionalist claims have been most forcefully aired in
recent writings such as Parkvall 2008, Plag 2008a,b, 2009a,b, McWhorter 1998, 2011,
Bakker et al 2011, and Daval-Markussen & Bakker 2012. The latter two articles are
based on new computational phylogenetics methods, of the sort advocated by Dunn et
al 2008. So our paper constructively evaluates the scope and limitations of these
computational phylogenetic packages such as SplitsTree (Huson & Bryant 2006)
which are become increasing popular in historical linguistics (cf. Nakhleh et al
2005a,b, Wichmann & Saunders 2007, Nichols & Warnow 2008, Dunn et al 2008,
Donohue et al 2011, etc.). In a related vein, enlist language-acquisition studies in
order to support the claim that Creole languages are exceptional. We show the
limitations of these studies as well. From that perspective our paper can serve as a
reference for the proper use of acquisition studies and computational phylogenetics in
the study of language creation and language change (cf. Ringe et al, Nakhleh et al
2005a,b, on computational phylogenetics and Indo-European historical linguistics).
(ii) Against repeated arguments to the contrary (in, e.g., Taylor 1956, Thomason &
Kaufman 1998, Ringe et al 2002, Nakhleh et al 2005b, Labov 2007, etc.), we argue
that Creole languages (e.g., the ‘classic’ Caribbean Creoles) lie within the scope of
the Comparative Method and its family-tree (Stammbaumtheorie) model for language
evolution, while keeping in view contact and diffusion phenomena (aka “substrate
effects”) which create (quasi-)Sprachbund1 patterns across certain Creole groupings
such as in the Caribbean (see Mufwene 2008, DeGraff 2009, Aboh & DeGraff, 2014,
to appear, for related arguments).

So this paper is really about fundamentals of historical linguistics and about ‘family values’ so to
speak—the family values of our Creole-formation and language-change theories. The family
values that we argue for in this paper place Creoles in the genealogical branches of their
European superstrates, following well-established criteria of the Comparative Method. This is in
the tradition of Uriel Weinreich, Rebecca Posner, Robert Chaudenson, Salikoko Mufwene, etc.,

                                                                                                               
1
Our use of terms such as ‘quasi-Sprachbund’ or ‘Sprachbund-like’ does not entail that Caribbean Creoles have
borrowed mutually from each other. It only means that they have borrowed from overlapping sets of Niger-Congo
languages during Creole formation and, therefore, they have converged toward certain similar patterns (in the
domains of, e.g., TMA marking, serial verb constructions, predicate clefts).

  5  
though we, unlike Mufwene, do not consider Creoles to be dialects of their European ancestors.
Our view about Creoles as genetically related to, but not dialects of, their European ancestors is
on a par with the view that the Romance languages descend from, but are not dialects of, Latin.

Our family values are inclusive to the extent that, alongside descent with modification from
Stammbaumtheorie ancestor languages, we also recognize the aforementioned contact-induced
quasi-Sprachbund phenomena in the formation of Atlantic Creoles. These phenomena are the
results of substrate influence and are, ultimately, due to native-language transfer in the
acquisition of European languages (i.e., Germanic or Romance) by adult speakers of Niger-
Congo languages. In other words, though we consider Caribbean and other Atlantic Creoles as
descended from their Germanic or Romance superstrate languages, we also recognize that the
Niger-Congo substrate languages played a key role in Creole formation in the Caribbean. We
classify the effects of such African-based influence under the rubric of area diffusion effects via
second-language acquisition in situations of language contact.

In the concluding section of the chapter (section 5), we sketch the basic framework and data for
such a family-friendly theory (data that contradict exceptionalist claims about Creole formation).
In the main body of the paper (sections 3–4), we evaluate a recent series of exceptionalist claims
about Creole formation and about a certain “Creole typology,” with general implications for the
scope and methods of the Comparative Method (Parkvall 2008, Plag 2008a,b, 2009a,b, Bakker et
al 2011, Daval-Markussen & Bakker 2012). We show that these exceptionalist claims are
empirically, conceptually and methodologically flawed. We focus on these particular papers
because, even as they introduce innovative methods in Creole studies, they represent long-
standing intellectual traditions in our field. Furthermore, unlike many of their intellectual
antecedents, these papers make extremely specific claims with concrete theoretical and empirical
consequences. The first set of proposals (Plag 2008a,b, 2009a,b) consider “Creoles as
conventionalized interlanguages” and the second set of proposals (Bakker et al 2011, Daval-
Markussen & Bakker 2012) produce an exceptional “Creole typology,” based on computational
phylogenetics methods à la Dunn 2008.

  6  
2. Terminological and conceptual background

To begin, let us clarify our objects of study and their labels. We will do so without any
presumption as to whether Creoles form a natural class on any diachronic (developmental) or
synchronic (typological) grounds. As we will see in this paper, the literature on Creole
languages abounds with unclear definitions and with reification and circular-reasoning (petitio
principii) fallacies. This chapter will show that it is often assumed, with little if any historical or
empirical evidence, that Creoles as a class emerged out of Pidgins. The latter, by definition,
have extraordinarily reduced and unstable structures. This hypothetical Pidgin-to-Creole cycle is,
in turn, often used to define Creoles. Then, it is claimed that certain (arbitrarily selected)
characteristics of Creoles are due to the “fact” that they emerged from Pidgins (i.e., with
relatively little influence from pre-existing “older” languages).

One postulated pan-Creole characteristic that is often assumed to derive from their common
ancestry in Pidgins is the absence of inflectional morphology, which in turn is taken to signal
utmost structural simplicity—as part of a postulated “Creole Prototype” (McWhorter 1998, 2001,
2011; also see the classic Bickertonian scenario).

This claim is empirically problematic due to: (i) the aforementioned lack of documentation of
such reduced Pidgins in the history of Caribbean Creoles (Bakker 2003, Mufwene 2008); (ii) the
fact that many Pidgins manifest the sort of inflectional morphology that is lacking in many
Creoles (Bakker 2003, Roberts & Bresnan 2008); (iii) the fact that certain Creoles have
inflectional morphology from their earliest stages onward (DeGraff 2001a,b, Luís 2008, Holm
2008).

The claim that Creoles as a class lack overt inflectional morphology is, therefore, an
exaggeration. How about the assumed link between overt inflectional morphology and
structural simplicity? In other words, can we assume that absence of overt inflection is a
diagnostic for absence of complex structures and complex grammatical operations? On the
theoretical front, there is ample argumentation in modern linguistics that absence of (overt)
morphology does not necessarily correlate with absence of (covert) structure (see, e.g., Liceras et
al 2008, Miyagawa 2010). For example, it has been argued that adult learners who produce
verbal forms that seem homophonous with infinitives insert these forms, not in reduced

  7  
structures that lack the functional projections of finite clauses, but in more complex structures
that do correspond to finite clauses in the target language (Prévost & White 2000). Yet
advocates of Creole simplicity claim that, given their morphological profile, Creoles as a class
are simpler than non-Creoles. In such studies, “simplicity” is often defined along some measure
on surface forms, a measure that is not rooted in any general theory of complexity (see Parvkall
2008, Bakker et al 2011, McWhorter 1998, 2011 and Hurford 2012:415–441 for three recent
examples of such an approach to structural simplicity in Creoles).

In order to avoid such methodological fallacies, we will follow the practice suggested in DeGraff
2009: we will use the phrase “Creole languages” as an ostensive label to refer to an extensionally
defined set of languages. This terminology is in the spirit of Mufwene’s caveat that creolization
should be taken as a socio-historical, and not a linguistic, concept (2008:40–58). In this paper, as
in DeGraff 2009, our main objects of study come from the extensionally defined set of classic
Creoles, namely the Creole languages of the Caribbean, especially Haitian Creole. We will use
these classic Creoles as benchmarks to evaluate popular claims in contemporary Creole studies
and to motivate our own approach.

2.1. A brief history of “simplification” and “break of transmission” in Creole


studies

The colonization of the Americas, including the Caribbean, was part of an imperialist plan
whereby the “New World” should bring great wealth to Europe. It’s in the execution of that plan
that lies the key historical and geo-political factor that triggered the language-contact situation
that was ultimately responsible for Creole formation in the Caribbean. Millions of Africans were
enslaved and brought to the Americas to produce wealth for the benefit of Europe. The Atlantic
slave trade thus brought speakers of African (Niger-Congo) languages into contact with speakers
of European (Indo-European) languages. In my native Haiti, the contact between speakers of
French varieties and speakers of Kwa and Bantu languages started in the middle of the 17th
century. The key question for us in this paper is whether these exceptional socio-historical
conditions of language contact under duress created typologically exceptional languages?

Though we grant that the socio-historical conditions surrounding Creole formation may have
been “exceptional” (on a par with every other combination of socio-historical conditions), we

  8  
still assume that the null hypothesis would have it that the biological brain/mind bases for
language acquisition and language change would remain uniform acrosss such diverse socio-
historical conditions (see DeGraff 2009 for a more elaborate discussion).

But there are historical and political reasons why such null hypothesis would have been
unthinkable in the colonial era: To morally and intellectually justify race-based slavery, the
European mission civilisatrice had to consider the African laborers as non- or lesser humans in
need of improvement (or “embellishment” via “domesticity,” in Moreau de Saint-Méry’s 1797
euphemistic terms). In a world view whereby languages were routinely enlisted to measure the
intellectual and moral levels of nations, it was “normal” scholarly practice (“normal” in the sense
of Kuhn 1970:10–34) for linguists in the colonial period to believe that “lesser” humans could
only speak “lesser” languages, thus the necessity to postulate a fundamental difference (a “break
in transmission”) between European languages and the latter’s renditions by the enslaved
Africans in the Caribbean and their Caribbean-born (i.e., “Creole”) descendants. Thus began
some of the most persistent dogmas in Creole studies even when the contemporary proponents of
modern versions of said dogmas are, by and large, using theoretical arguments that are
(thankfully!) far removed from the racialist theories of the colonial past.

2.2. Simplification: Creoles as contemporary “fossils of language” with


exceptional “rigor and simplicity”?

One persistent dogma in linguistics is the notion that Creole languages have extraordinarily
simple grammars. This “simplicity” notion finds intellectual antecedents in the earliest scholarly
writings about Creole languages. These writings, in the colonial period, are rife with explicit
claims that Creole languages are perforce much simpler than, or downright inferior to, their
European ancestors (see DeGraff 2005a for historical overviews; Martin 2012 is a recent analysis
of these views in the context of Portuguese-derived Creoles). It’s fortunate that the “simplicity”
claims in modern linguistics can be extricated from the “inferiority” claim of the colonial period.
these simplicity claims, even in the modern era, can still have neo-colonial effects, as in the
context of education where Creole languages are still, by and large, excluded, for reasons that are
germane to linguists’ claims that Creole languages are extraordinarily simple (more on this
below).

  9  
The French Jesuit missionary Father Pelleprat was among the first to formulate, in print, the
hypothesis that the versions of European languages spoken by Africans in the Caribbean showed
reflexes of early stages in untutored second-language acquisition. In 1665 he wrote:

“We wait until they learn French before we start evangelizing them. It is French that they
try to learn as soon as they can, in order to communicate with their masters, on whom
they depend for all their needs. We adapt ourselves to their mode of speaking. They
generally use the infinitive form of the verb [instead of the inflected forms] ... adding a
word to indicate the future or the past. ... With this way of speaking, we make them
understand all that we teach them. This is the method we use at the beginning of our
teaching ... Death won’t care to wait until they learn French.” (Pelleprat 1655 [1965, 30-
31], our translation)

In the 19th century, Saint-Quentin would consider Guyanais Creole to be “a spontaneous, hasty
and unconscious product of the human mind, freed from any kind of intellectual culture,” a
language “create[d] from complete scratch,” a language that requires “little strain on memory
and little effort,” etc.

In the 20th century, this view of Creoles as exceptionally simple languages became a cornerstone
assumption of “normal science” in linguistics, being enshrined in the classic textbooks of the
field, such as Bloomfield 1935. In one of the most cited papers about Creole formation and the
alleged simplicity of Creole languages, Whinnom (1971:110) even worried that Creole speakers’
intellectual development may be “handicapped” due to the extreme simplicity of the morphology
of their native languages. Then such simplicity became elevated as a “historical universal” in the
writings of scholars such as Seuren (1998:292–293). This claim is now found in a linguistic
textbook where it is stated in a categorical fashion: “... of the well documented creoles, none
equals the complexity [...] of a non-creole language” (Dixon 2010 vol. 1, p. 21).

In modern versions of Saint-Quentin’s claims, Creole languages have been described as the
offspring of virtually structureless Pidgins, and even as “living linguistic fossils” similar to
Homo Sapiens’s earliest (i.e., most primitive) languages. The key assumption here is that the
structurelessness of Pidgins makes them similar to the Protolanguage spoken by our hominid
ancestors before the rise of Homo Sapiens, thus the hypothesized resemblance between the
Pidgin-to-Creole cycle and the emergence of Human Language from Protolanguage (Bickerton

  10  
1990:169–171,181–185; see Hurford 2012:416,421 and passim for a 21st century version of this
hypothesis).

In these scenarios, the extreme simplicity of Creoles as compared to non-Creoles is, thus, a result
of the Pidgin-to-Creole cycle. The first stage of this cycle (i.e., “pidginization”) is a massive
simplification process whereby almost all grammatical items and almost all structural complexity
vanish, including morphological structure, all grammatically sensitive morphemes (e.g.,
inflectional affixes and functional items) and concomitant structure-dependent agreement
phenomena (e.g., subject-verb agreement, case morphology) (see, e.g., Bickerton 1999:69n16,
McWhorter 2001:158–159). The Pidgin is thus made up, by and large, of referential lexical
items and the few function words that survive the radical pidginization process. It is this
extraordinarily unstable and reduced Pidgin that, by hypothesis, provides the Primary Linguistic
Data (PLD) of the first generation of Creole-speaking children. As the immediate precursor of
the corresponding Creole, the Pidgin would thus constitute a bottleneck for the transmission of
complex structures from the languages in contact that initially gave rise to the Pidgin.

According to Bickerton (1981, 1984, 1999, etc.), the transition from Pidgin to Creole is ushered
via a Language Bioprogram. This Bioprogram, which is a genetically wired grammar accessible
to children only, is the primary source of Creole structures, in absence of any data evincing
complex structures.

Thus emerge extraordinarily simple Creole grammars (for variations on this scenario, see
Bloomfield 1935:472–475, Hall 1962, Bickerton 1981ff, McWhorter 1998ff, etc.). Versions of
this “Pidgin-to-Creole life-cycle,” which was first explicitly proposed in Hall 1962, is now found
in most contemporary introduction-to-linguistics textbooks (see, e.g., O’Grady et al 2010:503f).

In Bickerton’s and related scenarios such as McWhorter 1998ff, Parkvall 2008, etc., Creoles lack
complex structures because “human languages [...] accrete complexity as a matter of course over
millennia” (McWhorter 2011:125): since Creoles are the world’s youngest languages, with
ancestry in structureless Pidgins, “Creole grammars are the world’s simplest grammars”
(McWhorter 2001).

It is thus that the beginning of the 21st century has seen a resurgence of efforts to group Creoles
into a typological class with the “simplest” grammars (see, e.g., McWhorter 1998ff, Parkvall
2008, Bakker et al 2011, Hurford 2012 for various versions of this claim). In McWhorter 1998ff

  11  
and Parkvall 2008, Pidgins instantiate some near-zero level of structural complexity, and the
latter is the root for the presumed maximal simplicity of Creole languages as a class:

“Creole languages are unique in having emerged under conditions which occasioned the
especial circumstance of stripping away virtually all of a language’s complexity (as
defined in this paper), such that the complexity emerging in a creole is arising essentially
from ground zero, rather than alongside the results of tens of thousands of years of other
accretions. As such, creoles tend strongly to encompass a lesser degree of complexity
than any older grammar.” (McWhorter 2001:155)

DeGraff 2005a, 2014 offers an overview and critique of such “Creole Exceptionalism” views and
related issues throughout Latin America, with an an analysis of their consequences in and
beyond linguistics.

2.3. Break of transmission: Creoles as languages outside Stammbaumtheorie?

The belief that Creole languages manifest the most extreme structural simplicity thus depends on,
among other things, the belief that there was a severe “break in transmission” in Creole
formation—a “break in transmission” that acted as a bottleneck preventing the diachronic
transmission of complex structures from the languages in contact to the emerging contact
language. This diachronic rupture is perhaps the most popular truism in Creole studies.
Linguistics textbooks generally echo Taylor’s (1956) and Thomason & Kaufman’s (1988) oft-
quoted claim that Creoles are languages without genealogical affiliation due to their “abrupt
formation,” with Creole formation pigeon-holed in a separate rubric of diachronic phenomena,
outside the scope of the Comparative Method.

It has been pointed out that “in a long tradition dating back to the early nineteenth century,
Creoles have typically been excluded from the family trees of their lexifiers” (Noonan 2010:60).
In the case of Atlantic Creoles, they have been excluded both from the families of their Indo-
European (Germanic and Romance) superstrates and from the families of their Niger-Congo (e.g.,
Kwa and Bantu) substrates.

Why should we care about the fact that these scenarios are most “family-unfriendly” to Creoles?
If these scenarios are right, then Creole languages, as a class, would be singled out as languages
that live outside language families. This exceptional characteristic of Creoles would be even

  12  
more striking in light of the fact the circumstances of their history are relatively well documented,
alongside the fact that their lexica (where cognate sets are usually taken from in the Comparative
Method) are straightforwardly derived from some European language in the case of the Atlantic
Creoles.

Consider Haitian Creole (HC). The historical circumstances of its formation are relatively well
known (see, e.g., Singler 1996). The social mileu for the formation of proto-HC in colonial
Saint-Domingue included speakers of Romance (mostly French varieties) and speakers of mostly
Kwa and Bantu languages (Gbe, Congo, etc.).2 HC wears its French ancestry on its sleeves so to
speak (e.g., in the French-derived forms and semantics of the bulk of its lexicon and in the
diverse morpho-phonological patterns that manifest recurrent correspondences with French)
while various grammatical aspects show the influence of Niger-Congo languages, especially
Kwa (see Aboh & DeGraff, 2014, to appear, for case studies). Yet the “break of transmission”
view takes HC to belong to neither Romance nor Niger-Congo.

In some of these “break of transmission” scenarios, the diachronic “break” is due to the
aforementioned Pidgin-to-Creole cycle whereby drastically reduced Pidgins seed Creoles when
children create their native languages vith Primary Linguistic Data provided solely by said
reduced Pidgins. In Thomason & Kaufman’s “abrupt creolization” scenario, the diachronic
“break” is stipulated to arise through language shift “without normal transmission,” though no
explicit operational criteria are provided to distinguish “normal” vs. non-“normal” transmission.
Yet Creole languages are categorically considered as new linguistic phyla outside the scope of
standard Stammbaumtheorie groupings.

The next two Sections (3–4) evaluate recent exponents of simplicity and break-of-transmission
claims in Creole studies. In Section 5, we offer data and argument that suggest that Atlantic
Creoles, though they exhibit Niger-Congo substrate effects, fall within the scope of the
Comparative Method as descendents of their European superstrate languages. (Readers who are
already convinced that Creoles are not “simplest” languages and who are more interested in the
                                                                                                               
2
Recent arguments may suggest an input from Amerindian languages as well (Viada Bellido de Luna & Faraclas
2012). But this Ameridian input seems more tenous. For example, Viada Bellido de Luna & Faraclas’s main
linguistic argument in favor of an Amerindian input relies on the hypothesis that copula-less patterns in Caribbean
Creole derive from analogous patterns in Amerindian languages such as Arawakan. But the evidence they provide
consists mostly of isolated superficial strings without any grammatical analysis. The argument is weakened by the
fact that copula-less patterns are also quite common in the utterances of language learners, even when the target
language does not allow such copula-less patterns.

  13  
latest break-of-transmission claims through application of computational phylogenetics should
skip directly to Section 4.)

3. Creoles are not extraordinarily simplified “conventionalized


interlanguages of an early stage”3

According to the Interlanguage Hypothesis (Plag 2008a,b, 2009a,b), “Creoles are


conventionalized interlanguages of an early stage” and it is because of such origins in early
interlanguages that Creoles are relatively simpler than non-Creoles (Plag 2008a: 130–132). This
hypothesis is clearly rooted in an internalist approach to Creole formation. The basic claim,
which somewhat echoes previous work in the 1970s and 1980s on “second-language acquisition
as pidginization” (e.g., Schumann 1976 and Andersen 1983), is that Creole formation depends on,
inter alia, mental processes that are attested in the earliest stages of second-language acquisition
by adults—and not on those that are attested in the advanced stages. Though these early-
interlanguage mental processes reflect universal properties of human speakers, one corollary of
the EIH is that, in effect, Creoles are structurally exceptional to the extent that they stand out, by
and large, as early L2A interlanguages in a fossilized state of ‘arrested development’ (cf. Hall
1962:152). In other words, Creoles tend to show relatively few patterns that depend on cognitive
processes that are available in only the most advanced stages of second-language acquisition.
The EIH is one of the most recent exponents of the view that Creoles’ structural simplicity
originates from ‘pidginized’ interlanguages.

However, under closer scrutiny, the details of Plag’s proposal seem, to various
degrees,incompatible with a variety of socio-historical and linguistic facts about Caribbean
Creoles and with what we know about language development in children and adults. In any
given community characterized by contact among speakers of diverse languages (and, thus, by
untutored second-language acquisition), every language learner will, at any time, be at a distinct
point in their language-learning path. Furthermore, the social matrix of Caribbean Creole
formation was generally quite complex, with a variety of language-contact settings—from
relatively intimate on small homesteads to rigidly regimented and distant on large plantations
(Alleyne 1971, Chaudenson & Mufwene 2001). It thus cannot be assumed that, in the colonial
                                                                                                               
3
Unless noted otherwise, the HC data and related observations in this section come from DeGraff 2009:948-958.

  14  
Caribbean, the initial cohorts of Creole speakers would have would have, by and large, fossilized
at a similar early stage in their second-language acquisition trajectory.

Moreover, the archival evidence that is available to us suggests that Plag’s Early-Interlanguage
Hypothesis (hereafter “EIH”) may well be ahistorical. Creole texts from the colonial Caribbean
period also attest to the fact that emergent varieties of Creole speech were spoken far and wide in
diverse social contexts and by diverse ethnic groups, including native speakers of European and
African languages. French settlers made their formal debut in the Caribbean in the 17th century.
Then, one of the earliest Creole texts available to us, from a 1671 court deposition in Martinique
at a time near the onset of Creole formation, already suggests that this early Creole was used a a
lingua franca among blacks and whites, and this early Creole archival text already exhibits the
sort of complex structures (clausal embeddings, relative clauses, etc.) that are excluded by the
EIH. Furthermore, in Anonymous 1811:2 for example, we read that Creole in Saint-Domingue
was “generally spoken by the Blacks, the Creoles [i.e., the Caribbean-born] and by most of the
colonists in our islands in the Americas.” About this early colonial variety of Haitian Creole, we
also read in Moreau de Saint-Méry: “this [Creole] language, ... is often unintelligible when
spoken by an old African; one speaks it all the more fluently if one learns it at a younger age [...]
Europeans, no matter how long they have practiced it and no matter how long they have lived on
the Islands, are never fluent in all its nuances” (1797:64, my translation). In Moreau de Saint-
Méry’s perspective, the “old African” may have been stuck in some fossilized early-
interlanguage stage, but this was certainly not the case for those who “learn[ed Creole] at a
younger age.” Elsewhere, Moreau de Saint-Méry tells us quite explicitly which ethnic groups he
thought were the better learners of Creole: he considers the Aradas the worst (1797:31) and the
Congos and the locally born the best (1797:33,40). Notwithstanding the prejudices evident in
such reports, they are unsurprising to the extent that we do expect language attitudes and learners’
age to influence language learning, especially among adults.

More generally, given that early Creole varieties were spoken by nearly everyone in the colonial
setting, it seems unlikely that the eventual “conventionalized” norms for the Creole as a
communal language would emerge uniformly from the early-interlanguage stage that is posited
by Plag as the origins of Creole structures. The EIH is contradicted by linguistic structural
evidence as well, both from contemporary and from early Creole speech—to which we return
below.

  15  
3.1. Empirical and theoretical evidence against the Early-Interlanguage
Hypothesis (EIH)

One of the basic tenets of the EIH is that the passing of grammatical information within and
across syntactic constituents is altogether lacking in the earliest interlanguage stages and, thus, in
Creole languages (Plag 2008a: 122–128). “[T]hese languages seem to display almost
exclusively structures for which no information exchange between constituents is necessary”
(Plag 2008a: 125). The EIH considers information exchange to be a necessary ingredient for
context-dependent inflectional morphology such as number agreement in NPs, subject-verb
agreement and structural case marking, but the realization of such syntax-dependent overt
morphology “requires the most advanced processing procedures and occurs therefore only at
later stages” (Plag 2008a: 124). It is then argued that Creoles as conventionalized interlanguages
of early stages would be predicted to not show overt reflexes of such information exchange. As
we show here, such a claim seems incompatible with natural-language universals, thus making
Creoles exceptionally un-natural—to the extent that said information exchange is a fundamental
property of UG.

As shown in DeGraff (2009:949–958) Creole languages do manifest constructions with the sort
of overt inflectional morphology that depends on the syntactic environment and on information
exchange across constituents. Some of these constructions (e.g., overt gender agreement in HC)
are noted in Plag 2008a: 126. Here we want to focus on some of the more general aspects of the
counter-arguments to Plag in DeGraff 2009, along with additional counter-arguments brought
forward in Aboh, to appear, based on Saramaccan data.

We agree with the EIH assumption that information exchange is a necessary condition for the
appearance of overt context-dependent inflectional morphology. But there is a variety of
arguments and observations, including some from the literature on second-language acquisition,
to the effect that information exchange is not a sufficient condition for overt inflectional
morphology. In other words, there exist plenty of syntactic phenomena, all across natural
languages, that require inter-phrasal information exchange even in the absence of any overt
inflectional morphology (see Miyagawa 2010 for a variety of examples). Some of these
agreement phenomena that depend on information exchange without any overt morphological
correlates are analyzed in Booij 2005, a reference that, nonetheless, is cited by Plag in support of

  16  
the EIH. One such case is the Dutch determiner het for definite singular neuter NPs vs. de,
which is the default determiner for definite NPs. Similar facts obtain for relative pronouns in
Dutch. “These examples [...] illustrate that agreement is not always marked by means of
morphology, but may also be marked through the choice of a specific lexical item” (Booij
2005:108f). Similar examples are found in HC with respect to the marking of number: senk liv
yo/*la ‘five books’ (literally: ‘five books PLURAL’) where the cardinal morpheme senk ‘five’ is
incompatible with the singular definite article la. To wit: * senk liv la. That too is, in Booij’s
terms, a case of contextually determined agreement in absence of morphological inflection.

Aboh (to appear) provides similar examples of contextually determined agreement (i.e.,
information exchange) in Saramaccan as part of a larger set of data and observations that directly
contradict the EIH: indeed, Saramaccan, like HC, shows configurations with contextually
determined agreement between the numeral and the determiner, with an alternation between the
singular determiner di as in di man kodo womi ‘the single man’ (literally: ‘the.SINGULAR one
single man’) vs. the plural determiner dee as in dee dii womi ‘those/the three men’ (data from
Rountree & Glock 1992). If one were to argue that the determiners in these examples show
inherent agreement (i.e., plurality in the “real world”) instead of contextually-determined
agreement, then there’s another example where the same sort of agreeement shows up on both
the determiner and the complementizer head, providing yet another example of contextually-
determined agreement. Indeed, Aboh (to appear) shows that the di vs. dee contrast is also found
in the domain of relativizers that agree in number with the head noun: Di fisi di mi tata kisi bigi
‘The (singular) fish that (singular) my father caught is big’ vs. Dee fisi dee mi tata kisi bigi ‘The
(plural) fish that (plural) my father caught are big.’ This is yet another case of contextually
determined agreement (i.e., information exchange across pieces of structure) without
morphological inflection.

Such cases of information exchange in Creole languages are pervasive. If “simplicity” is


contingent on absence of information exchange, then any grammatical pattern that depends on
information exchange is evidence against such “simplicity” claims. Among such patterns we
find selectional restrictions and long-distance dependencies. Let us look at each of these in turn.

As far as we know, all natural languages, including Creole languages, encode selectional
restrictions whereby one head (e.g., X0 in (1)) imposes restrictions on the content of its sister YP

  17  
(e.g., the lexical content of the head Y0 of the sister of X0), according to the following template:

(1) Selectional restriction via inter-phrasal information exchange:

Such selectional restriction requires that the relevant feature on the lexical head Y0 be shared
with the maximal projection YP. In turn, such information about Y0 must be “exchanged,”
through YP, with X0 for a compatibility check. This is thus a straightforward instance of inter-
phrasal exchange—in this case, between X0, YP and Y0—which enforces the selectional
restriction that X0 imposes on Y0.

Let us illustrate such patterns of information exchange with two straightforward examples from
Haitian Creole (HC).4 These examples involve the selectional restrictions imposed by the verbs
konte ‘to rely’ and mete ‘to put.’ The verb konte ‘to rely’ requires a PP complement that is
headed by the proposition sou ‘on’:

                                                                                                               

  18  
(2) Mwen konte *(sou) wou

I count on you

‘I rely on you’

No other preposition is acceptable as the head of the PP complement of konte:

(3) * Mwen konte de / anba / anlè wou

I count of / below / above you

‘I rely on you’

As for the verb mete ‘to put’, it selects for two complements: NP and PP, with the NP
intervening linearly between V and PP, thus providing an example where inter-phrasal
information exchange takes place across an intervening phrase. In this particular case, the
selected PP must be headed by a locational preposition (e.g., sou ‘on’, anba ‘under’, anlè ‘above’,
akote ‘besides’):

(4) Mwen mete liv la sou/anba/anlè/akote/*de/*san tab la

I put book the on/under/above/besides/of/without table the

‘I’ve put the book on/under/above/besides/*of/*without the table’

Another well studied case of selectional restriction in Caribbean Creoles involves the selection of
finite vs. non-finite clauses by verbs that take propositions as complements. For example, Sterlin
(1989) shows that kwè ‘to believe’ only selects for finite complements whereas vle ‘to want’ can
take either finite or non-finite complements. The following examples illustrate this distinction:

(5) Jan kwè Mari te di li te konte sou Woje

John believe Mary ANT say 3sg ANT count on Roger

‘John believes that Mary had said that (s)he counted on Roger’

  19  
(6) a. Jan te vle pou Mari te genyen

John ANT want for Mari ANT win

‘John wanted for Mary to have won’

b. Jan te vle Mari (*te) genyen

John ANT want Mary ANT come

“John wanted Mary to win”

Here, too, information must be shared across phrases, namely across the TP or CP phrase
boundary. That is, for the relevant selection restrictions to be enforced, information must be
shared between the clause-taking verbs (either kwè or vle) and the head of their respective clausal
complements (be it the finite T0 in (5) and (6a) or the non-finite T0 in (6b)). (Aboh & DeGraff,
to appear a, analyze related cases of finite vs. non-finite complementation in HC.)

As in other natural languages that we are familiar with, inter-phrasal information exchange in
HC also takes place across embedded clausal domains of increasing depth, as in the cases of wh-
movement. In (7), the displaced constituent must share its features with the trace left under the
most embedded VP headed by konte.

(7) Sou ki moun Jan kwè Mari te di li te konte t?

On which person John believe Mary ANT say 3sg ANT count

‘On whom does John believe that Mary had said that (s)he counted?’

Such feature sharing between sou ki moun in the matrix Spec(CP) and its trace in the lowest VP,
two clauses down, is required in order to satisfy the selectional restrictions of konte and to ensure
the adequate semantic interpretation of the sentence. Indeed the preposed wh-phrase sou ki moun
is interpreted as the object of konte, though there are two TP boundaries intervening between the
wh-phrase in the sentence-initial Spec(CP) and its trace in the sentence-final object position
under the VP headed by konte.

  20  
Such inter-phrasal information exchange often has overt syntactic reflexes that can be used to
diagnose the involvement of certain structural positions in the path of wh-movement. For
example, wh-extraction in HC, somewhat on a par with its French superstrate, shows a robust
subject-object asymmetry, as illustrated in:

(8) a. Ki moun *(ki) renmen Mari ?

Which person WH love Mary

‘Who loves Mary?’

b. Ki moun (*ki) Mari renmen ?

Which person WH Mary love

‘Who does Mary love?’

When wh-movement takes place from the subject position as in (8a) the element ki, which we
gloss as WH without any theoretical commitment, must surface at or near the site of extraction.
But when wh-movement takes place from the object position, ki cannot surface at all.

As expected in light of our ongoing discussion, this subject-object asymmetry obtains even when
wh-movement takes place across successive TP boundaries. To wit:

(9) a. Ki moun Jan kwè Woje te di *(ki) renmen Mari ?

Which person John believe Roger ANT say WH love Mary

‘Who does John believe that Roger had said loves Mary?’

a. Ki moun Jan kwè Woje te di (*ki) Mari renmen ?

Which person John believe Roger ANT say WH Mary love

‘Who does John believe that Roger had said loves Mary?’

In typical generative treatment, the displacement of wh-phrases is implemented via successive-


cyclic movement of the wh-phrase through the Spec(CP) of each intermediate clausal domain, up
to the ultimate the landing site of wh-movement. Such wh-movement is one technical

  21  
implementation, among others, of the intuition that feature sharing in such cases takes place over
unbounded domains (cf. the copy-theory of movement in the Minimalist Framework and the
“slash” feature in GPSG and its theoretical descendants; see Müller 2011 for an overview and
Alexandre 2012 for one application to a Creole languages). This is inter-phrasal information
exchange to the hilt!

Though wh-movement is unbounded, it is also sensitive to features of (or “information” from)


intervening lexical items. That sensitivity as well must rely on inter-phrasal information
exchange. One such example involves the interaction between wh-movement and non-bridge
verbs such as chichote ‘to whisper’:

(10) * Sou ki moun Jan kwè Mari te chichote li te konte t ?

‘On whom did John believe that Mary had whispered that (s)he counted?’

Here the long-distance movement that was illustrated in (7) is blocked by chichote ‘whisper’ (a
non-bridge verb) which intervenes between the trace and its antecedent. Compare with the
absence of blocking in (7) where the intervening verb in a comparable position is the bridge verb
di ‘say’ (see Koopman 1982:216).

The well-known predicate-cleft construction is another example of long-distance inter-phrasal


exchange in HC. Here the long-distance dependency—possibly across multiple clause
boundaries—is between two copies of a predicate head, one in situ in the embedded clause and
one in the leftmost periphery of the embedding clause. This dependency too is sensitive to
intervening bridge verbs, as in (11); see Piou 1982.

(11) a. Se bo Jan kwè Woje te di Mari bo Jak ?

FOC kiss John believe Roger ANT say Mary kiss Jack

‘John believes that Roger had said Mary KISSED Jack?’

b. * Se bo Jan kwè Woje te chichote Mari bo Jak ?

FOC bo John believe Roger ANT say Mary KISS Jack

‘John believes that Roger had said Mary kissed Jack?’

  22  
Furthermore, information about the clefted predicate head must be exchanged long-distance with
the in-situ predicate position in order to ensure that the form in-situ is indeed a “copy” of the
clefted form. To wit, the ungrammaticality of (12).

(12) * Se bo Jan kwè Woje te chichote Mari karese Jak ?

FOC bo John believe Roger ANT say Mary caress Jack

As far as we can tell, all Caribbean Creoles, show instances of selectional restriction, recursive
clausal embedding, successive-cyclic wh-movement, predicate-clefts with copying, and related
grammatical constructions. The predicate clefts in Creoles are all the more striking that they
have analogues in the substrates, which sheds further doubt on the EIH since early
interlanguages would have excluded such constructions, contrary to the fact that such
constructions made it through Creole formation. In any case, all such constructions involve
inter-phrasal information exchange, often including information exchange across clause
boundaries (see, e.g., Aboh 2006 for Saramaccan, Durrelman 2008 for Jamaican Creole,
Alexandre 2012 for Capeverdean Creole, and Byrne & Winford 1993 and Holm & Patrick 2007
for data from a variety of other Creole languages, all of which have wh-movement and other
long-distance exchange of grammatical information of a rather complex nature). The appearance
non-local dependencies in Creole languages and at all stages of Creole formation (see below for
achival evidence) thus stands as a robust counter-example to the EIH’s claim that Creole
languages are conventionalized interlanguages at an early stage.

Here comes a further rub for the EIH: According to Plag (2008a:123f, 2008b:314–316), clausal
embeddings enter the competence of L2 learners only at the latest and most advanced stages,
which in the EIH are beyond the grasp of those who created Caribbean Creoles. Yet, we do find
in Creole languages a variety of phenomena that are parasitic on the availability of clausal
embeddings and on inter-phrasal information exchange across clause boundaries, as exemplified
in above examples for HC and in the above-cited references for a variety of other Creoles. These
constructions generally have potential scope across multiply embedded clauses. Therefore, they
make Creole languages, on a par with other natural languages, fundamentally distinct from the
sort of early-interlanguage varieties hypothesized by Plag. Indeed, such early-interlanguage
varieties lack one property (i.e., clausal embedding) that seems widespread among natural
languages, including Creoles. Such embedding is even found at the earliest stages of Creole

  23  
formation, as we show in the next section. And this is naturally expected: Since Creoles are
natural languages, with the unrestricted Merge operation as one essential computational building
block as it is in all natural languages (Chomsky 2004), it is naturally expected that, in Creoles as
well, Merge would recursively apply to its own output, thus produce embeddings of various sorts,
including clausal embeddings, with the possibility of wh-movement qua Internal Merge
(Chomsky 2004). In fact, it can even be argued that a language that lacks clausal embeddings
and wh-movement would require extra rules (thus, be more “complex”) than a language that is
similar except for the presence of clausal embeddings and . The extra rules would be the ones
that limit the application of Merge in a way that prevents it from taking a clause as one of its
argument or that prevents Internal Merge from applying (cf. Chomsky 2004; see DeGraff
2001b:249f for a related discussion).

3.2. Archival and substratal evidence against the EIH

One (easily disconfirmed) response to arguments against exceptionalist theories of Creole


formation is that the data that seem to support these arguments did not exist at the time of Creole
formation. Instead, it is claimed, these properties entered the language during a post-creolization
stage (“well after the creolization period”) as the language became “de-creolized” with
prolonged contact with the superstrate language (cf. Plag 2008a:126; McWhorter 2011:90).

One could imagine such properties entering the language as a consequence of language change
even without de-creolization (i.e., without any prolongued contact with the superstrate). In any
case, there is ample documentation and argumentation against the posited decreolization
scenarios. For example, the evidence provided in Alleyne 1971 and Lalla & D’Costa 1989
shows that so called “acrolectal” Creole varieties (i.e., those structurally close to European
superstrate varieties) have emerged at least as early as the “basilectal” varieties (those that seem
structurally the most distant from the corresponding European superstrate languages). In other
words, the earliest Creole varieties would already show superstrate-derived properties in all
domains of grammar (see Fattier 1998 for detailed documentation in the case of HC), so one
cannot claim that such properties are post-creolization reflexes. Bickerton, for one, has admitted
that his abrupt Creole-formation scenario based on a Pidgin-to-Creole cycle is challenged by the
early existence of such acrolectal varieties (Bickerton 1996).

  24  
Be that as it may, the available archival evidence suggests that early Creole varieties near the
start of Creole formation did manifest constructions that depend on inter-phrasal information
exchange (e.g., selectional restriction, clausal embeddings, wh-movement and predicate clefts).
Take, say, one of the earliest Creole texts, a court deposition from 1671 Martinique about the
sighting of a mermaid. This short text generously exhibits examples of selectional restriction,
clausal embedding of purpose complement, relative clauses, etc. These constructions include
grammatical constraints that apply across phrase domains, thus triggering the sort of inter-
phrasal information exchange that are excluded in the early-interlanguage varieties postulated by
the EIH.

Here is one telling example from the 1671 Mermaid text from Martinique (quoted from Hazaël-
Massieux 2008:31):

(13) mouchié faire yon autre negre courir après li pour prendre li avec ligne

man make a other negro run after 3sg for take 3sg with line

‘The man made another negro run after it in order to catch it with a line’

This example shows not one, but two, embedded clauses: (i) the clausal complement of the
causative main verb faire ‘to make’, namely yon autre negre courir après li ‘another negro run
after it [the mermaid]’ and (ii) the purposive clause pour prendre li avec ligne ‘in order to catch
it [the mermaid] with a line’. Other examples from that 1671 texts instantiate relative clauses
and other cases of clausal embeddings. It is evident that the Creole speakers who uttered such
sentences in the 17th century were not drastically limited in their processing capacities. This
short 17th century text contradicts the EIH claims that Creole structures originate from the
limited processing capacities of learners at the earliest stages of L2A (e.g., from the lack of inter-
phrasal information exchange and clausal embedding).

Another empirical strike against the EIH is the presence of focus constructions that take scope
over recursively embedded domains and that show influence from the Kwa substrates. One such
construction is the afore-mentioned predicate-cleft in Caribbean Creoles, with two copies of the
predicate head across clausal boundaries (see examples in (11)–(12)). That these predicate-cleft
patterns have direct analogues in the substrate languages (e.g., in Gbe; cf. Byrne & Winford

  25  
1993) suggests that they entered the Creole from the early stages of Creole formation—at the
time when these substrate languages were still spoken in Saint-Domingue. If so, then these early
Creole speakers must already have reached advanced L2A stages, beyond the early stages
posited by the EIH. Recall that, in the EIH, these early stages are devoid of any processing
capacity for clausal embedding. This lacuna would have excluded any possibility of transfer of
predicate-cleft structures from Gbe to the emergent Creole (cf. Plag 2008b:317). In other words,
in light of the EIH’s own postulates about a processibility hierarchy in L2A (Plag 2008a:121–
124, 2008b:314,316), those who introduced the Gbe-influenced predicate-cleft constructions into
the early Creole must have been learners at the very latest stage of acquisition, contrary to EIH’s
basic claim. (Predicate clefts and other robust counter-examples to the EIH—constructions that
are found in virtually all Caribbean Creoles—are discussed in more detail, alongside other
counter-examples and various conceptual problems, in DeGraff 2009:953–958.)

3.3. Creoles definitely do not, and could not, stand out as “conventionalized
interlanguages of an early stage”

One major impetus of the EIH is the notion that “‘universal tendencies’ in creoles can be
accounted for as results of limited processing capacities in second language acquisition” (Plag
2008b:308). The EIH would effectively turn Creoles’ immediate ancestors asabnormal
languages. Indeed Creole grammars, according to the EIH, would manifest their origins from
early interlanguages that lack the necessary syntactic apparatus for one basic building block,
namely information exchange across pieces of syntactic structure (intra-phrasal and inter-phrasal
information exchange, in Plag’s terms). Such exchange enters into a variety of constructions that
are widespread across the world’s languages, including agreement and government
configurations, Case marking, co-occurrence restrictions on lexical items, head-to-head
selectional restrictions, clausal embedding, long-distance wh-movement, predicate-clefts, etc.
These claims echo the long-held belief that Creoles are exceptional languages that reflect some
state of ‘arrested development.’ But our ongoing observations based on available data about
early and contemporary Creoles robustly contradict the EIH.

Furthermore, Plag appeals to the EIH to account for certain cases of “relative and local
simplicity,” which is defined as follows:

  26  
“... in certain areas, it can indeed be shown that the creole grammar is simpler than that of
its input languages, with ‘simpler’ being rather crudely defined in terms of either
markedness, or number of forms, features or morphosyntactic distinctions being
expressed.” (Plag 2008a:117n4)

It is claimed that such cases of relative and local simplicity are “due to the nature of creoles as
conventionalized interlanguages” (Plag 2008a:117). However, such decrease of “markedness,
or number of forms, features or morphosyntactic distinctions” seems a regular happenstance of
language change more generally, as (e.g.) in the history of Germanic and Romance. These
insights go as far back as Bunsen 1854. It has also been pointed out, quite repeatedly since at
least Meillet 1919, that L2A does play a major role in such increments of local simplicity,
especially in paradigms of inflectional morphology as in the verbal and nominal systems.
Compare, say, the inflectional morphology of Latin vs. that of Romance. In both the verbal and
the nominal domains, Romance languages show a decrease of inflectional paradigms as
compared to their Latin ancestor. Ditto regarding English as compared to Proto-Germanic. Thus,
French and English as well can be argued to be “simpler than [their] input languages” on rather
superficial grounds. (Then again, we must also remember that both language contact and
language change also contribute to increase in distinctions. See Aboh & DeGraff 2014 for such
a case study in the case of Haitian Creole.)

Various authors have discussed the role of L2A in the decrease of inflectional morphology in the
course of contact-induced language change as in the history of French and English. But, as far as
we know, it hasn’t yet been argued that modern French and modern English emerged as
conventionalized early-interlanguage varieties. It is thus reasonable, in Uniformitarian fashion,
to analyze the cases of local simplicity in Creoles without postulating that these languages, as a
class, emerged as conventionalized early interlanguages.

Plag (2008a:117n4) stresses that the EIH “does not entail any commitment to whether creoles are
simpler overall, or to whether such a notion of overall simplicity is meaningful in the first place.”
In the same article it is stated that “the creole creators made use of the same mental processes as
any second language learner does” (Plag 2008a:128). Yet the EIH, in effect, is an instance of
Creole Exceptionalism to the extent that Creole languages, and Creole languages only, are
considered as “conventionalized early interlanguages of an early stage.” Furthermore, the

  27  
hypothesis that Creoles would “display almost exclusively structures for which no information
exchange between constituents is necessary” (Plag 2008a:125) does make Creoles “exceptional,”
contrary to our observations in this Section. As shown here, the syntax and semantics of natural
languages, including Creoles, routinely rely on intra- and inter-phrasal “information exchange”
for a variety of constraints and operations, including agreement phenomena, whether the latter
correlate with morphological inflection, lexical choice or other sorts of co-occurrence restrictions.
Such agreement mechanisms, whether or not they have reflexes in overt morphological inflection,
may well constitute a core engine of all natural languages, including Creoles (cf. Miyagawa
2010).

4. When computational phylogenetics is not about evolutionary history

We now examine the related claim that, not only Creoles are overall simpler languages, but they
form an exceptional group of languages on the grounds of complexity, typology and genealogy.
This claim is made most forcefully in a 2008 paper by Mikael Parkvall on “Creole simplicity”
and in a 2011 paper by Bakker, Daval-Markussen, Parkvall & Plag with title “Creoles are
typologically distinct from non-Creoles” with a follow-up paper in 2012 by Daval-Markussen &
Bakker titled “Explorations in creole research with phylogenetic tools.” This claim is what we
shall refer to as the Exceptional Creole Typology Hypothesis (hereafter “CTH” for short). In
arguing in favor of CTH, Bakker et al 2011 (“B&al”) and Daval-Markussen & Bakker 2012 (“D-
M&B”) use novel computational phylogenetics algorithms such as those in SplitsTree like
NeighborNet (Huson & Bryan 2006, cf. Dunn et al 2008). According to B&al:33, these
computational methods produce results that are “quite staggering,” “results [that] invariably
cluster all the creoles and pidgins, quite separately from the non-creole languages of the world”
(B&al:33). Furthermore it is claimed that “creoles as a group stand out as being less complex
than non-creoles” (B&al:8) and that “the conclusion that creoles (and pidgins, for that matter)
are typologically distinct from the languages of the world is inescapable and robust” (B&al:35).
Parkvall 2008 and B&al are approvingly cited in McWhorter’s (2011:10) search for “a litmus
test for creole status” in synchronic structural terms. And D-M&B call B&al “the strongest
piece of evidence” for a Creole typology, evidence that is, in D-M&B’s own words, improved by
their “irrefutable evidence” (D-M&B:94).

  28  
In this section, we will show that the CTH is undermined by methodological, conceptual, logical,
and empirical problems (cf. Fon Sing & Leoue 2012 for one previous critique of methodological
and empirical failings in B&al). Thus, B&al’s and D-M&B’s results, far from “staggering” and
“irrefutable,” are quite illusory once certain issues are clearly laid out—about the foundations
and scope of NeighborNet (Bryant & Huson 2006), which is the main algorithm in SplitsTree
used by D-M&B, and about critical flaws in B&al’s and D-M&B’s basic definitions and
assumptions and in their methods, data and feature selection (also see Aboh & DeGraff, to
appear, for a related critique of Parkvall 2008 and Bakker et al 2011). The larger lesson here is
that novel computational methods such as those in SplitsTree must be reviewed with great care,
as we do here, before they are applied in the analysis of language-change and language-creation
phenomena.

4.1. Methodological problems in the Creole Typology Hypothesis: Problematic


assumptions and circularity of definitions

The basic claim of the CTH is that Creole languages as a class fit a narrowly defined and
uniform set of structural templates that, in turn, place them at the bottom of a certain complexity
hierarchy.

But the CTH runs into problematic issues from the onset, starting with circularity in B&al’s
definition of “Creoles”—a definition that also bears in their biased sampling of “Creole”
languages, as we will see in Section 4.3. B&al seem aware of this problem: “in order to avoid
circularity” they consider a socio-historical definition, that is, they consider Creoles as
“nativized or vernacularized developments of pidgins, which are makeshift languages used in
some contact situations” (2011:10, also p. 36). But, here they shift the definitional burden onto
“pidgins” as “makeshift languages,” and this introduces a basic empirical issue, namely the
absence of any documentation for such “makeshift languages” in the history of the classic
Caribbean Creoles. In Bakker’s (2003:26) own words: “There are no cases where we have
adequate documentation of a (non-extended) pidgin and a creole in the same area.” This “data
problem,” as it is called in Bakker 2003, is well known in Creole studies. Mufwene (2003:34f)
has provided maps that illustrate this problem by highlighting the “geographical complementary
distribution between the territories where creoles developed and those where pidgins emerged.”

  29  
More recently, McWhorter (2011:70) reckons that “we will never have empirical data of the
pidgin stages” for Caribbean Creoles. It is thus that the CTH becomes both circular and biased
when the putatively simple structures of B&al’s narrow sample of Creole languages (most of
them with documented ancestry in subsets of Germanic, Romance and Niger-Congo languages)
are attributed to these Creoles’ hypothetical ancestry in un-documented Pidgins qua “simplified
forms of interethnic makeshift languages [that] were insufficient for communication”
(McWhorter 2011:36).

The CTH is further weakened by B&al’s problematic assumptions about, and their use of, certain
computational methods for establishing typological or historical (un)relatedness among
languages, especially the methods they borrow from the important, but controversial, work of
Dunn and his colleagues on the use of structural-typological features in computational
phylogenetics.

4.2. What is computational phylogenetics? Basic concepts, scope and limitations

Among its many applications, modern computational phylogenetics can be used to formalize the
insights of the 19th century’s Stammbaumtheorie: these sophisticated algorithms can be used to
discover systematic correspondences across languages that can, then, be clustered into families
that are, in turn, considered to be related to one another in virtue of shared, evolutionary history.
A familiar and famous example of the Comparative Method is the reconstruction of Proto-Indo-
European on the basis of shared features of cognate morphemes from extant Indo-European
(“IE”) languages, including the possibility that some cognates differ in their phonology via
“descent with modification,” to use Darwin’s terminology. Similarly, in biology one can observe
that our human species and, e.g., crocodiles, platypus and horses all have four limbs, implying a
common tetrapod ancestor from which all these species evolved. On this account, birds and bats,
which are assumed to descend from this common tetrapod ancestor, must have independently
developed wings from two of these four limbs: the most parsimonious account is that the
common ancestor originally had four limbs, and that birds and bats had two of these four limbs
(the forelimbs) develop into wings (McGhee 2011:6f). This parallel-evolution account is more
parsimonious than the alternative in which the common ancestor had two limbs and two wings,
and following this, several species then independently developed two additional limbs from the

  30  
wings (Stearns & Hoekstra 2000).

The end result of phylogenetic analysis for both languages and biological species has
traditionally been a branching tree whose tips denote features of the observed contemporary
species. The tree’s topology indicates family relationships grounded on evolutionary history.
The lengths of the branches approximate time or more simply the number of changes in feature
values, and the internal nodes denote putatively historically reconstructed ancestral states. All
such models adopt, either explicitly or implicitly, some model of evolutionary change (in the
case of languages, linguistic change), so that one can correctly ascertain whether features are
common in virtue of common history (as in the limbs of humans, crocodiles and horses), as
opposed to independent ‘inventions’ (as in the wings of birds and bats).

However, besides common descent and independent innovation, there are other ways that two
linguistic (or biological) species might come to share a feature. One possibility is horizontal
transfer, as when two languages are in geographical contact, and one borrows lexical items or
structural properties from the other. In this case, the classic Stammbaum model no longer strictly
applies and evolution is no longer purely treelike (Nakhleh 2005b). While horizontal transfer is
relatively uncommon in biological species apart from prokaryotes (e.g., the Bacteria; cf. Jain et
al 1999, Syvanen & Kado 2002 and Theobald 2010), this process is widely recognized as a
productive source of shared features in human languages, as the prevalence of word borrowing
attests. There are two main approaches to this issue: methods that construct an explicit
representation of the (historical) contact/borrowing events; and methods that construct an
implicit, ahistorical representation of multiple tree-like possibilities simply as a graphical
representation of alternatives. As Nichols & Warnow (2008) note, “when there is borrowing
between languages, the proper graphical model will reflect such borrowing through the addition
of contact edges. Such graphical models are called ‘explicit phylogenetic networks’ since they
represent an explicit evolutionary scenario” (2008:762f).5

As before, interior nodes in an explicit phylogenetic network denote reconstructed ancestral


                                                                                                               
5
Nichols & Warnow assume that creolization is an exceptional diachronic phenomenon with “dual parentage”
(2008: 762; cf. Taylor 1956:413). We also find Creoles excluded from the comparative data in Ringe et al 2002: 63f,
108, Nakhleh et al 2005b: 404n18, Labov 2007: 344, 346, 349, 371, etc. and Holman et al 2008: 339. But our main
claim in this paper is that Creole formation instantiates the same sort of language-contact phenomena (e.g.,
borrowings and other areal diffusion phenomena) that we find in established language families such as Indo-
European, as documented in (e.g.) Ringe et al 2002 and Nakhleh et al 2005b. (See DeGraff 2009: 923–932 for
explicit arguments against the sort of Creole Exceptionalism illustrated in Ringe et al 2002, Nakhleh 2005b, etc.).

  31  
states as in Stammbaumtheorie, while the contact edges—horizontal lines—denote specific
historical events. For example, one might analyze the Haitian Creole agentive suffix –adò as in
mantadò ‘liar’ (cf. manti ‘to lie’) as a borrowing from Spanish (compare HC –adò with the
Spanish agentive suffix -dor as in matador ‘killer’; cf. matar ‘to kill’). Such a borrowing can be
represented by a (notional) phylogenetic tree like the one in (14) below, where we have also
displayed French and Italian in their usual position. According to this very simplified
evolutionary graphical model, French, HC, Italian and Spanish all share a common ancestor “A1”
(i.e., Latin or, more precisely, a cluster of Latin dialects) while French and HC share a common
ancestor “A2” (i.e., a cluster of 17th century French dialects) that excludes Italian and Spanish.
The shared feature that connects HC -adò and Spanish -dor is due to the “horizontal transfer”
event that is shown by the double-headed vertical arrow in (14) —a transfer event that is quite
likely in light of the history and geography of Haiti, which neighbors the Spanish-speaking
Dominican Republic (see DeGraff 2001a:68 for further details).

(14) An explicit phylogenetic network for a toy example of horizontal transfer into Haitian
Creole

Note that one could have posited an alternative, though much less probable, tree, without any
borrowing, whereby Spanish and Haitian Creole shared some common (Romance) ancestor, and
it was this ancestor that possessed the agentive suffix –dor which was then passed down to
Spanish relatively unchanged, but which in Haitian Creole mutated to -adò. In a third, yet even
less likely, scenario, Spanish and HC would have developed the agentive suffixes -dor and -adò,
respectively, in parallel and independently of each other, somewhat on a par with the
independent, but convergent, development of wings in birds and bats.

Given the available data (both linguistic and historical), these three competing scenarios are
fairly easy to choose from. Yet they do illustrate the potential difficulty that is inherent in

  32  
untangling shared borrowing due to geographic contact vs. common descent vs. independent
convergent development. The point about shared common descent (genealogy) vs. borrowing
due to contact (geography) is made more forcefully and with realistic data in Donohue et al 2011.
Models that explicitly introduce horizontal transfer (i.e., contact-induced borrowing) in linguistic
phylogenetic analysis have been advanced by Nakhleh et al (2005b), Barbaçon et al (2012) and
others.

Finally, besides the explicit reconstruction of evolutionary (historical) language relationships,


one can construct implicit network relationships among languages without specifying any
assumed evolutionary or historical relationships among them (Huson & Bryant 2006:256). This
approach is exemplified by the computer program SplitsTree, which can use any one of several
methods to first group related languages together and then any one of a number of ways to
display the results as a network instead of a tree (Huson & Bryant 2006). As Nichols & Warnow
observe, the output in such a method, which produces what Huson (2012) calls a “splits graph”,

“... instead represents graphically how the input data … do not fit a tree exactly. Thus the
graph represents a combination of tree-like signal and noise in the data. In particular, the
internal nodes of this graph do not represent ancestors of the given languages, but are
introduced in order to make possible the representation of the conflict between the
different splits that are produced in the data analysis.” (764f).

It is the algorithms in SplitsTree that have been vigorously applied in recent analysis of Creole
languages. But this use of computational phylogenetics in Creole studies comes with a
methodological twist whereby diachronic claims are extrapolated from the display of certain
typological similarities. Witness the following quotes: “In recent years, a number of algorithms
have been developed by bioinformaticians to help visualize biological evolution (see, e.g., Huson
& Bryant 2006). The resulting phylogenetic networks have a number of advantages over the old
evolutionary trees. First, they can account for horizontal relationships, i.e., contact phenomena”
(B&al 2011:7, emphasis added) and “[these] algorithms now make it possible to draw trees that
show not only inheritance, but also horizontal influence (contact, borrowing)” (B&al 2011:10).

The analyses that we examine in the remainder of this paper typically use two specific
components of SplitsTree: an incremental, agglomerative clustering algorithm, Neighbor-Joining
(NJ) applied to a particular collection of measured language features, followed by a so-called

  33  
‘split decomposition’ of the resulting graph output from NJ.6 NJ clustering amounts to a
‘similarity’ or ‘distance metric’ based on language features that progressively combines language
clusters considered four groups at a time, allowing for overlap. Thus, NJ is actually just
computing how close or how far apart one language is from another in terms of the selected
features regarded as distance, without regard to their possible historical ‘relatedness’ or
evolutionary history. It is important to emphasize, while keeping in mind the evolutionary claims
in B&al, that such a method is thus purely taxonomic or cladistic, rather than phylogenetically
(historically) based. The second step, the calculation of the ‘splits graph’ is one way to visually
display uncertainty in the way that one language is ‘close’ to others or not.

An artificial but concrete example may help make this distinction clear. Consider the made-up
binary feature data for five languages as shown below in (15), where 0/1 denotes the
absence/presence of 8 different features numbered 1 through 8:

(15) Language Feature values

German 01010101

English 01100110

Spanish 10101010

French 10101011

Haitian Creole 10101001

By inspection, looking down the feature columns at each language, it is apparent that French
differs from Haitian Creole by exactly one feature, the penultimate one. Therefore, the so called
‘Hamming’ distance between HC and French (defined as the number of 0-1 bit differences
between the feature values of HC vs. French) is 1. Spanish differs from Haitian Creole by two
features (the last two), with a Hamming distance of 2; and so forth, with English and German

                                                                                                               
6
The SplitsTree program also lets one choose other metrics for calculating clusters, but NJ works more efficiently
and accurately with large numbers of languages and measured features and so is the method of choice used by the
analyses that we examine here.

  34  
being the most ‘distant’ from Haitian Creole, each with a Hamming distance of 6. The
phylogenetic tree one obtains by ordinary methods (using the Neighbor Joining method) reflects
these feature (or “character”) differences transparently.

(16) Neighbor Joining phylogenetic tree for toy example in (15):

If we analyze the same data using NeighborNets (NJ plus split-decomposition display),
constructing networks as advocated by B&al, one obtains the following graphical output:

(17) NeighborNet network for toy example in (15):

  35  
As before, this figure depicts Haitian Creole as ‘closer’ to French and Spanish than English or
German. In addition, three parallel lines have been drawn in the graph, which is no longer a tree.
These lines mark the ‘split’ between (i) German and English vs. (ii) Spanish, French, and Haitian
Creole: if one makes a single ‘cut’ through the three red lines, the graph would fall apart (i.e.,
‘decompose’) into two completely connected, separate components (or ‘splits’), namely
(i) Haitian Creole, French and Spanish vs. (ii) English and German—hence the term ‘Split
Decomposition’ for this method of displaying the differences between (i) vs. (ii).

These three distinct lines indicate that German and English can be split off from Spanish, French,
and Haitian Creole in three possible, different ways: via feature column 1 (German and English
have the value 0 while the remaining 3 languages have the value 1); via column 2 (German and
English with value 1 vs. the remaining 3 languages value 0); or via column 5 (German and
English with value 0 vs. the remaining languages value 1). As should be apparent from the figure
in (17), the three red lines do not necessarily denote possible horizontal, historically grounded
transfer events such as borrowing or diffusion via language contact. In other words, it is these 3
features that “support” the split between (i) Spanish, French and Haitian Creole vs. (ii) English
and German.’7

In short, the red lines in (17) simply denote those features that happen, after the actual clustering
analysis has taken place, to split the graph into two separate parts, without regard for
evolutionary history, with the distance between language species indicating simply similarity or
dissimilarity (Huson & Bryant, 2006:255f).

While there is nothing wrong with using this visualization as a heuristic device, it may be
problematic to assume that this method meets the burden of proof required to reveal historical
events such as processes of Creole formation. Indeed, it must be stressed that such implicit
networks cannot be interpreted as necessarily indicating any sort of diachronic scenario, though
one can reasonably argue that close clustering might be suggestive of such a scenario. This is not
a novel observation. The same point holds for any similar ‘cladistic’ approach that calculates
distances between (language) species without assuming any evolutionary model that produces
the distances. This point is confirmed by some of the very figures in B&al such examples where
“non-creoles do not classify along genetic or areal lines. For example, Basque (isolate, Western

                                                                                                               
7
The length of the parallel lines gives an estimate of the number of characters that ‘support’ the split.

  36  
Europe), Hindi (Indo-European, India/South Asia), Burushaski (isolate, North Pakistan), and
Hunzib (East-Caucasian, Caucasus) cluster” (B&al, p33; see (18) below which is their figure 10).

(18) Network of 153 non-Creole languages, 34 pidgins and creoles, and Esperanto (EESP)

Underscoring the fact that such networks are not necessarily indicative of evolutionary scenarios,
take another look at the figure in (17). Notice that the red lines connecting French and Haitian

  37  
Creole back to the German-English portion of the graph end at internal nodes that have no labels
whatsoever, as must be the case since the NJ algorithm using distances produces only a cladistic,
cluster analysis rather than an historical, evolutionary analysis. Thus, these internal nodes do not
denote a possible reconstructed ancestral language. In Huson & Bryant’s terms, “internal nodes
in a split network do not necessarily correspond to hypothetical ancestors” (2006:259).

We have now established that, contrary to B&al’s claims (e.g., 2011:11–14), the Split
Decomposition methodology does not necessarily model horizontal contact events or other
historical scenarios. B&al define “phylogenetic network” as “any graph used to visualize
evolutionary relationships” (represented by edges or branches) between [...] languages
(represented by nodes or taxa” (2011:12, emphasis added). Furthermore they claim that:

“[t]hese networks that account for both historical relationships and borrowed items are
ideal for application to creole languages, as both inheritance and contact play an
important role in the formation and development of creoles. ... These relations as
established in by the programs can represent loans, structural borrowings, shared
inheritance, substratal influence or independent developments...” (2011:14, emphases
added).

However the fact is that the networks displayed in B&al’s article do not represent any
evolutionary relationship; these networks can and should only be used to display similarities
along the specific set of features that they’ve selected—with their selection proceeding in a
documentedly biased fashion (see below).

We next turn to a critique of some of the actual datasets in B&al and D-M&B, with an eye
towards examining a second critical component of accurate phylogenetic analysis, the proper
selection of languages, features and feature values. Phylogenetically informative feature
selection is not straightforward. Yet, using the wrong features unavoidably lead to erroneous
results, in biology as in linguistics (for linguistics, see the caveats in Nichols & Warnow
2008:769, 814). The right features to select need not be apparent, and even when they are
apparent, analysis may prove difficult.

A classic example illustrating this point in biology centers on the proper phylogenetic position of
the family including whales and dolphins, the Cetacea. Using external morphology (vestigial
hip and foot bone structure), cetaceans had long been grouped with the family of ungulates such

  38  
as deer and camels. However, with the advent of molecular sequencing, less ‘visible’, more
deep-seated features based on genomic information have firmly established that the closest living
relatives of cetaceans are the hippopotamus (Nomura & Yasue 1999). The analogy with ‘overt’
and ‘covert’ structure in Creoles is clear: without an independent, unbiased way to select and
confirm the features that are used in comparing languages, existing classifications can be
incorrect or even circular, subject to sampling bias. We turn to an examination of this possibility
in the Creole-related analysis of the next section.

4.3. Methodological problems in Bakker et al 2011: Biases in feature and


language selection8

4.3.1. Fundamental issues in statistics as regards (un)biased samples

The field of statistics deals with determining what inferences can be drawn from data.
Fundamental issues in statistics such as causality, bias, significance and reproducibility are
prominently discussed in most good introductory statistics textbooks (e.g. Starnes et al 1996,
Witte & Witte 2010). Improper applications of statistics can result in spurious conclusions.
Such errors can be extremely subtle and difficult to prevent or detect for non-specialists.9 Such
errors undermine the claims in Parkvall 2008, B&al and D-M&B.

Let us consider the basic logic of B&al’s and D-M&B’s argumentation for the exceptional
typological status of a particular class of languages C (“C” for “Creoles and Pidgins”):

(19) The basic logic in B&al’s and D-M&B’s argumentation:

a) Select a “reasonable” set C0 ⊂ C of languages in C.


b) Select a “reasonable” set N0 ⊂ N(C) of languages N(C) not in C.
c) Select a “reasonable” set F0 ⊂ F of available features F.
d) Demonstrate that the languages in C0 are, as a set, closer to each other by normalized
Hamming distance than to the languages in N0, considering groups of four languages at a
time. More specifically, NeighborNet uses a modified version of Neighbor Joining
clustering, defining the distance between two languages as the fraction of attested

                                                                                                               
8
Aboh & DeGraff, to appear, offers a related analysis of some of the sampling biases in Parkvall 2008 and Bakker
et al 2011.
9
For non-technical surveys of this widespread phenomenon, see Huff 1954 and Joel 2001.

  39  
features on which they agree, where a feature is attested if its value is known for both
languages. Note that this assumes equal weighting on all factors.

Using an argument of this type, a class of languages C can be shown to be “exceptional” if there
exist “reasonable” subsets F0 ⊂ F, C0 ⊂ C, and N0 ⊂ N(C) such that the languages in C0 are
closer to each other than to the languages in N0 using normalized Hamming distance with respect
to F0. Note also that these papers generally define “reasonable” for the selected sets of languages
as =diverse in terms of “lexifiers, geography and circumstances of genesis” (2011:16f) rather
than as “representative” or “sufficiently large.”

The logic in (19) suffers from a variety of flaws, the main one being the possibility of sampling
bias. Such bias occurs when there is a systematic failure to equally represent all classes of
objects that are supposed to be represented in a sample. Linguistic applications introduce many
complications, so let us first consider a simplified example from everyday life: the relation
between height and age in people. Using the logic in (19), we will show that the members of the
set C of 15-year-old males are, as a class, significantly taller than males of other ages, which is
clearly an absurd hypothesis. Select a set C0 ⊂ C of 15-year-old males. Make sure that this set is
“reasonable” in the sense that it is diverse with respect to many factors: ethnicity, nationality,
hair color, eyesight, age of parents at birth, etc. Select a set N0 ⊂ N(C) of males who are not 15
years old. Make sure that this set is similarly diverse, adding age as a diversity factor. If the 15-
year-olds we select happen to all be young basketball stars and the factors we use to justify
diversity do not allow us to identify this very important fact, the hypothesis is likely to be
erroneously verified. More subtle examples can be constructed as well. We know that average
heights vary across countries, for example, so if we omit the basketball player condition and
simply upweight the selection of 15-year-olds to include more individuals from countries with
larger average heights, our hypothesis may again appear true. Weighting the sample of non-15
year olds to include more young children would be similarly problematic.

Going beyond the one-feature case (height), it is not a difficult exercise to identify for any
additional feature how we can select sets C0 and N0 of people on which their values differ
significantly. In fact, it is far more challenging to find sets that do not suffer from bias. As we
add more features and extend the scope of our conclusions, the difficulty of eliminating bias
increases if features are added without due attention to their representativeness vis-à-vis the

  40  
relevant populations. (And, as we find more features on which a particular class of people differs,
it may become increasingly tempting to qualify this class as “exceptional.”) While the above
example is blatant, sampling bias is often subtle and difficult to detect. The extreme difficulty of
detecting and eliminating bias is, in fact, one of the primary justifications for the existence of
such a large and well-funded field of statistics. As we search for typological clusters, we must
question whether the selection of languages and features is biased in any way.

B&al seem aware of the possibility of biases in their selection of languages and features. They
write: “We used other scholars’ pre-existing samples in order to avoid any potential bias by the
present authors with regard to the selection of languages or features. The use of software also
guarantees that all features have the same weight, thus minimizing the bias” (p. 15). However,
giving features all the same weight is not a proven way to minimize bias. A more empirically
reliable method to reduce bias might include weighting of features according to, say, their
relative significance and time stability given a particular theory about their cross-linguistic
distribution and their transmission (see Section 4.7), though this is challenging to do given that
the features used for distance metrics should be given equal weights, all other things considered
equal. There’s also the risk of introducing theoretical biases into the input to the algorithms. But
this risk is mitigated in case the theory finds broad empirical support elsewhere.

Recall B&al’sclaim that theirs is “a balanced sample, with a fair distribution across lexifiers,
geography and circumstances of genesis, including at least one that is fairly deviant structurally
and not always classified as a creole (Nagamese)” (2011:17). The fact that most of the lexifiers
of the sampled Creoles are from Germanic or Romance and that their substrates are from Niger-
Congo should already shed doubt on the claim that we have “a balanced sample”—unless it is
assumed in ad hoc circular fashion that Creoles, by definition, tend to have Germanic or
Romance lexifiers and Niger-Congo substrates. (If this assumption is made, the conclusions of
the analysis may shed more light on languages with this particular history than on Creoles.
Correlation does not imply causality: the fact that a certain set of languages clusters separately
does not imply that it clusters separately because it is composed of Creoles.) In any case,
nowhere in B&al’s definition of “Creoles” is it assumed that Creoles’ lexifiers are generally
Germanic or Romance or that their substrates are generally Niger-Congo. In our basketball-
player example, it would not make sense either to assume that the definition of “15-year-old

  41  
male” includes an assumption that they tend to be basketball players.

One of the primary problems that our basketball-player example makes clear is that in-sample
diversity (e.g., the inclusion of Nagamese in B&al’s sample) is not necessarily a good
determinant of whether a sample is representative of the whole population. While the set of
people who are basketball players is diverse with respect to many factors, it is not representative
of the whole population. This point may seem intuitively obvious, but it is often subtle and can
be missed by even discerning readers.

To make statistically valid claims about a whole population through the type of argument in
B&al, it is critical to base calculations on a sufficiently large and unbiased sample of that
population, such as one that is uniformly randomly selected (this is a necessary but not a
sufficient condition for statistical validity). Given the limited datasets available and the fact that
they are not themselves representative of the whole population of languages (which languages
are present, which features are attested for which languages, etc.), this is a tall order. The types
of typological arguments that can be made are greatly limited by the available data.

4.3.2. Feature- and language-related biases in Bakker et al 2011 and Daval-


Markussen & Bakker 2012

In this section we demonstrate that the “pre-existing samples” of languages and features in B&al
and D-M&B were themselves created with explicit biases, in light of previous authors’ searches
for pan-Creole features or for substrate influence in Creoles. Moreover, we will also show that it
is not accurate to claim that the features in these studies all have the same weight (regardless of
whether this is a welcome characteristic): certain features are weighted more heavily because of
logical dependencies among them (Kouwenberg 2010). One clear example is the pair of features
“double object marking” and “object marking” where the former entails the latter, thus causing
the latter feature to be counted twice in many instances. These inter-dependence relationships
were themselves introduced in B&al’s and D-M&B’s own recoding of WALS features.
Furthermore, many of the values assigned to these features are empirically invalid and, worse yet,
logically contradictory. But let us first start with the bias-related questions and related
fundamental issues in ampling.

Faced with the question whether Creoles are substantially typologically distinct, a less biased

  42  
study would collect a sufficiently large and random sample of languages across geographical and
typological areas, then pick a sufficiently large and random sample of features across all domains
of grammars, and then see if, in this randomly selected group of languages, languages that are
identified as “Creoles” on independent grounds would cluster together.10 Such “independent”
grounds may appeal to, say, the labels that speakers assign to their native languages (“Creole” or
some variant thereof) for ethnographic or historical factors.

Note how this method differs from the logic in (19).

Such an unbiased typological study might strive to heed, at least, the following guidelines, as
advocated in Dunn et al 2008 which is taken as a model by B&al:

(20) Dunn et al’s guidelines toward reliable computational phylogenetics:

(i) Use a “combination of structural features from different domains of a grammar


(phonology, morphology, syntax, semantics)” (p. 715).
(ii) Investigate “as many abstract structural features from as many parts of the grammar
as possible” (p. 716f; also see p. 733 and Nichols & Warnow 2008:784).
(iii) Provide “a large body of basic features for each language, which together give a
broad typological profile, regardless of whether any given feature seems typologically
significant. The resultant phylogenies are thus not likely to reflect a sampling bias.”
(Also see Wichmann & Saunders 2007:383 on how areal effects introduce noise in
the data when the comparison is based on small set of features.)
(iv) “[A]void the charge of ‘hand-picking’ features by including in [the] sample the
widest feasible range of noninterdependent typological phenomena” (p. 733; cf.
Wichmann & Saunders 2007:376,382,385n7).

These caveats are all ignored in B&al and in D-M&B. In effect, both of these studies selected a
relatively small number of features that, by and large, were already established to show relatively
uniform markings across their small sample of Creole languages, and then they went on to claim
that these features “prove” that Creoles form an exceptional typology.

                                                                                                               
10
Even if we were able to do this, there would still be substantial sampling biases. For example, we are restricted to
languages that are sufficiently documented and (mostly) to living languages. There is also the now familiar “data
problem”: most of the world’s languages that have been labeled “Creoles” have superstrate languages from
Germanic and Romance, and substrate languages from Niger-Congo.

  43  
The study that B&al take as the most convincing uses 43 features for the comparison of 188
languages, 34 of which are Pidgins and Creoles. One problem is that, of the 8,084 possible
feature values defined by this language/feature matrix, 1,743 are marked “?” for missing data,
nearly 22% of the total, with substantial unwelcome consequences for NeighborNet’s clustering
algorithm (see Section 4.4.3 below).

In comparison, Dunn et al 2008 used 115 features for their sample of 22 Papuan languages (p.
728, 730), with features ranging over phonology, morphology and syntax—a stark contrast with
B&al’s study whose 43 features come almost exclusively from narrow domains of
morphosyntax.

In D-M&B, the number of features relevant to their Creole typology go down to 4 features and
these 4 features were selected as those that occur in 80% of Creoles. This is a spectacular
example of sampling bias, an example to which we return below, in Section 4.5 and in Appendix
B. There we will show that there exist many other sets of languages with 4 or more feature
values in common. In fact, what this appendix shows is that there exist some 1017 sets of
languages with 4 identical feature values in common and that the maximum cardinality of such
sets is 146. Therefore, the 36 Creole languages in D-M&B are by no means “exceptional”: they,
like many other (arbitrary) sets of languages, can be distinguished by some array of linguistic
features. The logical problem here is that there are many (1017) arbitrary set of languages that
have some set of 4 features in common without such set of languages forming any natural class
on any diachronic or syncronic grounds. Essentially, in statistical terms, these papers draw
conclusions from ‘noise’.

In B&al’s and D-M&B’s studies, the features that are used in investigating a Creole typology are
from Hancock 1987, Holm & Patrick 2007 and Parkvall 2008. These features all come from
relatively narrow areas of grammar.

To begin with, Hancock 1987 is based on a relatively superficial comparison of 50 isolated


sentences and phrases, without any analysis and without any effort to embed these data in larger
paradigms in any of the given languages. Such a sketch cannot be taken to reliably reveal
typological features of the languages in question. Hancock himself called these sentences “an
organon for continuing work” (1987:282).

The study that is based on Parkvall 2008 deals with 43 of Parkvall’s 53 features, features that are

  44  
all taken from narrow and superficial areas of morphosyntax, mostly having to do with overt
morphology (see critiques in Aboh 2009, Kouwenberg 2010, 2012). Furthermore, Parkvall’s
features were explicitly chosen with a particular complexity metric as a criterion: “For all the
traits listed, I consider their presence (or the presence in larger numbers) to add to the overall
complexity of a language” (2008:270). Consider, for example, the fact that, in his search for
“Creole simplicity,” Parkvall looks for features outside of WALS and includes in his list a
feature such as F52 “Alienability” from Nichols 1992. But a grammatical marker for alienability
is among the “closed-class and minority patterns [that] are so driven by universal preferences in
their marking as to yield little interesting typological or geographical variation” (WALS, p. 103).
In other words, a feature such as alienability can hardly be taken to identify typological clusters.

As for the features in Holm & Patrick’s 2007 book Comparative Creole Syntax (CCS), they were
selected with the explicit objective to cluster Atlantic Creoles and Niger-Congo languages
together, and to contrast them with the Creoles’ European ancestors (Holm 2007:vi).
Furthermore, like the features in Parkvall 2008, the CCS features come from isolated and inter-
dependent areas of grammar. Consider (e.g.) the fact that among the 97 features in Holm &
Patrick 2007, 40 of them concern the Tense-Mood-Aspect (TMA) and verbal systems, with 24 of
them related to temporal interpretation; among the other features, some 20 are related to the
nominal system (Véronique 2009:153f). Below we give examples of inter-dependent (i.e.,
redundant) features in Parkvall 2008 and describe how such redundancy create additional biases
in B&al’s and D-M&B’s samples of features (cf. Kouwenberg 2010).

It must be stressed that the studies in Hancock 1987 and Holm & Patrick 2007 started out with
the stated objective of looking for pan-Creole properties (by and large, interdependent morpho-
syntactic properties from limited aspects of grammar). They looked for these features among
Creoles that, for the most part, had similar sets of superstrate languages (from Germanic and
Romance) and substrate languages (from Niger-Congo). B&al and D-M&B then used the
features in these studies, alongside Parkvall’s (2008) features, in order to detect a Creole
typology. In other words, the features that form the basis of the CTH (be it the features and
languages from Hancock 1987, Holm & Patrick 2007 or Parkvall 2008) are interdependent
features from narrow domains of morphosyntax, without balanced representation from other
areas of grammar. To recapitulate, these features are not at all representative of the space of

  45  
parametric variation across the world’s languages. Therefore they have little value for arguing
that Creoles, as a class and independently of their superstrate and substrate languages, are
typologically distinct from non-Creoles. Given such biases, all that B&al and D-M&B can
conclude is that the Creoles in their sample tend to show similar values for the very features that
they have chosen. Such narrow similarity, though an interesting exercise on typological grounds,
is not on a par with the systematic sets of correspondences that have been used in the
Comparative Method in order to show genetic relatedness. Therefore, we must be skeptical of
B&al’s aforementioned claim that their feature sets, when fed into phylogenetic programs such
as SplitsTree, can help identify diachronic events such as “loans, structural borrowings, shared
inheritance, substratal influences or independent developments” (2011:14).

We also find biases in the samplings of Creole languages in B&al’s and D-M&B’s studies.
Consider the sample of 34 Pidgin and Creole languages in Parkvall 2008. This sample is biased
in light of the fact that they were chosen based on a non-random selection of features: these are
“languages with values for at least 30 out of 53 features considered” based on Parkvall’s notion
of simplicity (2008:273). The selection of languages is thus biased since it depends on a prior
biased selection of features. By definition, such languages cannot be taken as a random
representative of the underlying space of “Creole” languages—if (big “if”) the term “Creole”
were to be given an operational structural definition that would be independent of the particular
contingent ancestry of the most commonly studied Creoles (i.e., those with Germanic or
Romance superstrates and Niger-Congo substrates). B&al’s definition of “Creole” is one such
definition, as they define “Creoles” based on their “sociohistorical origin” as nativized pidgins,
with pidgins defined as “interethnic makeshift languages” (2011:36). Such definition does not,
in principle, force Creoles to have have Romance or Germanic superstrates and Niger-Congo
substrates. Therefore B&al’s sample of Creoles is not representative of the set of languages in
said intensional definition.

Other biases come up when we look closely at the sample in Parkvall 2008. Two of these
languages were from Dryer et al’s (2005) World Atlas of Linguistic Structures (WALS), 18 of
them from Holm & Patrick’s (2007) Comparative Creole Syntax: Parallel Outlines of 18 Creole
Grammars (“CCS”) and the rest from Parkvall’s own choosing. The majority of languages in
this sample have sources in Western European languages (Germanic or Romance) and Niger-

  46  
Congo (usually Kwa and Bantu). These source languages thus belong to restricted areas of
typological variation, a fact that has been noted in Alleyne 1980:146–180, Thomason &
Kaufman 1998:154, Bakker 2003:26, Mufwene 2008:136–153, Holm 2008:319f, Kouwenberg
2010, etc. Presumably, such a typologically limited set of Creole languages constitutes an
extremely biased sample from the start. This bias is further enhanced by the fact that the features
in Parkvall 2008 give priority to overt morphological marking: given the morphological profile
of Atlantic Creoles’ superstrate and substrate languages (from Germanic, Romance and Niger-
Congo, which themselves do not have high inventories of morphological paradigms, except for
Bantu) it is naturally expected that these Creoles will tend to score low on these overt markings,
especially keeping in mind the effects of second-language acquisition in language-contact
situations (see the many studies since Weinreich 1953).11

Having established that the sampling of languages and features in B&al’s and D-M&B’s studies
is thoroughly biased from the start, let us now see how these sampling biases play out in the
specific study based on the features and languages in Parkvall 2008.

4.3.3. Biases inherited from Parkvall’s (2008) features and languages

Firstly the sampling bias in the Parkvall feature set can be determined by a simple inspection of
the features and their values across Creoles and non-Creoles. One crucial issue here is that the
43 Parkvall-based features in B&al are of very low diversity, contrary to Dunn’s (2008:716)
caveat to investigate “as many abstract structural features from as many parts of the grammar as
possible.” Indeed these features are by and large based on two sorts of distinctions: (i) overt
morphological markings for a relatively small set of morphosyntactic distinctions (e.g., for
gender, number, person, perfective, evidentiality); (ii) cardinality of various sets of signals (e.g.,
number of vowels and consonants, number of genders), forms (e.g., suppletive ordinals,
obligatory numeral classifiers) and “constructions” (e.g., passive, antipassive, applicative,
alienability distinction, difference between nominal and verbal conjunction). Such choice of
features ignores large domains of grammatical phenomena and are certainly not representative of

                                                                                                               
11
One might wonder whether morphologically complex conjugated forms of French verbs could have led us to
expect such forms in French-derived Creoles as well. But here one must stress that the varieties of French in the
colonial Caribbean would favor periphrastic over synthetic verbal forms, as argued by Chaudenson & Mufwene
(2001); see DeGraff 2005b for a case study.

  47  
the space of cross-linguistic variation (Kouwenberg 2010).

The sampling bias and low diversity problems in B&al’s selection of features are illustrated by
the following fact: out of the 43 features in the Parkvall-based study in B&al, there are 23
features that are such that no Creoles shows a “1” (i.e., “presence”) for said features. This fact
alone may suggest that the selected features may have been consciously or unconsciously
handpicked in order to produce the desirable result about a Creole “cluster.” As we will see
below (in Section 4.4), these features are not only biased, but B&al and D-M&B produce many
feature values that are empirically problematic and even contradictory in many cases. As a result,
their Creole typology is largely an artifact of both methodological and empirical errors. We can
picture the issue more clearly with the heatmap in (21) which translates B&al’s sampling biases
with a color coding: ‘warmer’ colors (orange to red shading) indicate languages that are closer
to each other (in terms of Hamming or number of 0/1 bit differences) while ‘cooler’ colors
(green to blue) indicate languages that are farther apart from one another it terms of Hamming
distance.

  48  
(21) Heatmap for pairwise similarities based on the Parkvall-based feature data in B&al.
The similarities among Creoles and Pidgins (“CP”) are located at the upper left corner of the map,
based on their number code (from “2” to “34”). “E” is Esperanto and “N” are languages in
B&al’s sample that are outside the “CP” set.

The similarity values represented by this heatmap are from the distance matrix that is computed
based on the feature values that B&al used from Parkvall 2008 (see B&al’s online appendix).
The heatmap plots similarity as 1 minus the NeighborNet distance for each language pair. The
red area at the upper left corner of the heatmap in (21) shows that the Creole languages in
B&al’s sample have the highest degree of pair-wise similarities as compared to other groupings
in their sample. In other words, the Creole languages in B&al’s sample are, not only much more
similar to each other as a set than they are to the non-Creoles as a set, but they are much more
similar to each other as a set than the non-Creoles are to the other non-Creoles as a set. A biased
selection of features in B&al could explain this pattern of similarities. Indeed there is no a priori

  49  
reason, independent of the selection of features, why Creole languages should be so similar to
one another, much more so than languages in any other sociohistorically, linguistically or
geographically relevant groupings in the sample, including Germanic, Romance, European
languages, Balkan languages, Niger-Congo languages, etc. The unanswered question is: Why
would the posited sociohistory of Creoles force them into such an extraordinarily tight
typological cluster? As we will see below (in Section 4.4), some of the values that create this
extraordinarily tight cluster are either inter-dependent or contradicted both by logic and by actual
data from Creole and non-Creole languages, including data from the very WALS database
(Dryer & Haspelmath 2011), which is listed as a source of data in B&al and D-M&B.

Another fact that is of importance in evaluating the sampling bias that B&al inherit from
Parkvall 2008 is that Parkvall’s features, most of which are taken from WALS, are often
logically or empirically interdependent (Kouwenberg 2010:371). This interdependence, which
skews B&al’s results, partly stems from the encoding of WALS multi-state features as binary
features in B&al’s study. Such translation (from multi-valued to binary features) introduces
extra weight to certain distinctions. Altogether B&al’s re-coding of WALS features introduces
errors that are far from trivial.12 But let us first consider some straightforward examples that will
illustrate some of the errors that exacerbate the alleged differences between the Creole vs. non-
Creole languages in B&al’s already biased sample.

Consider for example the following features and their multi-state vs. binary values in WALS vs.
B&al, respectively: WALS feature 23 “locus of marking the clause” has 5 values based on the
marking of the “direct object”: (i) head-marking; (ii) dependent-marking; (iii) double-marking;
(iv) no marking; (v) other. In B&al, WALS multi-valued feature 23 is recoded as two distinct
binary-valued features F06 “overt marking of direct object” and F07 “double marking of direct
object.” But here a “1” (i.e., “presence”) for F07 entails a “1” for F06. In other words, “double
marking of direct object” logically entails that the direct object is “overtly marked.”
Equivalently, a “0” (i.e., “absence”) for F06 entails a “0” for F07: if the direct object gets no
“overt marking,” then there cannot be “double marking of direct object.” As Kouwenberg
notices in her critique of Parkvall 2008, “[t]he inclusion of interdependent features gives

                                                                                                               
12
Another methodological problem introduced by the translation from multistate to binary features is the
introduction of homoplasies, thus the higher likelihood of chance fluctuation ,which, in turn, increase the likelihood
of erroneous clusters (Wichman & Saunders 2007: 401; also see see Nichols & Warnow 2008: 803, 806, 808).

  50  
languages an opportunity to ‘score’ twice (or more times), or to ‘fail to score’ twice (or more
times)” (2010:371).

Such double counting is widespread. Consider another example: B&al’s binary features F30
“Grammaticalized past/non-past” and F31 “Remoteness distinctions of past” are meant to
translate the single multi-valued feature 66 in WALS “The Past Tense.” The latter in WALS has
the following values: (i) “present, no remoteness distinctions”; (ii) “present, 2-3 remoteness
distinctions”; (iii) “present, 4 or more remoteness distinctions”; (iv) “No past tense.” Here
comes another redundancy factor in B&al’s distance calculation: a “0” for F30 (i.e., no past vs.
non-past distinction) entails a “0” for F31 as well (i.e., no remoteness distinctions of past”). In
other words, it is a tautology that there cannot be any remoteness distinction in past marking if
there is no past marking at all. Similarly, any remoteness distinction in the marking for past
implies that past is overtly marked.

F06/F07 and F30/F31 are among the 6 cases of inter-dependent pairs of binary features that re-
encode multi-valued features in WALS. In (22) we provide a list of such logically inter-
dependent features—12 features out of a total of 43, without counting cases of inter-dependency
that are due to cross-linguistic implication tendencies (see note 13). Altogether nearly half of
B&al’s features show some inter-dependency of one kind or another (we return to this issue in
Section 4.4 and in Appendices A and B).

(22) Logically inter-dependent binary features in B&al’s WALS-based study


(cf. Kouwenberg 2010:371f):13

a) F06/F07 (“overt marking of direct object”/“double marking of direct object”) for WALS
23 (“locus of marking the clause”). Inter-dependency: F07 = “1” entails F06 = “1.”
b) F08/F09 (“Possession by double marking”/“Overt possession marking”) for WALS 24
“Locus of Marking in Possessive Noun Phrases”). Inter-dependency: F08 = “1” entails
F09 = “1.”
c) F11/F13 (“Gender”/“Non-semantic gender assignment”) for WALS 30–32 (“Number of
Genders”/“Sex-based and Non-sex-based Gender Systems”/“Systems of Gender
                                                                                                               
13
These are only the features that are logically inter-dependent. Kouwenberg (2010:371) also notes certain
empirical correlations among Parkvall’s features along the lines of well documented cross-linguistic implicational
tendencies. For example: “F38 ‘non-neutral marking of full NPs’ and F39 ‘non-neutral marking of pronouns’: this
feature pertains to Case marking, and there is obvious interdependence between Case marking of different types of
nominals; in particular, it is unlikely for full NPs to be Case-marked unless pronouns are Case-marked too” (ibid).

  51  
Assignment”). Inter-dependency: F13 = “1” entails F11 = “1.”
d) F23/F24 (“Ordinals exist as a separate class beyond ‘first’”/“Suppletive ordinals beyond
‘first’”) for WALS 53 (“Ordinal Numerals”). Inter-dependency: F24 = “1” entails F23 =
“1.”
e) F30/F31 (“Grammaticalized past/non-past” /“Remoteness distinctions of past”) for
WALS 66 (“The Past Tense”). Inter-dependency: F31 = “1” entails F30 = “1.”
f) F36/F37 (“Evidentiality (grammatical)”/“Both indirect and direct evidentials”) for WALS
78/77 (“Coding of Evidentiality”/“Semantic Distinctions of Evidentiality”). Inter-
dependency: F37 = “1” entails F36 = “1”

Recall that such inter-dependency arbitrarily introduces extra weight in the distance calculation,
and arbitrarily exacerbates the distance between Creoles vs. non-Creoles, especially in light of
the biases in the selection of features.14 Matters are made even worse when such features are
assigned values that are contradicted by the actual data (Fon Sing & Leoue 2012), thus
introducing additional mistakes in the computation of a “Creole cluster” (in this case, the
clustering is partly based either on the double-counting of certain values or on the assignment of
erroneous values or on both mistakes combined). We now turn to examples of such logical and
empirical errors, including (i) erroneous feature values in well studied languages such as French,
Spanish, English, German and Swedish; (ii) logically impossible combination of feature values
with (e.g.) certain languages being assigned the mutually-contradicting characteristics of
simultaneous showing “no overt marking on direct object” and “double marking of direct object.”
Let us look at these errors in turn, alongside their consequences for B&al’s claims about a Creole
typology.

                                                                                                               
14
Not all multiple-value features in WALS are translated into multiple binary features in Bakker et al. Contrast, say,
features 23 and 66 in WALS, each of which is encoded as two features in Bakker et al (F06/F07 and F30/F31,
respectively, as explained in main text) vs. feature 99 in WALS “Alignment of Case Marking of Pronouns” which is
translated as one single feature in Bakker et al, namely F39. As far as we have been able to ascertain, Bakker et al
does not explain their algorithm for the translation of individual multi-valued features in WALS into (multiple)
binary features. Since no actual linguistic example is mentioned or cited in Bakker et al 2011, we cannot tell how
the feature values were determined. This becomes all the more problematic in light of the empirical errors discussed
in the main text.

  52  
4.4. Empirical problems in Creole Typology Hypothesis: Clusters based on
interdependent features and erroneous feature values, including empirically
problematic values and logically contradictory values15

4.4.1. Logically contradictory feature values

Let us consider again B&al’s features F06 “overt marking of direct object” and F07 “double
marking of direct object.” It is logically impossible for any language to simultaneously exhibit
“double marking of direct object” and no “overt marking of direct object” (i.e., the combination
F06 = “0” and F07 = “1” is a logical impossibility). Yet, this is exactly what we find in B&al’s
data (their online appendix), not for one, but for the following seven languages, with WALS
code in parentheses: Amele (ame), Chukchi (chk), Ewe (ewe), Georgian (geo), Goonyandi (goo),
Diola-Fogny (dio), and Slave (sla).

A similar logical contradiction involves features F11 “Gender” and F13 “Non-semantic gender
assignment.” The presence of the latter (F13 = “1”) entails the presence of the former (F11 =
“1”), yet we find languages like Guaraní (gua), Khalkha (kha), Meithei (mei), Maricopa (mar),
Zuni (zun), that are described by B&al as having “non-semantic gender assignment” while
simultaneously lacking “gender”—another logical impossibility. Also problematic are languages
described as having “non-semantic gender assignment” but with “unknown value” (i.e., a
question mark “?”) for “gender”: Ladhaki (lad), Mundari (mun), Hunzbi (hzb) and Ingush (ing)
are such languages. Indeed, once we know that a language as “non-semantic gender assignment,”
we also know that it has “gender” assignment. In other words, F13 = “1” and F11 = “?” is
another logical contradiction.

We find similar contradictions for almost every other combination of inter-dependent features.
Out of the 188 languages in the Parkvall-based dataset used in B&al, 152 have at least one
inconsistent pair, for a total of 215 inconsistent pairs (see data in Appendix A). Out of the 5
pairs of inter-dependent binary features, only one pair (F09/F10 “Possession by double
marking”/“Overt possession marking”) shows a logically consistent array of markings across all
the languages in the sample. Such widespread combinations of erroneous feature values—
combinations that either contradict each other or contradict WALS values—undermine the
                                                                                                               
15
All the feature values here are taken from the online Appendix 6 to Bakker et al 2011:
http://aal.au.dk/fileadmin/www.aal.au.dk/lingvistik/afdelingen/bakkeretal2011JPCLappendix.pdf

  53  
reliability of B&al’s results (cf. Fon Sing & Leoue 2012).

4.4.2. The statistics of Parkvall’s (2008) interdependent features

As we have seen a few times now through concrete examples, one major flaw in Parkvall 2008,
B&al and D-M&B concerns the choice of linguistic features, especially the issue of
interdependent features. We now discuss this problem from a more abstract—and strictly
mathematical and computational—perspective.

As a general rule, the more dependent a set of features is, the easier it is to find sets of languages
that share subsets of those features. Holman (2008) found that there is a significant amount of
dependence in WALS features. Holman uses the adjusted Rand index (Hubert & Arabie 1985),
henceforth “ARI.” The original Rand index for two languages with respect to a set of features F
is the fraction of mutually attested pairs of features in F for which the two features agree, where
agreement is defined as both features having either the same value in both languages or different
values in both languages. The ARI adjusts for chance agreement so that it has an expected value
of 0 for a set of independent features, a maximum of 1 for features that agree perfectly, and a
minimum of -1. From the 2,560 languages and 138 features of the WALS database, Holman
constructs a slightly reduced version with 2,488 languages and 130 features.16 This version of
WALS has an ARI of 0.0161, which indicates a significantly higher dependence than the ideal
value of 0. Holman then reduces the 130 selected WALS features to just 47 “approximately
independent” features, by successively eliminating the most highly dependent remaining features
until the average ARI reaches the desired 0.17

                                                                                                               
16
“The present study excludes the four features with redundant data, and also the four features referring to color
terms, which are attested for a sample of languages that does not overlap enough with the rest to allow reliable
comparisons; thus, 130 features are compared. The analyses are based on 2488 languages, excluding pidgins, creoles,
and sign languages” (Holman 2008:216). ). (The last sentence is yet another instance of Creole Exceptionalism
among historical linguists (cf. note 5).)
17
It is worth noting that Holman’s definition of “approximately independent” as having a mean ARI of at most 0
does not preclude the possibility of significant dependencies within the features. Therefore, “approximately
independent” is a somewhat misleading term. In addition, Holman makes only a minimal attempt to demonstrate the
optimality of the features he identifies as “approximately independent.” The only evidence provided to “whether a
better procedure would find more than 47 independent features” is that two additional tested procedures did not find
an approximately independent set of features of a larger size, a highly unsatisfactory answer. However, as Holman’s
47 features have significantly less dependence than do the WALS features overall, we use them in this paper as a
proxy for independent features, in lieu of a more rigorously identified set of “approximately independent” features.

  54  
How independent are the features in the dataset that B&al and D-M&B use from Parkvall 2008?
This dataset contains 188 languages and 43 features. The mean ARI of this dataset is 0.0231,
significantly above 0, indicating high dependence. We applied Holman’s algorithm to this
dataset to find a subset of features with average ARI of at most 0, resulting in 28 approximately
independent features. If we also eliminate the 10 features with logically inconsistent values in
this dataset based on our earlier incompatibility argument (F06, F07, F11, F13, F23, F24, F30,
F31, F36, F37), the ARI of the resulting 33 features remains at 0.0231 but now Holman’s
algorithm run on these 33 features yields a set of features of size 22, around half the size of the
original set of 43.

For the remaining approximately independent sets of features, while the Creoles are still more
similar to each other than to the other languages based on the normalized Hamming distance, this
difference in similarity is significantly reduced. (Here, we define similarity as one minus the
normalized Hamming distance.) For example, the mean similarity among Creoles is 48% higher
than the mean similarity between Creoles and non-Creoles across the whole set of languages, but
only 28% higher when the logically inconsistent and dependent features are eliminated. While
the feature reduction strategy here does not itself unilaterally invalidate the conclusions drawn in
Parkvall 2008, B&al and D-M&B, it does seriously call them into question. Their conclusions
are further undermined by the scope and magnitude of the empirical and conceptual errors in
B&al as illustrated throughout this paper.

(23) Table of mean similarities:

Mean similarity

Exclude inconsistent? Exclude dependent? (All,All) (N,N) (C,N) (C,C)

No No 0.6332 0.6280 0.6139 0.9056

No Yes 0.6645 0.6484 0.6745 0.8793

Yes No 0.6361 0.6273 0.6249 0.9063

Yes Yes 0.6690 0.6495 0.6879 0.8777

  55  
Sets of languages are abbreviated as “All” (all languages), “C” (Creoles and Pidgins), and “N”
(languages other than Creoles, Pidgins, and Esperanto). Mean similarity for a pair of language
sets (A,B) is calculated as the mean similarity score of all language pairs (a,b) where a is in A
and b is in B.

(24) Heatmaps for mean similarities: In these heatmaps, Pidgins and Creoles are, again,
located at the upper left corner. The upper left heatmap uses the original 43 features;
dependencies are excluded in the lower left for a total of 28 features; logical inconsistencies are
excluded in the upper right for a total of 33 features; and logical inconsistencies and (then)
dependencies are excluded in the lower right for a total of 22 features.

Note the degradation in the uniform red color to orange when inconsistent and dependent feature
values are removed, in the lower right heatmap, as compared to the heatmap for the original
pairwise similarities, displayed in the upper left heatmap. This difference, thus, suggests that the

  56  
inconsistent and dependent features in B&al’s analyses play a role in making the Creole
languages in their sample look more similar to one another than they actually are.

4.4.3. Empirically problematic feature values and their problematic treatment in


B&al’s clustering calculations

We will now analyze some of the ways in which missing data and erroneous feature values in
B&al directly affect the validity of their claims about a Creole typology.

One of the basic CTH claims is that the commonalities in the structural profiles of Creole
languages reflect their Pidgin ancestry—such common structures, it is claimed, cannot be taken
to derive from the Creoles’ source languages. In B&al, this claim is based, among other things,
on NeighborNet networks in which the sampled Creole languages constitute a cluster that is
clearly distinct from other languages. But our ongoing observations about the erroneous data in
B&al warrant skepticism about the interpretation of such networks.

One way to further test CTH is to look more carefully at the NeighborNet output networks that
B&al’s data produce for specific subsets of languages. These tests become even more
enlightening when we run NeighborNet on some of the languages that belong to the families
whose members participated in Creole formation. Even from B&al’s own figures and comments,
we already know that NeighborNet’s networks do not reproduce phylogenetic information about
well-established language families: in one of the aforementioned networks in B&al, we find that
“Basque (isolate, Western Europe), Hindi (Indo-European, India/South Asia), Burushaski
(isolate, North Pakistan), and Hunzib (East-Caucasian, Caucasus) cluster [together]” based on
the features taken from Parkvall 2008 (B&al 2012:33). We will next examine the degree to
which such phylogenetically anomalous clusters are artifacts of the specific data in B&al,
including the low diversity of their feature samples, their missing data and their erroneous
feature values.

4.4.3.1. The problem of missing data

Improperly handling missing data can lead to incorrect conclusions and violate experimental
reproducibility. Therefore careful attention to missing data is a critical component of any
statistical analysis. As a result, over the past few decades a substantial literature has arisen on

  57  
this topic (e.g., Engers 2010, Little and Rubin 2002, Allison 2002, Schafer and Graham 2002,
Rubin 1976), and this is a familiar problem in phylogenetic and cladistics analysis generally.
There are many reasons for caution when dealing with missing data. One such reason is that the
distribution of missing data may be different from that of observed data. If, for example, the
average height of a population is estimated on the basis of a survey, and younger people (who
tend to be shorter) have a lower response rate, the result may overestimate average height. In
general, care must be taken in any analysis to minimize the probability that missing data, once
populated, would lead to unanticipated conclusions. Below we’ll see an actual example where
missing values for certain features skew B&al’s results in favor of their CTH. This actual
example (about reduplication) is actually even more intriguing because of the fact that these are
features whose values are well known in Creole studies—and have been written about in
previous papers by Bakker and Parkvall.

Meanwhile the way NeighborNet deals with missing data can be troubling for linguistic
typological analysis. The default definition of distance (Hamming distance divided by the
number of attested features) is not particularly sensitive to biases (read: non-uniformity) in the
loci of missing data, especially when there is a substantial volume of missing data, as is the case
with WALS. Worse still, any biases in missing data may artificially create or magnify splits.
For example, if there is little overlap between the missing features in Creoles vs. the missing
features in non-Creoles or if the features on which Creoles and non-Creoles would agree are
missing more often than those on which they disagree, then this could result in spurious splits or
in splits of greater magnitude between Creoles and non-Creoles. In fact, it turns out that the
dataset from Parkvall 2008 upon which B&al and D-M&B heavily rely has a highly non-uniform
distribution of missing data loci, as displayed in (25). For example, features ‘F35’ and ‘F36’ are
present for all Creoles and Pidgins but are missing in approximately 74% of languages other than
Creoles, Pidgins, and Esperanto. (See Appendix C for the statistical details on these missing
features. These details are especially helpful for the feature numbers that overlap in the bottom-
left corner of the graph.)

  58  
(25) Features from Parkvall 2008 plotted by identification code (with “F” prefix and
leading zeros removed). Multiple features at the same coordinates are separated by
commas:

Populating the missing data could easily result in a different picture. One basic question here is:
Is it the case that “the missing data are missing at random or the observed data are observed at
random.” Only in this case is it appropriate for our inferences to ignore the process that causes
the data to go missing, and even then such abstraction would only be appropriate for certain
classes of statistical problems (Rubin 1976:582). Rubin also advises us to “explicitly consider
the process that causes missing data.”

In the case of B&al and D-M&B, the missing data do not seem uniformly randomly distributed,
as hinted at by (25). As a concrete example, consider the feature “Reduplication” (i.e., F10 =

  59  
WALS 27). B&al assign F10 = “1” to only 2 Creoles (namely Kinubi and Fanakalo) and they
assign F10 = “?” (unknown values) to the majority of the Pidgins and Creoles in their sample.
Yet productive reduplication is well documented in a variety of Creole languages (Kouwenberg
2003), and we even find references to such documentation in Bakker & Parkvall 2005 (with
mention of “reduplication [as] fairly common in Creoles, possibly more so than in languages in
general” p. 511) and in Plag 2009 (with mention of “a wide range of reduplication processes
across a wide range of creoles,” p. 357). When it comes to languages outside their Pidgin/Creole
sample, B&al find productive reduplication in European languages such as English, French,
German and Spanish even though WALS indicates that such languages have no productive
reduplication, a point that is confirmed in Plag 2009:358 (see note 21 below for related
contradictions). This is one case where the process that causes data to go missing accentuates
the split between Creole vs. non-Creoles at the expense of well-documented facts.

To further illustrate the pitfalls of interpreting the output from NeighborNet given missing data
as in B&al and D-M&B 2012, we provide a simple fictitious example with eight languages and
eight features (see (26)). With no missing data, there are no obvious clusters in this dataset other
than four pairs of languages. We then omit data in two different ways, resulting in two different
patterns of clusters. In this case, the NeighborNet algorithm incorrectly groups these languages
into distinct clusters that contradict the actual featural make-up of these languages. Such
erroneous clusters are due exclusively to missing data. Two distinct arrays of missing data
produce two distinct conclusions. These examples were constructed using a 25% rate of missing
data, which is approximately the same as the missing data rate (approximately 22%) in the B&al
data.

  60  
(26) An example of inconsistent interpretation of missing data

Data Missing NeighborNet output Clusters

a1: 11111111

a2: 11111111

a3: 11110000 {a1, a2}

a4: 11110000 {a3, a4}


None
b5: 00001111 {b5, b6}

b6: 00001111 {b7, b8}

b7: 00000000

b8: 00000000

a1: 1111??11

a2: 111111??

a3: 1111?0?0

a4: 11110?0? {a1, a2, a3, a4}


Scenario 1
b5: 0000?11? {b5, b6, b7, b8}

b6: 00001?1?

b7: 00000??0

b8: 0000?0?0

  61  
a1: ??111111

a2: ?1?11111

a3: 1?1?0000

a4: 1??10000 {a1, a2, b5, b6}


Scenario 2
b5: ?00?1111 {a3, a4, b7, b8}

b6: ?0?01111

b7: 0??00000

b8: ?00?0000

The large volume of missing data in B&al also creates questions about the significance of cluster
results. The figure in (27) plots mean values by feature for Creoles and Pidgins and for
languages other than Creoles, Pidgins, and Esperanto (respectively “Creoles” and “non-Creoles”
for the remainder of this paragraph), with accompanying error bars, based on the standard error
of estimate (see Appendix C for all details). Features with equivalent mean values for Creoles
and non-Creoles would be expected to fall on the diagonal line, and rectangles are drawn such
that a feature whose rectangle intersects the diagonal line is not statistically distinguishable from
a feature with equivalent mean values; this applies to only 8 features (F9, F10, F14, F15, F27,
F28, F29, and F39). As the figure in (27) also makes apparent, many features are identical for all
Creoles where they are completely populated (comprising 23 features in all: F06, F07, F08, F13,
F16, F17, F19, F22, F31, F35, F36, F38, F41, F42, F43, F44, F45, F46, F47, F48, F49, F50, and
F52), whereas there is not one single feature on which all non-Creoles agree. This suggests
additional bias in the selection of features in B&al. Of the remaining twelve features (43 total
features minus 8 statistically indistinguishable features minus the 23 features that may have been
selected on some a priori basis), several have large standard errors, particularly F24, F25, and
F34,18 indicating the small numbers of Creoles for which they are populated (they are populated
for 5, 4, and 8 Creoles, respectively). While a multi-factor analysis, as done by NeighborNet,
                                                                                                               
18
F24: Suppletive ordinals beyond ‘first’. F25: Obligatory numeral classifiers. F34: Morphological imperative.

  62  
may pick up patterns that only arise through feature combinations, this discussion of features
considered individually sheds serious doubt on the quality of the data in Parkvall 2008, B&al and
D-M&B.

(27) Mean values per feature for Creoles/Pidgins vs. other languages in B&al
(cf. Appendix C):

Features from Parkvall 2008 plotted by identification code (with “F” prefix and leading zeros
removed). The x-coordinates of a feature identifier is its mean value for languages other than
Creoles, Pidgins, and Esperanto (where presence = 1 and absence = 0) and the y-coordinate is its
mean value for Pidgins and Creoles. The surrounding rectangle at which an identifier is centered
indicates standard error bars: the width is two times the standard error for languages other than
Creoles, Pidgins, and Esperanto and the height is two times the standard error for Creoles and
Pidgins. Standard error is defined as the sample standard deviation divided by the square root of
population size (in this case the number of languages for which a feature is attested).

  63  
4.5. The latest version of the Creole Typology Hypothesis: A typology based on 9,
then 6, then 4 features (Daval-Markussen & Bakker 2012)

Much of our critique of B&al applies to D-M&B: both papers enlist the narrowly sampled inter-
dependent feature sets from Hancock 1987, Holm & Patrick 2007 and Parkvall 2008. Recall that
these features were not randomly sampled and so do not neutrally represent the space of cross-
linguistic variation. D-M&B (2012:93) recognize both the biased nature of the CCS features and
the inability of SplitsTree to reliably detect phylogenetic signals based on such feature data:
“[T]he software was able to detect a clear phylogenetic signal in only a few cases, which in itself
is not surprising, since the features were originally selected as representative of the Atlantic
creoles (Holm and Patrick 2007: vi).” Yet, D-M&B (2011:36) proceed to claim that clusters that
are inferred from such biased features support the hypothesis that Creoles emerged from

  64  
radically reduced Pidgins.

The problem posed by inter-dependent features that were explicitly selected because of their
likely “Creole” character can be illustrated most readily by D-M&B’s comparison of the 18
Creoles in CCS with 7 European lexifiers—Arabic (aeg), Assamese (ass), Dutch (dut), English
(eng), French (fre), Portuguese (por) and Spanish (spa):

(28) D-M&B’s NeighborNet output network for 18 creoles and 7 lexifiers, with additional
information about “supporting characters”

Here the two features that can be selected to divide the overall graph into two disjoint subgraphs
are in positions 1 and 5, that is, “1.1 Statives with non-past reference” and “1.4 Non-statives with
past reference.” Both are CCS features, from Holm & Patrick 2007. However, these features’
definitions crucially depend on a specific morphosyntactic and interpretative pattern that is found
in many Atlantic Creoles and Niger-Congo substrates, namely the co-occurrence of “unmarked
verbs” with non-affixal markers of tense, mood and aspect (TMA): “Unmarked verbs: In the
Atlantic Creoles, verbs generally indicate tense and aspect not with inflections but rather with
preverbal (in some cases postverbal) markers.” This seems to be a reflex of analyticity of the
sort found in the Atlantic Creoles’ Niger-Congo substrates, especially Kwa. As for the
interpretive contrast between unmarked stative verbs as non-past and unmarked non-stative verb
as past, it is related to the well- known “factative effect” which is quite common in Niger-Congo,
especially Kwa (Welmers 1973:346–348, Déchaine 1991, Nurse et al 2010).

  65  
Recall, again, that the CCS features are far from a representative sample of cross-linguistic
diversity: of the 97 CCS features, there are some 40 features that are related to TMA markers and
to the verbal system, including the features in the first 27 positions in D-M&B’s input file.
These features are set up in such a way that they will tend to receive a “0” for any language that
does not fit the aforementioned Atlantic Creole morphosyntactic (analytic) and interpretive
patterns, many of which show correspondences with the European superstrates or Niger-Congo
substrates (see DeGraff 2005b for case studies). Take a language like (Standard) French for
example, which does not have a strictly analytic preverbal TMA system as is found in, say,
Haitian Creole. It is taken to score a “0” on the features in columns 1 to 27 of D-M&B’s CCS-
based input files. Ditto for Dutch, English, French, Portuguese and Spanish. This is ‘double-
scoring’ to an extreme (e.g., double-scoring of analyticity in the TMA cum verbal system), and
an extreme case of sampling bias, which seems designed to create the observed split. Indeed the
split in (28) is, by and large, based on inter-dependent features like the ones related to the
interpretation of invariant verbal forms alongside analytic TMA markers. Yet, D-M&B conclude
from such biased data points that “the superstrates have had a rather limited influence on the
grammatical makeup of the incipient creoles at the time of restructuring” (2012:91). What D-
M&B do not mention is the fact that these very TMA markers in Atlantic Creoles are all derived
from cognates in Germanic or Romance periphrastic verbal constructions, neither do they
mention the Creole-substrate correspondences in this domain (more on this below).

We now proceed to test D-M&B’s claim that Creoles cluster away from their substrates. When
we ran NeighborNet on their input file with 18 Creoles and 19 substrates, the software did not
output any single “supporting character” for the Creole vs. Substrate clusters. So we used the
“Select Taxa” in SplitsTree program to trim the input file to focus exclusively on the Atlantic
Creole and their Niger-Congo substrates:

  66  
(29) The command “Select Taxa” in the SplitsTree program menu allows the selection of
Atlantic Creoles and their source languages from D-M&B’s feature data:

Once we selected the Atlantic Creoles from D-M&B’s dataset, we obtained the following splits
with one single “supporting character” in column 17, a feature with value “0” in the Niger-
Congo substrates (except for a “?” for Fongbe) and value “1” in the Atlantic Creoles:

(30) NeighborNet output network for 12 Atlantic Creoles and their substrate languages,
based on D-M&B’s feature data:

  67  
The feature in column 17 is CCS’s “4.4 Anterior plus habitual”—that is, a habitual marker co-
occurs with a tense marker to indicate a past habit. D-M&B do not give their sources for the
assignment of feature values to the non-Creole languages, and WALS itself does not provide
direct information about the marking of habituality. But, contrary to D-M&B, it is not the case
that all the Niger-Congo languages in (30) lack the combination tense + habitual. Counter-
examples include Akan and Wolof (Dahl 1985:100). So here too we have a clustering that seems
a spurious artifact of unreliable data.

Such empirical problems seem to affect practically every clustering in D-M&B.

The punchline in D-M&B takes the logic in B&al, as summarized in (19) to an extreme
conclusion: they assert an “irrefutable” Creole Typology based on fewer and fewer features, all
the way down to 4 features. At this point, we have reached the most biased sample to-date,
namely “the 4 features which were shared by at least 80% of the CCS creoles” (2012:95).19 Such
Creole typologies based on single-digit numbers of features flout some of the aforementioned
basic heuristics in computational phylogenetics. Recall various caveats to the effect that the
charge of ‘hand-picking’ of features can only be avoided by the inclusion of the widest possible
range of non-interdependent features (Wichmann & Saunders 2007, Nichols & Warnow 2008,
Dunn et al 2008). Furthermore, invoking sophisticated phylogenetic tools like NeighborNet
(alternatively, ordinary Neighbor Joining), which are designed to handle large language-feature
matrices, seems unwarranted for any demonstration that Creoles are exceptional on the basis of
such small sets of features.

Lastly, the bias problem is compounded by the empirical and conceptual failings in D-M&B,
which are on a par with those in B&al. Consider, say, the four WALS features of the Creole
typology in D-M&B 2012:95 as computed for the 18 Creoles in CCS:

(31) WALS features that are shared by at least 80% of the CCS Creoles, per D-M&B:

a) 38A with value 2: Indefinite article = one (18 out of 18)


b) 69A with value 5: No tense-aspect inflection (18 out of 18)
c) 112A with value 6 (double negation) for Angolar and Palenquero and value 2 (negative
particle) for the other 16 Creoles
                                                                                                               
19
And even down to 2 features in Daval-Markussen’s most recent, unpublished work (i.e., his recent presentations at
the LSA in January 2012 and Aarhus, Denmark, in April 2012).

  68  
d) 117A with value 1 (predicate possession: locational) for Berbice Dutch and Krio, and
value 5 (predicate possession = have) for the other 16 Creoles

D-M&V take these 4 features to reveal “creole similarities [...] to be explained by restructuring
universals” (2012:91). In B&al, these similarities are taken to emerge from reduced Pidgins qua
“simplified forms of inter-ethnic communication” (2011:36). But is this the case? We will now
go through each of the features in (31) and show that these features do not, and could not,
support D-M&B’s Creole-formation scenario whereby Creole features are, by and large,
independent of their superstrate and substrate languages.

Consider WALS 38A with value 2 (indefinite article = one) for 18 out of 18 Creoles: This is
perhaps the most surprising feature in the whole set because it contradicts one feature that was
taken in B&al to also characterize every Pidgin or Creole language in one of their samples,
namely the absence of indefinite articles. Any reader who has taken time and patience to look
under the hood of the complicated graphs and feature data in B&al will perhaps recall that F16
“Indefinite article” is marked “0” for every single Pidgin or Creole in B&al’s Parkvall-based
study, even though both Hancock 1987 and Holm & Patrick 2007 describes indefinite articles in
some Creoles. But in D-M&B (published one year after B&al) it is claimed that all Creoles have
an indefinite article homophonous with one. In other words, the value of F16 when from being
“1” for some Creoles to “0” for all Creoles in B&al’s 2011 study (which makes that study
internally inconsistent), and then F16 is “1” for all Creoles in D-M&B’s 2012 study.

What are the facts? According to WALS, the use of the numeral for one as indefinite article is
the most popular pattern among languages that have indefinite articles, and it is found among
most of the Atlantic Creoles’ lexifier languages: French, Portuguese and Spanish. Furthermore,
the grammaticalization of the cardinal one into indefinite articles is a well-known case of
grammaticalization, which does not require the existence of any prior pidgin, as in the history of
Germanic and Romance (Lehmann 2002:46).

Let us now turn to the claim that WALS’s feature 69A has value 5 (no tense-aspect inflection)
for 18 out of 18 Creoles: This claim is empirically contradicted by Portuguese-lexifier Creoles
such as Capeverdean Creole and Korlai (Holm & Patrick 2007, Holm 2008, Luís 2008).20 As

                                                                                                               
20
Here, D-M&B may counter-argue that Capeverdean Creole also shows preverbal TMA markers, and should still
be given value “5.” But consider Indonesian as described in WALS: it only has “one suffix that occurs on a number

  69  
explained by Holm 2008:319f it is not surprising that most Altantic Creoles would show few
tense-aspect affixes since their source languages are “partially inflected superstrates and largely
non-inflected, isolating substrate languages ... the complete loss of inflectional morphology is
not an inherent part of the process of creolization.”

How about WALS 112A with value 2 (negative particle) for 16 out of 18 Creoles? According to
WALS, this is, by far, the most frequent pattern for negation marking in their sample—502
languages out of a total of 1,159. And it is widespread throughout Atlantic Creoles’ source
languages: Germanic, Romance and Niger-Congo. Here too, there is no need to appeal to
restructuring universals.

WALS 117A with value 5 (predicate possession = have) for 16 out of 18 Creoles: According to
WALS, this is too is the most frequent pattern in the sample—63 languages out of a total of 240.
And it too is found across Germanic, Romance and Niger-Congo.

In short, D-M&B “Creole typology,” like B&al’s, seems to arise as an artifact of a variety of
empirical and methodological errors, in addition to the conceptual errors discussed throughout
this paper.21 Furthermore, Appendix C shows that there are some 1022 subsets of languages in
WALS with 4 feature values in common. The maxium size of such sets is 146, as compared to
the 18 Creoles in D-M&B. Applying D-M&B’s logic to these data, one could thus conclude that
WALS contains 1022 “exceptional” typologies based on 4 common feature values. This makes
such “exceptionality” a rather banale phenomenon. Perhaps it is ironic to point out that, given
our results so far, the state of affairs that would make the set of 18 Creoles in D-M&B truly

                                                                                                                                                                                                                                                                                                                                                                   
of transitive verbs to indicate repetitiveness or thoroughness” (Dryer 2001), yet because of that single suffix, it is
given the feature 2 (“tense-aspect suffixes”) and so should Korlai and Capeverdean Creole. Another counter-
argument would be that tense/aspect suffixes would be post-creolization developments (i.e., not present at the birth
of these Creoles). Luís (2008:106f) argues explicitly against this hypothesis with evidence from the structure and
history of Indo-Portuguese Creoles such as Korlai (also see footnote 24 below). Luís shows how the formation of
Indo-Portuguese Creoles does enlist key features from the specific languages in contact, contrary to Bakker et al and
D-M&B. Luís’s arguments also apply against “Creole simplicity” claims, as in McWhorter 2011:20, to the effect
that Creoles, because of some hypothetical Pidgin ancestry, lack inflectional affixes.
21
B&al and D-M&B are riddled with empirical contradictions. Here are three more: According to B&al’s, 2 out of
the 34 Creoles in their Parkvall-based sample have definite articles. This contradicts the data from both Hancock
1987 and Holm & Patrick 2007 that B&al use in their own studies in the early part of the paper. We find similar
contradictions in the Creoles’ values for the feature passive (F41 = WALS 107): none of the Creoles has “1” for this
feature, but CCS documents such a construction in 8 Creoles: in Capeverdean Creole, Guinee Bissau Creole, Haitian
Creole, Jamaican Patwa, Nubi Arabic, Nagamese, Papiamentu and Seychellois. As for reduplication (F10 = WALS
27), here too B&al contradict related findings both in WALS and in Creole studies, including their own (e.g.,
Bakker & Parkvall 2005; see Section 4.4.3.1 in main text for further discussion of this point).

  70  
“exceptional” is an hypothetical situation in which they, unlike the 1022 aforementioned subsets,
did not share any 4 features in common!

4.6. Against quantitative exceptionalism in Creole studies

More broadly, in our view it is troublesome that B&al’s and D-M&B’s argument that Creoles
belong to an exceptional class C is based merely on finding one or more examples of subsets C0
⊂ C and N0 ⊂ N(C), as in (19), such that the languages in C0 are closer to each other than to the
languages in N0 given some hand-picked features F0 of smaller and smaller sizes, down to 4
features in D-M&B. As we noted earlier, this may be sufficient to show that certain languages
are probabilistically distinguishable based on F0, in the same way that someone who is tall and
strong and athletic (i.e., distinguishable based on height and strength and fitness level) is more
likely to be a basketball player than a (uniformly) randomly selected member of the population.
But is this sufficient to show exceptionality? With enough languages and features from which to
draw, distinguishability can be demonstrated for an astoundingly large number of other sets of
languages as well, outside of well-established typological and phylogenetic groupings, especially
when the size of F0 goes down to single digits, as in D-M&B. Here the statistical flaw in D-
M&B’s argument is that a procedure that compares too few features across too many languages
inevitably creates spurious “typologies” (i.e., equivalence classes with little typological or
phylogenetic significance): for example, D-M&B’s 4 features put Albanian, Dutch, English,
Lakota and Yaqui in the same equivalence class. As mentioned above, there are some 1022 such
equivalence classes with 4 common feature values.

The algorithms for heuristic and probabilistic search alongside the data in Appendix B further
illustrate statistical flaws in B&al and D-M&V’s argument by mechanically identifying many
other sets of languages, other than Creoles, that share 9, 6 or 4 feature values in common—
features that are not usually taken to define any typological or phylogenetic groupings. The fact
that many other sets of languages, outside well-established phylogenetic or typological groupings,
have 9, 6 and 4 features in common, deprives D-M&B’s use of term “exceptional” of its
empirical bite. It’s also striking that the sets of languages that share 9, 6, and 4 features in
common are much larger than the set of Creole languages in D-M&B.

  71  
We conclude that significant statistical care must be taken in the application of phylogenetic
tools to understanding the typological relationships among languages. Daval-Markussen and
Bakker (2012) claim to have “shown that the application of phylogenetic tools can help shed new
light on the typological relationships between languages.” However, the presence of significant
flaws in the paper suggests that extreme care must be taken in performing such analysis. The
data and methodology in B&al and Parkvall 2008 and the examples we present in the current
chapter are further evidence that an increased attention to critical statistical inquiry should be
more central to typological analyses in Creolistics.

This call to action is motivated by B&al’s and D-M&B’s “quantitative exceptionalism,” which
we can define as “the widespread and often harmful belief that insights reached via quantitative
means form an exceptional class” with respect to the standard of rigor to which they are held in
and beyond the scholarly literature (Bass 2012).

4.7. Other conceptual issues in determining typological vs. phylogenetic


relatedness: Structural vs. lexical features? Stable vs. unstable features?

One underlying assumption of the CTH, one that has do with diachroncy, is that Creoles do not
belong to the language families of any of their superstrate and substrate languages.22 This
assumption seems based on the belief that, given their structural features, Creoles typologically
cluster with one another, instead of their superstrates. Here it is important to note the jump from
typology to diachrony in B&al’s analysis, and recurrent slippages therein between the two
dimensions of analysis, even though B&al’s paper is cast, in the title, as an argument about
Creole typology.

B&al offer the following rationale for their exclusive preference of structural features over
lexical features in determing genealogical classification: “Creoles typically show lexical
continuity with their lexifiers, but only limited continuity in their structural make-up, making it
strictly seen impossible to consider a creole language a genetic descendant of its lexifier”
(2011:14). But one should ask: Is such limited continuity with a (potential) ancestor language
                                                                                                               
22
But this assumption seems somewhat ambivalent since B&al also write: “... any cross-linguistic study of language
similarities across language families needs to rely on structural rather than lexical data” (2011:15; emphasis added).
Here there is an implicit acknowledgment that the languages under study (i.e., Creoles) are spread “across [distinct]
language families,” thus the need for structural data for cross-Creole comparison.

  72  
also considered when determining whether French, say, is a genetic descendant of Latin?

As it turns out, it is on similar grounds of “limited continuity [with their lexifier] in their
structural make-up” that Creoles are excluded from the Comparative Method by Taylor 1956,
Thomason & Kaufman 1988, etc. (cf. note 5). But, unlike Taylor and Thomason & Kaufman,
B&al evokes “objective” algorithmic approaches to phylogeny (p12). Yet they do not offer any
“objective” operational measure to decide when “limited continuity” is limited enough to
exclude genealogical affiliation. Consider the fact that Romance languages as well show
“limited continuity” vis-à-vis their Latin ancestor, as already noted, long ago, by Meillet
(consider, e.g., word order in Latin vs. Romance, as mentioned in Dunn et al 2008: 716 with
reference to Harris & Campbell 1995: 230). Would such “limited continuity” from Latin to
Romance make it “impossible to consider a [Romance] language a genetic descendant of
[Latin]”?

The application of computational phylogenetic methods requires a prior step of sensitivity


analysis in order to ascertain the sources of uncertainty within the various inputs to the relevant
calculations—so that one check the validity of one’s assumptions—in this case, the exclusive use
of structural-typological features. In B&al, this sensitivity-analysis step is carried with
Hancock’s 1987 data samples from 33 English-based Creoles, all with Niger-Congo substrates
except for two (Hawaiian Creole ad Norfolk Island Creole). The objective of this sensitivity
analysis was to check whether lexical data and structural-typological data would produce similar
results in NeighborNets (B&al:19), and it was concluded from that lone study that typological
data alone would be sufficient to establish reliable groupings among Creole languages: “We
conclude that structural features may be safely used for evolutionary studies (cf. Dunn et al
2005), even though often only lexical-formal data have been used in most classifications”
(B&al:19). But this is problematic in light of the fact that this sensitivity-analysis step was
applied on Creoles that all share the same lexical stock, namely English. Since all the Creoles in
the Hancock sample have lexica all derived from English, the structural distances across these
Creoles are likely to outweigh the lexical distances. Therefore one would not expect the lexical
data alone to make any substantial difference to the classification results. Of course, a different
situation obtains among Creoles with distinct lexifiers, and the sensitivity-analysis carried out
with Creoles with the same lexifier cannot be taken as a gauge for Creoles with distinct lexifiers.

  73  
There are other reasons to be suspicious of B&al’s exclusive use of structural-typological
features: As we already discussed, the structural-typological features that are enlisted in B&al
are themselves biased and cannot be taken to reliably establish genealogical classification. But
there is another issue—a possibly larger, methodological bias—worth pointing out, an issue that
is adumbrated in B&al: “Holman et al. (2008) show that in fact the most successful method of
language development and classification combines lexical with grammatical (typological) data”
(2011:18). In other words, one simply cannot ignore the basic criteria of the traditional
Comparative Method, namely correspondences across potential cognate lists.

As it turns out, the very examples of computational phylogeny methods cited in B&al (e.g., Gray
et al 2009, McMahon & McMahon 2006, McMahon et al 2007) use lexical, not structural, data
in their computation. Most genealogical classifications to date, especially the most robust
classifications, have crucially relied on lexical and phonological data (Paul 1891, Meillet 1958,
Hoenigswald 1960 see Alice Harris & Lyle Campbell 1995 and Ringe & Eska 2013 for
overviews; cf. B&al:19). One advocate of structural-typological data in computational
phylogeny (Holman et al 2008) makes it clear that typological data are best when they
complement lexical data—this heuristic for genealogical classification seems relatively well
accepted in a field well known for its contentious nature (see, e.g., Nakhleh et al 2005a: 172).
The helpful complementarity of lexical and typological features in phylogenetic analysis is also
noted in B&al 2011:18: “Holman et al 2008 show that in fact the most successful method of
language development and classification combines lexical with grammatical (typological) data.”
But what is not noted there is Holman et al’s recommendation for a particular typological/lexical
ratio, with potential lexical correspondences taking precedence over typological ones when
determining genetic relatedness (Holman et al 2008:348).

In this vein, Wichmann & Saunders (2007:396) illustrate the limitations that are inherent with
using WALS features as the exclusive sources of feature data in computational phylogenetics
methods such as the one in B&al and D-M&B. And Wichmann & Saunders cautiously add:
“[F]or a conclusive demonstration we would like to be able to find cognate grammatical
elements and systematic sound correspondences” (p. 399).

The cases that make exclusive use of typological data in determining genetic relatedness (e.g.,
Dunn et al 2008, which B&al approvingly cite) are those where the lexical data are no longer

  74  
reliable because of the time depth during which the relevant languages (Austronesian and non-
Austronesian) have been in contact in Western Melanesia. This time-depth limitation (10,000
years according to Dunn et al 2008: 710–712; cf. Wichmann & Saunders 2007:378) certainly
does not apply to Creole languages, which are often dubbed the world’s “youngest” languages.
Dunn et al 2008 also stress the importance of balancing out features from diverse modules of
grammar—phonology, morphology, syntax and semantics—and to include the widest possible
range of features (p. 715f, 733) in one’s phylogenetic analyses. Again, this caveat is not taken
into account in B&al and D-M&B.

Interestingly, Dunn et al (2008:715) alerts us to “special conditions” when typological features


are unreliable, namely when grammatical properties can be transferred without concomitant
transfer of lexical properties. One of these “special conditions” is: “where the donating
language is adopted wholesale by the speakers of another language, in the classic case in full
language shift. In this case, the tendency is for substrate influences to be more apparent in
structure (phonology, grammar) than in lexicon [...] due to imperfect learning or interference.”
This, we take, is exactly the sort of conditions that applied to language shift in the colonial
Caribbean. To the reader who might argue that Creole formation fundamentally differs from
other cases of shift, our answer is that it doesn’t, a point that is argued at length in DeGraff
(2009). Some readers will claim, as in Lefebvre 1998, that Creole languages fall outside the
Comparative Method, due to their believe that Creoles take their lexicon from one language (e.g.,
French for HC) and their syntax from other unrelated languages (eg., from Gbe for HC). This
“Relexification Hypothesis” has been shown to be empirically and theoretically untenable
(DeGraff 2002).

Also of importance is the time-stability of typological features (Dunn et al 2008: 716, 730; cf.
Nichols 2003 and Longobardi & Guardiano 2009). Various historical linguists have pointed out
the importance of choosing features that are relatively immune to borrowing and homoplasy, the
latter having to do with the possibility that two languages share features, not due to shared
inheritance, but rather ‘accidental’ convergence through back mutation or parallel development
(see, e.g., Barbançon et al 2013 and Nichols & Warnow 2008). (In biology, common examples
of homoplasy include the independent evolution of eyes and wings across species.) This issue of
time-stability is amplified by Wichmann & Saunders 2007:379–383, with reference to areal
factors as well. Wichmann & Holman 2012 (in their Appendix 1) offers a table of relative

  75  
temporal stabilities for WALS features. The least stable features are those that are most prone to
change, and thus the least reliable for computing genealogical classification. Among the 40
WALS features used by B&al for determining their Creole typology, almost half of unstable (i.e.,
7 are very unstable, 12 are unstable) while 21 are stable (8 stable,13 very stable).

One related caveat has to do with the choice of binary vs. multi-valued features. Recall that
B&al, unlike some of the studies in D-M&B, converted WALS multi-valued features into binary
features. Yet the latter are much less reliable than the former:

“[A] character is likely to be more informative the more states it has. It may not be simple
to find characters with several states that can be attested for any language, so it might be
tempting to simply use binary ones. Nevertheless, we would recommend avoiding binary
characters since their states are expected to be more prone to chance fluctuation than
characters with several states.” (Wichman & Saunders 2007: 401)

Taken altogether, the above considerations render the CTH’s genealogical claims about Creole
languages quite fragile. B&al’s and D-M&B’s results are based on structurally-typological
features alone, features from very restricted areas of grammar. These results are far from
conclusive, and they become even less conclusive in light of the statistical and empirical flaws
highlighted in this paper.

5. Back to family values: Creole languages in the Comparative Method23

The Comparative Method is often criticized on the grounds that it cannot handle language-
contact phenomena of the sort that seems most central to Creole formation. Here we should
recall some relevant facts in biological phylogenetics that we hope will bring much needed
clarity to this critique. In biology, there is a growing consensus that, indeed, phylogenetic or
cladistics-based trees alone cannot account for all the facts about biological evolution, especially
when we start looking at reticulate events such as hybridization and horizontal gene transfer
(Huson & Bryant 2006:254). And this is exactly one reason for the rise of phylogenetic
networks in biology—namely, to account for such “reticulate events.” But such events simply
do not do away with the fact that mutation and speciation are among the most important
mechanisms for biological evolution, and these processes are best captured by phylogenetic trees
                                                                                                               
23
This section is more amply elaborated in DeGraff (2009) and Aboh & DeGraff (2014, to appear).

  76  
of the traditional sort—similar to the Stammbaumtheorie tree most familiar to historical linguists.

In fact, reticulate networks (i.e., phylogenetic networks that explicitly represent reticulate events)
“provide an ‘explicit’ representation of evolutionary history, generally depicted as a phylogenetic
tree with additional edges” (Huson & Bryant 2006:255; recall the figure in (14) above for one
simplistic example; Nakhleh et al 2005b provide more realistic and complex examples whereby
reticulate networks can be viewed as containing sets of phylogenetic trees, as in the history of
Indo-European).

Similar considerations apply to the Comparative Method: though there are contact (e.g.,
Sprachbund-like) phenomena to account for (see note 1), these contact phenomena do not do
away with the usefulness of the Stammbaumtheorie model in accounting for the diversification
of language through history, as in the Indo-European family. Ringe et al 2002 and Nakhleh et al
2005b are two recent efforts to accommodate contact phenomena in the history of Indo-European
while keeping the basic insights of the Stammbaumtheorie model, alongside the advantages of
contact edges in reticulate networks.

To further illustrate these points about the usefulness of both phylogenetic trees and reticulate
networks, we take as our benchmark HC, a “radical Creole” in Bickerton’s (1984) terms or a
“most Creole of Creoles” in McWhorter’s (1998: 809) terms—that is, HC is generally assumed
to have emerged from a most radically reduced pre-Creole Pidgin. The core properties we
discuss below are already found in the earliest (proto-)HC varieties and, thus, cannot be
dismissed as post-creolization features that would have entered the language via relatively recent
contact with French, long after the Creole-formation period (DeGraff 2001b:291–294,
2009:940f).24

Here we appeal, again, to what Dunn et al (2008: 710–712) describes as the “gold standard for
historical linguistics,” to be applied whenever cognate sets can be reasonably established within
a time depth of some 10,000 years (cf. Wichmann & Saunders 2007:378). This “gold standard”
is the classic Comparative Method, and here is its basic heuristic: If a system of lexical and
                                                                                                               
24
As discussed in the main text (Sections 3.2 and 5), the very concept of “decreolization” in the history of
Caribbean Creoles is problematic to the extent that the “acrolectal” varieties that are often taken to be the results of
decreolization would have been among the first to emerge, then, to subsequently co-exist with the more “basilectal”
varieties. In the case of Haiti, the contact with French was reduced to a minimum after the independence in 1804,
with most Creole speakers being monolingual and with little, if any, contact with French speakers (DeGraff 2001b:
229–232).

  77  
morphophonological correspondences can be identified between a set of languages, then these
correspondences can be taken as evidence of genealogical relatedness, with ancestry in a
common protolanguage. What counts as a robust system of evidence in the Comparative
Method? Let’s assume that such a system must meet some “individual-identifying” treshhold
(Nichols 2006). In other words, these correspondences must contain a fair amount of “faits
particuliers” (or “language-particular idiosyncratic properties” in Meillet’s terminology) in order
to reliably rule out chance correspondences.

Any good set of dictionaries and descriptive grammars for HC and other French-based Creoles
such as those spoken in Guadeloupe, Martinique, Guyane, Sheychelles, Mauritius, etc., will offer
robust evidence of systematic correspondences between these languages, and similar evidence is
straightforwardly available from dictionaries and descriptive grammars for English-based
Creoles asuch as those in the Caribbean. In many instances, these correspondences amount to
identity, as I’ve personally attested when listening to or speaking with Creole speakers from
Martinique, Mauritius, Seychelles. These correspondences do meet Nichols’s “individual-
identifying” threshold (Fattier 1998, 2003, DeGraff 2009). Listed below is a sample of “faits
particuliers” in HC with systematic correspondences with respect to French. Similar
correspondences can be established, straightforwardly, between other French-based Creoles and
French. Such language-particular idiosyncratic properties in the case of HC include the majority
of HC affixes, the majority of HC paradigmatic lexical sets (including items from Swadesh lists),
all grammatical morphemes, pronouns, deictic elements, etc. The sample below is taken from
Aboh & DeGraff, to appear:

• All HC cardinal numbers are derived from French: en ‘1’, de ‘2’, twa ‘3’, kat ‘4’ ... san
‘100’ ... mil ‘1,000’ ... from French un, deux, trois, quatre ... cent ... mille ...
• All HC ordinal numbers, including the suffix /-jɛm/ and its morphophonology (sandhi,
suppletion, etc.), are derived from French: premye ‘1st’, dezyèm ‘2nd’, twazyèm ‘3rd’,
katryèm ‘4th’, ... santyèm ‘100th’ ... milyèm ‘1,000th’ ... from French premier, deuxième,
troisième, quatrième, ... centième, ... millième ...
• All HC kinship terms are derived from French: for example, frè ‘brother’, sè ‘sister’,
kouzen ‘cousin’, kouzin ‘cousin (feminine)’ ... from French frère, soeur, cousin, cousine ...
• All color terms are derived from French: blan ‘white’, nwa ‘black’, rouj ‘red’ ... from

  78  
French blanc, noir, rouge ...
• All body-part terms are derived from French: cheve ‘hair’, zòrèy ‘ear’, je ‘eye’, nen
‘nose’, bouch ‘mouth’, dan ‘tooth’, lang ‘tongue’ ... from French cheveux, oreille, yeux,
nez, bouche, dent, langue ...
• All TMA markers are derived from French: te ‘ANT’, ap ‘PROG, FUT’, ava
‘IRREALIS’, fini ‘COMPLETIVE’ ... from French étais/était/été (imperfect and
participle of ‘to be’), après ‘after’, va(s) ‘go+3sg/2sg+PRES’, finir/fini(s) ‘to finish’ and
its various participial and finite forms...
• All prepositions are derived from French: nan ‘in’, pou ‘for’, apre ‘after’, anvan ‘before’,
devan ‘in front of’, dèyè ‘behind’ ... from French dans, pour, après, avant, devant,
derrière...
• All determiners, demonstratives, etc., are derived from French: yon ‘a’, la ‘the’, sa
‘this/that’ from French un, la/là, ça.
• All pronouns are derived from French: m(wen) ‘1sg’, ou ‘2sg’, li ‘3sg’, nou ‘1pl, 2pl, yo
‘3pl’ ... from French moi, vous, lui, nous, eux ...
• All complementizers are derived from French: ke ‘that’, si ‘if’, pou ‘for’ ... from French
que, si, pour ...
• Almost all HC derivational morphemes have French-derived distribution and semantics:
for example, HC de- as in deboutonnen ‘to unbutton’ and dezose ‘to debone’ from French
de- which, like HC de-, has inversive and privative uses.
• HC morphophonological phenomena with French roots include liaison phenomena as in
an Bèljik ‘in Belgium’ vs. ann Ayiti ‘in Haiti’; de zan /de zã/ ‘two years’, twa zan /twa
zã/ 'three years', san tan /sa tã/ 'one hundred years’ ... (cf. the pronunciation of the HC
and French ordinal and cardinal numbers above; Cadely (2002, 2003) provides further
examples of HC-French correspondences in phonology)

In Nichols’ terminology, these sets would count as individual-identifying “lexical categories with
some of their (phonologically specific) member lexemes.” Even more striking is the fact that HC
even instantiates Nichols’s example of “the miniparadigm of good and better ... as diagnostic of
relatedness.” To wit, HC bon and miyò straightforwardly derive from French bon and meilleur.
This HC example is all the more telling that the last vowel in miyò (written millor in
Ducœurjoly’s 1802 Creole language manual, p. 330) reflects an Old and Middle French

  79  
pronunciation of the word meilleur as meillor (Nyrop 1903v2: 312), a pronunciation that is partly
retained as mèlyor in Franco-Provençal dialects (Stich 2001), thus indicating that this paradigm
was inherited from French during the Creole-formation period and is not a late
(“decreolization”/post-creolization) feature of HC (see Aboh & DeGraff, to appear, for a more
detailed discussion, with additional related examples).

There is a great diversity of HC morpho-phonological patterns that resembles the bon/miyò


example—with clear cognates in 17th century French, as documented in Fattier 1988ff and
Cadely 2002, 2003. Fattier’s 6-volume dialect atlas for HC dialects documents a great variety of
morphophonological and morphosyntactic phenomena and lexical patterns that were inherited
from colonial varieties of French during Creole formation—and not borrowed late due to post-
/de-creolization contact with French (cf. notes 20 and 24). In other words, from Creole
formation onward, HC has manifested a robust set of “faits particuliers” that can be used to
straightforwardly establish its genetic affiliation with French.

Cadely’s work documents phonological correspondences between HC and French, including


arguments as to why certain correspondences prevailed from the onset of Creole formation (see,
e.g., Cadely 2002:449 on the status of HC’s oral vs. nasal vowels), thus excluding their source in
post-creolization contact with French. Such a system of correspondences is incompatible with
the postulation of an extraordinarily reduced Pidgin as the immediate ancestor of HC.

Sylvain (1936), Fattier (1998ff), DeGraff (2001a,b, 2005b) and Aboh & DeGraff (2014, to
appear) provide further details on the French-based origins of HC morphophonology and syntax,
alongside Niger-Congo substrate influence. The latter creates Sprachbund-like effects across
Caribbean Creoles (in, e.g., TMA distribution and interpretation, serial verb constructions,
predicate clefts; see note 1). These effects may seem even more striking in cases where the
superstrate and substrate structures are congruent, as in certain areas of TMA marking (DeGraff
2005b).

It thus seems most reasonable to accept, without further hesitation, HC’s genealogical
relatedness with French, “relatedness by descent” as in the better-studied Stammbaumtheorie
branches of Indo-European. If so, then the still popular “break in transmission” scenarios à la
Taylor, Bickerton, Thomason & Kaufman, McWhorter, B&al, etc. cannot hold. Then again, one
may argue that Caribbean Creoles such as HC show more “significant discrepancy” between

  80  
their lexical- vs. grammatical-correspondences vis-à-vis their respective lexifiers than French
does vis-à-vis Latin (this argument goes back to Taylor 1956 and Thomason & Kaufman 1988,
and is echoed throughout B&al). But such arguments are challenged by Meillet’s observation
long ago that French is of a grammatical type distinct from Latin even though French can be duly
considered a descendent of Latin according to the Comparative Method. DeGraff (2001b,
2009:919–922) compares HC and French using the structural parameters that, according to
Meillet, show that French “fall[s] into a typological class that is quite remote from the structural
type represented by Latin” (Meillet 1958: 148).25 Such comparison suggests that HC and French
may actually be typologically closer to each other than French and Latin are—with respect to
word order, Case morphology, definite articles, etc.

By the same token, the notion of global Creole simplicity falls apart: though French may look
“simpler” than Latin on the surface (e.g., absence of nominal declensions) it developed structural
devices (e.g., articles) that are absent in Latin. Similarly HC developed structural devices (e.g., a
complex TMA system, focus-marking strategies including predicate copying) that are absent in
French (for case studies, see Aboh & DeGraff 2014, to appear).

6. Envoi

Daval-Markussen & Bakker claim that their analyses “illustrate various ways in which
phylogenetic tools can advantageously be put to use in investigating and visualizing the
relationships of Creole languages to other languages” (p. 89). In light of the evidence in our own
paper, what D-M&B’s analyses illustrate is that great care must be taken in the use of
phylogenetic tools in Creole studies and, more generally, in historical linguistics.

Similar caveats apply to the use of language-acquisition results in studies of language creation
and language change. Plag’s (2008a,b) claims that Creole structures originate in adult learners’
early interlanguages are historically and empirically unfounded.

What we are left with is the relatively banal conclusion that, indeed, Creole languages are, both
                                                                                                               
25
The (apparent?) typological distance between Latin and the Romance languages, alongside the lexical continuity
from Latin to Romance, may also present a challenge to approaches such as Longobardi & Guardiano 2009 where
syntactic parameters alone are used as markers for genealogical classification. But as Longobardi (p.c., 5/24/2016)
explains, the apparent Latin-to-Romance typological gaps may be an artifact of the choice and abstractness of the
parameters being compared. This, again, points to the importance of theory in our choices of features to be used in
phylogenetic analyses.

  81  
in their history and their structures, pretty much like any other language. As such the classic
Comparative Method of historical linguists can apply straightforwardly: given the cognate sets
that can be established between Caribbean Creoles and their European source languages, the
former can be taken to descend, with modification, from the latter. This goes counter to one of
the long-standing orthodoxies of our field (Taylor 1956, Thomason & Kaufman 1988, Ringe et
al 2002, Nahkleh et al 2005b, Labov 2007, etc.).

As for substratum effects, they are of the same sort as can be noticed in the history of, say,
European languages: in the Caribbean, as in Europe, we find Sprachbund-like effects that create
structural similarities across language families. In the Caribbean case, these quasi-Sprachbund
effects originate from the Niger-Congo languages spoken by the African adults as they were
learning, and transforming, European varieties in the Caribbean.

Sign Languages too were once thought to be ‘exceptional’ in virtue of their unique modality.
Linguists now recognize them for what they are: signed languages are ‘just like’ any other
human language. In our view, the same holds for Creole languages: they are no more and no less
‘exceptional’ than any other possible grouping of full-fledged human languages. Each grouping
is just as ‘exceptional’ as any other, based on the features that we use to compare them.

  82  
Appendix A: Incompatible feature values in Bakker et al 2011

The table below lists the inconsistent pairs in the Parkvall-based dataset used in B&al.
Languages are listed in the leftmost column and inconsistent pairs are listed in the topmost row.
Languages are abbreviated as in B&al, and inconsistent pairs are written in the form F1|F2=v1v2,
where features F1 and F2 take respective values v1 and v2. For example, there are 7 languages for
which F6|F7=01, and Esperanto (EESP) has one inconsistent pair (F23|F24=01).

Recall from (22) in main text: Logically inter-dependent binary features in B&al’s WALS-based
study (cf. Kouwenberg 2010:371f)

a) F06/F07 (“overt marking of direct object”/“double marking of direct object”).


F07 = “1” entails F06 = “1.”
b) F08/F09 (“Possession by double marking”/“Overt possession marking”).
F08 = “1” entails F09 = “1.”
c) F11/F13 (“Gender”/“Non-semantic gender assignment”).
F13 = “1” entails F11 = “1.”
d) F23/F24 (“Ordinals exist as a separate class beyond ‘first’”/“Suppletive ordinals beyond
‘first’”).
F24 = “1” entails F23 = “1.”
e) F30/F31 (“Grammaticalized past/non-past” /“Remoteness distinctions of past”).
F31 = “1” entails F30 = “1.”
f) F36/F37 (“Evidentiality (grammatical)”/“Both indirect and direct evidentials”).
F37 = “1” entails F36 = “1”

  83  
F11|F13=01

F23|F24=01

F30|F31=01

F36|F37=01
F11|F13=?1

F23|F24=?1

F30|F31=?1

F36|F37=?1
F6|F7=01

F8|F9=10

F8|F9=1?
F6|F7=?1
Language Total errors 7 5 4 82 3 15 1 32 66
EESP 1 1
PFRT
CBSM 1 1
CGBC
CHCR 1 1
CHWC
CMCR 1 1
CMLC 1 1
CMQC
CNDY 1 1
CNPI 1 1
CPAP 1 1
CSAN 2 1 1
CSEY
CSRA 1 1
CSRM 1 1
CTAY
CPIJ 1 1
CAUS 1 1
CJAM 1 1
CKRI 1 1
CDOM
CGUA
CLUC 1 1
CNEH 1 1
PFAN 1 1
PLIF
PCHJ
CKIN 1 1
PMOB
CPRI 1 1
CANN 1 1
CPAL
CSAT 1 1
PRUN
Naco 2 1 1
Nain 1 1
Name 3 1 1 1
Nasm 2 1 1
Nbag 2 1 1
Nbrh 1 1
Nbsq 1 1
Ncah 3 1 1 1
Ncha 1 1
Nchk 4 1 1 1 1
Nckr 2 1 1

  84  
F11|F13=01

F23|F24=01

F30|F31=01

F36|F37=01
F11|F13=?1

F23|F24=?1

F30|F31=?1

F36|F37=?1
F6|F7=01

F8|F9=10

F8|F9=1?
F6|F7=?1
Language Total errors 7 5 4 82 3 15 1 32 66
Ncmn 1 1
Ncoo 1 1
Ncyv 1 1
Ndag 1 1
Ndni
Nepe 1 1
Neve
Newe 3 1 1 1
Nfin 1 1
Ngeo 2 1 1
Ngoo 3 1 1 1
Ngrw 1 1
Ngua 1 1
Nhai 1 1
Nhmo 1 1
Nhun 2 1 1
Nigb 1 1
Nika 1 1
Nimo
Nind 1 1
Nkay 1 1
Nkew 1 1
Nkha 3 1 1 1
Nklv 1 1
Nknr 2 1 1
Nkoa 2 1 1
Nkob 1 1
Nkse 1 1
Nkut 1 1
Nlad 1 1
Nlan 2 1 1
Nlez 1 1
Nmal 1 1
Nmao 1 1
Nmap
Nmei 2 1 1
Nmnd 1 1
Nmrt 1 1
Nmss
Nnen 1 1
Nnez 1 1
Nngi
Nnht 1 1
Nniv 1 1
Notm
Nprs 2 1 1
Nqim 2 1 1

  85  
F11|F13=01

F23|F24=01

F30|F31=01

F36|F37=01
F11|F13=?1

F23|F24=?1

F30|F31=?1

F36|F37=?1
F6|F7=01

F8|F9=10

F8|F9=1?
F6|F7=?1
Language Total errors 7 5 4 82 3 15 1 32 66
Nram 1 1
Nrap 1 1
Nshk 1 1
Nsml 1 1
Ntab
Ntha 1 1
Ntur 1 1
Nurk 1 1
Nusa
Nwic
Nvie 2 1 1
Nwra 1 1
Nyaq 2 1 1
Nyid 2 1 1
Nyko 1 1
Nyur 2 1 1
Nzqc 1 1
Nawp 1 1
Nbrm 2 1 1
Nkrk 2 1 1
Nmar 3 1 1 1
Nzun 2 1 1
Narm 2 1 1
Nfij 1 1
Nkhm 2 1 1
Nlah 1 1
Npai 2 1 1
Nsue
Ntuk 1 1
Nyor 2 1 1
Nala 2 1 1
Napu 1 1
Nbej 2 1 1
Nbma 2 1 1
Ncre 2 1 1
Nfre 1 1
Nhau 2 1 1
Nheb 2 1 1
Nhin 2 1 1
Nhix
Nirq 1 1
Nkhs 2 1 1
Nlat 1 1
Nmay 1 1
Npsm
Nspa 1 1
Nswe 1 1

  86  
F11|F13=01

F23|F24=01

F30|F31=01

F36|F37=01
F11|F13=?1

F23|F24=?1

F30|F31=?1

F36|F37=?1
F6|F7=01

F8|F9=10

F8|F9=1?
F6|F7=?1
Language Total errors 7 5 4 82 3 15 1 32 66
Ntag 1 1
Ntiw 1 1
Naeg 2 1 1
Norh 1 1
Nmun 1 1
Nabk 1 1
Nbrs
Neng 1 1
Nger 1 1
Ngrb 1 1
Ngrk 1 1
Nkho 1 1
NKnd 1 1
Nlav 2 1 1
Nmyi 2 1 1
Nond
Nwar 1 1
Nket 1 1
Nrus 1 1
Nwrd 1 1
Nbur 1 1
Ndyi 2 1 1
Npau 1 1
Nprh
Narp 2 1 1
Ndio 1 1
Nhzb 4 1 1 1 1
Ning 1 1
Njuh
Nluv 2 1 1
Nmau 1 1
Nmxc
Nnug
Nsup 1 1
Nswa 2 1 1
Nzul 2 1 1
Nyim 2 1 1
Njak 2 1 1
Njpn 1 1
Nkio 1 1
Nknm
Nkor 2 1 1
Nkro
Nlkt 2 1 1
Nsla 2 1 1
Nsnm
Nwch 1 1
Nyag

  87  
Out of the 188 languages in the Parkvall-based dataset used in B&al, 152 have at least one
inconsistent pair, for a total of 215 inconsistent pairs. Here are examples of languages with
inconsistent feature-value pairs:

• F23/F24: Esperanto, Sango, Fanakalo, Basque, Chukchi, Ewe, Finnish, Georgian,


Persian, Quechua, Turkish, Vietnamese, Burmese, Khmer, Yoruba, Hausa, Hindi,
Spanish, Swedish, Arabic, English, German, Greek, Russian, Korean, and many other
(well described) languages are described as having “suppletive ordinals beyond ‘first’”
yet lacking “ordinals [...] as a separate class beyond ‘first’.”
• F30/F31: Acoma, Cahuilla, Chukchi, Canela-Krahô, Gooniyandi, Lango, Yidiny, Zoque,
Paiwan, Cree, Lavukaleve, Mangarrayi, Wardaman, Hunzib and Kiowa are described as
having “remoteness distinctions of past” yet lacking “past-nonpast distinctions.”
• F36/F37: Bislama, Mauritian Creole, Papia Kristang, Ndyuka, Nigerian Pidgin,
Papiamentu, Sango, Sranan, Saramaccan, Solomon Islands Pidgin, Australian Creole
English, Jamaican Creole, Krio, Saint Lucian, Negerhollands, Kinubi, Principense,
Annobon Creole, Sao Tomense, Ainu, Canela-Krahô, Mandarin, Vietnamese, Alamblak,
Hebrew, Khasi, Khoekhoe, Wari’, Yimas, Japanese, Korean are described as having
“both indirect and direct evidentials” yet lacking grammatical “evidentiality” markers.

Combining features WALS 77 with WALS 78 in Dryer & Haspelmath’s (2011) database shows,
unsurprisingly, that there is no WALS language with such logically impossible combinations of
features: no language could show both “no grammatical evidentials” and “direct and indirect
evidentials” (see http://wals.info/feature/combined/78A/77A ). Another striking fact is that the
majority of Pidgins and Creoles in Bakker et al’s sample are described as having this
incompatible combination of features, thus making Creole languages truly “exceptional”—so
exceptional that they contradict basic axioms of logic.

  88  
Appendix B: Exceptionality or distinguishability?

We consider the extreme case of identifying subsets of languages in the WALS database that
share exactly the same values in several features. Note that this is a more onerous challenge than
identifying languages that will cluster tightly according to the type of method employed in B&al
and D-M&B, for two principal reasons: (1) WALS features can take more than two possible
values compared to those in B&al and D-M&B; and (2) where the Creoles in B&al and D-M&B
have high similarity, we require language sets to share exactly the same values in all chosen
features. Therefore, if we are successful even while bearing these greater burdens, our findings
will be applicable to casting doubt on the methods used in those papers. We will approach this
using two different methods: heuristic and probabilistic search.

B.1. Finding examples of languages in WALS that share many common features

To help us with heuristic search, we implemented two algorithms.

Algorithm 1. This algorithm takes as input a minimum number of shared features N and outputs
examples of language sets that are equal in at least N features. In particular, for each language L,
let M(L) be the maximum number of languages in a set containing that language that are equal in
at least N features. The algorithm returns, for each seed language L, one set of languages of size
M(L) that is equal in at least N features (if one exists). It can be easily modified to return a list of
all such “maximal” language sets for each seed language L, though the running time will increase.
This algorithm calls the following function for each language in WALS (this function takes
seed_language_set, remaining_language_set, and shared_feature_count as input and produces
maximal_language_set as output):

• For each language A in remaining_language_set:


• If this language A is equal in at least shared_feature_count_features to the
seed_language_set:
• Set candidate_maximal_language_set(A) = find_maximal_language_set (input:
seed_language_set = seed_language_set plus A, remaining_language_set =
remaining language set minus A).
• Set max_size = maximum size of any candidate_maximal_language_set.
• Choose a candidate_maximal_language_set of max_size.

  89  
Note that this function should never be called when the languages in seed_language_set are equal
in fewer than shared_feature_count features. The algorithm is optimized for finding small sets of
languages that share large sets of features; in other words, it runs faster for higher N. As N gets
smaller the running time increases very quickly. Therefore we also defined an alternative
algorithm below that is different optimized. Without too much trouble, though with increased
running time, Algorithm 1 can be adapted into an exhaustive search algorithm that returns all sets
of languages that share above a given number of features in common.

We present two examples of the output of Algorithm 1. Example 1 has 10 languages that are
equal in 30 common features, and Example 2 has 66 languages that are equal in 9 common
features.

Example 1: The following 10 languages:


1. English 5. Greek (Modern) 9. Russian
2. Finnish 6. Hindi 10. Spanish
3. French 7. Hungarian
4. German 8. Persian
have common values in these 30 features (common values in parentheses):
1. Glottalized Consonants (No glottalized consonants)
2. Lateral Consonants (/l/, no obstruent laterals)
3. Tone (No tones)
4. Absence of Common Consonants (All present)
5. Fusion of Selected Inflectional Formatives (Exclusively concatenative)
6. Coding of Nominal Plurality (Plural suffix)
7. Inclusive/Exclusive Distinction in Independent Pronouns (No inclusive/exclusive)
8. Obligatory Possessive Inflection (Absent)
9. Possessive Classification (No possessive classification)
10. Nominal and Verbal Conjunction (Identity)
11. The Past Tense (Present, no remoteness distinctions)
12. The Optative (Inflectional optative absent)
13. Verbal Number and Suppletion (None)
14. Order of Demonstrative and Noun (Demonstrative-Noun)
15. Order of Numeral and Noun (Numeral-Noun)
16. Order of Degree Word and Adjective (Degree word-Adjective)
17. Order of Adverbial Subordinator and Clause (Initial subordinator word)
18. Alignment of Verbal Person Marking (Accusative)
19. Passive Constructions (Present)
20. Antipassive Constructions (No antipassive)
21. Applicative Constructions (No applicative construction)
22. Numeral Bases (Decimal)
23. N-M Pronouns (No N-M pronouns)
24. Minor morphological means of signaling negation (None)

  90  
 

25. M in Second Person Singular (No m in second person singular)


26. M in First Person Singular (m in first person singular)
27. Other Roles of Applied Objects (No applicative construction)
28. Zero Marking of A and P Arguments (Non-zero marking)
29. Productivity of the Antipassive Construction (no antipassive)
30. Number of Possessive Nouns (None reported)
Example 2: The following 66 languages
1. Arapesh (Abu) 24. Igede 45. Masa
2. Abun 25. Irarutu 46. Maybrat
3. Acholi 26. Jabêm 47. Mba
4. Adzera 27. Jukun 48. Mbum
5. Arop-Lokep 28. Kara (in Central 49. Mbay
6. Ambae (Lolovoli African 50. Meyah
Northeast) Republic) 51. Musgu
7. Ambai 29. Kaulong 52. Mooré
8. Angas 30. Karen (Bwe) 53. Mor
9. Anufo 31. Kele 54. Mpur
10. Mufian 32. Kera 55. Margi
11. Au 33. Kaliai-Kove 56. Mumuye
12. Bagirmi 34. Karen (Sgaw) 57. Mupun
13. Birom 35. Kayah Li 58. Ngambay
14. Buduma (Eastern) 59. Ngoni
15. Buma 36. Labu 60. Ngizim
16. Cham (Western) 37. Lagwan 61. Olo
17. Efate (South) 38. Lunda 62. Pero
18. Ewe 39. Lele 63. Rotuman
19. Fyem 40. Lenakel 64. Sahu
20. Goemai 41. Lewo 65. Tidore
21. Hatam 42. Lamaholot 66. Ura
22. Mina 43. Loniu
23. Ifumu 44. Luvale

have common values in these 9 features (common values in parentheses):


1. Order of Subject, Object and Verb (SVO)
2. Order of Subject and Verb (SV)
3. Order of Object and Verb (VO)
4. Order of Adjective and Noun (Noun-Adjective)
5. Order of Numeral and Noun (Noun-Numeral)
6. Position of Interrogative Phrases in Content Questions (Not initial interrogative
phrase)
7. Relationship between the Order of Object and Verb and the Order of Adjective
and Noun (VO and NAdj)
8. Postverbal Negative Morphemes (VNeg)
9. Minor morphological means of signaling negation (None)

  91  
 
 

Algorithm 2. Let m(F) be the maximum number of languages that are attested and have
equal values in all features in feature set F, and let M(N) = max|F|=N m(F) be the
maximum number of languages that are attested and have equal values in any feature set
F of size N. Algorithm 2 is designed to calculate M(N). M(N) can be straightforwardly
calculated brute force by calculating m(F) for all combinations of N distinct features F,
but this calculation can be quite computationally intensive. If we know that M(N) is over
a certain threshold value MT, then we can speed up this algorithm:

• Iterate from shared feature count s = 1 to N:


• If s = 1:
• Then: initialize I to be the set of all feature sets of size one: {f1}, {f2}, …,
{fn}.
• Else: set I0 = I and initialize I to the empty set. Eliminate feature sets F in
I0 for which m(F) is greater than or equal to MT. For each remaining
feature sets F = {fk(1), fk(2), … , fk(s -1)} in I0, add to I the feature sets F ∪
{fk} for all k greater than each of k(1), k(2), …, k(s-1) such that Fk = {fk(1),
fk(2), … , fk(s -2)} ∪ {fk} is already a member of I0 (if Fk is not a member of
I0, then we can be sure that m(F ∪ {fk}) < MT and is thus not worth
considering).
• Calculate m(F) for each feature set F in I (each of which now contains s
features).
• Calculate M(N) = maxF I m(F).
This algorithm is optimized for small numbers of shared features and high values of
shared languages; in particular, it runs faster for higher values of MT.

Algorithm 2a. This algorithm takes as input a minimum number N of shared features and
outputs sets of languages that are equal in at least N features. However, it searches the
language and feature space differently by first identifying features that are likely to result
in clusters. In particular, it starts with a seed feature and iteratively adds features that
produce large clusters of languages. The algorithm calls the following function for each
seed feature in WALS:

  92  
 
 

• Initialize F to any single feature.


• For n = 2 to N: add a feature A to F that maximizes M(F + A). If there is more
than one, choose one; if there are none, the search has failed.
• Choose a language set L that has M(F) common features.

Algorithms 1 and 2 identify maximal sets of languages that are equal in N features. While
Algorithm 2a does not possess this property, it does have its own advantages. With a few
minor optimizations, this algorithm works quickly for small numbers of features. It can
also be adapted without too much trouble to return additional language sets that share N
given features. We can easily control through parameters how many languages it returns,
though the algorithm will become cost prohibitive quickly as it is configured to return
more and more language sets. Another advantage of this algorithm is that its output
clearly indicates what features are related so the user can quickly identify when it is
picking up languages with logical dependencies. For example, priming this algorithm at
N = 9 with two different features yields (the features in the sets are listed in order of
being added):

Example 3. Seed feature = ‘Order of Subject and Verb’. F = {‘Order of Subject and
Verb’, ‘Minor morphological means of signaling negation’, ‘Order of Adjective and
Noun’, ‘Order of Numeral and Noun’, ‘Order of Demonstrative and Noun’, ‘Order of
Object and Verb’, ‘Relationship between the Order of Object and Verb and the Order of
Adjective and Noun’, ‘NegSVO Order’, ‘SVNegO Order’}. The largest cluster of
languages that have the same values in these features in WALS is of size M(F) = 122.

Example 4. Seed feature = ‘Consonant Inventories’. F = {‘Consonant Inventories’,


‘Absence of Common Consonants’, ‘Front Rounded Vowels’, ‘Uvular Consonants’,
‘Lateral Consonants’, ‘Presence of Uncommon Consonants’, ‘Glottalized Consonants’,
‘Minor morphological means of signaling negation’, ‘Order of Subject and Verb’}. The
largest cluster of languages that have the same values in these features in WALS is of
size M(F) = 31.

  93  
 
 

It is clear that these heuristic search algorithms suffer from the problem of feature
dependency. This is especially clear when we consider features such as SVO, SV and
VO in example 3. Accordingly, if we restrict the algorithms to look at less dependent
sets of features in WALS (as in Holman 2008), they will find many fewer sets of
languages. However, despite their supposed “reasonableness” we have found, in Section
4 above in the main text, that the feature sets in B&al and D-M&B also suffer from high
dependency and are subject to similar flaws.

To further illustrate the problem posed by inter-dependent features, consider the


following plot for the maximum number of WALS languages that share the number of
feature values indicated on the x-axis (the discontinuity indicates data that have not been
computed due to computational limitations; see table below). The number of languages
that share large numbers of features drops dramatically when the features are restricted to
the “approximately independent” features in Holman 2008. These values were calculated
using a slightly modified version of Algorithms 1 and 2.

  94  
 
 

The table below indicates the maximum number of WALS languages that share the
indicated number of common feature values, displayed separately for all 192 WALS
features and for the 47 approximately independent features in Holman 2008. For
example, the maximum number of languages in WALS that share 27 features values in
common is 18, and the maximum number of languages sharing 27 features goes down to
3 when only Holman’s features are considered.

Calculations were completed using Algorithms 1 and 2 (Algorithm 1 for high numbers of
shared features, and Algorithm 2 for low numbers of shared features) from Appendix B.
Values marked “-nc-” were not calculated (due to computational limitations where the
number of features was too low for Algorithm 1 and too high for Algorithm 2 to handle
comfortably), and blank values indicate zero (i.e., that there are no languages with the
corresponding number of attested features).

  95  
 
 

Number of shared features Maximum number of languages


All WALS features Holman 2008 features
1 1,316 525
2 988 415
3 575 272
4 524 146
5 382 69
6 332 37
7 302 27
8 248 19
9 231 14
10 178 12
11 155 10
12 140 9
13 128 8
14 128 7
15 107 7
16 86 6
17 74 6
18 61 5
19 53 5
20 -nc- 5
21 -nc- 4
22 -nc- 4
23 -nc- 4
24 -nc- 4
25 27 3
26 21 3
27 18 3
28 13 2
29 11 2
30 to 31 10 2
32 to 34 9 2
35 8 2
36 to 38 8 1
39 to 44 7 1
45 to 47 6 1
48 to 52 6
53 to 62 5
63 to 75 4
76 to 90 3
91 to 112 2
113 to 192 1

  96  
 
 

B.2 Estimating the number of sets of languages in WALS that share many
common features

While the algorithms above can easy find many examples of sets of languages that share
many common features, they are computationally intractable for exhaustive searches due
to the large size of the search space. In other words, they are not appropriate for
precisely calculating for any NL and NF the number M(NL, NF) of sets of NL languages
that are equal in NF features in WALS. One alternative approach to is estimating M(NL,
NF) using a Monte Carlo simulation, which relies on a large number of random searches.

We conducted a 100 million iteration randomized simulation to estimate M(20, NF), the
number of sets of NL = 20 languages in WALS that share common features. A single
iteration of the simulation consists of picking 20 languages at uniform random and
checking how many features they share in common. To reduce the search space, we
restricted our attention to the 158 languages in WALS that each have at least 100
nonblank features, rendering our estimates conservative. The search space contains
around 158 choose 20, or around 1025, sets of 20 languages. Based on our simulation we
estimate that there are approximately 1022 sets of languages that share at least 4 features
in common and 1019 sets that share at least 6 features in common. Note that our sample
size of 11 for the 7-feature case is small, so the error may be significant, though it is
exceedingly unlikely that our calculations are off by even one order of magnitude. Just
because this simulation did not find any language sets with over 7 features in common
does not mean that there are not a huge number of them, just that the number of such
language sets is likely of order of magnitude smaller than 1018 (so it could still be a huge
number).

  97  
 
 

Common Count Likelihood of language sets Expected number of language


features sharing at least this many sets sharing at least this many
features features
0 17,958,358 1.0 1.1 × 1025
1 64,680,158 0.82 0.9 × 1025
2 15,375,145 0.17 1.9 × 1024
3 1,895,367 0.020 2.2 × 1023
4 86,784 0.00091 1.0 × 1022
5 3,986 0.000042 4.6 × 1020
6 191 0.0000020 2.2 × 1019
7 11 0.00000011 1.2 × 1018

While this method is only tractable for language sets of small sizes, it is enough to shed
serious doubt on the methods employed in D-M&B. Two examples that are meant to
demonstrate a “Creole Typology” in this paper use feature sets of sizes 4 and 6. The
current results show that there are tremendous numbers of language sets that cluster even
more tightly with respect to feature sets of these sizes than do the Creole sets used in that
paper. Arguments similar to those used in that paper could be used to show that any of
those sets of languages is also exceptional.

Of course, as expected, when features are restricted to the approximately independent


features in Holman 2008, the situation changes substantially. We repeated the simulation
above (100 million sets of 20 languages chosen from the same 158 languages) but
restricted to Holman’s 47 features. But even in this case, it is striking that there are 7.7 ×
1017 sets of languages that are expected to have 4 features with common values. Also
recall that the maximum number of such languages (with 4 feature values in common) is
146 (as compared to D-M&B’s 18 Creoles).

Common Count Likelihood of language sets Expected number of language


features sharing at least this many sets sharing at least this many
features features
0 82,995,585 1.0 1.1 × 1025
1 16,534,833 0.17 1.9 × 1024
2 466,220 0.0047 5.2 × 1022
3 3,355 0.000034 3.7 × 1020
4 7 0.000000070 7.7 × 1017
  98  
 
 

Creoles may be distinguishable from other languages based on certain choices of features.
However, based on evidence we present here, there are huge numbers of language sets,
outside well-established phylogenetic groupings, that are also so distinguishable.
Therefore Creoles are not unique in their exceptionality, a contradiction in terms.

  99  
 
 

Appendix C: Statistics for Bakker et al’s feature values from Parkvall


2008

This table shows the following statistics for each of the Parkvall 2008 features as used in
B&al 2011, decomposed by language type: the number of attested values (“Attested”),
the percentage of missing values (“Missing”), the mean value (“Mean”), the standard
deviation of the values (“Mean”), and the standard error of the values (“SE”).
Feature Creoles and Pidgins Languages other than Creoles, Pidgins, Esperanto
Attested Missing Mean Std SE Attested Missing Mean Std SE
F03 21 38% 0.71 0.46 0.10 130 15% 0.20 0.40 0.04
F06 34 0% 0.00 0.00 0.00 149 3% 0.77 0.43 0.03
F07 32 6% 0.00 0.00 0.00 120 22% 0.51 0.50 0.05
F08 6 82% 0.00 0.00 0.00 120 22% 0.16 0.37 0.03
F09 21 38% 0.48 0.51 0.11 120 22% 0.58 0.50 0.05
F10 8 76% 0.25 0.46 0.16 120 22% 0.41 0.49 0.05
F11 32 6% 0.78 0.42 0.07 120 22% 0.43 0.50 0.05
F13 34 0% 0.00 0.00 0.00 143 7% 0.10 0.31 0.03
F14 14 59% 0.71 0.47 0.13 111 27% 0.75 0.44 0.04
F15 13 62% 0.15 0.38 0.10 111 27% 0.13 0.33 0.03
F16 34 0% 0.00 0.00 0.00 136 11% 0.46 0.50 0.04
F17 34 0% 0.00 0.00 0.00 132 14% 0.83 0.37 0.03
F19 33 3% 0.00 0.00 0.00 132 14% 0.10 0.30 0.03
F21 23 32% 0.30 0.47 0.10 132 14% 0.89 0.32 0.03
F22 25 26% 0.00 0.00 0.00 151 1% 0.33 0.47 0.04
F23 31 9% 0.10 0.30 0.05 149 3% 0.36 0.48 0.04
F24 5 85% 0.40 0.55 0.24 141 8% 0.91 0.29 0.02
F25 4 88% 0.25 0.50 0.25 150 2% 0.60 0.49 0.04
F27 33 3% 0.03 0.17 0.03 140 8% 0.05 0.22 0.02
F28 26 24% 0.69 0.47 0.09 118 23% 0.63 0.49 0.04
F29 29 15% 0.55 0.51 0.09 118 23% 0.58 0.50 0.05
F30 32 6% 0.84 0.37 0.07 152 1% 0.53 0.50 0.04
F31 34 0% 0.00 0.00 0.00 137 10% 0.18 0.39 0.03
F32 28 18% 0.71 0.46 0.09 124 19% 0.52 0.50 0.05
F33 28 18% 0.86 0.36 0.07 124 19% 0.27 0.45 0.04
F34 8 76% 0.50 0.53 0.19 40 74% 0.80 0.41 0.06
F35 34 0% 0.00 0.00 0.00 40 74% 0.53 0.51 0.08
F36 34 0% 0.00 0.00 0.00 40 74% 0.35 0.48 0.08
F37 21 38% 0.95 0.22 0.05 114 25% 0.75 0.43 0.04
F38 34 0% 0.00 0.00 0.00 129 16% 0.09 0.28 0.02
F39 9 74% 0.44 0.53 0.18 107 30% 0.40 0.49 0.05
F40 19 44% 0.21 0.42 0.10 85 44% 0.73 0.45 0.05
F41 3 91% 0.00 0.00 0.00 84 45% 0.81 0.40 0.04
F42 3 91% 0.00 0.00 0.00 132 14% 0.55 0.50 0.04
F43 34 0% 0.00 0.00 0.00 145 5% 0.50 0.50 0.04
F44 34 0% 0.00 0.00 0.00 107 30% 0.12 0.33 0.03
F45 34 0% 0.00 0.00 0.00 148 3% 0.49 0.50 0.04
F46 34 0% 0.00 0.00 0.00 148 3% 0.16 0.36 0.03
F47 34 0% 0.00 0.00 0.00 47 69% 0.49 0.51 0.07
F48 34 0% 0.00 0.00 0.00 152 1% 0.26 0.44 0.04
F49 34 0% 0.00 0.00 0.00 132 14% 0.27 0.44 0.04
F50 34 0% 0.00 0.00 0.00 142 7% 0.38 0.49 0.04
F52 34 0% 0.00 0.00 0.00 142 7% 0.20 0.40 0.03

  100  
 
 

References:

Aboh, Enoch O. 2006. Complementation in Saramaccan and Gungbe: The case of C-


type modal particles. Natural Language & Linguistic Theory 24.1, 1–55.
Aboh, Enoch O. 2015. The Emergence of Hybrid Grammars: Contact, Language
Change, and Creation. Cambridge: Cambridge University Press.
Aboh, Enoch O. and Michel DeGraff. 2014. Some notes on bare noun phrases in Haitian
Creole and Gbe: A transatlantic Sprachbund perspective. In Åfarli, Tor A., and
Brit Mæhlum, eds. The sociolinguistics of grammar, 203–236. Amsterdam: John
Benjamins.
Aboh, Enoch O. and Michel DeGraff. To appear. A null theory of Creole formation based
on Universal Grammar. In Ian Roberts, ed. The Oxford Handbook on Universal
Grammar, Oxford University Press.
Adam, Lucien. 1883. Les idiomes négro-aryen et maléo-aryen. Essai d’hybridologie
linguistique. Paris: Maisonneuve et cie.
Alexandre, Nélia. 2012. The Defective Copy Theory of Movement: Evidence from wh-
Constructions in Cape Verdean Creole. Amsterdam: John Benjamins.
Alleyne, Mervyn C. 1971. Acculturation and the cultural matrix of creolization. In
Hymes, 169–186.
Alleyne, Mervyn C. 1980. Comparative Afro-American: A Historical-Comparative
Study of English-Based Afro-American Dialects of the New World. Ann Arbor:
Karoma Publishers.
Allison, Paul. 2002. Missing Data. Thousand Oaks CA: Sage Publications.
Andersen, Roger W. 1983. Pidginization and Creolization as Language Acquisition.
Rowley, Mass: Newbury House.
Anonymous. 1811. Idylles et Chansons, ou Essais de Poësie Créole par un Habitant
d'Hayti. Philadelphie: Imprimerie de J. Edwards.
Baissac, Charles. 1880. Étude sur le patois créole mauricien. Nancy: Imprimerie
Berger-Levrault et cie.
Bakker, Peter. 2003. Pidgin inflectional morphology and its implications for creole
morphology. In Geert Booij & Jaap van Marle (eds), Yearbook of Morphology
2002, 3–33. New York: Kluwer Academic Publishers.
Bakker, Peter, Aymeric Daval-Markussen, Mikael Parkvall & Ingo Plag. 2011. Creoles
are typologically distinct from non-creoles. Journal of Pidgin and Creole
Languages 26.1, 5–42.
Bakker, Peter & Mikael Parkvall. 2005. Reduplication in pidgins and creoles. In Hurch,
Bernhard (ed.,) Studies on reduplication, 511–531. Berlin: Mouton de Gruyter.
Barbançon, François; Tandy Warnow; Steven N. Evans, Donald A. Ringe, Jr & Luay
Nakhleh. 2013. An experimental study comparing linguistic phylogenetic
reconstruction methods. Diachronica 30:2, 143–170.
  101  
 
 

Bass, Trevor. 2012 (September 24). Quantitative Exceptionalism [Web log post].
Retrieved from http://databitten.com/quantitative-exceptionalism
Best, Joel. 2001. Damned Lies and Statistics: Untangling Numbers from the Media,
Politicians, and Activists. Berkeley: University of California Press.
Bickerton, Derek. 1981. Roots of Language. Ann Harbor MI: Karoma.
Bickerton, Derek. 1984. The language bioprogram hypothesis. Behavioral and Brain
Sciences 7.2, 173–203.
Bickerton, Derek. 1990. Language and Species. Chicago: University of Chicago Press.
Bickerton, Derek. 1996. The origins of variations in Guyanese. In Gregogy Guy,
Crawford Feagin, Deborah Schiffrin and John Baugh (eds.), Towards a Social
Science of Language: Papers in Honor of William Labov, 311–327. Amsterdam:
John Benjamins.
Bickerton, Derek. 1999. How to acquire language without positive evidence: What
acquisitionists can learn from Creoles. In Michel DeGraff, ed., Language
Creation and Language Change. Creolization, Diachrony and Development,
1999c, 49-74. Cambridge MA: MIT Press.
Bloomfield, Leonard. 1935. Language. London: George Allen & Unwin.
Booij, Geert. 2007. The Grammar of Words: An Introduction to Linguistic. Oxford:
Oxford University Press.
Bunsen, Christian Karl Josias. 1854. Outlines of the Philosophy of Universal History
Applied to Language and Religion. London: Longman, Brown, Green, and
Longmans.
Byrne, Francis, and Donald Winford. 1993. Focus and grammatical relations in creole
languages. Amsterdam: John Benjamins.
Cadely, Jean-Robert. 2002. Le statut des voyelles nasales en Créole haïtien. Lingua.
112 (6). 435–464.
Cadely, Jean-Robert. 2003. Les sons du créole haïtien. Journal of Haitian Studies 9(2).
4–41.
Chaudenson, Robert & Salikoko Mufwene. 2001. Creolization of Language and Culture.
London: Routledge.
Chomsky, Noam. 2004. Beyond explanatory adequacy. In Belletti, Adriana, ed.
Structures and beyond, 104–131. Oxford: Oxford University Press.
Daval-Markussen, Aymeric & Peter Bakker. 2012. Explorations in creole research with
phylogenetic tools. In Miriam Butt, Sheelagh Carpendale & Gerald Penn (eds.),
Visualization of Linguistic Patterns and Uncovering Language History from
Multilingual Resources, Proceedings of the European Association of
Computational Linguistics 2012 Joint Workshop, 89-97. Stroudsburg PA:
Association for Computational Linguistics.

  102  
 
 

Déchaine, Rose-Marie (1991). Bare Sentences. Proceedings of the First Semantics and
Linguistic Theory Conference. 31--50. Cornell University.
DeGraff, Michel. 2002. Relexification: A Reevaluation. Anthropological
Linguistics. 44 (4): 321–414.
DeGraff, Michel. 2005a. Linguists’ most dangerous myth. The fallacy of Creole
Exceptionalism. Language in Society 34.4. 533–591.
DeGraff, Michel. 2005b. Word order and morphology in ‘creolization’ and beyond.
In Guglielmo Cinque & Richard Kayne (eds.), The Oxford Handbook of
Comparative Syntax, 293–372. New York: Oxford University Press.
DeGraff, Michel. 2009. Language acquisition in creolization and, thus, language
change: Some Cartesian-Uniformitarian boundary conditions. Language and
Linguistic Compass 3/4. 888–971.
DeGraff, Michel. 2014. The Ecology of Language Evolution in Latin America: A
Haitian Postscript Toward a Postcolonial Sequel. In Salikoko Mufwene, ed.,
Iberian Imperialism and Language Evolution in Latin America, 274–327,
Chicago : The University of Chicago Press.
Dixon, Robert M. W. 2010. Basic linguistic theory. Volume 1, Methodology. Oxford:
Oxford University Press.
Dobrushina, Nina; Johan van der Auwera & Valentin Goussev. 2011. The Optative. In:
Matthew Dryer & Martin Haspelmath (eds.) The World Atlas of Language
Structures Online. Munich: Max Planck Digital Library, chapter 73. Available
online at http://wals.info/chapter/73. Accessed on 2012-08-22.
Donohue, Mark; Simon Musgrave; Bronwen Whiting & Søren Wichmann. 2011.
Typological feature analysis models linguistic geography. Language. 87 (2).
369–383.
Dryer, Matthew. 2011. Position of Tense-Aspect Affixes. In Matthew Dryer & Martin
Haspelmath (eds.) The World Atlas of Language Structures Online. Munich: Max
Planck Digital Library, chapter 69. Available online at http://wals.info/chapter/69
Accessed on 2012-08-23.
Dunn, Michael; Stephen C. Levinson; Eva Lindström; Ger Reesink & Angela Terrill.
2008. Structural phylogeny in historical linguistics: Methodological explorations
applied in Island Melanesia. Language 84 (4). 710-759.
Enders, Craig. 2010. Applied Missing Data Analysis. New York: Guildford Press.
Fattier, Dominique. 1998. Contribution à l’étude de la genèse d’un créole: L’atlas
linguistique d’Haïti, cartes et commentaires (6 volumes). Université de Provence,
France: PhD dissertation. (Distributed by Presses Universitaires du Septentrion,
Villeneuve d’Ascq, France.)
Fattier, Dominique. 2002. Lexique: Approche synchronique, à propos de l’haïtien. In
Claudine Bavoux & Didier de Robillard (eds.) Linguistique et créolistique.
Univers créoles 2, 111–128. Paris: Anthropos.

  103  
 
 

Fattier, Dominique. 2003. Grammaticalisations en créole haïtien : Morceaux choisis.


Creolica. 23 avril 2003.
Fon Sing, Guillaume & Jean Leoue (2012). Creoles are not typologically distinct from
non-creoles. Paper presented at Ninth Creolistics Workshop: Contact languages in
a global context: Past and present. Aarhus University, Denmark, 11-12-13 April
2012.
Gray, Russell; A. J. Drummond & Simon J. Greenhill. 2009. Language phylogenies
reveal expansion pulses and pauses in Pacific settlement. Science 323. 479–483.
Hall, Jr., Robert. 1958. Creole languages and genetic relationships. Word 14. 367–373.
Hall, Robert A., Jr. 1962. The life-cycle of pidgin languages. Lingua 11. 151–156.
Hancock, Ian. 1987. A preliminary classification of the Anglophone Atlantic creoles
with syntactic data from thirty-three representative dialects. In Glenn Gilbert
(ed.), Pidgin and Creole Languages. Essays in Memory of John E. Reinecke, 264–
333. Honolulu: University of Hawaii Press.
Harris, Alice & Lyle Campbell. 1995. Historical Syntax in Cross-Linguistic Perspective.
Cambridge: Cambridge University Press.
Hoenigswald, Henry M. 1960. Language change and linguistic reconstruction.
[Chicago]: University of Chicago Press.
Holm, John (2008). Creolization and the fate of inflections. In Thomas Stolz, Dik
Bakker & Rosa Salas Paloma (eds.), Aspects of Language Contact. New
Theoretical, Methodological and Empirical Findings with Special Focus on
Romancisation Processes, 299-324. Berlin: Mouton.
Holm, John & Peter Patrick. 2007. Comparative Creole Syntax: Parallel Outlines of 18
Creole Grammars. Westminster Creolistics Series, 7. London: Battlebridge
Publications.
Holman Eric. 2008. Approximately independent features of languages. International
Journal of Modern Physics C. 19 (2). 215–220.
Holman, Eric; Søren Wichmann; Cecil H. Brown; Viveka Velupillai; André Müller; and
Dik Bakker. 2008. Explorations in automated language classification. Folia
Linguistica 42 (3-4). 331–354.
Hubert, Lawrence & Phipps Arabie. 1985. Comparing partitions. Journal of
Classification, 2 (1). 193-218.
Huff, Darrell. 1954. How to Lie with Statistics. New York: W. W. Norton.
Hurford, James R. 2012. The origins of grammar. Oxford: Oxford University Press.
Huson, Daniel H. & David Bryant. 2006. Application of phylogenetic networks in
evolutionary studies. Molecular Biology and Evolution 23 (2). 254–267.
Hymes, Dell (ed.) 1971. Pidginization and Creolization of Languages. Cambridge:
Cambridge University Press.

  104  
 
 

Jain, Ravi; Maria C. Rivera & James A. Lake. 1999. Horizontal Gene Transfer among
Genomes: The Complexity Hypothesis. Proceedings of the National Academy of
Sciences of the United States of America. 96 (7). 3801–3806.
Koopman, Hilda. 1982. Les questions. In Lefebvre et al, 204–241.
Kouwenberg, Silvia (ed.). 2003. Twice as Meaningful: Reduplication in Pidgins,
Creoles and Other Contact Languages. London: Battlebridge.
Kouwenberg, Silvia. 2010. Creole studies and linguistic typology: Part 2. Journal of
Pidgin and Creole Languages. 25(2). 359–380.
Kouwenberg, Silvia. 2012. Rejoinder. Journal of Pidgin and Creole Languages. 27(1).
167–169.
Kuhn, Thomas S. 1970. The structure of scientific revolutions. Chicago: University of
Chicago Press.
Labov, William. 2007. Transmission and diffusion. Language 83(2). 344–387.
Lalla, Barbara & Jean D’Costa (eds.). 1989. Voices in Exile: Jamaican Texts of the 18th
and 19th Centuries. Tuscaloosa : University of Alabama Press.
Lehmann, Christian. 2002. Thoughts on Grammaticalization. Second Edition. Erfurt:
Seminar für Sprachwissenschaft der Universität.
Lefebvre, Claire. 1998. Creole Genesis and the Acquisition of Grammar: The Case of
Haitian Creole. Cambridge Studies in Linguistics; 88. Cambridge University
Press
Lefebvre, Claire; Hélène Magloire-Holly & Nanie Piou (eds.), Syntaxe de l’haïtien. Ann
Arbor MI: Karoma Publishers.
Liceras, Juana; Helmut Zobl & Helen Goodluck (eds., 2008). The role of formal features
in second language acquisition. New York, NY: Lawrence Erlbaum Associates.
Little, Roderick & Donald Rubin. 2002. Statistical Analysis with Missing Data. Second
Edition. New York: Wiley.
Longobardi, Giuseppe & Cristina Guardiano. 2009. Evidence for syntax as a signal of
historical relatedness. Lingua 119(11). 1679–1706.
Luís, Ana (2008). Tense marking and inflectional morphology in Indo-Portuguese
creoles. In: Susanne Michaelis (ed.) Roots of Creole Structures: Weighing the
Contribution of Substrates and Superstrates. Amsterdam: John Benjamins.
Martin, Carla D. 2012. Sounding Creole: The Politics of Cape Verdean Language, Music,
and Diaspora. PhD diss., Harvard University.
McGhee, George R. 2011. Convergent Evolution: Limited Forms Most Beautiful.
Cambridge MA: MIT Press.
McMahon, April & Robert McMahon. 2006. Keeping contact in the family: Approaches
to language classification and contact-induced change. In Yaron Matras, April
McMahon, & Nigel Vincent (eds.) Linguistic Areas. Convergence in Historical

  105  
 
 

and Typological Perspective, 51–74. Houndmills Basingstoke: Palgrave


MacMillan.
McMahon, April; Paul Heggarty; Robert McMahon, & Warren Maguire. 2007. The
sound patterns of English: Representing phonetic similarity. English Language
and Linguistics 11. 113–143.
McWhorter, John. 1998. "Identifying the Creole Prototype: Vindicating a Typological
Class". Language. 74 (4): 788–818.
McWhorter, John. 2001. The world's simplest grammars are creole grammars.
Linguistic Typology 5)2/3). 125–166.
McWhorter, John. 2011. Linguistic Simplicity and Complexity: Why Do Languages
Undress? Boston: De Gruyter Mouton.
Meillet, Antoine. 1919. Le genre grammatical et l’élimination de la flexion. Scientia.
Rivista di Scienza, XXV, LXXXVI, 6. Reprinted in Antoine Meillet. 1958.
Linguistique historique et linguistique générale. Volume I. Paris, France: Honoré
Champion.
Meillet, Antoine. 1958. Linguistique historique et linguistique générale. Volume I.
Paris: Honoré Champion.
Michaelis, Susanne. 1993. Temps et aspect en créole seychellois: Valeurs et
interférences. Hamburg: H. Buske.
Miyagawa, Shigeru. 2010. Why Agree? Why Move? Unifying Agreement-Based and
Discourse-Configurational Languages. Cambridge MA: MIT Press.
Moreau de Saint-Méry, M. L. E. 1797. Description topographique, physique, civile,
politique et historique de la partie française de l’isle de Saint Domingue (3
volumes). Philadelphia: Chez l’auteur.
Mufwene, Salikoko. 2008. Language evolution: Contact, competition and change.
London, UK: Continuum.
Müller, Gereon. 2011. Constraints on Displacement: A Phase-Based Approach.
Amsterdam: John Benjamins.
Nakhleh, Luay; Tandy Warnow; Don Ringe & Steven N. Evans. 2005a. A comparison
of phylogenetic reconstruction methods on an Indo-European dataset.
Transactions of the Philological Society. 103 (2). 171–192.
Nakhleh, Luay; Don Ringe & Tandy Warnow. 2005b. Perfect Phylogenetic Networks:
A new methodology for reconstructing the evolutionary history of natural
languages. Language. 81 (2). 382-420.
Nichols, Johanna. 1992. Linguistic Diversity in Space and Time. Chicago: University of
Chicago Press.
Nichols, Johana. 2003. Diversity and Stability in Language. In Brian Joseph & Richard
Janda (eds.) The Handbook of Historical Linguistics, 283–310. Malden, MA:
Blackwell Pub.

  106  
 
 

Nichols, Johanna & Balthasar Bickel. 2011. Locus of Marking in Possessive Noun
Phrases. In Matthew Dryer & Martin Haspelmath (eds.) The World Atlas of
Language Structures Online. Munich: Max Planck Digital Library, chapter 24.
Available online at http://wals.info/chapter/24. Accessed on 2012-08-21.
Nichols, Johana, and Tandy Warnow. 2008. Tutorial on Computational Linguistic
Phylogeny. Language and Linguistics Compass. 2 (5). 760-820.
Nomura O, Osamu & Hiroshi Yasue. 1999. Genetic relationships among hippopotamus,
whales, and bovine based on SINE insertion analysis. Mammalian Genome:
Official Journal of the International Mammalian Genome Society. 10 (5). 526–
527.
Nurse, Derek; Sarah Rose & John Hewson (eds.). 2011. Verbal categories in Niger-
Congo. http://www.mun.ca/linguistics/nico/
O’Grady, William Delaney; John Archibald; Mark Aronoff & Janie Rees-Miller (2010).
Contemporary Linguistics: An Introduction. Boston: Bedford/St. Martin’s.
Parkvall, Mikael. 2008. The simplicity of creoles in a cross-linguistic perspective. In
Matti Miestamo, Kaius Sinnemäki & Fred Karlsson (eds.) Language Complexity:
Typology, Contact, Change, 265–285. Amsterdam: John Benjamins.
Paul, Hermann. 1891. Principles of the history of language. London: Longmans, Green &
Co.
Pelleprat, Pierre. 1665. Relation des missions des PP. de la Compagnie de Jésus dans
les îles et dans la terre ferme de l’Amérique Méridionale. Paris: Cramoisy &
Cramoisy.
Piou, Nanie. 1982. Le clivage du prédicat. In Lefebvre et al, 122–151.
Plag, Ingo. 2008a. Creoles as interlanguages: Inflectional morphology. Journal of
Pidgin and Creole Languages 23(1). 114–135.
Plag, Ingo. 2008b. Creoles as interlanguages: Syntactic structures. Journal of Pidgin
and Creole Languages 23(2). 307–328.
Plag, Ingo. 2009a. Creoles as interlanguages: Phonology. Journal of Pidgin and Creole
Languages. 24 (1). 119–138.
Plag, Ingo. 2009b. Creoles as interlanguages: Word-formation. Journal of Pidgin and
Creole Languages. 24.2. 339–362.
Posner, Rebecca. 1985. Creolization as typological change: Some examples from
Romance syntax. Diachronica 2(2). 167–188.
Prévost, Philippe & Lydia White. 2000. Missing surface inflection or impairment in
second language acquisition? Evidence from tense and agreement. Second
Language Research, 16(2). 103–133.
Rankin, Robert. 2003. The Comparative Method. In Brian Joseph & Richard Janda
(eds.) The Handbook of Historical Linguistics, 183–212. Malden, MA: Blackwell
Pub.

  107  
 
 

Ringe, Donald A., and Joseph F. Eska. 2013. Historical linguistics: Toward a twenty-first
century reintegration. Cambridge: Cambridge University Press.
Ringe, Donald; Tandy Warnow & Ann Taylor. 2002. Indo-European and Computational
Cladistics. Transactions of the Philological Society 100(1). 59–129.
Roberts, Sarah & Joan Bresnan. 2008. Retained inflectional morphology in pidgins: A
typological study. Linguistic Typology 12(2). 269–302.
Rountree, S. Catherine & Naomi Glock. 1982. Saramaccan for Beginners: A
Pedagogical Grammar of the Saramaccan Language. Paramaribo, Suriname:
Instituut voor Taalwetenschap.
Rubin, Donald. 1976. Inference and Missing Data. Biometrika 63(3). 581–592.
Saint-Quentin, Alfred de & Auguste de Saint Quentin (1872). Introduction à l’histoire
de Cayenne suivie d’un recueil de contes, fables et chansons en créole avec
traduction en regard, notes & commentaires. Antibes: J. Marchand.
Schafer, Joseph & John Graham. 2002. Missing data: Our view of the state of the art.
Psychological Methods 7(2). 147–177.
Schumann, John H. 1978. The Pidginization Process: A Model for Second Language
Acquisition. Rowley MA: Newbury House Publishers.
Singler, John. 1996. Theories of Creole Genesis, sociohistorical considerations, and the
evaluation of the evidence: The Case of Haitian Creole and the Relexification
Hypothesis. Journal of Pidgin and Creole Languages 11(2). 185-230.
Starnes, Daren; Dan Yates & David Moore, David. 1996. The Practice of Statistics.
Fourth Edition. New York: W. H. Freeman.
Stearns, S. C., & Rolf F. Hoekstra. 2000. Evolution: An Introduction. London: Oxford
University Press.
Sterlin, Marie-Denise. 1989. Les caractéristiques de pou: Un modal en position de
complémenteur. Revue québécoise de linguistique 18(2). 131–147.
Sylvain, Suzanne. 1936. Le créole haïtien. Morphologie et syntaxe. Port-au-Prince:
Wetteren.
Syvanen, Michael & Clarence I. Kado. 2002. Horizontal Gene Transfer. San Diego
CA: Academic Press.
Taylor, Douglas. 1956. Language contacts in the West Indies. Word 12. 399–414.
Theobald, Douglas L. 2010. A formal test of the theory of universal common ancestry.
Nature 465 (7295). 219–222.
Thomason, Sarah & Terrence Kaufman. 1988. Language Contact, Creolization, and
Genetic Linguistics. Berkeley: University of California Press.
Viada Bellido de Luna, Marta & Nichola Faraclas. 2012. Linguistic evidence for the
influence of indigenous Caribbean grammars on the grammars of the Atlantic
Creoles. In Nicholas Faraclas (ed.) Agency in the Emergence of Creole
Languages: The Role of Women, Renegades, and People of African and
  108  
 
 

Indigenous Descent in the Emergence of the Colonial Era Creoles. Amsterdam.


John Benjamins.
Welmers, William Everett. 1973. African Language Structures. Berkeley: University of
California Press.
Weinreich, Uriel. 1958. On the compatibility of genetic relationship and convergent
development. Word 14, 374–379.
Whinnom, Keith. 1971. Linguistic hybridization and the ‘special case’ of Pidgins and
Creoles. In Hymes, 91–115.
Wichmann , Søren & Arpiar Saunders (2007). How to use typological databases in
historical linguistic research. Diachronica 24(2). 373–404.
Witte, Robert & John Witte. 2010. Statistics. Ninth Edition. Hoboken: John Wiley.

  109  
 

You might also like