Ramazan Korkmaz, Gürkan Doğan, Endangered Languages of The Caucasus and Beyond (2017)

Endangered Languages of the Caucasus and Beyond
The Languages of Asia

Series
Series Editor
Alexander Vovin (Ecole des Hautes Etudes en Sciences Sociales (EHESS), France)
Editorial Board
José Andrés Alonso de la Fuente (Universitat Autònoma de Barcelona)

Wolfgang Behr ( University of Zurich)
Uwe Bläsing (Leiden University)
Bjarke Frellesvig (University of Oxford)
Stefan Georg (University of Bonn)
Juwon Kim (Seoul National University)
Ross King (University of British Columbia)
Dongho Ko ( Jeonbuk National University)
Mehmet Ölmez (Istanbul Technical University)
Toshiki Osada (Institute of Nature and Humanity, Kyoto)
Laurent Sagart (CRLAO, Paris)
Claus Schönig (Freie Universität Berlin)
Marek Stachowski ( Jagellonian University of Kraków)
Yukinori Takubo (Kyoto University)
John Whitman (Cornell University)
VOLUME 15
The titles published in this series are listed at brill.com/la

Endangered Languages of the
Caucasus and Beyond
Edited by
Ramazan Korkmaz and Gürkan Doğan
LEIDEN | BOSTON
Cover illustration: “FLYING ‘WHITE’ SHAMAN IN ‘T’ LETTER SHAPE” by Ahmet Ali Aslan.
Reproduced with kind permission. According to Altaian belief, the soul is carried off by the spirits
eastward if the youth is destined to become a ‘White’ shaman. The flying shaman form, full of symbols
referring to the cycle of life (snake), and the divine (Upper, Middle, and Lower World, Black and White),
creates the letter “T” in the original alphabet used by Turkic peoples in their Orhun and Yenisei script.
The Library of Congress Cataloging-in-Publication Data is available online at http://catalog.loc.gov

LC record available at http://lccn.loc.gov
Typeface for the Latin, Greek, and Cyrillic scripts: “Brill”. See and download: brill.com/brill-typeface.
issn 2452-2961
isbn 978-90-04-32564-7 (hardback)
isbn 978-90-04-32869-3 (e-book)
Copyright 2017 by Koninklijke Brill nv, Leiden, The Netherlands.

Koninklijke Brill NV incorporates the imprints Brill, Brill Hes & De Graaf, Brill Nijhoff, Brill Rodopi and
Hotei Publishing.
All rights reserved. No part of this publication may be reproduced, translated, stored in a retrieval system,
or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise,
without prior written permission from the publisher.
Authorization to photocopy items for internal or personal use is granted by Koninklijke Brill nv provided
that the appropriate fees are paid directly to The Copyright Clearance Center, 222 Rosewood Drive,
Suite 910, Danvers, ma 01923, usa. Fees are subject to change.
Brill has made all reasonable efforts to trace all rights holders to any copyrighted material used in this work.
In cases where these efforts have not been successful the publisher welcomes communications from
copyright holders, so that the appropriate acknowledgements can be made in future editions, and to settle
other permission matters.
This book is printed on acid-free paper and produced in a sustainable manner.

Contents
Preface vii
1 Consequences of Russian Linguistic Hegemony in (Post-)Soviet

Colonial Space 1
Gregory D. S. Anderson
2 The Contacts between the Ossetians and the Karachay-Balkars,

According to V. I. Abaev and Marrian Ideology 17
Johnny Cheung
3 Why Caucasian Languages? 39

Bernard Comrie
4 International Research Collaboration on Documentation and

Revitalization of Endangered Turkic Languages in Ukraine: Crimean
Tatar, Gagauz, Karaim, Qrymchak and Urum Experience 51
İryna M. Dryga
5 Cases-Non-cases: At the Margins of the Tsezic Case System 60

Diana Forker
6 Language Endangerment in the Balkans with Some Comparisons to

the Caucasus 79
Victor A. Friedman
7 Instilling Pride by Raising a Language’s Prestige 91

George Hewitt
8 Unwritten Minority Languages of Daghestan: Status and Conservation

Issues 98
Zaynab Alieva and Madzhid Khalilov
9 Report on the Fieldwork Studies of the Endangered Turkic

Languages 108
Yong-Sŏng Li
vi contents
10 Empire, Lingua Franca, Vernacular: The Roots of Endangerment 122

Nicholas Ostler
11 Endangered Turkic Languages from China 135

Mehmet Ölmez
12 The Death of a Language: The Case of Ubykh 151

A. Sumru Özsoy
13 Diversity in Dukhan Reindeer Terminology 166

Elisabetta Ragagnin
14 How Much Udi is Udi? 187

Wolfgang Schulze
15 Language Contact in Anatolia: The Case of Sason Arabic 209

Eser Erguvanlı Taylan
16 Language and Emergent Literacy in Svaneti 226

Kevin Tuite
17 The Internet as a Tool for Language Development and Maintenance?

The Case of Megrelian 244
Karina Vamling
18 Linguistic Topography and Language Survival 258

George van Driem
19 And So Flows History 275

Alexander Vovin
Index 289
Preface
According to UNESCO, it is believed that at least half of the nearly 7,000 lan-
guages spoken around the world will cease to be used within the next 100 years.
If this issue is neglected, people will lose not only their cultural heritage but
also invaluable understandings about the history of all humankind. In other
words, with the disappearance of unwritten and undocumented languages,
humanity would lose not only cultural wealth but also important ancestral
knowledge embedded, in particular, in indigenous languages. As many of these
languages are no longer being handed down from generation to generation,
the number of the native speakers of these languages is decreasing, leaving
them endangered. For instance, the Ubykh language, one of the languages spo-
ken in the Northwest Caucasus, became extinct when the last speaker Tevfik
Esenç died in 1992. For such reasons, if efforts are not made to document the
speech and cultural practices of those who use these languages, many speech
forms will disappear along with the cultural heritage they embody.
Within the context above, the 1st International CUA Conference on
Endangered Languages was held by the Caucasus University Association (CUA)
on 13-16 October 2014, at Ardahan, Turkey, in collaboration with the Turkish
Language Society, Ardahan University and Harvard University. The goal of this
“open by invitation conference” was to bring together 35 prominent scholars
from all over the world who were actively engaged in research projects and
partnerships on different aspects of language endangered, documentation and
revitalization to create a forum to be able to discuss the problems of dying lan-
guages on both theoretical and practical levels. The motto of the Conference
was “before shooting stars vanish” in the sense that the CUA views each lan-
guage as a star in the sky that could turn into a meteor and believes that as
many of these languages should be documented and/or revitalised as possible
before they fade away. The regional focus of the Conference was global, with
a particular emphasis on the languages and cultures of the Caucasus and the
Asia. The aim was to inspire international discussion to a linguistic diverse and
unique corner of the globe, through which glocal scholars would be able to
benefit in their work. The conference program is available on the official web
site (https://www.ardahan.edu.tr/CUAConference2014/).
The region in which the 124 participant universities of the CUA are situated
is home to a rich linguistic diversity, much of it highly endangered, making it
particularly appropriate to situate the Conference at Ardahan.
The primary intent of the Conference was to offer the invited scholars the
opportunity to address relevant linguistic issues to articulate best practices
viii preface
in the field. Hence the participants addressed issues such as the state of the
field of language documentation, conservation and revitalization; experi-
ences that reflect on establishing research centres and international research
collaborations; or topics related to technology, data collection, archiving and
preservation.
The conference concluded with an action plan for which the ultimate goal
was to establish an “Endangered Languages Research Centre” at Ardahan. We
are very happy to announce that the Turkish Language Society kindly declared
to support this Centre financially.
All 35 participants were invited to submit their papers for this publica-
tion and 19 scholars submitted their work which represent the papers in this
volume.
We would like to thank all of the presenters in the panel sessions, who made
the conference not only interesting but provocative as well.
The Brill’s editors kindly accepted the current editor’s suggestion to publish
these papers in the form of a special volume.
We believe that the present volume forms a coherent collection to com-
plement the previously published volumes of the Global Oriental. Special
thanks go to our conference sponsors for their continued support: Turkish
Language Society, Kyrgz – Turkish Manas University and the Turkish National
Commission for UNESCO. Last but not least we wish to thank Smithsonion
Institution, Foundation for Endangered Languages, and Living Tongues
(Institute for Endangered Laguages) for their invaluable academic support.
Prof. Ramazan Korkmaz

Prof. Gürkan Doğan
Chapter 1
Consequences of Russian Linguistic Hegemony in

(Post-)Soviet Colonial Space
Gregory D. S. Anderson
Introduction
Siberia – the very name can evoke a shudder. Why? True, it is the coldest
inhabited part of the earth so a shiver, yes, but why a shudder? This is due
to the fact that Siberia is known around the world and in Russia itself, where
it constitutes the vast majority of the land mass of that giant nation,1 as a
frozen wall-less prison. To be sure, Siberia served as a penal colony for the
Tsarist Imperial Russian leaders and this tradition was institutionalized with
murderous zeal by their successor Soviet hegemons. But this is only part of the
sad story of the consequences of Russian/Soviet colonialism on the diverse
native populations of Siberia. For the purposes of the present study I focus
only on the linguistic consequences of Russian imperialism and hegemony on
the Native Siberian peoples.
I start with a brief overview of the diverse Native Siberian groups as they
stood at the time of the initial colonialist expansion and exploitation in the
16th century in section 1. I turn in section 2 to an introduction to various pre-
Soviet phases of Russian colonialism and hegemony over Native Siberian
populations. In section 3, using post-Soviet census data, I discuss the issues
of language shift and ethnic shame that move ever forward among the Native
Siberian population groups in the post-Soviet colonial space. Finally in section
4 I present some structural linguistic consequences of Russian linguistic hege-
mony on the grammatical structures of the dwindling and receding languages
of the vast Siberian territory.
1 Indeed we should say the entire Asian portion of Russia in a traditional, non-administrative
understanding of the term Siberia.
© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_00�

2 Anderson
1 Native Siberia at the Time of Russian Contact
Siberia at the time of contact was home to several dozen languages belonging
to a range of different language families. Moving west to east we find various
northern Samoyedic (Nenets, Enets, Nganasan) and Ob-Ugric (Khanty, Mansi)
speaking peoples in the western edge of Siberia between the Urals and the
Ob-Irtysh River complex. These people mainly pursued reindeer breeding in
the north and hunting and fishing economies in the southern parts of this
region. To their east were found a range of Siberian Turkic groups (Tuvan,
several Altai and Xakas groups, Shor, Chulym Turks, the Tofa and Siberian Tatars
groups) in the southern regions and southern Samoyedic (Selkup, Kamasian,
etc.) and Yeniseic groups, today represented only by the Ket and Yugh. The
Turkic speakers were largely pastoral nomads, but this mixed with hunting/
fishing or reindeer breeding in the northern mountainous and swampy regions
where these economic pursuits were more viable, while the Samoyedic and
Yeniseic peoples originally pursued hunter/fishing economies with some
limited reindeer-based economies, for example among the northern Selkup.
The Tungusic speaking groups occupied a vast territory stretching eastward
from central Siberia all the way to Sakhalin in the south, Kamchatka in the east,
and the Russian Arctic Far East in the north. These include such languages as
Evenki, Even, Negidal, Nanai, Udihe, Ulcha, Oroch and Orok. Tungusic peoples
were engaged in either hunting pursuits or reindeer economies, depending
on the environmental conditions. In the southern part of the western half of
this territory, Tungusic speakers were in contact with the pastoralist Buriat,
speakers of a Mongolic language, and their now extinct linguistic cousins
the Soyot. On Sakhalin and in the Amur river area, Tungusic speakers were
in contact with Nivkh (Gikyak) a riverine fishing-oriented people who speak
a language isolate. In the southern and central part of Kamchatka, Itelmen-
speaking people were found, and in the north Koryak who speak a Chukotko-
Kamchatkan language. In the northern part of the Tungus-speaking area were
found to the east the reindeer-breeding Yukaghiric-speaking peoples like the
Odul, Wadul, Chuvan and Omok, the latter two now extinct linguistically, and
the Omok ethnically as well too. To the east of the Yukaghiric peoples were
the reindeer-herding Chukchi, also of the Chukotko-Kamchatkan language
family (which also includes Chukchi’s sister languages Kerek and Al’utor),
while in the coastal parts of Chukotka, Chukchi-speaking people pursued sea-
mammal hunting oriented economies similar to the local Eskimoic-speaking
populations, the Sireniki, Naukan and Siberian Yupik. In short, Native Siberia
at the time of Russian contact was home to a vast array of different peoples
Consequences of Russian Linguistic Hegemony 3
speaking several dozen languages belonging to a range of unrelated linguistic

taxa or genetic units.
2 Phases of Colonialism and Hegemony in Native Siberia
Siberia was subjected to at least two very different patterns of colonialism

during the Russian Imperial era. The first stage treated Siberia as revenue
resource for the Imperial coffers (Forsyth 1992, Slezkine 1994). At this stage,
during roughly three hundred years stretching from the 16th century across
the 18th century, Siberia stood as a classic example of an exploitation colony.
The expansion to Siberia was pursued to fulfill a Tsarist lust for fur, complete
with forts to ensure the tax collection and protect the Crown’s interests. As
long as the dreaded yasak or fur tribute tax was paid, often set at debilitatingly
high rates, which resulted in a state of virtual enslavement for many families
across generations, Native Siberian people were more or less free to continue
their traditional livelihoods (Forsyth 1992, Slezkine 1994). When not met,
harsh physical punishments (including death) were meted out (Forsyth 1992,
Slezkine 1994).
The next phase in the colonialist mistreatment of Siberia came in the phase
of the penal-cum-settlement colony for which it is known the world over.
Exiles with families were to begin an irrevocable shift in the demographics
of Siberia. Territorial hegemony was expanded and economic hegemony over
the already exploited Native Siberians solidified throughout the 19th century.
Ultimately this was followed by attempts at spiritual hegemony over the Native
Siberian populations as well, since inevitably with the exiles and newcomers
came missionaries determined to bring the heathens of Siberia into the Russo-
Christian light.
While several Native Siberian groups abandoned their languages in this
period, the shift was from one indigenous language to another, generally
accompanied by a change in traditional economic pursuits. Thus several for-
mer hunter-gatherer groups speaking Yeniseic (Arin, Assan, Kott, Pumpokol)
or southern Samoyedic languages (Kamasian, Mator, Taigi) adopted a local
Siberian Turkic lect when they adopted pastoral nomadism. Russian thus still
had relatively little impact on the Native Siberian languages up until the 20th
century, except for lexical borrowings which were generally phonologically
accommodated to the indigneous sound systems (Anderson 1995).
What changed all this was the rise of Soviet Siberia. Ideologically, the
Native Siberian populations were problematic to place in the social order of
4 Anderson
Marxism-Leninism, seeing as many were engaged in Stone-Age economies,

with little relevance to the urban worker milieu that engendered the Bolshevik
Weltanschauung. An early idealism with respect to the proper means of deliv-
ering the message of Red Salvation to the exploited masses – i.e., it was seen as
best to be delivered in their own tongues – was at best incompetently attempted
and quickly abandoned. Native Siberians were definitely, as put by Slezkine
(1994), last among equals in the fraternal brotherhood of nations moving to
the great Soviet Socialist future. Ideological hegemony over these populations
was only part of the goal. The environment of paranoia that spread in the late
1920s and accelerated through the 1930s during Stalin’s reign of terror and
power consolidation led to requisitioning vast stretches of Native Siberian ter-
ritory and putting them into the service of the State supported terror regime
against its citizenry, with Native Siberian populations being disproportionately
subjected to this. During the Second World War, massive numbers of people
including entire ethnic groups from the Caucasus and hundreds of thousands
of Ukrainians were deported to Siberia (Grenoble 2003). This further tipped
the demographic scales away from Native Siberians. Moreover, officially the
only access to education and economic and social advancement was through
the medium of Russian, following the implementation of Stalin’s ‘second
mother tongue’ policy (Grenoble 2003). To compound the issues even more,
traditional lifeways were suppressed (collectivization of herds, shamans exe-
cuted as kulak-overlords), which further disrupted the populations and eroded
the Native Siberian cultural fabric.
The result of all of these trends was that there began a massive shift away
from Native Siberian ethnic identities and shift to the now supremely domi-
nant Russian language. We can see this manifest itself in both language reten-
tion rates, reported rates of ethnic affiliation, discussed in section 3 below, and
in the wholescale restructuring of the grammars of the minority languages
actually still used by Native Siberians, which I turn to in section 4.
3 Census Data and Indicators of Ethnic Shame & Language Shift
All things being equal, minority population groups should grow steadily unless
some catastrophe has occurred like epidemic disease or war that has fractioned
the population. Indeed we can see this trend when looking at census returns
on ethnic affiliation for some (but not all) Native Siberian groups. If we look at
these trends as reported in six successive census returns from 1959-2010, four in
the Soviet period and two in the post-Soviet period, for groups like the Nenets,
the Evenki, the Dolgan or the Yukaghir, everything appears normal in terms of
demographic growth.
Population figures from census records

(1)
1959 1970 1979 1989 2002 2010
Nenets 23,007 28,705 29,894 34,665 41,302 44,640

Evenki 24,151 25,471 27,294 30,163 35,527 38,396
Dolgan 3,932 4,877 5,053 6,945 7,261 7,885
Yukaghir 442 615 835 1,142 1,509 1,603
For many other groups however, there are disturbing trends revealed, and
the number actually declines in many cases. What the demographic decline
numbers actually reveal is not loss of physical representatives/actual
population members, but rather the decline in the subjective evaluation
of being associated with that identity. This is a process called ethnic shame.
Thus, most Native Siberian groups have experienced, or are now undergoing,
rapid spread of ethnic shame and are hiding their identity in favor of reporting
as Russian. We can divide the Native Siberian groups in roughly four such
groups as they reflect somewhat different trends in the spread of ethnic
shame. Two groups from western Siberia, the Khanty and Mansi experienced
this only during the Soviet era between 1970 and 1979. This period marked
a massive upturn in the exploitation of the extractive oil and gas industries
on their territories. Since then, the groups that had self-identified as Mansi
has grown steadily while Khanty experienced an up-turn in ethnic identity
between 1989 and 2002, but then a more plausible increase between 2002
and 2010.
(2)
1959 1970 1979 1989 2002 2010
Khanty 19,410 21,138 20,934 22,521 28,678 30,943

Mansi 6,449 7,710 7,563 8,474 11,432 12,269
6 Anderson
Four other Native Siberian groups began showing this spread of ethnic shame
already in the same period, but this has continued and/or accelerated. These
are the Oroch, Nganasan, Ket and Nivkh. The various fluctuating numbers show
local trends that distinguish each of these groups. Oroch began this decline in
ethinc identity and spread of ethnic shame in the period between 1979 and 1989,
the other three between 1970 and 1979. Nivkhs experienced normal growth reports
between 1970 and 1979. Nganasans returned a serious upturn in self-reported
ethnic identity between 1979 and 1989, but experienced steep decline between
1989 and 2002. Kets showed a reverse trend with continued decline between
1979 and 1989 and an up-turn in the immediate post-Soviet decade (1989-2002).
However, all four report declines in the most recent census.
(3)
1959 1970 1979 1989 2002 2010
Oroch 782 1,089 1,198 915 686 596

Nganasan 748 953 867 1,278 862 834
Ket 1,019 1,182 1,122 1,113 1,494 1,219
Nivkh 3,717 4,420 4,397 4,673 5,162 4,652
Indeed this is the trend one finds across Siberia: the process of ethnic shame
has advanced and spread considerably in the post-Soviet period. Some groups
began to show this decline in the period between the end of the USSR and
the first post-Soviet census (1989-2002), a group which includes four Tungusic
languages, Nanai, Ulcha, Udihe and Negidal, plus Koryak and Aleut. This
process has continued in the 2010 census returns.
(4)
1959 1970 1979 1989 2002 2010
Nanai 8,026 10,005 10,516 12,023 12,160 12,003

Koryak 6,287 7,487 7,879 9,242 8,743 7,953
Ulcha 2,055 2,448 2,552 3,233 2,913 2,765
Udihe 1,444 1,469 1,551 2,011 1,657 1,496
Aleut 421 450 5,46 702 540 482
Negidal — 537 5,04 622 567 513
The last group are just recently revealing these trends in ethnic shame and self-
invisibilization. The 2010 census returns show the first reported demographic
decline among the Chukchi, Sel’kup, Itelmen and Tofa.
(5)
1959 1970 1979 1989 2002 2010
Chukchi 11,727 13,597 14,000 15,184 15,767 15,098

Sel’kup 3,768 4,282 3,565 3,612 4,249 3,649
Itelmen 1,109 1,301 1,370 2,481 3,180 3,193
Tofa 586 620 763 731 837 762
Unsurprisingly, language retention rates for Native Siberian languages

show even further decline. What makes this particularly alarming is that
the reported language numbers are generally inflated massively (Grenoble
2003; Anderson 2010, 2011). Thus, while the 2002 census of Tofa reported 378
speakers, I did an actual door to door survey in the villages Tofa is spoken
in myself in 2001 and the actual number then was under 40! Indeed Native
Siberia is one of the most extreme areas of language endangerment and thus
represents one of the hottest of the Language Hotspots (Anderson 2010, 2011, in
preparation). The vast majority of languages are either moribund or seriously
endangered. Even the groups whose populations do not show advancing
ethnic shame are rapidly shifting to Russian and away from their ancestral
languages.
Compare the reported retention rates for the censuses between 1959 and
2002 for the above mentioned five sets of languages. We see lower percentages
across the board as a rule. Some languages only show this decline in reported
numbers. Where numbers increase, we are dealing with specific social condi-
tions that reflect more nostalgia for, and identity with, the ancestral language,
and not actual linguistic practices. This is in part triggered by the Russian form
rodnoj jazyk that census takers use, which is a phrase that connotes something
more like the language of your heart and your heritage than the one of your
tongue and brain. And all these upwards trends are followed in subsequent
returns by more realistic declines.
8 Anderson
(6)
1959 1970 1979 1989 2002
Nenets 84.7% 83.4% 80.4% 77.1% 75.8%

Evenki 54.9% 51.3% 42.8% 30.4% 21.3%
Dolgan 93.9% 89.8% 90.0% 81.7% 67.0%
Yukaghir 52.5% 46.8% 37.5% 32.8% 40.0%
Khanty 77.0% 68.9% 67.8% 60.5% 47.3%

Mansi 59.2% 52.4% 49.5% 37.1% 24.0%
Oroch 68.4% 48.6% 40.7% 18.8% 37.5%

Nganasan 93.4% 75.4% 90.2% 83.2% 60.6%
Ket 77.1% 74.9% 61.0% 48.3% 32.5%
Nivkh 76.3% 49.5% 30.6% 23.3% 13.3%
Nanai 86.3% 69.1% 55.8% 44.1% 32.0%

Koryak 90.5% 81.1% 69.0% 52.4% 34.5%
Ulcha 84.9% 60.8% 38.8% 30.8% 25.2%
Udihe 73.7% 55.1% 31.0% 26.3% 13.7%
Aleut 22.3% 21.8% 17.8% 28.3% 32.4%
Negidal — 53.3% 44.4% 28.3% 25.9%
Chukchi 93.9% 82.6% 78.2% 70.3% 49.1%

Sel’kup 50.6% 51.1% 56.6% 47.6% 38.6%
Itelmen 36.0% 35.7% 24.4% 19.6% 12.1%
Tofa 89.1% 56.3% 62.1% 43.0% 45.2%
Russian linguistic hegemony and language ideology that valorizes Russian as

inherently superior to other languages of Russia, and enfranchised through the
institutionalization of the Russian language as the only medium to access to
transnational culture, education, social and economic advancement spells the
doom for most Native Siberian languages. Since it is unlikely these attitudes
will change soon, it is unlikely that the rapid decline of Native Siberian
languages will be reversed. These processes have accelerated significantly
in the post-Soviet period, possibly as a result of the removal of any pretense
of state support for indigenous minority identity in Siberia. Indeed it is not
a stretch to say that most of the language families of Siberia will be extinct
before 2100.
4 Other Consequences of Russian Linguistic Hegemony:

Codeswitching and Restructuring
Since basically all Native Siberian people outside of Tuva (which only joined
the USSR in the 1940s) are fluent in Russian, it is no surprise that codemixing
or codeswitched utterances are commonplace in the speech of those Native
Siberians who continue to use the minority languages, such as the following
mixed Russian-Chulym Turkic sentence.
(7) Chulym Turkic (Ös)

kør-ze-m na nas kakoj=ta ʃybyr moɣalaq [kør-ybyly]
see-cond-1 R.at R.us R.some.such poor bear [look-prs]
‘I look and what do I see but some poor bear (looking) at us’
(Anderson and Harrison 2004: 184)
Code-switching with Russian is still not very well explored for almost any
minority language of the Russian Federation and will likely advance the study
of this phenomenon greatly when it can be better investigated. For example,
there is complex gender agreement interactions found in Erzya-Russian
codeswitching (Janurik 2015) and similar phenomena are found in most
languages of Russia today, but the topic has not been explored in most of them.
4.1 Evidence of Contact-Induced Restructuring I: Syntax of Complex

Sentences
For those languages still extant, Russian structures are penetrating into all
domains of grammar in contemporary Native Siberian language usage. This
includes restructuring the basic syntactic structures of the languages, e.g.,
morphosyntactic projections or frames with respect to case assignment to
arguments by predicates (4.2) or the whole-scale restructuring of complex
sentence structure towards ones where the embedded or dependent clauses
has a finite verb and is headed by a complementizer, relative pronoun or
subordinator and follows the matrix clause rather than complex sentences
with embedded clauses preceding the matrix clause and appearing with a non-
finite (participle or converb) verb form and no clause initial complementizer,
relative pronoun or subordinator. I will use mainly data from Siberian Turkic
languages (Anderson 2004, 2005) to exemplify the range of changes found
10 Anderson
resulting directly from Russian contact and the adoption of Russian linguistic
norms into these originally quite different and distinct language systems.
One area of the syntax of complex sentences that has a clear Russian origin
and one which was originally entirely alien to Siberian Turkic structure is the
use of a clause-initial complementizer followed by a finite verb of the type
S <finite.matrix> COMP S <finite.“embedded”> (8) instead of the original structure of
S <non-finite.“embedded”> + S <finite.matrix> (9), often with a genitive marking on the
subject of the embedded clause to further show the non-finite or nominalized
quality of the embedded predicate. The borrowed Russian complementizer ʃto
introduces this new finite complement clause in (8).
(8) Abakan Xakas

noɣa sɪler saɣɯn-tʃa-zar ʃto min xorɯx-tʃa-m
why you.pl think-prs.i-2pl pycck.comp I be.scared-prs.i-1
‘why do you think I am scared?’
(Anderson 2005: 197)
(9) Abakan Xakas

sirer-nɪŋ irtɪ nan-dʒaŋar min undu-p
you.pl-gen early return-modal.nonfin:2pl I forget-cv
sal-tɯr-bɯn
pfv-evid.pst-1
‘I forgot that you had to leave early’
Another feature of Russian syntax now found in high-contact varieties of

Siberian Turkic include relative clauses with finite verbs introduced by relative
pronouns that follow the noun relativized on (10). Both such features are alien
to Siberian Turkic which rather used non-finite pre-nominal participle forms
and no relative pronouns, still found in the speech of some less restructured
speakers (11).
(10) Abakan Xakas

sin pil-bin-tʃe-zɪŋ ol kɪʒɪ-dɪ xajzɯ-nɣa min
you know-neg-prs.i-2 that person-acc which-3.dat I
paz-ɯbɯs-xa-m
write-prf-pst-1
‘you don’t know that person I wrote to’
cf. (11) Abakan Xakas

ol kør-gen pyyr-neŋ min xorɯx-pas-tɯɯx-pɯn
he see-pst.prtcpl wolf-gen I fear-neg.irr-sbjnctv-1
‘I wouldn’t have been scared of the wolf he saw’
Another Russian feature that has found its way into Native Siberian languages
is the use of a scopeless negative operator and a finite verb together with a
borrowed subordinator in a type of temporally subordinate clause generally
introduced by ‘until’ or ‘before’ in English translations, and by poka in
the original Russian and restructured Siberian Turkic forms. So compare
for example (12) and (14) with borrowing/restructuring with the original
formations in (13) and (15).
(12) Abakan Xakas

poka pol-bas-tar soox-tar
subord be-neg.fut-pl frost-pl
‘until/before it gets cold’
(13) Abakan Xakas

soox pol-ɣandʒa
frost be-cv
‘until/before it gets cold’
(14) Abakan Xakas (15) Abakan Xakas

poka turu-bas-pɯn min tur-ɣandʒa
subord stand-neg.fut-1 I stand-cv
‘until I stand’ ‘until I stand’ (A05: 219-20)
Intermediate stages in this shift can be found in the speech of some speakers
too. Thus in (16) we find a form introduced by the borrowed subordinator and
with the scopeless negative operator, but with a non-finite verb as well (here in
a participle form marked also with the locative case).
(16) Abakan Xakas

poka pɪs par-ba-an-de ib-zer
subord we go-neg-pst.prtcpl-loc house-all
‘before/until we went home’ (Anderson 2005: 219)
12 Anderson
This Russian-originated structure has proven to be particularly easy to borrow.

In addition to the Turkic data above, we find similar formations in Yeniseic
languages too. So for Yugh Verner (1997: 194) lists poka + bəɲ [neg] + Verb with
a fully restructured subordinator + scopeless negative operator + finite verb. In
S. Ket on the other hand in the 1970s one finds a similar structure with a calqued
and not a borrowed subordinator asjka . . . bən Verb (Kostjakov 1976: 59). In
Central Ket there was variation between the original non-finite (‘converb’)
structure competing with an intermediate one introduced by the calqued
subordinator + scopeless negative operator, i.e., Verb=baŋdiŋa alternating with
asjka . . . bən . . . Verb=baŋdiŋa. Central Ket (Grishina 1977: 105).
4.2 Contact-Induced Restructuring II: Case Usage, Morphosyntax and

Shift in Verbal Subcategorization Frames
Complex sentence structure is not the only domain that shows the tangible
effects of contact with Russian. More subtle morphosyntactic patterns also
reflect the hegemonic pressure Russian has exerted over Native Siberian
languages. One such area is the introduction of dative case for impersonal
subjects (17) whereas the original structure still is used by less restructured
speakers used the nominative (18) and triggered verb agreement with that
referent as the subject.
(17) Abakan Xakas

maɣaa nan-arɣa kirek
I:dat return.home-inf nec
‘I have to go home’
(18) Abakan Xakas

min ib-zer par-arɣa kirek pol-ɣa-m
I house-all go-inf nec aux-pst-1
‘I had to go home’
While originally essive constructions were marked by a postposition derived

from a converb form of ‘be’ (i.e., ‘being X’ > ‘as X’), and this formation can
still be found for example in the speech of the Bel’tir Xakas (20), in urban
vernacular varieties one finds the very Russian use of instrumental case in this
function (19).
(19) Abakan Xakas

ol traktorist-peŋ toɣɯn-tʃa
he tractor.driver-ins work-prs.i
‘he works as a tractor-driver’
cf. (20) Bel’tir Xakas

dojarka pol-ɯp toɣɯn-tʃa-m
milkmaid be-cv work-prs.i-1
‘I work as a milkmaid’
(Subrakova 1992: 46)
Lastly, the introduction of new functions to cases also extends to the genitive
in the speech of high-contact varieties of Siberian Turkic, such as Tofa (21) or
Abakan Xakas (22). Genitive case previously never was the case subcategorized
for by any verb as the form of its complement/object/2nd argument in the
Turkic languages, but now it is used in these highly restructured varieties
as the case that is projected onto the complement of the verb ‘fear, be afraid of’,
a pattern of usage that clearly and obviously reflects Russian norms. Originally
and still in less restructured varieties of these Turkic languages the ablative case
was/is used (23).
(21) Tofa
kør-gen-ɪ irezaŋ-nɯŋ men kòrt-pa-an men
see-pst.prtcpl-def bear-gen I fear-neg-pst 1
‘I was not afraid of the bear I saw’
(Field Notes 2001, SDA-Bear Story)
(22) Abakan Xakas (23) Abakan Xakas

ol pyyr-nɪŋ xorɯx-tʃa olar pyyr-deŋ xorɯx-tʃa-lar
he wolf-gen fear-prs.i they wolf-abl fear-prs-3pl
‘he is afraid of the wolf’ ‘they are scared of the wolf’
(Anderson 2005: 179) (Anderson 2005: 180)
Some contact-triggered restructuring towards Russian norms is found

throughout the languages of Siberia. For example Gruzdeva (2015) describes
a host of changes in Nivkh, such as the importation of forms of ‘give’ into the
imperative paradigm as first person imperative forms (24), calqued on the
Russian model (25),
14 Anderson
(24) Nivkh
t‘ana ñ-aχ lu-gu-ja
give.imp.2sg I-acc sing-caus-imp.2sg
‘let me sing’
(Gruzdeva 2015: 171)
(25) Russian
daj spo-ju
give.imp sing-fut.1sg
‘let me sing’
(Gruzdeva 2015: 171)
or the calqued use of clause-initial conditional marker aif that is redundant,

as there is conditional inflection already on the verb, but used exactly as
Russian esli is (Gruzdeva 2015: 174). Nor are Siberian languages unique among
the subjects of linguistic hegemony in the post-Soviet colonial space, although
they are certainly the most severely affected. Similar syntactic restructuring
has been reported in such diverse languages under Russian dominion as
Udmurt (Kaysina 2015) and Chechen (Guerin 2015).
5 Summary
Five centuries of Russian colonialism and hegemony have left the once rich
diversity of languages in Native Siberia in steep decline. Economic exploitation
followed by waves of settlement colonization altered the demographics of
Siberia forever, and Russian emerged as the only language with a place long-
term in the linguistic market. Pressures came from below and above, and
language abandonment and ethnic shame have arisen and spread. The future is
very bleak for the languages of Siberia. Before they disappear entirely, massive
structural borrowing and incorporation of loans from Russian is their fate.
Only adequate documentation undertaken now can leave future generations
of Native Siberians the legacy they deserve should the social and economic
conditions ever once again become favorable to reclaim these identities.
References
Anderson, Gregory D. S. 1995. Diachronic Aspects of Russianisms in Siberian Turkic.

In Berkeley Linguistics Society 21. Berkeley: Berkeley Linguistics Society, pp. 365-76.
Anderson, Gregory D. S. 2004. The Languages of Central Siberia: Introduction and

Overview. In E. Vajda (ed.) Languages and Prehistory of Central Siberia. Amsterdam:
John Benjamins, pp. 1-119.
Anderson, Gregory D. S. 2005. Language Contact in South Central Siberia. Turcologica
54. Wiesbaden: Harrassowitz Verlag.
Anderson, Gregory D. S. 2006. Towards a Typology of the Siberian Linguistic Area.
In Yaron Matras, April McMahon, and Nigel Vincent (eds.) Linguistic Areas.
Convergence in Historical and Typological Perspective. Basingstoke: Palgrave
Macmillan, pp. 266-300.
Anderson, Gregory D. S. 2010. Perspectives on the global language extinction crisis: The
Oklahoma and Eastern Siberia Language Hotspots. Revue Roumaine de Linguistique
XLV: 129-142.
Anderson, Gregory D. S. 2011. Language Hotspots: what (applied) linguistics and educa-
tion should do about language endangerment in the twenty-first century. Language
and Education 25 (4): 273-289.
Anderson, Gregory D. S. in preparation. Language Extinction. Cambridge University
Press.
Anderson, Gregory D. S. and K. David Harrison. 2004. Shaman and Bear: Siberian
Prehistory in Two Middle Chulym Texts. In E. Vajda (ed.) Languages and Prehistory
of Central Siberia. Amsterdam: John Benjamins, pp. 179-197.
Forsyth, James. 1992. A History of the Peoples of Siberia. Russia’s North Asian Colony 1581-
1990. Cambridge: Cambridge University Press.
Grenoble, L. 2003. Language Policy in the Soviet Union. Dordrecht: Kluwer.
Grishina, N. M. 1977. Upotreblenie slova “bang” v slozhnom predlozhenii ketskogo jazyka
[Use of the word “bang” in Ket complex sentences]. Jazyki i toponimija 4: 102-107.
Gruzdeva, Ekaterina. 1998. Nivkh. Languages of the World Materials LW/M 111. Munich:
LINCOM-EUROPA.
Gruzdeva, Ekaterina. 2015. Sociolinguistic and linguistic outcomes of Nivkh-Russian
language contact. In Stolz (ed.), pp. 153-181.
Guerin, F. 2015. The evolution of Chechen in asymmetrical contact with Russian. In
Stolz (ed.), pp. 183-198.
Janurik, B. 2015. The emergence of gender agreement in code-switched verbal con-
structions in Erzya-Russian bilingual discourse. In Stolz (ed.), pp. 199-217.
Kasyina, I. 2015. Grammatical effects of Russian-Udmurt language contact. In Stolz
(ed.), pp. 219-235.
Kostjakov, M. M. 1976. Ketskie sootvetstvija russkomu sloznopodchinennomu pred-
lozheniju s pridatochnym vremeni. [Ket correspondences to Russian subordinate
sentences with adjectival-subordinate tense] Jazyki i toponimija 1: 56-62.
Slezkine, Y. 1994. Arctic Mirrors. Russia and the Small Peoples of the North. Ithaca, New
York: Cornell University Press.
16 Anderson
Stolz, Christel (ed.) 2015. Language Empires in Comparative Perspective. Berlin: Mouton
de Gruyter.
Subrakova, O. P. 1992. Padezhnaja sistema v bel’tirskix govorax xakasskogo jazyka.
[The case system of the Bel’tir variety of Xakas] Xakasskaja dialektologija, pp. 32-50.
Abakan: XakNIIJALiI.
Verner, G. K. 1997. Jugskij jazyk. [The Yugh language] In A. P. Volodin (ed.) Jazyki mira:
Paleoaziatskie jazyki. Moscow: Indrik, pp. 187-195.
Internet Sources
http://www.perepis2002.ru/
http://www.tooyoo.l.u-tokyo.ac.jp/Russia/bibl/
Chapter 2
The Contacts between the Ossetians and the

Karachay-Balkars, According to V. I. Abaev and
Marrian Ideology
Johnny Cheung
A modest tribute to Uwe Bläsing

and his forensic approach
to etymology and the origin of words
⸪
1 Introduction
Ossetic is geographically the most western East Iranian language spoken

in the Northern Caucasus. It is spoken in two areas, viz. in North Ossetia-
Alania, which is part of the Russian Federation, and in South Ossetia, which
declared its independence in 2008 in the aftermath of the Georgian-Russian
war. According to the Russian census of 2010, around 488,254 persons declare
Ossetic as a first language. The dominant dialect is Iron, which also serves as
the main or official language for the Ossetians. A minority, less than 100,000
speaks the relatively archaic dialect of Digoron, which is used predominantly
by the Sunni Muslim minority in North Ossetia-Alania.
In contrast, the Karachays and Balkars speak two very closely related Turkic
languages that are usually classified as “West Kıpçak”. They are settled primarily
in two Russian republics, viz. in Kabardino-Balkaria (Balkars) and in Karachay-
Cherkessia (Karachays). Both areas are situated in the Northern Caucasus
region of the Russian Federation. Although the Karachays and the Balkars share
the same standard, literary language, often simply called Karachay-Balkar, it is
mostly based on the speech of the numerically superior Karachays, 218,403, vs.
112,924 Balkars, according to the Russian census of 2010. Historically, the lan-
guage had never acquired literary status, as the speakers would have resorted to
writing Arabic and / or Russian instead, until the Soviet government commis-
sioned the introduction of this literary Karachay-Balkar in 1935/6. According
© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_003

18 Cheung
to the same census, 212,522 of the combined total of Karachays and Balkars
declare to use Karachay-Balkar natively. The Karachay-Balkar settlements are
divided in two contiguous political units, but unlike the Ossetians in their
home regions, the Karachays and Balkars do not constitute a majority in their
respective republics, where there are sizeable, ethnic Russians and Caucasian-
speaking Kabardino-Cherkess (also known as Circassians).
Historically, the Scythians, Alans, and Sarmatians are considered the linguistic
ancestors of the modern Ossetians, although, evidently, the linguistic
documentation is rather meagre and often limited to personal names, the
occasional quote in a Classical Greek source (such as Herodotus), and grave
inscriptions.
The same may apply to the attempts to establish a direct linguistic link
between the modern day Karachay-Balkars and the (presumably) Kıpçak
speaking Cumans and Pechenegs, if not including the other elusive Bolghars
with their unclear Turkic affiliation. This is obviously a cause of disagreement.
Of course, there are other, mostly lesser-known Turkic languages spoken in the
Caucasus, such as Nogay (a South/Central Kıpçak or “Aralo-Kaspian” Turkic
language), Kumyk (West Kıpçak), who might also lay claim on these historically
attested peoples and tribes.
Ossetic and Karachay-Balkar are not in imminent threat of extinction, as these

languages have an enshrined position within the political framework of their
autonomous republics. Language retention of the native language among
the Ossetians and the Karachay-Balkars is high, despite the omnipresence of
Russian, which is the language of education and serves as the natural lingua
franca among the many Caucasian nationalities. In their respective republics,
of the 459,688 North-Ossetians, 402,248 of them indicated that they had a
command of Ossetic (87.5%), whereas of the 194,324 Karachays living in the
Karachay-Cherkessian Republic 181,740 had a command of Karachay-Balkar
(93.5%) and in the case of the 108,577 Balkars in the Kabardino-Balkarian
Republic, 96,252 (88.6%), according to the census of 2010.1 Although, in the
larger towns and cities, Russian is heard pretty much everywhere, Ossetic
and Karachay-Balkar respectively are usually the everyday language of
communication in the country side.
The level of marginalisation in the former Soviet and contemporary
Russian society differs considerably for both communities. Ossetians are well
1 All figures are cited from the documents available at http://www.gks.ru/free_doc/new_site/

perepis2010/croc/perepis_itogi1612.htm.
The Contacts between the Ossetians and the Karachay-Balkars 19
integrated in mainstream Russian society and are therefore treated relatively

favourably, especially after the annexation of South Ossetia by Russia in 2008.
In contrast, the Karachay-Balkars are generally viewed with some suspicion,
because of their religion (Islam) and possible ties to other Turkic groups and
communities in Russia and abroad, including possible, political aid and inter-
ference from Turkey. Even their full rehabilitation and measures to compen-
sate for the wrongdoings in the past were not in place until Boris Yeltsin signed
an official decree on March 3, 1994, which restored their cultural rights in their
assigned Republics. Compounding to their rather marginal position in Russian
society is the dearth of prominent Karachay-Balkar intellectuals, who could
speak and carve out a cultural and political space for their communities in
Russian society.
The scholarly study of Ossetic and Karachay-Balkar did not start in earnest
until the second half of the 19th century, right after the conquest of the
Caucasus by the Russians. Ossetic was studied in depth, thanks to the efforts of
the Finnish scholar Anders Sjögren and a prominent Russian scholar, Vsevolod
Miller. Subsequently, the native Ossetian scholar Vassiliy Abaev would build on
their works. Especially the historic relations of Ossetic with the other Iranian
languages, several European language groups (Slavic, Celtic), and, of course,
with Turkic too, became much better known. Abaev has also published several
important synchronic descriptions of Ossetic and textual editions of Ossetic
myths and folklore.
Karachay-Balkar, on the other hand, lacks similar kinds of wide-ranging
research, such as the interpretation of the customs, the historic dimensions
of the language, possible contacts with other ethno-linguistic groups, and so
on. Actually, it was not until at the turn of the 20th century that the Russian
linguist Nikolaj Karaulov recorded the Karachay-Balkar in earnest (Karaulov
1908). Even an in-depth description of the dialects of Karachay-Balkar has yet
to appear. A bilingual Karachay-Balkar – Russian dictionary did not appear
until 1989 (Tenišev 1989). The older Russian-written literature on Karachay-
Balkar was meant as an aid to help the Karachay-Balkars to master Russian.
The impact of the contributions made by linguists, especially in Russia and

the Soviet-Union, on the intellectual formation and appraisal of the “mother”
language by its speakers is undeniable though. Ossetic was actively researched
by Russian and, later, Soviet scholars, because it was Indo-European and
spoken by a largely Christian population, in a sea of largely non-Indo-European
languages with sizeable Muslim populations. For this reason, their speakers
received a relatively favourable treatment, in comparison to other minorities.
20 Cheung
The situation for Karachay-Balkar could not be more different. The

Karachay-Balkars speak a Turkic tongue, and are largely Muslim, and therefore,
potentially hostile, overtly or latently, to the official, atheistic Soviet system.
The Karachay-Balkars were deliberately broken up in two ethnic designations,
despite the clear ethnic communalities and almost identical language. In addi-
tion, they were “housed” in separate republics, which they also had to share
with unrelated groups, viz. the Cherkessians and Kabardinians respectively.
Actually, these Cherkessians and Kabardinians speak very closely related West
Caucasian languages. The Soviet authorities actively pursued this kind of eth-
nic or tribal fragmentation, creating micro-nationalities against the wishes of
the local intellectuals, as observed by Alexandr Bennigsen (1983). Obviously,
the ulterior motive is “divide and conquer” in order to prevent potentially big
challenges to Soviet rule.
Russian or Soviet research on Karachay-Balkar was rather limited, other

than within the context of its status as a minor Turkic language. According
to the School of the prominent Soviet scholar Nikolaj Marr, Karachay-Balkar
was a kind of linguistic “mongrel”, the result of “crossbreeding” (скрещения)
between Turkic and “Svan” elements, as described in a speech at a 1929 meeting
of the Soviet Academy of Sciences: “. . . The timeliness and urgency to study this
language [of the Karachay-Balkars, JC] is because of the established relations
of the Svan with the Turkic languages of the aforementioned peoples. This
makes it possible to identify the process of crossbreeding on the one hand,
but on the other hand, to establish the presence of Japhetic elements in those
languages, and, therefore in Turkic generally, insomuch Balkar and Karachay
appear as the languages of the said system.”2
Even worse, at the end of the Second World War many nationalities were the
target of large-scale deportations, such as the Ingush, Chechens, Karachays,
Balkars, the Buddhist Kalmyks and the Muslim Digor Ossetians, who were
accused of collaboration with the retreating Nazi German troops. One can
2 “. . . . Своевременность и срочность изучения этого языка объясняется установленными

связями сванского с тюркскими языками названных народов, что дает возможность
с одной стороны выявить процессы скрещения, с другой – установить наличие
яфетических элементов в указанных языках, а следовательно и в тюркских вообще,
поскольку балкарский и карачаевский являются языками именно этой системы.” In
Письмо Н. Я. Марра в Президиум АН СССР с обоснованием необходимости экспедиции
в Кабардино-Балкарию и перечнем предполагаемых ее участников [Letter of N. Ya. Marr
at the Presidium of the Academy of Sciences of the Soviet-Union on the necessity for an expedi-
tion to Kabardino-Balkaria and a list of potential participants], 19 February 1929 (Letter 141,
reprinted by Anfert’eva and Kazanskij 2013, 214 f.).
notice the huge gap in (pan-)Soviet publications on Karachay-Balkar, nothing

was published between 1941 (Bizni zamannı ğigiti, Nalčik) and 1960.
The relative academic marginalisation of the Karachay-Balkars can also be seen

in Soviet publications on the language contacts between the Karachay-Balkars
and the Ossetians. The significant amount of Turkic loanwords in modern
Ossetic bears witness to the fact that in the ancient past there were intensive
contacts between the ancestors of modern-day Ossetians and the Turkic world.
They commenced with the rise of the Judaeo-Turkic Khazar Empire in the
6th century CE and later, the contacts became even more intense during the rule
of the Golden Horde. Not to mention, Oğuzic Azerbaijani used to be the lingua
franca throughout the Caucasus since the founding of the Safavid Empire in
the 16th century until the Russian conquest of the region in the 19th century.
In fact, this Turkic influence on the Ossetic language is much more profound
than that from the Caucasian languages that are spoken in the region, such as
adjacent Ingush-Chechen, Kabardino-Cherkess (= Adyghe) or Georgian.
The Turkic groups that have been geographically closest to these Ossetians
are these Balkars and Karachays, who themselves have borrowed many Ossetic
forms. The modern Ossetians usually call their Balkar neighbours Asy, who,
historically speaking referred to the Ossetians themselves. Both names, Balkar
and Asy, were already mentioned in the famous early Persian-written geo-
graphical work “The Regions of the World” (Ḥudūd al-‘Alam), as ās and balqar
respectively. The transfer of the ethnonym ās to the Balkar points to intensive
cultural contacts, such as interethnic marriages and strategic alliances between
these two peoples. According to the eminent Turkologist Omelyan Pritsak they
may have lived together in the Northern Caucasus until the Mongol invasions
(Pritsak 1958, 341). Indeed, as asserted recently for instance, by Džurtubaev
(2010, 4) in his introduction, the ethnogenesis of both the Ossetians and the
Karachay-Balkars is an “interrelated process” (взаимосвязанных процесса).
Mediaeval European sources further confirm the co-existence and co-mingling

of these groups, when an Ossetic group (known as the Jász) and the Kıpçak
speaking Cumans (Kun) settled in Hungary during the 13th century. It is only
natural to wonder whether the Karachay-Balkars can be considered the last
remnant of the mediaeval Kumans. After all, those Cumans were living very
closely, if not in some sort of symbiotic relation, with the Alans, who are the
conventionally accepted (immediate) ancestors of the modern Ossetians. In
the past, the Mingrelians would apply alani to the Karachays. Even today, the
term alan is still employed by the Karachay-Balkars as a self-identification.
According to Thordarson (2009, 28 f.), “[w]e are thus justified in regarding the
22 Cheung
Karachay-Balkars as Turkicised Alans”, also on account of the numerous place

names of Ossetic / Alanic origin attested in the Karachay-Balkar regions. This
is difficult to prove (or disprove) though, as we have no idea how the social
circumstances and interactions were between the Iranophone Ossetic- and
Turkic-speaking Karachay-Balkars in the modern Karachay-Balkar regions.3
2 Abaev (1933) and Nikolaj Marr
The famous Ossetian linguist and prominent Ossetologist Vasilij I. Abaev seized
upon this historic “cohabitation” to confirm Nikolaj Marr’s Japhetic theory,
which was proclaimed in the early 1920s.4 According to this pseudo-scientific
theory, also known as the New Study of Language, languages rather reflect a
continuous merger of previous languages.5 Nikolaj Marr gave a Marxist twist
to the European mediaeval idea that, analogous to the legendary origin of the
Semitic peoples and their languages from Noah’s son Shem, most European
nations and ethnicities descended from Noah’s other son, Japheth. According
to Marr’ own interpretation, the languages spoken by Japheth’s children would
be the substrate that was later overlaid by Indo-European languages. The
different layers (of borrowing) would correspond to the different social classes
of ancient societies (in Europe). Language was considered a superstructure on
the base of society, concurrent to the creation of a (single) socialist economy.
As language mixing was therefore the logical consequence, the notion that the
languages of peoples could be traced back and therefore classified according
3 Other explanations are conceivable: a large group of mostly, Iranophone Alans may have
vacated these areas, voluntarily or involuntarily (after the well-documented Mongol inva-
sions). Subsequently, the Turkic Karachay-Balkar population occupied or dominated the
region, while they would also impose their language on the remaining (Iranophone) sheep-
herders. Neighbouring groups would still have called them by their older names, etc. It is
also conceivable that a previous, centuries-old situation of active bilingualism in these areas
had shifted in favour of a predominant monolingual Turkic environment, when an external,
Turkic language, i.e. Azeri, became the lingua franca of the Caucasus. This kind of linguistic
symbiosis and co-existence between two linguistically unrelated groups is well known else-
where in the world, e.g. between the Iranophone Balochis and Dravidian-speaking Brahuis
in Pakistan. The dominance of one language over another depended on the political constel-
lation and the linguistic preference often alternated with each generation.
4 Most of his articles on the Japhetic Theory can be found in Marr (1933).
5 This has led to a complaint by a Soviet scholar critical of Marr that “the study of the connec-
tions between related languages is turned over as a monopoly to bourgeois linguistics” (cited
by Pollock 2006, 122).
to a common origin (as, notably, proposed by Indo-European linguists),

was dismissed as a “false consciousness” (ложное сознание), which was
introduced by bourgeois nationalism. New languages were rather the result
of crossbreeding, while the ultimate origin of languages derives from the four
primordial sounds sung by the ancient people during their chores, viz. ber, yon,
roš, sal (Marr 1936, 130).
In the article published in the journal, Language and Thought, Abaev (1933)
discussed precisely the intermingling and mixing of the Karachay-Balkars and
Ossetians, being reflected in the mutual borrowings between these two groups.
Later on, he incorporated this study in his collected writings known as Ossetic
Language and folklore (Abaev 1949).6
Apparently, shortly after the official endorsement by Marr, a youthful Abaev
undertook a field trip to Baksan and its surroundings (located in the Kabardino-
Balkarian Republic). He then compiled a comprehensive list of putative Ossetic
and Karachay-Balkar parallels as published in Abaev (1933), which reflected
his faithful adherence to Marr’ Japhetic theory, the prevailing dogma of that
period. It comes as no surprise that this publication shows severe, method-
ological shortcomings. Notably, the article does not give an ultimate origin of
the forms, i.e. whether derived from Proto-Iranian, Old / Proto-Turkic or from
“Caucasian”. Even if we ignore the ideological bias, Abaev’s paper also contains
numerous factual errors, which he did not correct, when, subsequently, these
forms were incorporated in his famous Historical-Etymological Dictionary
of Ossetic. In this dictionary, he frequently assigned an older origin of these
parallels, which was finally permitted after Stalin had denounced the Japhetic
theory in an article.7
Unfortunately, Abaev often suggested etymologies that were a priori implau-
sible. Also, it seems that some of the forms cited by Abaev were rather ephem-
eral, such as the counting system with Digoron sounding names. This system
was apparently used in the southern Kabardino-Balkar region of Greater
Khulam (near to the Russian – Georgian border). Actually, these numerals do
not appear to be attested in other Karachay-Balkar-speaking areas (and there-
fore, not incorporated in the standard(ized) Karachay-Balkar language). The
publication includes other terms too. A good example is Ossetic gæmæx ‘bare,
with bare spots’, for which a Karachay-Balkar form gǝmǝx ‘a spot covered by
scarce vegetation’ was cited by Abaev as parallel, without giving any source.
6 Lajpanov (1967) more or less repeated this work.

7 This was first published in the newspaper Pravda on June 20, 1950, on which see also the
editorial in the linguistic journal Voprosy Jazykoznanija 1952, 3 f.
24 Cheung
So far, I have not found any corroboration for this, only Karachay qımıja ‘bare
(footed)’ (?). Forms such as gǝmǝx are perhaps no more than ad hoc borrow-
ings that we may very well encounter in the vocabulary of the few (bilingual?)
Balkar speakers who happened to have been in intensive contact with local
Digoron speakers (by marriage, trade or otherwise).
Although Abaev introduced the region as a kind of melting pot of customs,
traditions and languages of the local peoples, he did not explain the exact social
or sociolinguistic circumstances (such as code-switching, active bilingual-
ism, and other aspects of interlinguistic and multilingual communications)
of this region. It remains, for instance, unclear how competent those inform-
ers were in either Digoron Ossetic or Balkar, and how the linguistic skills were
acquired, through marriage, upbringing, trade or otherwise. There were argu-
ably no religious objections against intermarriages between Sunni Digoron
and Balkar speakers. Abaev asserted that the Ossetic elements in Karachay-
Balkar were not recent but the result of “the legacy of ancient Alanic-Turkic
mingling, which took place in the areas of all the gorges, from the Terek to the
Upper Kuban river”8 (Abaev 1949, 18). However, many of the claimed Ossetic
loanwords in Karachay-Balkar are not attested elsewhere, which would rather
suggest recent or ad hoc borrowing. This could be an indication of (recent)
bilingualism.
Although Abaev did distinguish elements that Karachay-Balkar had borrowed

from Ossetic, from those that Ossetic has borrowed from Karachay-Balkar, the
criteria for the distinction were rather erratic. He did invoke certain semantic
and morphological criteria though to decide from which direction a term was
borrowed, e.g. Ossetic bælas ‘tree’. Its generic sense would have been passed on
to Karachay-Balkar balas ‘a wooden hay-dragger’, which is rather specialized
(but, theoretically, both bælas and balas could have been independently
borrowed from a third source). Elements that Abaev could not have perceived
as borrowed from Ossetic to Karachay-Balkar, or vice-versa, were considered to
have originated in the postulated Japhetic substrate.
At first sight, this kind of categorization was in the spirit of Marr, but
Marr’s theory muddles the distinction between original and borrowed forms.
Therefore, Abaev relied implicitly on an etymology postulated by previous
(non-Marrian) linguists. Without the traditional historical-comparative frame-
work, any etymological attempts in his article were rather ad hoc.
8 “наследие старого алано-тюркского смешения, происходившего на территории всех

ущелий, от Терека до верхней Кубани.”
Assigning an (ultimate) origin for the forms was, of course, of secondary

importance to Abaev (1933), as these “parallels” were rather classified according
semantic categories:
A. terms from the inanimate nature,

B. terms from the animate nature,
C. designations of cultivated plants,
D. designations of domesticated animals,
E. terms from the material culture,
F. anatomical and medical terms,
G. social and ethnic terms,
H. designations of physical and mental properties,
I. varia.
J. counting system (as an indication of economic interactions)
K. religion, mythology and folklore,
L. toponyms.
For this conference, I would like to present a few of my own observations and a
personal assessment of Abaev’s treatment of the Ossetic and Karachay-Balkar
“parallels”. I will limit myself to the categories A. and B. Abaev’s work is a very
instructive example of Soviet linguistics of the interbellum.
As for the assessment of the Karachay-Balkar material,9 I have relied prin-
cipally on the dictionary that Tenišev published in 1989, in order to find a con-
firmation whether Karachay-Balkar has genuinely assimilated these loanwords
mentioned by Abaev in its vocabulary. Another valuable publication is that
from Gustav Schmidt, who more explicitly considered the Ossetic borrowings
into Karachay (Schmidt 1931). Again, regrettably, he did not always identify
the ultimate origin of the Ossetic elements, whether they were inherited from
Old Iranian or merely local, Caucasian Wanderwörter, remained unanswered.
Recently, Ewa Siemieniec-Gołas has carried out a very valuable lexical study
on the Turkic “Erbwortschatz” of Karachay-Balkar (Siemieniec-Gołas 2000).
I have considered this work as well for the present talk.
9 The transcription of the Karachay-Balkar form is according to the modern Romanized

Turkish alphabet. However, x is used here to denote the voiceless fricative velar, whereas
ğ is the voiced correspondence. As it is the case in most Turkic languages, the Karachay-
Balkar velars k, g have both back and front realizations (the allophones [q]/[k] and [ɢ]/[g]/
[ɣ]), depending on the vocalic environment. Quite often, the complementary distribution of
these realizations do not apply to (especially, the most recent) borrowings, e.g. from Arabic
or Russian.
26 Cheung
A. In the category of inanimate terms, Abaev cites several “parallels”. Of the

16 forms, 4 forms are of Ossetic / Iranian origin, 3 forms of Karachay-Balkar /
Turkic origin, 3 were wrongly analyzed (or simply unclear), perhaps 4 from a
third source and 1 was perhaps ephemeral:
i. The following forms that have a clear Iranian origin are:

– Oss.10 cægat ‘northern side of the mountain’ (PIr. *čakāta-, Middle Persian
cagād ‘peak, summit’, Sogdian ck”t ‘peak, forehead’) ~ KB çeget (Balk.)
‘north(ern) direction’ (borrowing from Balk. çeget would have yielded Oss.
†cægæt).
– Oss. awwon ‘darkness, cover’ (*āwa-wahāna- ‘covering into/down’, cf. Persian
bahāneh ‘pretext, cover’) ~ KB awana ‘contour, silhouette, outline’.
– KB dorbun (Kar.) ‘(bear) cave’, (Balk.) ‘cave, cavern’ (lit. ‘stone-bottom’, from
PIr. *darwa-11 ‘solid, hard (as wood)’, cf. Khotanese dūra- ‘hard’, and *buna-
‘bottom’, cf. Pers. bon). The corresponding Oss. formation *dorbun is not
attested though, only its elements dor ‘stone’ and bun ‘bottom, floor’ are
clearly Ossetic in origin.
– Oss. swadon / sawædonæ ‘well’ (< pl./f. *syāwā ‘black’ + *dānā ‘river, waters’)
~ KB (Balk.) şawdan ‘springs, well’. Etymologically speaking, Ossetic sawæ-
donæ literally means ‘black water(s)’, which could be calque on an earlier
Karachay-Balkar *kara sū for ‘well, or spring’?
ii. On the other hand, Ossetic must have borrowed quite substantially from
Karachay-Balkar as well. The difficulty is that quite often the Karachay-
Balkar forms are almost indistinguishable from their Turkic correspondences.
The following forms may derive from Karachay-Balkar due to its typical
phonological features:
10 Additional abbreviations include: KB = Karachay-Balkar, Balk. = Balkar, Kar. = Karachay;

Oss. = Ossetic, Dig. = Digoron dialect of Ossetic; PIr. = Proto-Iranian, OT= Old-Turkic, PT =
Proto-Turkic. Close Iron and Digoron counterparts cited as examples are separated by the
sign “/”, with the Iron form on the left and the Digoron form on the right of the slash sign.
11 The reconstruction *dárwa- would explain the vocalism in Ossetic (cf. Cheung 2002,
128 f.), and in Khotanese dūra-, cf. Emmerick (1989, 211). This thematized adjectival for-
mation is a derivative of *dāru (gen. stem dru-) ‘wood’, Persian dār, etc. I have taken the
cue for this connection from Maciuszak (2007, 205 f.). The additional cognate forms (Old
Persian duruva- ‘secure, firm’, Avestan druua-, Sanskrit dhruvá- ‘healthy’ cited by her are,
are unconnected though, as (also) shown by its morphological derivational process (in
Sanskrit) and the semantic discrepancies.
– KB töppe ‘top, crown (of the head); peak; tuft’ (< PT *töppe, cf. OT töpi,
Kumyk töbe, Turkish tepe) ~ Oss. (Dig.) c’opp ‘pluck, wool’, (Iron) c’upp sum-
mit, peak’ (c- < *ti).
– KB kaya ‘rock, boulder’ (< PT *kaya, Turkish kaya etc.) ~ Oss. k’æj/ k’æjæ
‘slate’. Evidently, the Ossetic forms may also derive from another Turkic lan-
guage. The Svanetic form k’a ‘slate’ however is rather a direct loanword from
Iron Ossetic k’æj.
– KB ırxı ‘stream, creek’ (< OT arık ‘irrigation canal’, cf. Chagatay arığ, Turkish
ark, etc.) ~ Oss. (Dig.) ærxæ ‘gorge, dry riverbed’.
– KB tılpıw ‘vapour; air’ ~ Oss. (Dig.) tulfæ ‘vapour, steam’, see further below.
iii. The following forms may be wrongly analyzed or unrelated:

– Oss. k’oyldym / k’uldun ‘(mountain) slope, hill’, unrelated to KB küllüm ‘sun-
kissed spot’ deriv. of kün ‘sun’).
– Oss. (Iron) ran, (Digoron) rawæn ‘place’. The cited KB ran is only attested in
the expression kaya-ran ‘rock ledge, a certain spot on the rock, rock terrace’,
in fact it just reflects a compound with a-elision kaya aran (aran ‘valley, low-
land’) > kaya ’ran. KB ran is therefore an accidental form, being unrelated to
Oss. ran.
– Oss. k’æxæn ‘slope; cliff’. Abaev (1958-1995, 1: 631) no longer incorporated the
cited Karachay-Bakar parallel təxən (sic!) ‘flat area on the rock’.
iv. The following forms may stem from a third source, perhaps independently:
– Oss. (æ)zme(n)sæ (Dig.) ‘sand’ (Iron yzmis) ~ KB (Balk.) üzmez ‘id.’. The Balkar
form does not conform to Turkic morphology, hence it might be a borrow-
ing from Ossetic, although it has no further correspondences in Iranian or
in the neighbouring Caucasian languages. An Iranian preform *uz-maišā-
‘mixture, being mixed up’ (*maiz- ‘to mix, mingle’) has often been suggested
(cf. Abaev 1958-1995, 4: 282), but this reconstruction is fraught with prob-
lems, both semantically and morphologically.
– Oss. xuræ ‘gravel’ (Iron xoyr) ~ ? KB (Balk.) xuru ‘stony place, cobblestones’
(no further documentation).
– Oss. cuxcur ‘flowing water’ ~ KB çuçxur ‘water fall’ ← South Caucasian /
Kartvelian ?, cf. *me-rčx-e ‘shallow (of water)’, *rečx-/rčx- ‘to purl, babble,
murmur’ (Klimov 1998, 119, 157).
– typpyr / tuppur ‘bloated, fat’; [Dig.] hill’ ~ duppur ‘hill’, with similar forms in
Darginian dupur ‘mountain’, Persian topoli ‘fat’, derived from a Turkic forma-
tion with *töppe?
28 Cheung
v. A very recent, ephemeral borrowing is:

– Oss. gæmæx ‘bare, with bare spots’ ~ KB (Balk.) gǝmǝx ‘a spot covered by
scarce vegetation’, see above.
B. The 32 terms from the animate natural field are largely neither from Iranian
nor Turkic. The botanical terms are usually indigenous (Caucasian). Of the
parallels, 7 are Ossetic forms borrowed into Karachay-Balkar, 5 from Karachay-
Balkar into Ossetic, whereas the remaining 14 may be most likely from a third
source (independently). Finally, 6 borrowed forms may be just ephemeral (4)
or misinterpreted (2).
i. Ossetic forms borrowed into Karachay-Balkar are:

– Oss. bærz / bærzæ ‘birch’ ~ KB mırzı ‘id.’ (< PIr. *barzā-, Skt. bhurjá- m. ‘Betula
utilis’).
– Oss. kærdæg ‘grass’ (< PIr. *karta-ka- ‘cut’) ~ KB kırdık ‘id., greens’ (form con-
taminated with kırdış?).
– Oss. xans (Digoron) ‘long, thick grass; tall weeds’ (< PIr. *kāsa- ‘tall grass’, cf.
Pers. kāh ‘straw’ (ºns from fans ‘wool’) ~ KB xans ‘grass’.
– Oss. fadawon (Dig.) ‘soft, dry grass (for decking)’ (lit. ‘foot-covering’, with fad
‘foot’ < PIr. *pāda-) ~ KB (Balk.) fadawan ‘straw often used as padding in
mountain shoes’.
– Oss. mælʒyg / mulʒug ‘ant’ (< Iranian *marwi- + *-čī-ka-) ~ KB (Balk.) gumul-
cuk ‘id.’ (with gu˚ from gubu). The Balkar form is evidently a borrowing from
Dig. mulʒug. This Digoron formation shows an additional u-umlaut in com-
parison to the Iron correspondence.
– Oss. synʒ / sinʒæ ‘thorn; blackthorn; splinter ~ KB (Balk.) şinji ‘spine, (plant)
needle’, see below.
– Oss. tæk’uzgæ (Dig.) ‘rowan (berry)’ ~ taqüzük (tüqüzgü, Abaev 1958-1995,
3: 255) ‘id.’. The lack of vowel harmony and the velar q in front of ü of the
Balkar form all point to borrowing from Dig. tæk’uzgæ, but the ultimate ori-
gin is unknown.
ii. Several Karachay-Balkar forms from the animate realm have entered
Ossetic. We may cite the following forms, which in turn may be borrowings
from another language:
– KB bittir ‘bat’ ~ Oss. (Dig.) bittir (Iron xælyn-byttyr) ‘id.’, see below.
– KB gabu ‘dandruff’, (Karachay) gıbı, (Balkar) gubu ‘spider’ ~ Oss. gæby, gyby /
gæbu ‘mite’. The Ossetic forms appear to be borrowings from Karachay
Balkar gabu, etc., which again may be an adaptation of a Kartvelian
formation, notably from a Georgian dialect form, cf. Gurian ǯɣiba- ‘tick’
(Klimov 1998, 100).
– KB gılıw ‘foal; rat’ ~ Oss. (Dig.) gælæw ‘rat’. According to Abaev, gælæw is
an “infantile deformation” of k’ælæw ‘foal’, which would be comparable to
Kabardian qolow ‘piglet’, Georgian qoqo, Megrelian ɣoɣo ‘calf of buffalo’.
Rather than considering “infantile deformation”, gælæw may simply be a
loanword from Karachay-Balkar, as gılıw has retained the two meanings
‘foal; rat’. Of course, Karachay-Balkar gılıw may well be Caucasian in origin.
– KB mıga ‘quail’ ~ Oss. mæga ‘snipe’. The Balkar form probably directly stems
from Kartvelic, notably Georgian mc̣qer- ‘quail’. Balkar would have simpli-
fied the consonant cluster of the Kartvelian original formation, and, then,
have passed on the term to Ossetic: Oss. mæga clearly shows a semantic
shift.
– Oss. (Dig.) pursa (Iron pysyra) ‘nettle, Urtica urens’ ~ mursa ~ ‘id.’, see below.
iii. The following forms feature borrowings from a third source. They consists
mostly of terms from the local flora, which are often Caucasian:
– Oss. æxsæli, æxsælæ, (Iron æxsæly) ‘juniper’ ~ KB (Balk.) şkeyli, şkildi
‘id.’ ← South Caucasian?, cf. Georgian ašk’ili ‘wild rose’, Mingrelian
šker- ‘rhododendron’.
– Oss. sk’eldu ‘cowberry’ ~ KB şkildi ‘juniper’ (‘можжевельник’, Tenišev 1989,
751), kızıl şkildi, (Kar.) kızıl işkildi ‘cowberry, Vaccinium vitis-idaea’, also dial.
ışkıldı ?) ← a variant of the Caucasian ‘juniper’ forms (-di: unanalyzable suf-
fix in both Ossetic and Karachay-Balkar). The Karachay-Balkar forms seem
to be borrowings from an unattested Iron correspondence *(y)sk’ildy, cf.
Dig. sk’eldu.
– Oss. cym/cumæ ‘dogwood, Cornus’ ~ KB çum ‘id.’, cf. Lezgian čumal,
Tabassaran čemel ‘id.’ (similar forms: Turkish çim ‘grass’).
– Oss. ʒedyr, ʒeʒyr, ʒedyræg / ʒæduræ ‘blackberry’ ~ KB züdür ‘id.’ ← a Caucasian
language ?, perhaps to be analyzed as *zǝ ‘red; blackberry’ (cf. Adyghe zǝ
‘red’ or Abkhaz Bzyp a-z ‘blackberry shrub, bush’, Chirikba 1996, 87), and
*dur ‘fruit’ ? (cf. Lezgian dur ‘dried fruit’, Ossetic dyrǧ ‘fruit’, loanword).
Alternatively, it may be a borrowing from Finno-Ugric, according to Tenišev
(1989, 807), who apparently follows Abaev (1958-1995, 1: 396).
– Oss. ʒæbidyr / ʒæbodur, ʒæbedur ‘mountain goat, Capra caucasica’ ~ cuǧutur
‘id.’ ← a preform *ǯəɣʷətur, undoubtedly of Caucasian origin, probably West-
Caucasian, cf. Adyghe šəquɫtər ‘id.’ (Apažev – Kokov 2008, 576).
– Oss. mæntæg / mæntæg, mont ‘burdock’ ~ KB mant ‘id.’ ← Wanderwort ?, cf.
Svan mant ‘id.’, Greek mínthē ‘mint’.
30 Cheung
– Oss. næzy / næzi ‘pine, Pinus sylvestris’ ~ KB (Balk.) nazı, (Karachay) nızı
‘fir’ ← Kartvelic *naʒw ‘spruce, fir(-tree)’, cf. Georgian naʒv (but also as a
regional Wanderwort in other Middle Eastern languages, cf. Persian nāz,
nāžu, nājū ?).
– Oss. murtgæ, murk’æ ‘Viburnum’ ~ KB (Karachay) murtxu, from Kartvelic, cf.
Georgian marc̣qv- ‘strawberry’.
– Oss. mæra / mura, pura ‘hollow’ ~ KB pura ‘hollow, rotten (tree)’ ← ?, cf.
Chechen, Ingush mur ‘hollow tree’.
– Oss. ninæǧ ‘raspberry, Rubus idaeus’ (Iron mænærǧ) ~ KB nanık ‘id.’ ← ?
– Oss. tægær ‘maple’ ~ tıgır, (Balk.) tıkır ← Caucasian, cf. Svan tek’er, tek’ra
‘maple’.
– Oss. turtu, (Iron) tyrty ‘barbarry, Berberis vulgaris’ ~ KB türtü ← Wanderwort ?,
cf. Lezgian turt ‘id.’, similar forms such as Persian tūt.
– Oss. ug ‘owl’ (Iron wyg) ~ KB uku. No doubt, these forms are onomatopoetic
in origin, cf. Megrelian, Laz ɣu, Svan ɣu, etc. Ossetic and Karachay-Balkar
have probably borrowed the forms independently from each other, perhaps
from another Caucasian language (if we are not dealing with “spontaneous”
expressive forms).
– Oss. gælæbo, gæbælo, (Iron) gælæbu ‘butterfly’ ~ KB (Karachay) göbelek
‘id.’ Abaev also cites the Balkar forms gebelo, gelbo (← Digoron?), probably,
ultimately, of Turkic origin (cf. Turkish kelebek). Almost all Turkic corre-
spondences of kelebek have retained a final velar (with the exception of geo-
graphically distant Uyghur kepilɛ). In addition, the voiced velar g- needs an
explanation.
– Oss. mæga ‘snipe’ ~ KB (Balk.) mıga ‘quail’ ← a Kartvelic preform * mc̣q̣a-,
cf. Georgian mc̣q̣er- ‘quail’. Both Ossetic and Karachay-Balkar would have
simplified a difficult to pronounce consonant cluster mc̣q̣°. On final *–r in
Proto-Kartvelic, cf. Klimov (1998, 317f.).
iv. Ephemeral are probably:

– Oss. kældæ ‘dry wood, deadwood’ (Iron kældyn) ~ KB kıldı (not found in
Tenišev and other publications): kıldı is rather an ad hoc borrowing from
Digoron ? The Digoron form seems to be a lexicalized past participle of the
verb kælun (Iron kælyn) ‘to spill, fall down’, which is of PIr. origin.12
12 Abaev also entertains the possibility of a connection with several European designa-
tions for ‘wood; log’, e.g. Greek kládos ‘branch’, Slavic *kòlda ‘block, log’ (Russian kolóda),
Germanic (Old Icelandic) holt, German Holz ‘wood’. This may be co-incidental rather
than an instance of “Scytho-European” borrowing.
– Oss. ʒumarǧ, zumarǧ, (shortened) zum (Iron zym) ‘Caucasian snowcock,

Tetraogallus caucasicus’ (lit. ‘winter-bird’ < (thematized) Ir. *zyama- ‘winter’
+ *mrga- ‘bird’) ~ KB cumarık ‘id.’ (not in other publications). Karachay-
Balkar cumarık as a borrowing from Ossetic is not attested elsewhere.
– Oss. kændys / kændus a slightly toxic plant ~ KB kündeş ‘id.’
– Oss. qoppæg / qoppæǧ, qobæǧ ‘an edible lily’ ~ KB xömpek, xoppug ‘id.’
v. The following cited forms are fully unclear, also because of the unclear
meaning (misinterpretation, misheard?):
– Oss. bynʒ / binʒæ ‘fly’ ~ KB didin ‘wasp’ (not confirmed elsewhere, also not
included in Abaev 1958, 280).
– Oss. ʒægæræg ‘not fully bloomed flower’ ~ KB cıgıra, zıǧıra a kind of edible
plant.
3 Some Observations
We can notice several highly interesting forms that the linguistic ancestors
of Karachay-Balkar and Ossetic must have borrowed from the period prior to
their arrival in the (northern) Caucasus, i.e. before the Mongol invasions in
the 13th century. Abaev was the first scholar to label these ancient borrowings
as “Scytho-European” isoglosses, which in practice, meant that the ancestors
of the Ossetians would have borrowed, mainly, from Germanic and Slavic
(also Celtic and Latin), on which see Abaev (1965). However, similar, ancient
borrowings from Hungarian were not included in this label, simply because of
the fact that Hungarian was not part of the Indo-European language family.
A typical example of such a “Scytho-European” isogloss as defined by Abaev is
the following “Ossetic ~ Karachay-Balkar parallel”:
– Oss. synʒ /sinʒæ ‘thorn; blackthorn; splinter ~ KB (Balk.) şinji ‘spine, (plant)
needle’. In this case, sinʒæ may reflect older *spina-13 + dimin. suff. *čī. The
preform *spina- would be a loanword, most conceivably from East Slavic,
cf. Russian spiná ‘spine’, Old Polish spina ‘id.’ (inherited forms or, loanwords
ultimately from Latin ?). The “spine” form appears to be a widespread
European cultural term, attested in Latin spina ‘thorn’, Baltic (Latvian) spina
‘rod’, Germanic (e.g. Old High German spinela ‘hairpin’), English spine, etc.
13 Initial *sp- > *sf- > *s’s’ (palatalization) > modern Oss. s-, cf. sistæ, Iron syst ‘louse’ < PIr.
*spiš + *čī (e.g. Avestan nom. sg. spiš, Persian šepeš ‘id.’).
32 Cheung
However, there is an implicit bias towards these ancient borrowings, as Abaev

considered mostly (pre-) Ossetic as the first receiver of those so-called “Scytho-
European isoglosses”, effectively disregarding the possibility that the ancestors
of Karachay-Balkars could have contributed to these “Scytho-European
isoglosses” as well. After all, we may consider Karachay-Balkars as a modern
remnant of the powerful Cumans and Pechenegs, who used to occupy a good
chunk of the Eurasian steppes. Thanks to their expansion, Cumans and the
Pechenegs certainly came in contact with South Slavic speaking groups, and
for a prolonged period. These Southern Slavs may have used just only recently
a literary language (which is now known as Old Bulgarian, or alternatively, Old
Church Slavonic).
The second, politically significant ethnic group the Cumans and the
Pechenegs would have met were the Hungarians, who just completed their
conquest of Carpathia in the 9-10th century CE. The Hungarian arrival in the
Balkans came in the aftermath of the attacks by these Cumans around 895.
A written testimony to these contacts is the so-called Codex Cumanicus
compiled in Hungary in the 12-13th century. This Codex served as a textbook
of Cumanic.
We can cite several borrowed forms for which Abaev claims Ossetic as the
initial adopter, but actually, they most likely have entered an earlier stage of
Karachay-Balkar first, before their adoption into Ossetic:
– KB bittir ‘bat’ ~ Oss. (Dig.) bittir (Iron xælyn-byttyr) ‘id.’ ← South Slavic, espe-
cially Church Slavic nepŭtyrǐ ‘bat’ (which shows metathesis of t . . . p > p . . . t,
cf. Russian netopyr’). The Ossetic and Karachay-Balkar forms appear to be
an ancient borrowing from (South) Slavic. The question of course is which
language has borrowed first. The apparent loss of ne˚ may give us a clue.
Karachay-Balkar (and other Turkic languages) does not have native nouns
with initial ne˚, only derivatives of the pronoun ne ‘what’ are attested, cf.
Siemieniec-Gołas (2000, 158 f.). A Turkic speaker would have most likely
re-analyzed such a foreign formation, South Slavic nepŭtyrǐ, as an expres-
sion with the interrogative pronoun ne. In contrast, there would be no
apparent reason to resort to such a re-interpretation in Ossetic. There are
several inherited formations with initial (Proto-Ossetic) *ne˚, e.g. *nez
(= Digoron nez, Iron niz) ‘disease’, *new- ‘to cry’ (= Dig. new-, Iron niw-), *ne-
negative prefix. Initially, an early predecessor of Karachay-Balkar would
thus have borrowed the South Slavic form, after which it was passed on
to Ossetic.
– KB mursa ~ Oss. (Dig.) pursa (Iron pysyra) ‘nettle, Urtica urens’ ‘id.’. According
to Abaev (1949) the Ossetic form has been borrowed into Karachay-Balkar,
with the initial labial stop becoming the corresponding nasal m-. This
however cannot be correct, as only older voiced b-14 may become m- in
Karachay-Balkar, e.g. the indigenous name for the Balkar is Malkar. It is
more likely that an earlier Karachay-Balkar form *bursa is the source of the
Ossetic form, which looks not very ancient anyway (with atypical p- and
final -a, rather than f- and -æ respectively). Therefore, those Cumans with
their extensive contacts may have passed on this *bursa to Karachay-Balkar.
If so, Karachay-Balkar mursa < *bursa may have been an old borrowing from
Hungarian, viz. borsó ‘pea’ (Old Hungarian burso 1254, a place name), with
final -ó < *-Vk(V) and de-affricatisation of *č > s).15 The Hungarian form
itself reflects a Turkic loanword *burčak (Benkő 1992-1997, 1: 129), which is
the term for a legume, pulse(-like) plant, notably pea, vetch (and also ‘hail-
stone’), cf. Turkish burçak ‘vetch’, Karachay-Balkar burçak ‘hail’, (Balkar) ‘pea’
(Siemieniec-Gołas 2000, 70 f.; Clauson 1972, 357; Sevortjan 1978, 275 f.). The
semantic shift from ‘a legume’ to ‘nettle’ in Karachay-Balkar mursa needs an
explanation16 though.
14 Admittedly, the fate of the initial labial stops in Turkic is rather complicated. According to
Pritsak (1958, 352), b- becoming m- is a typical Kıpçak development (“echt kiptschakisch!”),
e.g. maka ‘frog’ (< PT *bāka, cf. Kumyk baka), (Karachay) miyik ‘big’ (but Balkar biyik, cf.
Tatar, Nogay biyik < PT *bädük, cf. Turkish büyük). It is difficult to postulate a watertight
phonetic rule though, especially since there are relatively few cases within Karachay-
Balkar (and borrowings from other Turkic languages may have distorted a possible pho-
netic distribution).
15 Besides, Chuvash părça ‘pea’ shows loss of the final velar. Assuming that Karachay-Balkar
mursa is a loanword from Chuvash is fraught with phonological and historical inconsis-
tencies. The older suggestion that pre-historic Ossetic would have borrowed somehow
from Chuvash is equally problematic. According to Gombocz (1912, 52), the Finno-Ugric
forms, Mari pursa, pırsa (both modern Chuvash and Mari lack indigenous voiced stops)
and Hungarian borsó would all have been borrowed from Old Chuvash *burčaɣ, but this
reconstructed Old Chuvash *burčaɣ is simply too close to all the other Turkic forms to
corroborate this statement, at least, with regard to Hungarian borsó.
16 Perhaps, the preform *bursa is a blend formation of two similar Hungarian forms: borsó
‘pea, vetch’ and bors ‘pepper’ (bors ← Turkic burç ← ultimately Sanskrit marica, Clauson
1972, 771 f.; Sevortjan 1978, 274 f.).
34 Cheung
As another example of such a labial correspondence / adaptation, we may cite:
– KB Abıstol, Amıstol ~ Oss. (Dig.) Amistol Summer month (June-July) ← ulti-

mately Greek apóstolos. Abaev (1949, 283) insinuated that the source of
the Karachay-Balkar form is Ossetic Digoron Amistol. The Digoron form is
difficult to explain, notably -m- and the vocalism -i-, if it were a direct bor-
rowing from Greek, or more likely via a Slavic intermediary apostolŭ, the
expected Ossetic (Digoron) form should have been †ap(’)ostol (and Iron
†ap(’)ustul). Rather, the Balkar form may be the source of the Digoron form,
a voiced stop *b is normally not found natively17 in intervocalic position,
which would therefore have been adapted as -m- in Ossetic. The back-vowel ı
represents -i- in Digoron. The Balkar form on the other hand, shows a regu-
lar phonetic adaptation of the Slavic outcome apostol of the Greek form.
Balkar indigenous vocabulary does not contain an intervocalic, voiceless
labial stop, hence Slavic / Greek -p- → Balkar -b-, cf. Proto-Turkic *tāpan
(or *tāban?) ‘heel’ > Balkar taban and P-Turkic *tōpık ‘knee’ > Balkar tobuk.
In addition, Balkar also shows a regular alternation b ~ m, unlike Ossetic,
which does not have an intervocalic -b- in its indigenous phonemic inven-
tory, as all Old Iranian intervocalic *-p-, *-b- have become -v- (except after
*u). Finally, the extraneous vowel sequence a . . . o (of apostolŭ) would natu-
rally be adapted as a . . . . ı in Balkar.
Finally, from the inanimate sphere (cat. A., see above), we can cite another
likely instance of ancient Cumanic borrowing:
– KB tılpıw ‘vapour; air’ ~ Oss. tulfæ ‘vapour, steam’. Abaev (1958-1995, 3:

316 f.) cautiously cited a rather far-fetched connection with Sanskrit turīpa-
‘semen (fluid)’. Both forms, Ossetic tulfæ and Karachay-Balkar tılpıw, have
probably been borrowed, perhaps rather from (South) Slavic *toplŭ, Old
Church Slavic toplъ ‘warm’ (Derksen 2007, 490), with regular metathesis of
*pl > lp. Ossetic does not have a native labial stop p in its phonemic inven-
tory, all forms with p point to either a foreign origin or is the result of a sim-
plifying gemination of a consonant segment (e.g. nk > Iron pp).
Considering the more faithful phonetic adaptation of Slavic – ŭ/-ъ in

Karachay(-Balkar) as -ıw, Ossetic tulfæ seems to have been borrowed from
Cumanic. The Karachay(-Balkar) form tılpıw appears to show umlaut, a feature
that can already be noticed in the writing of the Codex Cumanicus, and in
17 On the development of intervocalic *p > Ossetic v, (after *u) b, cf. Cheung (2002, 18f.).
modern Karachay-Balkar also in certain lexicalized phrases, e.g. bu-kün ‘today’ >
bügün. However, the exact circumstances of this kind of umlaut are unclear.
If the direction of the borrowing were the other way round, Ossetic initial -u-
would have been consistently adapted as -u-/-ü- in Karachay-Balkar.
4 Summary and Conclusions
The one-sided concentration on research of the (putative) linguistic ancestors

of the Ossetians in the past hundred years by Russian and Soviet scholars
has led to the marginalisation and even downplaying of the Turkic linguistic
component of this centuries old relation. Ossetic was intensively studied and
many aspects of its history, speakers, literature and dialectology became better
known, resulting in an appreciation and pride among its modern speakers. This,
however, cannot be said of the speakers of Karachay-Balkar. Marked by academic
neglect (and deportation of its speakers during the dark days of Stalinism), the
Karachay-Balkar language was also considered to be somewhat of a linguistic
crossbreed, as fostered by the Japhetic Theory developed by Nikolaj Marr. This
has pretty much resulted in an approach in which many, non-Turkic borrowings
found in Karachay-Balkar were considered to be taken directly from Ossetic or
from a common Japhetic / Caucasian substrate language.
A long exposé published by the Ossetian scholar Vassilij Abaev (Abaev 1933)
illustrates this situation on the relation between the linguistic ancestors of
modern Iranophone Ossetic and Turcophone Karachay-Balkar speakers.
He interpreted the Ossetic – Karachay-Balkar parallels found in the local
Balkar dialect as the outcome of ancient “Alanic-Turkic” mingling, on top of
a Japhetic/Caucasian substrate. Nevertheless, the bias is not only due to the
adoption of Marr’s Japhetic Theory. It also had a personal bias, as he ascribed
the great majority of these cases to an earlier Ossetic provenance, giving little
thought to the possibility that Karachay-Balkar could also have passed on
many borrowings to Ossetic as well.
We can summarize our conclusions drawn from the assessment of the Ossetic–
Karachay-Balkar parallels discussed by Abaev (1933), as follows (based on two
semantic categories):
– there is no clear tendency in the direction of the borrowings: both Ossetic

and Karachay-Balkar have contributed in almost equal measure to each oth-
er’s vocabulary.
36 Cheung
– as can be expected, a large group of these “parallels” consists of borrow-

ings from local (Caucasian) languages, and it is often unclear whether they
entered Ossetic or Karachay-Balkar first.
– in addition, the linguistic ancestors, the Cumans, of the modern Karachay-
Balkars may have also borrowed from European languages, after which they
would have entered Ossetic: (Greek apostólos →) South-Slavic *apostolŭ
‘apostle’ → Cumanic *abïstol (> Karachay-Balkar Abıstol) → Ossetic Amistol
‘Apostle(’s Month)’ (→ dial. Balkar Amystol!).
The main criteria that have allowed us to distinguish the direction of borrowing
between Ossetic and Karachay-Balkar are:
– phonological criteria: e.g. the presence of vowel harmony in Karachay-Balkar

forms and its (general) absence in Ossetic, the phonological restrictions and
adaptions typical for Karachay-Balkar and Ossetic respectively. In Karachay-
Balkar, we may notice, for instance, the lack of forms with initial ne-. Ossetic,
on the other hand, does not possess (indigenous) p and intervocalic -b-, while
it shows the frequent substitution of initial stops (especially from Karachay-
Balkar and other non-Caucasian languages) with their corresponding ejec-
tive consonants. Notable examples are: (Slavic) nepŭtyrǐ → Karachay-Balkar
ne bittir → Ossetic byttyr / bittir; Karachay-Balkar töppe, kaya → Ossetic c’upp /
c’opp (c < *tj), k’æj / k’æjæ; Ossetic tæk’uzgæ → Karachay-Balkar taqüzük.
– semantic shifts: the language that has preserved the meaning of the bor-
rowed form from a donor language most closely, may also have adopted
the form first. Examples include: Karachay-Balkar gılıw ‘foal; rat’ → Ossetic
gælæw ‘rat’; (Kartvelic) *mc̣q̣a- ‘quail’ → Karachay-Balkar mıga ‘quail’ →
Ossetic mæga ‘snipe’.
– historical-comparative evidence: forms directly inherited from their lin-
guistic affiliated group, i.e. (Indo-)Iranian or Turkic respectively, as shown
by historical-comparative methods, may decisively point to the direction
of the borrowing: Karachay-Balkar töppe (< Proto-Turkic *töppe, cf. Turkish
tepe, Kumyk töbe, etc.) → Ossetic c’upp / c’opp; Ossetic kærdæg ‘grass’ (< PIr.
*karta-ka- ‘cut’) → Karachay-Balkar kırdık ‘id.’.
A further (re-)assessment of the Ossetic and Karachay-Balkar material

may shed more light on the historical contacts between the Ossetians and
Karachay-Balkars, which in turn may assist in the formation of their respective
self-image and identity.
Precisely, the lack of great, especially, local researchers and scholars has
created a cultural and historic void in the national narrative of the Karachay-
Balkars. For this reason, many Karachay-Balkars have resorted to “borrow”

aspects of their culture and historiography from their Turkic brethren, notably
from Turkey. This can have dire consequences in the future. Being ignorant of
their unique (language) history, the Karachay-Balkars could well devalue their
language to such a degree, that they may decide no longer to pass it on to the
next generation, switching to Russian or even (Istanbul) Turkish instead.
References
Abaev, Vasilij I. 1933. “Poezdka k verxov’jam Kubani, Baksana i Čereka [Trip to the
upper reaches of the Kuban, Baksan and Cherek rivers].” Jazyk i Myšlenie I: 71-89
[= Abaev 1949, 271-290].
Abaev, Vasilij I. 1949. Osetinskij Jazyk i Folk’lor, vypusk I [Ossetian Language and Folklore,
part I]. Moskva – Leningrad: Izdatelʹstvo Akademii nauk SSSR.
Abaev, Vasilij I. 1958-1995. Istoriko-ètimologičeskij slovar’ osetinskogo jazyka [Historical-
Etymological Dictionary of the Ossetic Language]. 5 vols. Moskva – Leningrad:
Institut jazykoznanija RAN.
Abaev, Vasilij I. 1965. Skifo-evropejskie izoglossy na styke Vostoka i Zapada [Scytho-
European isoglosses on the crossroad between East and West]. Moskva: Nauka.
Anfert’eva, Antonina N. and Nikolaj N. Kazanskij. 2013. “Materialy k istorii Instituta
Lingvističestix issledovanij RAN 1921-1934 gg. (ot Instituta jafetidologičeskix
Izyskanij do Instituta jazyka i myšlenija im. N. Ja. Marra) [= Materials on the his-
tory of the Institute of Linguistic Research RAN from the years 1921-1934 (from
the Institute of Japhethological Investigation to the Institute of Language and
Thought, named after N. Ja. Marr)].” Acta Linguistica Petropolitana, Trudy Instituta
Lingvističestix issledovanij RAN 9, no. 1: 1-437.
Apažev, Muxamed and Džamaldin N. Kokov. 2008. Adygè-urys psal’al’è / Kabardino-
čerkessko-russkij slovar’ [Kabardino-Cherkess Russian Dictionary]. Nalčik:
Gosudarstvennoe izdatelʹstvo inostrannyx i nacionalʹnyx slovarej.
Benkő, Lórand. 1992-1997. Etymologisches Wörterbuch des Ungarischen. 3 vols.
Budapest: Akadémiai Kiadó.
Bennigsen, Alexandre. 1983. The Islamic Threat to the Soviet State. London – Canberra:
Croom Helm.
Cheung, Johnny. 2002. Studies in the Historical Development of the Ossetic Vocalism.
Wiesbaden: Dr. Ludwig Reichert Verlag.
Chirikba, Viacheslav. 1996. Common West Caucasian: the reconstruction of its phonologi-
cal system and parts of its lexicon and morphology. Leiden: Research School CNWS,
School of Asian, African, and Amerindian studies.
38 Cheung
Clauson, Gerard. 1972. An Etymological Dictionary of Pre-Thirteenth-Century Turkish.

Oxford: Clarendon Press.
Derksen, Rick. 2007. Etymological Dictionary of the Slavic Inherited Lexicon. Leiden:
Brill Publishers.
Gombocz, Zoltán. 1912. Die bulgarisch-türkischen Lehnwörter in der ungarischen
Sprache. Helsinki: Société finno-ougrienne.
Karaulov, Nikolaj A. 1908. “Balkary na Kavkaze [the Balkars in the Caucasus].” Sbornikʹʹ
materialovʹʹ dlja opisanija mestnostej i plemenʹʹ Kavkaza, 38: 131-180.
Lajpanov, Kazi T. 1967. “O tjurkskim èlemente v ètnogeneze osetin [On Turkic elements
in the ethnogenesis of the Ossetians].” In Proisxoždenie osetinskogo naroda [The
Origin of the Ossetian people], edited by Xazbi S. Čerdžiev: 209-214. Ordžonokidze:
Severoosetinskoe knižnoe izdat.
Maciuszak, Kinga. 2007. Review of Studies in the Historical Development of the Ossetic
Vocalism, by Johnny Cheung, Studia Etymologica Cracoviensia, 12: 203-206.
Marr, Nikolaj. 1933. Ètapy razvitija jafetičeskoj teorii [The stages of the Japhetic
Theory], edited by V. B. Aptekarev. Leningrad: Gosudarstvennaja akademija istorii
material’noj kultury. Vol. 1 of Izbrannye raboty [Collected Works].
Marr, Nikolaj. 1936. Voprosy jazykoznanija [Fundamental Questions of Linguistics],
edited by A. G. Ioannisjan. Leningrad: Gosudarstvennoe socialno-èkonomičeskoe
izdaltel’stvo. Vol. 2 of Izbrannye raboty [Collected Works].
Pollock, Ethan. 2006. Stalin and the Soviet Science Wars. New Jersey: Princeton
University Press.
Pritsak, Omeljan. 1958. “Das Karatschaische und Balkarische.” In Philologiae Turcicae
Fundamenta I, edited by Deny, Jean, Kaare Grønbech, Helmuth Scheel, and Zeki
Velidi Togan: 340-368. Wiesbaden: Franz Steiner Verlag.
Schmidt, Gustav. 1931. “Über die ossetischen Lehnwörter im Karatschajischen.” In
Mélanges de Philologie offerts à M. J. J. Mikkola, professeur de philologie slave à l’Uni-
versité de Helsinki: 364-395. Helsinki: Suomalainen Tiedeakatemia.
Sevortjan, Ervand V. 1978. Ėtimologičeskij slovarʹ tjurkskich jazykov na bukvu “b”
[Etymological Dictionary of Turkic Languages on the letter b]. Moskva: Nauka.
Siemieniec-Gołas, Ewa. 2000. Karachay-Balkar Vocabulary of Proto-Turkic origin.
Kraków: Księgarnia Akademicka.
Thordarson, Fridrik. 2009. Ossetic Grammatical Studies. Edited by Sonja Fritz. Wien:
Verlag der Österreichischen Akademie der Wissenschaften.
Voprosy Jazykoznanija. 1952. “Dva goda dviženija sovetskogo jazykoznanija po novomu
puty [Two years of the movement of Soviet Linguistics on a new path].” May-June,
3: 3-18.
Chapter 3
Why Caucasian Languages?1

Bernard Comrie
Introduction
The articles in this volume are for the most part concerned with sociolinguistic
aspects of languages of the Caucasus, but in the present article I want to
draw attention to the fact that these languages also have unusual features
of grammatical interest. In addition to the importance of documenting and
preserving languages of the Caucasus as part of their communities’ cultural
heritage, the languages are also of scientific importance because of their
structural properties.
I have selected six features for brief discussion, with reference to more
extensive treatment in the literature. The examples represent all three of the
indigenous language families of the Caucasus, with one each from Kartvelian
and West Caucasian (Abkhaz-Adyghe), and four from East Caucasian (Nakh-
Daghestanian), more specifically the Tsezic branch of East Caucasian; the
greater concentration on the last mentioned simply reflects my own greater
familiarity with these languages.
1 Georgian Verb Indexing (“Agreement”)
In Georgian, the finite verb indexes (or in more traditional terminology: agrees
with) its subject and object in person and number. This means that a transitive
verb indexes both its subject and its object. The relevant forms can be found in
any standard grammar of Georgian, and are to a large extent straightforward.
The forms in (1) show a selection of relevant forms in the present tense.
1 A version of this article was presented at the 1st International CUA Conference on Endangered
Languages, Ardahan, Turkey, 13-16 October 2014. I am grateful to the conference organizers
for making this event possible and to all those who participated in the discussion of my pre-
sentation. The following abbreviations are used: abs absolutive, ad ad(essive), adj adjec-
tive, all allative, antip antipassive, attr attributive, caus causative, erg ergative, in
in(essive), ipfvcvb imperfective converb, loc locative, obl oblique, prf perfect, prs pres-
ent, pstunw past unwitnessed, rel relative, resptcp resultative participle, sg singular.

40 Comrie
The stem of the verb is invariable, xedav, to which can be attached a prefix
and/or a suffix, depending on the person of the arguments to be indexed.
(1) v- xedav ‘I see him/her/it’

xedav ‘you see him/her/it’
xedav -s ‘s/he sees him/her/it’
m- xedav -s ‘he/she/it sees me’
g- xedav -s ‘he/she/it sees you’
m- xedav ‘you see me’
g- xedav ‘I see you’
Some of the cases are morphologically transparent, and likewise transparent

in terms of their processing, in particular forms like m-xedav-s and g-xedav-s,
where the prefix consistently indexes the object (first or second person) and
the suffix consistently indexes the subject (third person).
However, the system is somewhat more complex than this, in particular
because certain persons are indexed by zero, which means that the hearer2
cannot simply link a given affix to a given combination of grammatical rela-
tion (subject or object) and person, but rather has to infer the interpretation
from patterns of absence of prefix or suffix. In the simple cases, we can identify
subject and object prefixes as in (2).
(2) Subject 1sg v-

2sg ∅-
3sg -s
Object 1sg m-
2sg g-
3sg ∅-
Each of the overt subject prefix, overt subject suffix, and the two overt object
prefixes receives a consistent interpretation. Beyond that, inferencing is
required, as developed in more detail in Comrie (2013: 24-26).
First, absence of an object prefix on a transitive verb always indicates a
third person object. Thus, having identified that none of the object prefixes
is present, the hearer can make this inference. The case of absence of subject
prefixes is more complex, and here the hearer may have to go through several
stages. There is no overt second person affix, whether prefix or suffix, so as a
2 Although the inferences are presented as if conscious processes undertaken by the hearer,
such processing is of course below the level of consciousness, and combinations may well be
routinized by the time a child acquiring Georgian achieves fluency in the system.
Why Caucasian Languages ? 41
general rule the absence of both the first person subject prefix and the third
person subject suffix can be taken as an indication of a second person sub-
ject. There is, however, one exception to this, which follows from the fact that
Georgian does not permit combinations of person prefixes. The form g-xedav
has a second person object prefix, so clearly the object is ‘you’. What is the
subject? It cannot be third person, since this would require the third person
subject suffix -s, as indeed shows up in the form g-xedav-s in (1). Normally,
absence of a subject affix indicates a second person subject, so can g-xedav
mean ‘you see yourself’, with coreferential subject and object second person
arguments? The answer is negative, because another rule of Georgian inter-
venes, namely one that says that objects coreferential with the subject are not
expressed by means of an object affix on the verb, but rather by means of a
separate word tavi, literally ‘head’, so that a combination of second person sub-
ject and second person object is expressed literally as ‘you hit your head’; since
‘head’ counts as third person, it has no overt person affix in the verb morphol-
ogy. For g-xedav this leaves only one remaining interpretation, namely ‘I see
you’, and this turns out to be correct, since in Georgian when a subject and an
object prefix compete for the prefixal person position in the verb morphology,
the object wins out and the subject remains unexpressed. In g-xedav, there
is no piece of the verb’s structure that we can identify as expressing the first
person subject, and a zero subject marker does not in itself indicate a first per-
son subject; rather, it is the interaction of a number of principles of Georgian
verb morphology that conspire to determine the unique correct interpretation
of g-xedav.
2 Kabardian Pre- and Postnominal Relative Clauses
The discussion of relative clauses in Kabardian in this section is based on

Applebaum (2013: 108-118); cf. also the published summary in Applebaum &
Berez (2009: 30-33).
Kabardian has a fully productive prenominal relative clause construction,
as illustrated in (3)-(4). The head noun, ‘man’ in (3) and ‘girl’ in (4), is outside
the relative clause and to its right, while the relative clause precedes the head
noun, i.e. exactly the opposite order from what one finds in English and most
other European languages (although it does parallel the dominant construc-
tion in Turkish).
(3) [qálɐ-m ∅-kʷ’-á] ɬ’ə́ -r

city-obl 3sg-go-prf man-abs
‘The man who went to the city.’
42 Comrie
(4) [maxʷɐqɐ́ s mektéb-əm jə-ʃə-s-ɬɐɣ-á] həgɐ́ bz-ər

every_day school-obl 3sg-there-1sg-see-prf girl-abs
‘The girl whom I saw at school every day.’
In examples (3) and (4), the relative clause consists of a sequence of several
words (in fact, two in (3) and three in (4)), and the head is a separate word. This
can be seen most obviously from the presence of one stress per word, marked
in this transcription by means of an acute accent on the vowel in question.
In addition, Kabardian also has a postnominal relative clause, with basi-
cally the same internal structure as the prenominal relative clause, except that
there are much heavier restrictions and a very different prosodic structure. An
example is provided in (5).
(5) a žɐlɐ [də-z-ɣɐ-t’əs-á]-m

the boy 1pl-rel-caus-sit-prf-obl
‘The boy who made us sit down.’
Perhaps the most important characteristic of (5) is that it is phonologically

a single word, despite the transcription with three words following the
morphosyntax rather than the phonology. This can be seen most clearly in the
fact that (5) has a single accent. But there are also additional prosodic cues
that this is a single word. As a separate phonological word, ‘boy’ would have
the form ʃálɐ, with an accent on the first vowel, and strengthening of this vowel
from [ɐ] to [a]. In (5), the word for ‘boy’ does not show this strengthening,
since it is not a separate word and does not have an accent on its first vowel.
The initial consonant of the verb form also shows that there is no phonological
word boundary between it and the preceding ‘boy’. As a separate phonological
word its initial stop would be devoiced, i.e. tə-z-ɣɐ-t’əs-á-m; the absence of
devoicing shows that the verb is not at the beginning of a phonological word.3
Given that the postnominal relative clause must form a single phonological
word with its head, one might well ask whether this imposes further restric-
tions, of length or complexity, on the relative clause, given that the whole must
be pronounced as a single phonological word. The answer is affirmative: The
postnominal relative clause may not be longer than a single grammatical word,
a restriction that considerably facilitates the prosodic unification of the con-
struction, although it does mean that ideas requiring expression by means of
3 Though not directly relevant to the structure of relative clauses, the voicing of the initial
fricative of ‘boy’ in (5) shows that there is no phonological word boundary between it and the
preceding article.
more than a single grammatical word in the relative clause must be expressed
by means of prenominal relatives.
Kabardian illustrates first a clear example of language that has both pre-
nominal and postnominal relative clauses but where it is the postnominal rela-
tive clause that is more restricted in its expressive possibilities. Moreover, it
provides excellent illustration of the need to include phonological information
in carrying out syntactic analysis.
3 Tsezic Pharyngealization
This section examines two languages from the Tsezic branch of East Caucasian,
Tsez (based on the fuller discussion in Maddieson, Rajabov, & Sonnenschein
1996) and Bezhta, with the link between the relevant phenomenon in the two
languages following the discussion in Comrie (2003).
One of the characteristic phonetic features of Tsez is pharyngealization. In
the indigenous vocabulary, phonemic pharyngealization occurs in two sets
of circumstances. First, uvulars may occur pharyngealized in any position in
the word, including word-finally, cf. the contrast between raq ‘side’ and raqˁ
‘wound’. In this position, phonetic pharyngealization characterizes basically
just the consonant, including its release, with only minimal effect on the pre-
ceding vowel.
Second, and more interestingly for present purposes, phonemic pharynge-
alization may characterize the initial (C)V of a word, cf. the contrast between
-oƛo4 ‘amongst’ and ˁoƛno ‘seven’, and between mo ‘(eye) tear’ and mˁow ‘kind
of mushroom’. Here, phonetic pharyngealization characterizes the whole of
the vowel, and the last part of the initial consonant if there is one; it does not
extend to the rest of the word, i.e. beyond the initial (C)V.
Corresponding to a pharyngealized vowel (or CV sequence) in Tsez, in
Bezhta one finds what has traditionally been described as an “umlauted”
vowel, as seen when comparing Tsez ˁaƛ ‘village’ with Bezhta äƛ. The precise
distinctive phonetic value of umlauted vowels in Bezhta remains to be investi-
gated in detail, although impressionistically they are less pharyngealized and
4 The leading hyphen indicates that the postposition is normally preceded by a gender-number
prefix. While pharyngealization is frequent in Tsez lexical representations and clearly phone-
mic, it is not easy to find minimal or even near-minimal pairs.
44 Comrie
more fronted than their Tsez cognates.5 But there is a further, perhaps more
interesting difference.
In Tsez, as noted, pharyngealization on the initial syllable is restricted pho-
netically to that syllable. By contrast, in Bezhta umlauting of the vowel of an ini-
tial syllable extends to following syllables, though tending to fall off in intensity
as one gets towards the end of the word, especially with longer words. Thus, the
past tense of the verb ‘cough’ in Bezhta is öhƛö-yö, where not only the second
vowel of the stem but also the vowel of the suffix is umlauted, giving rise to vowel
alternations in suffixes, cf. xuƛo-yo, past tense of ‘drink’. In other words, Bezhta
has developed vowel harmony: In general, the vowels of a word are either all
umlauted or all non-umlauted. The comparison of Tsez and Bezhta thus shows
us one possible scenario for the origin of vowel harmony, through the extension
of what was originally an opposition phonetically characterizing only the first
syllable to one that characterizes phonetically the whole of the word.
4 Personification in Tsez
Tsez, like most East Caucasian languages, has a gender system.6 In Tsez, there
are four genders, distinguished in the singular for instance by the prefix which
they require on vowel-initial verbs and adjectives that agree with that noun
phrase. The four classes are set out in (6), with the prefix used for agreeing with
a noun of that gender, and a brief semantic characterization of the nouns that
constitute each class.
(6) Tsez gender (noun class) prefixes in the singular

I ∅- all and only male humans
II y- all female humans; some inanimates
III b- all animals; many inanimates
IV r- many and only inanimates
5 While the Bezhta umlauted vowels ä, ö, and ü are often traditionally characterized as front
vowels, acoustic investigation of their formant structure by Sven Grawunder (Max Planck
Institute for Evolutionary Anthropology) suggests that while there is some fronting, at least
for ö and ü, it does not advance even as far as the central position. The Tsez and Bezhta coun-
terparts sound different, but the precise phonetic characterization of the difference remains
open.
6 Traditionally, this often referred to as a noun class system in speaking of East Caucasian lan-
guages. For present purposes at least, this is purely terminological. In general, the gender of a
noun is not reflected in its own form, much as in German one simply has to learn the genders
of Löffel ‘spoon’ (masculine), Gabel ‘fork’ (feminine), and Messer ‘knife’ (neuter).
In somewhat more detail: Gender I contains all and only nouns denoting
male humans (plus supernatural beings assimilated to male humans).
Gender II contains all nouns denoting female humans plus a restricted
number of inanimate nouns. Gender III includes all nouns denoting animals,
plus a large number of inanimate nouns. Gender IV contains a large number
of nouns, all of them inanimate. While in one sense the main problem in
learning the gender of nouns in Tsez is the fact that inanimates are assigned
across three genders, for present purposes we are only concerned with the
semantically determined features of the system: Male humans are in gender I,
female humans in gender II, animals in gender III.
What, then, happens in a traditional tale where the participants are ani-
mals, but playing the role of humans? One traditional Tsez story that I discuss
in more detail in Comrie (2005) involves as main participants a rooster and a
hen, who play the role of a husband and wife, with traditional family roles, as
can be seen in (7), where the rooster goes out to work while the hen stays home
to look after the house.7 There is thus a potential conflict between the fact that
the protagonists are animals (which would require gender III), but behave as
humans (which would require genders I and II).
(7) mamalay ɣuddes b-ik’i-x zew-no qaci-x

rooster every_day III-go-ipfvcvb be-pstunw firewood-ad
ciq-x-ār onoču b-eynoy-xo zew-no idu-z-ā
forest-ad-all hen III-work-ipfvcvb be-pstunw home-loc-in
‘Every day the rooster went to the forest for firewood,
and the hen did the housework at home.’
A similar problem arises in English with the choice of pronouns in the third
person singular, where English distinguishes, roughly speaking, male human
he, female human she, non-human (including animals and inanimates) it.
If I were to tell the story of the rooster and the hen in English, then I would
almost certainly use the pronouns he for the rooster, she for the hen, since what
is deemed relevant is the human social role that each illustrates. In Tsez, by
contrast, gender III, appropriate for animals, is used, so that in the first line
of (b) the verb b-ik’i-x has the gender III prefix agreeing with ‘rooster’, and in
the second line the verb b-eynoy-xo again has the gender III prefix, this time
agreeing with ‘hen’.
7 The full Tsez text of the story, with a Russian translation, is available in Abdulaev & Abdullaev
(2010: 44-47); the version there is slightly different from the one used here, but the differences
are not significant.
46 Comrie
This points to an interesting difference in the interaction between gram-

mar and discourse in English and Tsez gender in cases of personification. In
English, it is the result of personification—male or female human—that pre-
dominates; in Tsez, it is the input (animal).
5 Bezhta Antipassive
In English, voice alternations are in general independent of the expression

of tense-aspect-mood (TAM), so that corresponding to the active simple past
in (8) one can change the aspect to progressive without changing the voice,
as in (9), and likewise one can change the voice without changing the aspect,
as in (10).
(8) The detective followed the suspect.

(9) The detective was following the suspect. progressive
(10) The suspect was followed by the detective. passive
In Bezhta, as generally in Nakh-Daghestanian languages, the alignment of a

simple transitive clause is ergative-absolutive, i.e. the subject/agent (A) in (11)
stands in the ergative case, while the direct object/patient (P) stands in the
absolutive case. (If the verb indexes an argument, this will be the P; in (11), the
word ‘bread’ belongs to gender III.)
(11) öždi bäbä m-üq-čä

boy.erg bread(III).abs III-eat-prs
‘The boy eats the bread.’
There is an alternative construction as in (12), a so-called antipassive (Comrie,

Khalilov & Khalilova 2015: 551-553), in which the P shows up in an oblique case
(in Bezhta, usually the instrumental), while the A shows up in the absolutive
case; there is a suffix indicating that the verb is in the antipassive voice.8
Just like the English passive, the Bezhta antipassive changes the grammatical
relations in the clause; note, for instance, that the verb in (12) now indexes
the A (gender I), not the P.
8 The formation of antipassives in Bezhta is, however, rather idiosyncratic, lexicalized, as will
be seen in the following examples.
(12) öžö bäbäla-d ∅-üⁿq-dä-š

boy(I).abs bread-ins I-eat-antip-prs
‘The boy is busy eating the bread.’
However, unlike the English passive, the Bezhta antipassive not only changes
the grammatical relations, it also changes the TAM of the clause, adding the
semantic component of durative, i.e. extended in time, which depending on
the lexical meaning of the verb might involve simply extending the particular
action in time, as in (12), or repeating it so that the resultant complex action
takes up more time than a single instance of the action in question, as in (14)
and (16) below. The question therefore arises whether the Bezhta antipassive
is really to be characterized as a voice, even though in (12) the grammatical
relations are different from in (11), or whether it should rather be characterized
as an aspectual, durative form, since this is its semantic import.
The situation becomes more complex if we include intransitive verbs. First,
Bezhta has a small class of intransitive verbs with onomatopoeic semantics
that take their single argument in the ergative case, as in (13). Diachronically,
these may originally have been transitive constructions (of the general type
‘the boy uttered X’), but synchronically they are one-place predicates.
(13) öždi öhƛö-yö

boy.erg cough-pst
‘The boy coughed (once).’
In one respect, the antipassive of such an onomatopoetic verb is similar to that

of a transitive verb: The argument in the ergative case in (13) shows up in the
absolutive case in (14), just like the A of a transitive clause.
(14) öžö öhdää-yö

boy.abs cough.antip-pst
‘The boy coughed (several times).’
With the general run of intransitive verbs, the antipassive, as in (16), has exactly
the same case frame as the basic construction, as in (15); although the form of
the verb shifts to indicate the antipassive.
(15) öžö ∅-ogic’-iyo

boy(I).abs I-jump-pst
‘The boy jumped (once).’
48 Comrie
(16) öžö ∅-ogiyac-ca

boy(I).abs cough.antip-prs
‘The boy jumps (several times).’
But is (16) then really an antipassive? There is no change in voice relative to

(15), the only change being in the aspectual value. Is the so-called antipassive
in Bezhta then purely aspectual, i.e. more accurately described as durative aspect
rather than as antipassive voice? The aspectual shift in meaning is constant,
but nonetheless there is an undeniable change in grammatical relations with
transitive verbs and, though perhaps less spectacular, with onomatopoetic verbs.
Bezhta thus illustrates an interesting interaction between voice and TAM.
The relevant morphological form is consistently durative. However, if the
verb is transitive an obligatory change in grammatical relations is required,
paralleling a prototypical antipassive construction. A corresponding change
is also required in onomatopoetic verbs. Nothing in the nature of human
language predicts this syntactic change. The Bezhta construction does pose
a terminological problem (antipassive? durative? both?), although the gen-
eralization, though unusual, is easy to state: The resultant form is consis-
tently durative, a change in voice to antipassive is required for transitive and
onomatopoetic verbs.
6 Bezhta Adjectives
As the final set of data on Tsezic languages, we consider the word classes
represented in the native and borrowed vocabulary of Bezhta, following
Comrie & Khalilov (2009). Of particular interest here are adjectives. Bezhta has
only a small number of indigenous underived adjectives, such as -uk’o9 ‘big’,
k’et’o ‘good’. In addition, within the indigenous vocabulary there are adjectives
derived from other word classes, such as participles derived from verbs, e.g.
-uɣo-yo ‘dead’, morphologically -die-resptcp, i.e. something like ‘having died’.
A large proportion of Bezhta’s adjectives, however, are loans from Avar, the
traditional lingua franca of the area, such as bercinab ‘beautiful’.
Within the Loanword Typology project, of which Comrie & Khalilov (2009)
forms part, for each language a fixed list of lexical meanings was used, trans-
lated into the language in question, and then identified as an indigenous for-
mation or a loan, in the latter case with further identification of the immediate
9 A leading hyphen indicates the position where an agreement (indexing) prefix precedes
the stem.
source. The data in (17) show, for each word class in Bezhta, the percentage of
words on the fixed list that are indigenous versus borrowed from Avar, from
Russian, from Georgian, or from some other source.
(17) Loanwords in Bezhta by part of speech and source language
Indigenous Avar Russian Georgian Other

Noun 55.6 23.1 12.1 8.3 0.8
Verb 94.0 6.0 0.0 0.0 0.0
Adjective 70.4 29.6 0.0 0.0 0.0
Function word 88.1 9.9 1.0 1.0 0.0
In the overwhelming majority of languages, from any given source of a

substantial number of loans the number of borrowed nouns exceeds that of
borrowed adjectives, as can be seen for loans from French into English in (18).10
(18) Loanwords in English by part of speech and source language (Grant 2009)
Indigenous French Latin Old Norse Other

Noun 52.0 29.5 9.7 2.7 6.0
Verb 65.9 22.3 5.9 4.8 1.1
Adjective 66.2 19.2 6.2 6.2 2.3
Function word 91.1 2.2 2.2 2.2 2.2
Bezhta is unusual in that among its substantial number of loans from Avar, the
main source of its borrowings in the LWT wordlist, the percentage of adjectives
that is borrowed is greater than the corresponding percentage for any other
word class. These adjectives also, incidentally, include many frequent words,
such as bercinab.
7 Conclusion
In this article I have drawn attention to some of the properties of indigenous

languages of the Caucasus that continue to attract linguists to them.
10 Likewise from Latin into English, although the numbers are smaller and therefore less
probative—a single item can shift the percentages of word classes significantly. From Old
Norse into English there are actually more adjectives than nouns or verbs, although the
numbers are again small.
50 Comrie
Documenting and preserving these languages is not only an important social

and societal task, but also a rewarding scientific endeavor.
References
Abdulaev, Arsen K. & Isa K. Abdullaev. 2010. Cezyas folklor [Tsez folklore]. Leipzig–
Makhachkala: “Lotos”.
Applebaum, Ayla Ayda Bozkurt. 2013. Prosody and grammar in Kabardian. PhD disser-
tation, University of California, Santa Barbara.
Applebaum, Ayla B. & Andrea L. Berez. 2009. A theory is only as good as the data: cast-
ing a wide net in Kabardian and Ahtna documentation. In Peter K. Austin, Oliver
Bond, Monik Charette, David Nathan, & Peter Sells (eds) Proceedings of Conference
on Language Documentation and Linguistic Theory 2, 29-38. London: SOAS.
Comrie, Bernard. 2003. A note on pharyngealization and umlaut in two Tsezic lan-
guages. In Winfried Boeder (ed.) Kaukasische Sprachprobleme, 105-109. Oldenburg:
Bibliotheks- und Informationssystem der Universität Oldenburg.
Comrie, Bernard. 2005. Grammatical gender and personification. In Dorit Diskin Ravid
& Hava Bat-Zeev Shyldkrot (eds) Perspectives on language and language develop-
ment: Essays in honor of Ruth A. Berman, 105-114. Dordrecht: Kluwer.
Comrie, Bernard. 2013. Ergativity: some recurrent themes. In Edith L. Bavin & Sabine
Stoll (eds) The acquisition of ergativity, 15-34. Amsterdam: John Benjamins.
Comrie, Bernard & Madzhid Khalilov. 2009. Loanwords in Bezhta, a Nakh-Daghestanian
language of the North Caucasus. In Martin Haspelmath & Uri Tadmor (eds)
Loanwords in the world’s languages: A comparative handbook, 414-429. Berlin: De
Gruyter Mouton.
Comrie, Bernard, Madzhid Khalilov, & Zaira Khalilova. 2015. Valency in Bezhta. In
Andrej Malchukov & Bernard Comrie (eds) Valency classes in the world’s languages,
541-570. Berlin: De Gruyter Mouton.
Grant, Anthony P. 2009. Loanwords in British English. In Martin Haspelmath & Uri
Tadmor (eds) Loanwords in the world’s languages: A comparative handbook, 360-383.
Berlin: De Gruyter Mouton.
Maddieson, Ian, Ramazan Rajabov, & Aaron Sonnenschein. 1996. The main features of
Tsez phonetics. UCLA Working Papers in Phonetics 93.
Chapter 4
International Research Collaboration on

Documentation and Revitalization of Endangered
Turkic Languages in Ukraine: Crimean Tatar,
Gagauz, Karaim, Qrymchak and Urum Experience
İryna M. Dryga
I would like to begin by thanking Ardahan University and Turk Dil Kurumu
for inviting me to speak here today. It’s a real pleasure to have this opportunity
to share my views on documentation and revitalization with you. I represent
A. Krymsky Institute of Oriental Studies, National Academy of Sciences of
Ukraine and at the same time National Network organized for the revitalization
of the Turkic minority languages in our country.
This initiative of combined efforts of professional linguists, politicians and
legislators was demonstrated not at once. It all started years ago with the rare
revitalisation attempts and efforts.
All of the Turkic languages in Ukraine (Crimean Tatar, Gagauz, Karaim,
Krymchak, Urum) are endangered, notably they are at risk of extinction in
the short or in the long terms. According to UNESCO’s “Atlas of Endangered
Languages” [http://www.unesco.org/languages-atlas/index.php] the degree of
their vitality varies from definitely endangered (Gagauz) to extinct (Karaim)
[Czató 2010]. Even the most numerous and demographically sizeable Crimean
Tatar population is considered linguistically endangered because the language
is spoken mainly by the representatives of the older generation.
The assessment of this language in the Atlas of Endangered Languages has
long appeared to be too optimistic and it is necessary to revise it on the basis
of a more profound analysis.
First, for the last several years we have spared no effort to collect and
preserve the remains of ‘small’ or ‘insular’ Turkic languages in our country
[Dryga 2010: 195-200, 220-233, 355-362, 406-419]. In 2006-2007 we conducted
a field study in Ukraine and Lithuania in cooperation with the Altaic Society
of Korea with a goal to revive, preserve and study the languages and, if pos-
sible, cultures of numerically small Turkic language speaking minorities such
as Karay, Qrymchaq and Urum, residing mainly in rural localities spread

52 Dryga
over the multilingual regions of the Crimea, Trakai and Azov and having no
native language education [Author’s recordings 2006]. As a result Korean and
Ukrainian linguists have collected an invaluable language material given that
there was only one speaker of Qrymchak remaining alive, about ten speakers
of Qypchaq – Polovets dialects of Urum, eight speakers of the Crimean dialect
and two speakers of Halych-Volyn’ dialect of Karaim as of 2008 [Altaic Society
of Korea, 2006: 198-203; http://altaireal.snu.ac.kr/askreal_v25/photogroup
.html]. The age of all the speakers was over 60, and mostly 80 or more.
Second, Lenara Kubedinova and Radovan Garabik from Slovakia began to
develop the corpus of Crimean Tatar Wikipedia as a collaborative free con-
tent internet encyclopedia [Yavorska, Dryga 2015: 130-136]. Natural language
processing often turns to Wikipedia for the source of material, because it
exhibits the content, represents contemporary living language and is a valu-
able source of texts, for example for Linguistic corpus of Crimean Tatar lan-
guage. For Crimean Tatar it became a point of grassroot prestige, and a major
source of accessible online texts in the language. For comparison, no other
Turkic language endangered in Ukraine has a Wikipedia corpus. Crimean Tatar
Wikipedia (in Latin script) officially started on 12 January 2008, though the first
pilot version dates from September 2006. It contains about 4000 articles, rank-
ing 164th by the number of articles.
Moreover, Miquel Cabal-Guarro on the basis of the data drawn from the
sociolinguistic survey that he conducted in 2011 amongst Crimean Tatars
across the peninsula of Crimea, provided an analysis of the language uses and
transmission [http://ru.krymr.com/a/25467619.html]. Although the Crimean
Tatar language was either seldom or never spoken, especially amongst
the individuals of the younger generation (that tend to use only or mostly
Russian in their everyday communication, even with their relatives), it is
still one of the main identification elements of the Crimean Tatar ethnicity
and nearly in all cases claimed to be the identity language of the respond-
ents, almost always even their declared native language. Miquel Cabal-
Guarro also tried to elucidate the degree of endangerment of the Crimean
Tatar language.
Tudora Arnaut organized 3 international symposiums in Kyiv in 2005-2010
[Arnaut 2005, 2010, 2013] on the problems of the Turkic speaking peoples,
and it became our initial experience of public collaborative discussions on
the endangered Turkic peoples’ linguistic problems. Quite important deci-
sions taken by the symposiums were quite commendable; they were trans-
ferred to the relevant governmental departments and commissions but had
no effect [Arnaut 2007]. Moreover, the new language law offered in 2010 by
ENDANGERED TURKIC LANGUAGES IN UKRAINE 53
pro-Russian and Communist Parliament majorities [https://en.wikipedia.org/

wiki/Language_policy_in_Ukraine] ignored these recommendations.
Oleksandr Rybalko tried to propose art and cultural events as one of the
additional tools for promoting endangered languages in case of Urum language
[Yavorska, Dryga 2015: 264-269]. Understanding that in order to be protected,
endangered languages require great collective efforts made by various govern-
ment, academic, and other institutions, he made a field trip with ArtPole Art
Agency to Stary Krym (Mariupol), a village founded by Urums speaking the
Oghuz dialect, recorded several pieces (fairytales and songs) from a 87-aged
native Urum speaker, a storyteller and singer, and one of the fairytales, “Ashyk
Garip”, was selected for an art project in a form of video art with subtitles in
Ukrainian and English, and accompanied by live music. The project was shown
at ArtPole festival, was well received by the audience, had educational value
showing to public how rich and diverse Ukraine is linguistically and culturally,
and emphasizing the need of urgent support to protect it. Other rare efforts of
enthusiasts can be also added.
In conclusion: we realized that all the individual efforts were not more
than just professional hobbies, and the case of Russian occupation of Crimea
and South–Eastern region forced us to reconsider the measures taken before.
Russian invasion has set for us new challenges of immediate, target accurate
protection of endangered Turkic languages when their situation has funda-
mentally worsened within less than half a year. OSCE observers having arrived
in the country immediately noted the importance of appearence in Ukrainian
public discourse on the topic of Turkic language minorities’ rights, against the
background of stable deterioration in the Crimean Tatar language’s situation.
Add to this targeted physical destruction of non-Russian villages at South–
Eastern region of Ukraine. Tragically, settlements of Turks-meskhetians (near
Sloviansk) and Urums became places of the most brutal battles. For example,
we took as illustration for our conference “Endangered Turkic languages of
Ukraine” the photo that became famous among Ukrainian Internet users.1 This
is a photo of Urum village Starobeshevo completely razed to the ground where
not a single house was left, but just unknown Ukrainian soldiers common
grave with homemade cross on it.
It became the impetus for the immediate organization of a working group
of all interested Ukrainian and Crimean linguistic experts, involving European
and Turkish experts that worked on answers to the following questions: how to
prevent the threat of language assimilation of Crimean Tatars and other Turkic
1 http://oriental-studies.org.ua/uk/перша-міжнародна-конференція-мови-п/.
54 Dryga
languages represented in Ukraine? What strategies of protection and support

should our country choose?
The conference analyzed language policies and educational practices
towards the endangered languages [Ajniuk 2012] as well as raised awareness
of these languages in Ukraine in the context of language diversity in Europe.
It also discussed the issues of language planning and language shift including
expected cultural and cognitive consequences of language loss from different
perspectives. Particularly, when discussing theoretical issues Paul Billbao-
Sarria mentioned that for successful language recovery we need to meet 4
aspects. These are 1) adequate legislation, 2) proper planning, 3) enough and
well spent economic resources, 4) public interest and support, such as backup
of a community [Yavorska, Dryga 2015: 69-70]. As a step for creating a civil soci-
ety, the community must be able to create new speakers of the language, new
spaces to use that language and protect means to recover the language as far
as create political lobby concerning their language issues, prepare reports on
the status of language rights regularly etc. All these tasks were taken on by our
National Network on Endangered Turkic languages in Ukraine. Institutionally,
we have responded positively to the offer to join projects of European Language
Equality Network (ELEN) and hope that this will help in the language rights
monitoring in the Crimea, access to which is now unavailable.
I would like to draw your attention to the fact that endangered languages
in the context of language ideologies are metaphorical notions and security
is the main concept here. For many years in Ukraine the situation developed
in which not only Turkic languages but Ukrainian and Russian as well were
perceived as the languages under threat. As a result of the 2014 conflict per-
ception of the Ukrainian language shifted and it began to be perceived not
as a language under threat but as a language-protector [Yavorska, Bogomolov
2010; Yavorska, Dryga 2015: 8]. Crimean Tatar participants of the conference
noted that after the events in the Crimea in spring of 2014 Crimean Tatar
youth has started to communicate more in native language and it seems
to represent a shelter element for the many [Yavorska, Dryga 2015: 119-129;
160-172]. So suddenly endangered languages have become in demand nowa-
days and we discussed how to explore the shifting mechanism and to use this
trend properly.
Our Polish colleagues shared their experience of endangered languages’
heritage documenting that is close to us methodologically and technically. In
particular, their research group has developed special software for digital pro-
cessing of various linguistics materials. The researchers have collected data on
Kashub, Armenian-Qypchak, Tatar, Karaite and other languages under threat
of extinction and dormant languages. It was also told us about the initiatives
of Tatar activists in Poland on revitalization of their native language [Yavorska,
Dryga 2015: 10-11].
And the last lesson taught us by the recent events and finally by The lan-
guage law is that the definition of a particular language endangerment must
clearly display the language status in synchrony and diachrony. The fact that
this definition is not clearly established created the situation when in certain
countries the vigorous languages like German and Russian receive the same
protection level as languages on the verge of extinction.2
Conclusions based on reports presented and discussions conducted on
the topic:
1. Dominant culture and the ideologies embedded in it may often nega-

tively affect endangered languages even if unwittingly [Yavorska,
Bogomolov 2010].
2. Split jurisdictions in the aftermath of the collapse of the Soviet Union
and wars (occupation of Crimea) complicate sustained and consistent
education policies toward the endangered languages.
3. Historical depth that was interrupted from the native perspective and
appropriated by cultural others—hence, e.g. the Crimean Tatar texts are
perceived as Ottoman.
4. Quite a few of bi-cultural communities still are on the scene as Armenian
Qypchaks, Urums, Karaites, Gagauzes who sometimes prefer to be
Bulgarians.
5. Facing dilemmas of loyalties and cultural affiliation—affected by the cul-
tural policies and historic narratives of various kin-states and
communities.
6. Bilingualism—as both the necessity and the policy issue often un-
reflected upon properly—both the natives and hosting states tend to side
up, explicitly or implicitly, with the preference of monoligualism.
7. The need for a specific alphabet in the case of the endangered Turkic
languages—probably phonetic—instead of serving as an instrument to
ensure better access to language for those willing to improve their com-
petence, as alphabets often become an ideological minefield. E.g.
Crimean Tatars failed to adopt a new Latin based alphabet;
2 About the conference look more at: http://internationaal.pvda.nl/2014/10/01/een-ontmoeting-

met-krimtataren/ and http://teraze.org.ua/page.php?id=9&article=4896.
56 Dryga
– the need for digitizing texts including old and new oral ones while
native speakers of dialects are still alive;
– the need for both bi-lingual and explanatory dictionaries;
– the need for oral accounts to be collected while there is still something
to collect;
– the need for translation as an instrument of enhancing the corpus of
modern texts, sustaining and developing language.
8. UNESCO’s Atlas of World’s Endangered Languages still has inadequate
characteristics for a number of Turkic languages. The cause is the
lack of sources. Consequently, the situation demands a professional
re-assessment of the current status of every of Turkic languages.
Our research Institute and our organization are ready to enter as a partner
organization into every kind of international research projects, groups, and
to join international language corpora and databases with our own language
databases and accomplishments, etc.
We believe that professional discussion on endangered languages’ preser-
vation issues will contribute to the promotion of cultural diversity as a value,
of tolerance towards different ethnic groups and people living in Ukraine.
This would ensure the dialog and the understanding between Crimean Tatars,
Ukrainian- and Russian-speaking population of the Crimea. It will also create
additional opportunities for Crimean Tatars and other Turkic-speaking peo-
ples of Ukraine to take finally, though may be too late, a worthy place in the
social and cultural life of the country. It will also help to overcome the political
tension around the language issue in Ukraine.
Primary Sources
Author’s recordings of Crimean Tatar (Razdol’ne, 1998-1999), Karaim (Simferopol –

Evpatoria, 2006, Trakai 2006-2007), Urum (Mariupol – Stary Qrym, 2007, 2008),
Krymchak (Simferopol, 2006).
Secondary Sources
Ajniuk, Bohdan (ed.), 2012. Ekologia movy i movna polityka v suchasnomu suspilstvi
(zbirnyk naukovyh prats).—Кyiv: Vydavnychyj dim Dmytra Burago, 376 p.
Altaic Society of Korea, 2006. Fieldwork Studies of Endangered Altaic Languages. For
the Genealogical Study of Korean and the Preservation of Endangered Languages.The
Language and Cultural Studies Series 2. Seoul: Altaic Society of Korea.
Arnaut, Fedora and Dermenci, Ömer (eds.), 2005. Kruhlyi stil: Tiurkomovni narody
Ukrajiny (movy ta kultury tatar, gagauziv, urumiv, karajimiv i krymchakiv). Kyiv:
Vydavnychyj dim Dmytra Burago, 112 p.
Arnaut, Tudora and Nasrattinoglu Irfan (eds.), 2010. Materialy druhoho Vseukrajinskoho
kruhloho stolu “Problemy osvity tiurkomovnyh narodiv Ukrajiny: gagauziv, urumiv,
karajimiv, krymchakiv i krymskyh tatar” / II.Ukrayna’daki Türkçe Konuşan Halklar
(Gagauz, Urum, Karay, Kırımçak ve Kırım Tatarları’nın eğitim sorunları) Paneli. Kyiv:
Vydavnycho—polihrafichnyj tsentr “Kyivskyj universytet”, 257 p.
Arnaut, Tudora and Nasrattinoglu Irfan (eds.), 2013. Materialy tretioho Mizhnarodnoho
sympoziumu “Tiurkomovni narody Ukrajiny” / III.Uluslararası Ukrayna’da Türkçe
Konuşan Halklar Sempozyumu. Кyiv: Vydavnycho—polihrafichnyj tsentr “Kyivskyj
universytet”, 335 p.
Arnaut, Tudora, 2007. Ukrayna Gagavuzların Arasında Ana Dilinin Gelişmesi, In:
Gagavuz Türkçesi Araştırmaları 27-29 Aralık 2007 Bilgi Şöleni. Ankara: Türk Dil
Kurumu Yayınları, p. 52.
Czató, Èva Á., 2010. ‘Report on an Uppsala workshop on Karaim studies’, In: Johanson,
Lars and Csató, Éva Á. (eds.), Turkic languages, Wiesbaden: Routledge.
Dryga Iryna, 2010. Pontika: Türkoloji yazıları. Тюркологічні студії. Кyiv: Chetverta
hvylia, 530 p.
Yavorska, Halyna and Bogomolov, Alexander, 2010. Nepevny object bazhannia: Jevropa v
ukrajins’komu politychnomu dyskursi. Кyiv: Vydavnychyj dim Dmytra Burago, 136 p.
Yavorska, Halyna, Dryga, Iryna (eds.), 2016 Zahrozheni movy. Krymskotatars’ka ta
inshi tiurkski movy v Ukrajini: zbirnyk naukovyh prats’. 1st International Conference
Endangered languages: Crimean Tatar and other Turkic languages in Ukraine
(26-27th September 2014). Proceedings. Kyiv: Aksioma Medobory, 344 p.
Internet Resources
http://altaireal.snu.ac.kr/askreal_v25/photogroup.html
http://oriental-studies.org.ua/index.php?news=5379.
http://ru.krymr.com/a/25467619.html
http://sonseslerduyulmadan.hacettepe.edu.tr
http://www.turkiyat.hacettepe.edu.tr/kitap/tehlikedekidillerbildirileri_090413.pdf
http://www.unesco.org/languages-atlas/index.php
https://en.wikipedia.org/wiki/Language_policy_in_Ukraine
58
Appendix
Dryga
FIGURE 1 Turkic speaking minorities on ethnographic map of Ukraine.

FIGURE 2 Photo of destroyed Urum

village Starobeshevo
on the banner of the 1st
International Conference
“Endangered Languages:
Crimean Tatar and Other
Turkic Languages in
Ukraine”.
Chapter 5
Cases-Non-cases: At the Margins of the Tsezic

Case System
Diana Forker
1 Introduction
The Tsezic languages are a group of closely related languages that form one
subbranch within the Nakh-Daghestanian (or East-Caucasian) language
family. They can be divided into East Tsezic, comprising Hunzib and Bezhta,
and West Tsezic, comprising Khwarshi, Tsez and Hinuq. Tsezic languages are
spoken in the Republic of Daghestan, which belongs to the Russian Federation.
Daghestan is located in the north-eastern part of the Caucasus. Smaller groups
of Tsez and Bezhta speakers also live in Turkey, and some Bezhta speakers live
in Georgia. The largest Tsezic language is Tsez with about 12,000 speakers
(according to the Russian census of 2010); the smallest language is Hinuq with
around 600 speakers.
Case assignment in the Tsezic languages is largely semantically moti-
vated, and morphosyntactic features play only a marginal role (cf. Kibrik
1997). Due to the dominant role of semantics in the assignment of case it
seems that it is relatively simple to extend the case inventory. That is, suffixes
and enclitics with an autonomous distinguished form paired with a clear-
cut meaning can, in principle, develop into cases. This seems to explain the
origin of many spatial cases in Tsezic that most probably go back to spatial
postpositions.
The aim of this paper is to explore a number of nominal markers that
resemble cases and compare them with genuine case markers with respect
to functional and formal similarities and differences. I will adopt the canoni-
cal approach as exemplified by Corbett (2008) for the feature of case. Corbett
(2008) provides ten criteria for canonical case markers and examines the
Russian cases in regard to the criteria. I will use these criteria for investigating
whether the respective nominal markers from the Tsezic languages could be
analyzed as case markers.

Cases-Non-cases: At the Margins of the Tsezic Case System 61
2 Some Notes on the Nominal Inflection in Tsezic Languages
Inflectional categories of nouns in Tsezic languages are number and case. The
languages have elaborate gender systems with up to five genders, but nouns
are normally not inflected for gender. Inflection is almost exclusively suffixing
and largely agglutinative. For the nominal inflection this means that number
and case are expressed by different suffixes.
Tsezic languages have a rich case inventory with a large number of spatial
cases. Non-spatial cases found in all Tsezic languages are absolutive, ergative,
genitive, and instrumental. All cases except for the absolutive are expressed by
suffixes that are added to the so-called ‘oblique stem’. By contrast, the form that
the noun takes in the absolutive case is called ‘direct stem’. The same distinc-
tion between direct and oblique stem is found with all sorts of pronouns, and
partially also with adjectives and participles.
Case formation is straightforward and regular. The main difficulty in the
nominal morphology of the Tsezic languages is the formation of the oblique
stem from the direct stem. There are several operations that are used in order
to form the oblique stem: insertion of vowels, consonants or glides, ablaut,
stress shift, conversion, and suffixation of oblique stem markers (Forker 2010a).
The last process, the suffixation of an oblique stem marker, is the most fre-
quent and the only productive way of forming oblique stems. The number
of oblique markers differs from language to language. In Khwarshi, there are
only six markers whereas Hunzib has more than twenty. In all languages,
the number of markers is much lower for plural nouns than it is for nouns
in the singular. Table 1 illustrates the formation of oblique singular stems to
which the genitive suffix has been added.
Table 1 Examples of oblique stems for singular nouns
Tsez Hinuq Khwarshi Bezhta Hunzib
operation ablaut, vowel stress consonant & deletion suffixation

deletion & shift vowel insertion of glide of oblique
oblique marker marker
‘place’ ‘boy, son’ ‘sibling’ ‘eye’ ‘stable’
absolutive moči úži ɨs häy bež
genitive meč-o-s uží-š ɨs-t-ɨ-s hä-l bež-li-s
62 Forker
There is some variation in the stem formation systems. This means that often
nouns have more than one way of forming the oblique stem; they follow a
rare and unproductive pattern but can also be used with a common oblique
marker. Moreover, with spatial cases the case suffix is sometimes attached to
the unmodified noun. Depending on the analysis we can either say that in such
examples the spatial cases are suffixed to the direct stem or that we deal with
conversion.
3 First Case Study: -ɣo / -ɣa
It is plausible to assume that Proto-Tsezic had a case marker *-ɣo that is still
found in Bezhta and Khwarshi, but has almost been lost in Hinuq and is absent
from Tsez and Hunzib (see Forker 2010b, 2012a for more information on the
spatial case systems of Tsezic languages). In Bezhta, the spatial suffix -ɣa has
the meaning ‘next to, by, at, near’ (1a) (Kibrik & Testelec 2004: 236). It is also
used for the standard of comparison, and occasionally with temporal and
metaphorical meaning. The Khwarshi suffix -ɣo translates as ‘in close contact
with, nearby’ (1b) (Khalilova 2009: 82). In both languages, further directional
cases (lative, ablative, etc.) can be added to the suffix as the Khwarshi example
illustrates.
(1a) Bezhta
buxari-ya-ɣa gäʔä sukʼo=na
flue-obl-next be.neg who=add
‘There is nobody next to the flue.’
(1b) Khwarshi (Khalilova 2009: 82)

y-õk’-un abaxar yuq’uˁč’eɣo-l=in
ii-go-uwpst neighbor(ii) old.woman.apud-lat=add
uq’uˁč’eɣo-l=in
old.man.apud-lat=add
‘The neighbor went to the grandmother and grandfather.’
In Hinuq, the situation is different because -ɣo is poorly integrated and only
occurs in very few spatial and temporal adverbs (Forker 2013: 103). The spatial
adverbs č’ek’k’uzaɣo ‘everywhere’ and seda-ɣo ‘in one place’ are formed from
quantifiers. The temporal adverbs are sebedoɣo ‘in autumn’ (< sebe ‘autumn’)
and aldoɣo ‘formerly, before, in front’. As can be seen in the spatial adverbs the
suffix is added to the oblique stem, i.e., seda is the oblique form of the numeral
hes ‘one’ and -za is an oblique plural stem suffix that has been attached to the
quantifier č’ek’k’u ‘all’.
4 Second Case Study: The Vocative
Hinuq and Khwarshi are the only Tsezic Nakh-Daghestanian languages that
have a vocative suffix -(i)yu. In Hinuq, it is only used with two common nouns
(uži-yu ‘boy-voc’, ked-bi-ža-yu ‘girl-pl-obl.pl-voc’), and it also expresses
affectionateness and endearment (Forker 2013: 433-434). In Khwarshi, it seems
that it is more widely used because it can occur with various types of common
nouns, e.g. cana-yu ‘she.goat.obl-voc’ (Khalilova 2009: 72-73). Neither in
Hinuq nor in Khwarshi can the vocative be suffixed to proper names. In both
languages the vocative is added to the oblique stem of the noun.
5 Third Case Study: The Functive
5.1 The Functive in Tsezic, Avar and Andic Languages

The Tsezic languages as well as Avar and many Andic languages have a suffix
that Creissels (2014a, b) calls a ‘functive marker’. I will start my discussion of
the functive by describing its functions in Hinuq.
The Hinuq functive marker is -ɬun (Forker 2013: 432-433). It is exclusively
added to nominals, including case-marked nouns or pronouns and nominal-
ized verbs or adjectives and specifies the property of accomplishing the role
or manner or function of whatever the hosting nominal denotes, similar to
English ‘as’. More specifically, it occurs in three contexts of use:
i. professions and other social functions (‘as a teacher’, ‘as a friend’, etc.)
(2) Hinuq
hago zoq’ʷe-n toxtor-ɬun
he be-uwpst doctor-func
‘He was a doctor.’
ii. the intellectual activity of regarding a situation in a certain light or aspect

(e.g. ‘consider as’)
(3a) sud-mo-y Malla Rasadan bitʼaraw-ɬun ∅-uː-ho

court-obl-erg Mullah(i) Nasrudin right-func i-do-prs
‘The court gives Mullah Nasrudin right (lit. makes him the one who is right).’
64 Forker
(3b) deče=gozon debez žo r-eqʼi-yon, me ∅-iči

how.much=prt 2sg.dat thing(v) v-know-conc 2sg i-be
r-eqʼi-š-me-ɬun
v-know-ptcp.res-neg-func
‘How much ever things you know, be as if you would not know them.’
iii. adverbial clauses expressing causes
(4) аˁši iše y-iq-oru-ƛ’o-ɬun hayɬoqo

much snow(iv) iv-become-ptcp.pst-spr-func he.at
awariya b-iq-iš
accident(iii) iii-happen-pst
‘Because of the heavy snowfall he had an accident.’
In other Tsezic languages, cognates of the Hinuq suffix are attested (with the
exception of Hunzib for which the relevant information is lacking). It seems
that in all Tsezic languages the functions i and partially ii are found. The
functive suffixes can be added to nouns bearing case suffixes and if they are
added to items without overt case suffixes they are attached to the direct stem
(not to the oblique stem). In Khwarshi, the suffix is -ɬun / -ɬin (Khalilova 2009:
257-258); in Tsez it is -ɬun (5a). In Bezhta, the functive suffix -ɬun is also used
with participles of verbs in a construction with the meaning ‘as if’, e.g. when
people are pretending to act in a certain way (5b). Furthermore, all Tsezic
languages including Hunzib have a postposition sababɬun consisting of the
Avar noun sabab ‘cause, reason’ and the functive suffix. The complex word
sababɬun is also used in Avar, but it is not regarded as a postposition in the
recent Avar grammar by Alekseev et al. (2014).
(5a) Tsez (Abdulaev & Abdullaev 2010)

deber di žek’u-ɬun=ä c’ok’i-x-anu=ƛin
2sg.lat 1sg man-func=q consider-ipfv.cvb-neg=quot
‘Don’t you consider me to be a man?’
(5b) Bezhta
y-egay-ʔeš-ɬun=na ∅-aq-na giɣa
ii-see-ptcp.pst.neg-func=add i-happen-cvb down
ẽxe-ɣa=na gowacʼo-na ∅-ẽƛʼe-š Tʼahir
river-next=add look.i-cvb i-go-prs Tahir(i)
‘Pretending not to see (her) and looking at the river Tahir is walking by.’
Avar has the suffix -ɬun that is used in the same or at least in a very similar
manner as the functive suffixes of the Tsezic languages (6). The suffix is not
explicitly mentioned in the grammars by Charachidzé (1981) and Alekseev
et al. (2014). Ebeling (1966: 72), following an older grammar of Avar written in
Georgian, writes that it “forms the ‘predicative’, meaning ‘as, in the quality of’
(see Creissels 2014b: 442 for further references).
(6) Avar (Axlakov 1976: 13)

[Go, but don’t convince him with your hand (i.e. with the force), but]
k’ole-b b-at-ani, k’al-zu-l xabar-al-da-ɬun
be.able-n n-find-irr.cond mouth-obl-gen talk-obl-loc-func
b.aq-e!
get.n-imp
‘if you can, get it by talking to him.’
In most of the Andic languages functive suffixes are attested as well, and the
meaning corresponds to the meaning of the cognate suffixes in Avar and
Tsezic. In Karata, Godoberi, Tindi and Botlikh the suffixes are formally very
similar and in one case even identical to the Avar and Tsezic suffixes:
– Karata -ɬe (Creissels 2014b: 444, citing Magomedova & Xalidova 2001)
– Godoberi -ɬu (Saidova 2006: 226)
– Tindi -ɬo (Magomedova 2003)
– Botlikh -ɬun (Creissels 2014b: 444, citing Saidova & Abusov 2012: 109)
Relevant examples are (7a, b)
(7a) Godoberi (Saidova 2006: 226)

den muʕalim-ɬu=da
1sg teacher-func=cop
‘I am a teacher.’
(7b) Tindi (Magomedova 2003)

ihwa-ɬo ħalt’ijaː
herdsman-func worked
‘(He) worked as a herdsman.’
In the languages Northern Akhvakh and Bagvalal, which also belong to the
Andic branch of Nakh-Daghestanian, the functive suffix is formally slightly
66 Forker
diverse and is followed by a gender agreement marker. The Northern Akhvakh

suffix -ɬ+agreement (+-he/-hi) is described in detail by Creissels (2014a, b). It is
suffixed to nouns, pronouns, adverbs, etc. The agreement is controlled by the
absolutive argument of the clause. The same agreement suffixes are used with
the general converb.
(8) Northern Akhvakh (Creissels 2014b: 441)

di-la hu-be čaka χːirada ʕadati-ɬ-eː harigʷ-ari
1sg-dat dist-n very dear custom-func-adv.n see-pf
‘I considered this a very good custom.’
The Bagvalal suffix -l(h)i is analyzed by Daniel et al. (2001: 191-193) as a verb-
forming suffix that attaches to nouns, adjectives, adverbs, and postpositions.
Like in Akhvakh it is followed by a gender agreement marker, and in the
available examples also by a converbal suffix (9a, b).
(9a) Bagvalal
gurǯija-j hak’uj-li-j-o j-ah-aː
Georgian-f wife-vblz-f-cvb f-take-pot.inf
‘to take a Georgian (woman) as wife’ (Daniel et al. 2001: 193)
(9b) milica-lhi-w-o ħalt’idaː-X ida ow

police-vblz-m-cvb work.ipfv-cvb be he
‘He works as a police officer.’ (Magomedova 2004)
5.2 Creissels’ Analysis

In his analysis of Northern Akhavakh, Creissels (2014a, b) introduces the
term ‘functive’ and compares the Akhvakh suffix -ɬ to essive cases in Uralic
languages such as Hungarian, Estonian, and Finnish. Creissels (2014b) also
makes a proposal concerning the diachronic development of functives in
Tsezic, Avar and Andic. Following Alekseev (1988: 35), he suggests that the
functive originates from a verb *ɬ- ‘become’ inflected with a converbal suffix.
In fact, Avar has а perfective converb suffix -(u)n (Alekseev & Ataev 1998: 62),
but synchronically there is no verb ɬ-ize ‘become’.1 Moreover, in Avar-Andic and
in Tsezic there are a number of verb-forming suffixes that can reasonably be
analyzed as going back to such a verb (Table 2).
1 There is a verb ɬeze ‘put’, but I am not sure about its relation to the functive suffix.
Table 2 Verb-forming suffixes in Avar-Andic and Tsezic
Language Suffix Example
Avar -ɬ xera- ‘old’ > xer-ɬ-ize ‘grow old’

Akhvakh -ɬ ĩk’a ‘large’ > ĩk’a-ɬ-urula ‘enlarge’ (Creissels 2014b: 441)
Godoberi -ɬ -eč’uxa ‘big, large’ > -eč’uxa-ɬ-i ‘become large’ (Saidova
2006: 67)
Karata -ɬ herk’a ‘big, old’ > herk’ã-ɬ-aɬa ‘become big / old’
(Magomedova & Xalidova 2001: 122)
Bagvalal -l(h)i muk’u- ‘small’ > muk’u-li ‘become small, decrease’ (Daniel
et al. 2001: 191)
Hinuq -ɬ -oč’č’u ‘cold’ > -oč’-iɬ-a ‘become cold’ (Forker 2013: 329)
Khwarshi -ɬ c’odora- ‘clever’ > c’odor-ɬ- ‘become clever’ (Khalilova
2009: 266)
Bezhta -ɬ -uq’o ‘big, old’ > -uq’-ɬ-al ‘grow, increase’ (Khalilov 1995: 136)
Tsez -ɬ c’uda(ni) ‘red’ > c’uda-ɬ-a ‘become red’ (Khalilov 1999: 279)
In the Andic languages, not only adjectives and adverbs/postpositions, but

also nouns serve as the base for the derivation of verbs. In the Tsezic languages,
this seems to be impossible. But Tsez, Hinuq, and Bezhta make use of the same
suffix for the derivation of inchoative and potential verbs from other verbs, e.g.
Hinuq -ac’- ‘eat’ > -ac’eɬ- ‘be able to eat’ (Forker 2013: 328-330). In Khwarshi, the
derivational suffix for potential verbs is -l(l) (Khalilova 2009: 267).
When analyzing the Akhvakh functive Creissels (2014b: 441) discusses the
question whether it could be considered a derivational suffix employed for the
formation of verbs, as has been done by Daniel et al. (2001) for Bagvalal. In
fact, such an approach seems plausible since the gender marker following -ɬ in
the Akhvakh functive marker also constitutes the suffix of the general converb.
Moreover, it is cross-linguistically common to express the functive meaning
through a construction with a verb such as ‘be, become’. However, Creissels
(2014b) provides four counterarguments and suggests at the same time to treat
the functive as a case marker. The counterarguments are:
1. it can be added to base words from which verbs normally cannot be

formed, e.g. interrogative pronouns such as ‘what’
2. the nouns to which the suffix is added can be marked for case and/or
number (6)
68 Forker
3. nouns bearing the functive suffix can take modifiers (e.g. genitive modi-
fiers or adjectives) (8), (10)
4. nouns with the functive suffix serve not only as adjuncts but also as
predicative arguments of verbs such as ‘be’, ‘become’, ‘consider’, etc. (2),
(5a), (10)
However, despite the functional similarity to cases such as the Uralic essive,
I want to argue that it does not seem the best solution to treat the functive as
belonging to the case systems of Tsezic, Avar or Andic. I will concentrate in my
argumentation on the functive in Hinuq because this is the language for which
I have most data, but my first and my second argument equally apply to the
other Tsezic languages.
Why should the Hinuq functive not be considered a case marker? How does
it diverge from the other case suffixes? First of all, it can be added to other
case markers (4). Second, it is suffixed to the direct form of the nominal or
nominalized adjective or verb. For instance, if the resultative participle takes
case suffixes, it must be in its oblique form -za, but, as (3b) illustrates, this
is not the case for the functive. Third, nouns to which the functive is added
do not trigger the use of the second genitive (10). These three properties
are not attested for the case suffixes. Case suffixes are not added to other
case suffixes (though there are complex spatial cases, but they always have as
the second suffix a directional marker). Case suffixes are added to the oblique
stem (though for some nominals oblique and direct stem are identical). All
nominals inflected for any case other than the absolutive trigger the use of
the second genitive (10b). The first genitive is only available for nouns in the
absolutive case.
(10a) Hinuq
hago di ∅-egennu essu-ɬun ∅-iči-ƛo!
he 1sg.gen1 i-young sibling(i)-func i-be-opt
‘May he be my younger brother!’
(10b) dižo essu-y tʼek tʼotʼer-ho

1sg.gen2 sibling-erg book read-prs
‘My sibling reads a book.’
The similarity of the Hinuq suffix -ɬun with its Avar source suggests that it must
be a more recent borrowing and considerably younger than the verb-deriving
suffix -ɬ. The derivational suffix is used with a wide range of base words
including many native Hinuq words and is rather deeply integrated into the
linguistic system. The derived verbs behave just like simple verbs. Therefore,
Hinuq -ɬun and -ɬ cannot directly be traced back to the same origin, even if this
is probably the case for the Avar equivalents.
6 Fourth Case Study: Tsezic -ɬi
The last case study examines the suffix -ɬi and its use in the Tsezic languages.
This suffix has again been borrowed from Avar. Its functions in Tsezic largely
overlap with its functions in Avar, but there are also some differences between
Avar -ɬi and Tsezic -ɬi as well as among the Tsezic languages. The functions are:
i. Derivation of abstract nouns from other nouns, adjectives, non-finite

verbs forms, etc. in all Tsezic languages, e.g. Bezhta gäččö ‘be not’ > gäččöɬi
‘absence’, Tsez halmaɣ ‘friend’ > halmaɣɬi ‘friendship’, Hunzib ɨs ‘brother’
> ɨsɬi ‘family’
ii. Marking complements of verbs of knowledge and cognition in all Tsezic

languages except Hunzib (11a, b) (Forker 2016). In Bezhta, this function
seems to be broader than in the other languages and also includes other
types of subordinate clauses (Kibrik & Testelec 1994: 261).
(11a) Bezhta
[do hoƛoʔ biƛo-ʔ ∅-eče-yo-ɬi] ∅-iqʼe-da . . .
1sg here house-in i-be-ptcp-abst i-know-cond
‘if they knew that I lived here in the house . . .’
(11b) [biƛo r-oː-ro-ɬi] zuq’o-ro kutakalda zaħmatab

house(iv) iv-do-ptcp-abst be-pst strongly difficult
‘Building the house was very difficult.’ (Kibrik & Testelec 1994: 261)
iii. Marking the topic of a conversation or narration in Hinuq and Tsez

(12a, b). In Hinuq, the topic of a conversation can also be marked with the
first genitive, and in Tsez with the cont-Ablative (or some other spatial
cases).
(12a) Tsez (Abdulaev & Abdullaev 2010)

žoyä [babi-ya ɬina-ɬ xizay egäru-ɬi]
boy.erg father-erg what.obl-cont after send.ptcp.pst-abst
esi-n
tell-uwpst
‘The boy told why (lit. after what) the father had sent him.’
70 Forker
(12b) Hinuq
dižo tʼek-mo-za-ɬi r-egi roži eƛi-š
1sg.gen2 book-obl-obl.pl-abst v-good word(v) say-pst
batʼi-batʼiyaw poʔet-za-y=no, ʡalim-za-y=no
various poet-obl.pl-erg=add intellectual-obl.pl-erg=add
‘Various poets and intellectuals said good things about my books.’
iv. Marking X in phrases like ‘X turns into Y’ in Hinuq (13), and apparently
also in Tsez (in the latter language this is done in combination with the
spr-Lative)
(13) Hinuq
haɬu ked-i zon-ɬi b-uː-ho arxi,
that.obl girl-erg refl.obl-abst iii-make-prs ditch(iii)
gulu-za-ɬi r-uː-ho ɬe
horse-obl.pl-abst v-make-prs water(v)
‘The girl turns herself into a ditch and the horses into water.’
v. Marking of nominals that are governed by the postposition ʡolo ‘because

of’ in Hinuq (14) (alternatively, the purposive suffix -ƛi can be used in the
same function)
(14) Hinuq
xexza-ɬi ʡolo eli aƛ-a-do nox-iš
child.obl.pl-abst because.of 1pl village-in-dir come-pst
‘Because of the children we came to the village.’
vi. Formation of a simultaneous converb and of a few spatial adverbs in

Hinuq (Forker 2013: 141, 241, 358-361)
Most of the usages are functionally related because they involve the formation
and the use of abstract nominals, either as genuine parts of the nominal lexicon,
or in a more abstract sense. For example, complements can be analyzed as
nominalized propositions that occur in a position where otherwise nouns
occur. The topic of a conversation or narration is an abstract object that can be
expressed in the form of a nominalized proposition or other item.
Once again we ask the question whether the suffix -ɬi can be analyzed as a
case suffix. If we restrict ourselves to Hinuq, which seems to be the language
with the broadest range of use, the functions iii-v are functions that are typically
associated with (spatial) cases.2 What concerns the morphosyntactic side, in

functions iii-v -ɬi is added to oblique stems of the respective nominals just like
case suffixes (13), (14). Modifiers of the nominal bearing the suffix must appear
in their oblique form or as second genitive (12b). Finally, in functions iii-v the
suffix can be followed by another suffix -zo that is identical to the second geni-
tive and the second ablative. The second ablative belongs to the directional
markers, which are allowed to follow other case suffixes. This might appear
to provide us with a further argument in favor of the analysis of -ɬi as a case
suffix. However, the directional markers only attach to spatial cases expressing
location and not to any other cases, and the functions iii-v do not involve any
unambiguously spatial semantics. Yet Hinuq has a couple of spatial adverbs
such as hayɬi ‘there’, hibayɬi ‘there’, and haɬi ‘here’ that are diachronically com-
plex consisting of stems of demonstrative pronouns and a suffix -ɬi (Forker
2013: 358-361). It seems that it is precisely the suffix that turns the demonstra-
tive pronouns into spatial adverbs. Under such an approach -ɬi can be viewed
as a case suffix that occupies the boundaries of the Hinuq spatial case system.
7 Exploring the Margins of the Hinuq Case System with the

Canonical Approach
In order to explore the borderline cases of the Tsezic case systems I adopt the
canonical approach as developed by Corbett and his colleagues (see Corbett
(2005, 2008) and Brown, Chumakina & Corbett (2013) among others). The basic
idea of Canonical Typology is that when investigating a linguistic phenomenon
we should start by modelling a canonical item that represents an idealized
instance of the examined phenomenon. The properties of the canonical item
will help us to classify specific examples with respect to their similarity to
the canonical item. In the following, I take Corbett’s (2008) analysis of case
and apply it to the four linguistic items that have been treated in Sections 4-7
thereby concentrating on the Hinuq data. In other words, I will clarify to what
extend the following suffixes can be analyzed as belonging to the case system
of Hinuq: the spatial suffix -ɣo, the vocative -(i)yu, the functive -ɬun, and the
abstract suffix -ɬi.
2 Function vi, i.e. the formation of verb forms employed in adverbial clauses, is also attested
with some spatial cases in Hinuq, in particular the spr-essive (Forker 2013: 240-241; 255-257).
72 Forker
7.1 The Non-spatial Cases of Hinuq Viewed through the Eyes of

Canonical Typology
Corbett (2008) lists ten criteria that help to distinguish canonical instances
of cases from less canonical ones. Below I briefly present the criteria. In his
analysis, Corbett stresses that he treats case as a morphosyntactic feature. He
also mentions Nakh-Daghestanian local cases as an example of cases expressing
morphosemantic features to which the criteria do not apply. Therefore, I will
largely concentrate on two non-spatial cases of Hinuq, the ergative and the
dative, and show how these cases behave with respect to the criteria.
Criterion 1: Canonical features and their values have a dedicated form.

For instance, the Hinuq dative has a suffix that is formally unique within the
case system (-z). By contrast, the form of the ergative overlaps with the form
of the in-Essive for some nouns (-i) and with the absolutive and partially the
first genitive for personal pronouns (see Forker (2013: 130) for the paradigms).
Criterion 2: Canonical features and their values are uniquely distinguishable

across other logically compatible features and their values. We do not have to
select particular combinations: any of them will serve.
This criterion is met by all Hinuq cases: they are expressed independently of
other categories such as number or gender.
Criteria 3 & 4: Canonical features and their values are distinguished

consistently across relevant word classes and across lexemes within relevant word
classes.
Almost all word classes and all lexemes within the different word classes
can be inflected for all cases in Hinuq and thus the system is close to being
canonical. The ergative represents a minor exception because singular and
plural personal pronouns for the first and second person do not distinguish
between absolutive and ergative (and with plural personal pronouns the first
genitive is also identical to the absolutive/ergative).
Criterion 5: The use of canonical morphosyntactic features and their values

is obligatory.
The use of cases in Hinuq is obligatory and every nominal is treated as bearing
a specific case value. In this sense, criterion 5 is met by all Hinuq cases. Choices
between cases are treated as non-canonical (see Criteria 6 & 7).
Criteria 6 & 7: Canonical use of morphosyntactic features and their values

does not admit syntactic or semantic conditions.
Cases in Hinuq, as well as in the other Nakh-Daghestanian languages, have a

clear and specific semantic load which governs their use. For instance, the use
of the ergative to mark agents is obligatory; it cannot be replaced by any other
case because it is the only case that expresses agentivity. In non-canonical
agent constructions, the at-essive instead of the ergative is used, but the case
marking turns the argument into a non-canonical agent that lacks a number of
agentive properties such as control of his/her actions and volitionality (Forker
2013). Similarly, the dative obligatorily marks experiencers, beneficiaries and
purpose. It also marks recipients, but in this function it alternates with the
at-essive/at-lative. However, the alternation implies clear differences in the
semantics: the dative expresses recipients to whom an object is permanently
transferred whereas the at-essive/at-lative expresses temporary recipients
(Daniel, Khalilova & Molochieva 2010).3
Therefore, Corbett (2008) excluded the spatial cases right from the begin-
ning. However, it is not clear to me whether the just described alternations
between non-spatial and spatial cases are also irrelevant in the view of Corbett
or if they can be counted as non-canonical behavior of the non-spatial cases.
Corbett points out that differential object marking represents an instance of
non-canonical behavior regulated in many languages by the semantic condi-
tion of definiteness. Non-canonical agent marking and recipient marking as
mentioned above are regulated by similar semantic conditions, but, in contrast
to differential object marking in languages such as Turkish, involve an alter-
nation between grammatical and spatial cases and not between two different
grammatical cases. Hinuq has a further construction that might count as non-
canonical behavior of the case system, the biabsolutive construction, in which
the agent occurs in the absolutive instead of the ergative (Forker 2012b, 2013:
522-529). Thus, now we have an alternation between two grammatical cases
that is determined by morphosyntactic properties of the verbal predicate and
by semanto-pragmatic conditions.
In sum, there are no additional syntactic conditions on the use of the Hinuq
cases, but there are a few constructions that might count as semantic condi-
tions leading to a deviation from the canonical case ideal.
3 There are even more semantic restrictions with respect to the spatial cases. For example, cer-
tain place names and microtoponyms can be only inflected for directional cases but not for
the spatial cases expressing locations because of the inherent locational semantics of those
nominals. For nouns with animate referents it is difficult if not impossible to find a context
of use for spatial cases such as the in-directional. But these restrictions are purely semantic
and do not concern the morphology or the syntax.
74 Forker
Criteria 8 & 9: Canonical use of morphosyntactic features and their values does
not admit additional lexical conditions from the target (governee) or from the
controller (governor). The controller has a single requirement (e.g. it governs the
dative).
There are (almost) no special lexical conditions on the side of targets, e.g. all
nouns take the same case if the same meaning needs to be expressed. There
are occasional cases of idiosyncrasy (e.g. the noun zoro ‘barn’ is inflected for
the sub-essive that normally means ‘under’ to express the meaning ‘in the
barn’), but they perhaps have a semantic explanation. From the side of the
controllers there are also (almost) no lexical conditions, e.g. all transitive verbs
require the agent to be marked with the ergative, and all affective verbs require
the experiencer to be marked with the dative. However, the affective verb -aši-
‘find’ additionally allows for the at-essive to occur on the experiencer and this
might be considered as an additional construction, but again it is a variation
between a grammatical case and a spatial case.
Criterion 10: The use of canonical morphosyntactic features and their values
is sufficient (they are independent).
This criterion is fully met in Hinuq because the non-spatial cases do not permit
additional markers such as postpositions.4
7.2 The Borderline Cases

Table 3 summarizes the behavior of the Hinuq ergative, dative, vocative
(Section 5), functive (Section 6) and of the suffix -ɬi in the functions iii to v. The
suffix -ɣo (Section 4) has been excluded since it is clearly an old spatial case
and spatial cases were regarded by Corbett as expressing morphosemantic and
not so much morphosyntactic features. Table 3 displays how well the cases and
case-like morphemes fulfill the ten criteria. At the bottom the table indicates
if the morphemes require the nominal stem to occur in the oblique form and
if modifiers such as demonstrative pronouns also need to occur in the oblique
form. These two properties are specific for the Tsezic case system and are
therefore also considered.
4 The spatial cases allow them, but the use of postpositions with spatial cases is mostly
optional.
Table 3 Comparing non-spatial cases and case-like morphemes
Criteria Ergative Dative Vocative Functive -ɬi (functions iii-v)
1 partially yes yes yes yes

2 yes yes yes yes yes
3&4 partially yes no yes yes
5 yes yes* no partially partially
6&7 yes* yes* yes yes yes
8&9 yes yes* no yes yes
10 yes yes yes yes no
obl stem yes yes yes no yes
obl modifier(s) yes yes ? no yes
The ergative behaves relatively canonical except for its partial formal overlap
with the absolutive and the in-essive and its alternation with the absolutive
and the at-essive in non-canonical agent constructions (indicated by * in
Table 3). The dative is even more canonical since it has a unique ending. Again
the only instances that might count as non-canonical behavior are alternations
with certain spatial cases. The vocative is the least canonical nominal marker.
Its use is not obligatory because the absolutive can be used instead, and it
occurs only with a very limited number of nouns. However, it is suffixed to
the oblique stem. With respect to the criteria 1-10, the functive is as canonical
as the ergative or the dative. The only non-canonical feature is found with
respect to criterion 5 because the functive can occasionally be replaced with
the absolutive, e.g. compare (15) with example (2) above.
(15) Hinuq
Abdukarim ħaži Buynaksk šahar-mo-s imam goɬ
Abdukarim Gadzhi Buynaksk town-obl-gen1 imam be
‘Abdukarim Gadzhi is the imam of the town of Buynaksk.’
Thus, Creissels’ (2014b: 442-443) critique of excluding the functive from the
case inventory as it is done in all grammars of Tsezic languages (as well as in
the grammars of Avar and Andic languages) is surely justified. Yet it does not
confirm to all other cases when it comes to the specific property of requiring
nominals to occur with the oblique stems and modifiers to appear in their
76 Forker
oblique form. Moreover, as mentioned in Section 6, the functive can be added to

other case markers, and such a behavior is not typical for genuine case suffixes.
Finally, the suffix -ɬi is slightly less canonical than the functive because
functions iii and v can also be fulfilled by other markers (criterion 5), and
the suffix can optionally be followed another suffix, i.e., it permits additional
markers (criterion 10). But in contrast to the functive it behaves well with
regard to oblique stems.
8 Conclusion
In this paper, I investigated borderline cases of the Tsezic case systems that
resemble core cases, but also deviate from them in various ways. This was
done by applying the canonical approach as illustrated by Corbett’s (2008) ten
criteria for canonical case markers. The examined nominal markers can be
ordered along a scale of canonicity from being more canonical (functive, -ɬi) to
less canonical (vocative).
The Tsezic case systems are far from being homogenous because of the divi-
sion into non-spatial and spatial cases with concomitant differences in their
formal and functional properties. They lose members (e.g. the spatial suffix -ɣo
that was described in Section 4, certain spatial case combinations seem to have
fallen out of use in Bezhta, see Forker 2012a, b). But that they also gain new
members. The functive and the abstract suffix -ɬi are both borrowings from
Avar that are functional equivalents of cases in other languages, and the latter
suffix in Hinuq also shares some formal properties with the core case suffixes.
Abbreviations
i-v: genders i-v, abst: abstract suffix, add: additive, adv: adverbial, apud:
apud-essive, at: location ‘at, by, near’, conc: concessive, cond: conditional,
cont: cont-essive, cop: copula, cvb: converb, dat: dative, dir: directional,
dist: distal, erg: ergative, f: feminine, func: functive, gen: genitive, imp:
imperative, in: in-essive, inf: infinitive, ipfv: imperfective, irr: irrealis,
lat: lative, loc: locative, n: neuter, neg: negation, next: location ‘next to’,
obl: oblique stem, opt: optative, pf: perfective, pl: plural, pot: potential,
prs: present, prt: particle, pst: past, ptcp: participle, q: question marker,
quot: quotative, res: resultative, sg: singular, spr: location ‘on’, uwpst:
unwitnessed past, vblz: verbalizer
References
Abdulaev, Arsen K. & I. K. Abdullaev. 2010. Didojskij (cezskij) fol’klor. Makhachkala:

Lotos.
Alekseev, Mixail E, 1988. Sravnitel’no-istoričeskaja morfologija avaro-andijskix jazykov.
Moscow: Nauka.
Alekseev, Mixail E., S. Z. Alixanov, Boris M. Ataev, M. A. Magomedov, I. I. Magomodov,
G. I. Madieva, Patimat A. Saidova & Dzh. S. Samedov. 2014. Sovremennyj avarskij
jazyk. Makhachkala: Aleph.
Alekseev, Mixail E. & Boris M. Ataev. 1998. Avarskij jazyk. Moscow: Academia.
Axlakov, A. A. 1976. Avarskie teksty. In A. A. Axlakov & Kh M. Khalilov (eds.), Satir i
jumor narodov Dagestana, 7-42. Makhachkala: Daginogoizdat.
Brown, Dunstan, Marina Chumakina & Greville G. Corbett (eds.) 2013. Canonical
morphology and syntax. Oxford: Oxford University Press.
Charachidzé, Georges. 1981. Grammaire de la langue avar (langue du Caucase Nord-
Est). Paris: Farvard.
Corbett, Greville G. 2005. The canonical approach in typology. In Zygmunt Frajzyngier,
Adam Hodges & David S. Rood (eds.), Linguistic diversity and language theories,
25-49. Amsterdam: Benjamins.
Corbett, Greville G. 2008. Determining morphosyntactic feature values: the case of
case. In Greville G. Corbett & Michael Noonan (eds.), Case and grammatical rela-
tions, 1-34. Amsterdam: Benjamins.
Creissels, Denis. 2014a. Functive phrases in typological and diachronic perspective.
Studies in Language 38. 605-647.
Creissels, Denis. 2014b. Functive-transformative marking in Akhvakh and other
Caucasian languages. In Michael Daniel, Vladimir A. Plungian & Ekatarina A.
Ljutikova (eds.), Jazyk. Konstanty. Peremennye: Pamjati Aleksandra Evgen’eviča
Kibrika, 430-449. Sankt-Peterburg: Aletejja.
Daniel, Michael, Nina Dobrushina, Tat’jana Sosenskaja & Sergei Tatevosov. 2001.
Derivacionnaja morfologija. In Aleksandr E. Kibrik (ed.), Bagvalinskij jazyk, 186-197.
Moscow: Nasledie.
Daniel, Michael, Zaira Khalilova & Zarina Molochieva. 2010. Ditransitive constructions
in East Caucasian: A family overview. In Andrej Malchukov, Martin Haspelmath
& Bernard Comrie (eds.), Studies in ditransitive constructions, 277-315. Berlin: De
Gruyter.
Ebeling, Carl L. 1966. The grammar of Literary Avar: Review of Chikobava and
Cercvadze’s ‘The Grammar of Literary Avar’. Studia Caucasica 2. 58-100.
Forker, Diana. 2010a. Nonlocal uses of local cases in the Tsezic languages. Linguistics
48. 1083-1109.
78 Forker
Forker, Diana. 2010b. Variation in stem formation in Tsezic languages. Suvremena

lingvistika 69. 1-19.
Forker, Diana. 2012a. Spatial relations in Hinuq and Bezhta. In Luna Filipović & Kasia
M. Jaszczolt (eds.), Space and time across languages, disciplines, and cultures: Volume
I. Linguistic diversity, 15-34. Amsterdam: John Benjamins.
Forker, Diana. 2012b. The bi-absolutive construction in Nakh-Daghestanian. Folia
Linguistica 46. 75-108.
Forker, Diana. 2013. A grammar of Hinuq. Berlin: De Gruyter.
Forker, Diana. 2016. Complementizers in Hinuq. In Kasper Boye & Petar Kehayov (eds.),
Semantic functions of complementizers in European languages. Berlin: De Gruyter.
Khalilov, Madžid Š. 1995. Bežtinkso-russkij slovar’. Makhachkala: Institut JaLI DNC RAN.
Khalilov, Madžid Š. 1999. Cezsko-russkij slovar’. Makhachkala: Institut JaLI DNC RAN.
Khalilova, Zaira. 2009. A grammar of Khwarshi. Utrecht: LOT.
Kibrik, Aleksandr E. 1997. Beyond subject and object: towards a comprehensive rela-
tional typology. Linguistic Typology 1. 279-346.
Kibrik, Aleksandr E. & Jakov G. Testelec. 2004. Bezhta. In Michael Job (ed.), The indig-
enous languages of the Caucasus, vol. 3: The North East Caucasian languages, part 1,
217-295. Ann Arbor: Caravan Books.
Magomedova, Patimat T. 2003. Tindinsko-russkij slovar’. Makhachkala: Institut JaLI
DNC RAN.
Magomedova, Patimat T. 2004. Bagvalinsko-russkij slovar’. Makhachkala: Institut JaLI
DNC RAN.
Magomedova, Patimat T. & Rašidat Š. Xalidova. 2001. Karatinsko-russkij slovar’. Sankt-
Peterburg: Scriptorium.
Saidova, Patimat A. 2006. Godoberinsko-russkij slovar’. Makhachkala: Institut JaLI DNC
RAN.
Saidova, Patimat A. & Magomed G. Abusov. 2012. Botlixsko-russkij slovar’. Makhachkala:
Institut JaLI DNC RAN.
Chapter 6
Language Endangerment in the Balkans with Some

Comparisons to the Caucasus
Victor A. Friedman
1 Introduction
In this article, I would like to offer a comparative perspective on language

endangerment in the Balkans and the Caucasus. My focus here will be more
on the Balkans than the Caucasus, as the former has received less attention
in this regard than the latter.1 In a sense, we can view the Balkans and the
Caucasus as two prongs of southeastern Europe, both of which have been
long-time zones of contact between Christian and Muslim polities. To be sure,
the phrase southeastern Europe usually conjures up images of the Balkans but
not the Caucasus, and the relationship of the Caucasus to Europe remains
ideologically unstable.2 Nonetheless, the Western and Eastern shores of the
Black Sea—the west being a peninsula defined by the Adriatic and the East
being an isthmus defined by the Caspian—do share aspects of the ecologies
of language endangerment. In addition to being zones of contact between
Muslim and Christian Empires, the Balkans and the Caucasus also share
certain marginalized minorities, viz. Gypsies and Jews, where in each case
the linguistic fates and adaptations are also worthy of comparison. In this
article, therefore, my goal will be to draw attention to the types of language
endangerment occurring in the Balkans and to make appropriate comparisons
with the Caucasus.
1 On the situation in the Caucasus, see Friedman (2010). There is no literature to speak of
on language endangerment in the Balkans beyond the third edition of the UNESCO atlas
(Mosley 2010).
2 A clear example of this instability occurs in Webster’s Geographical Dictionary, where the
political border but the northern borders of Iran and Turkey define the southern border of
Europe in the entry for Asia, but the Caucasian ridge defines the boundary between Europe
and Asia in the entry for Europe (Bethel 1949: 74, 347).

80 Friedman
2 Balkans and Caucasus Compared
Among the language ecological commonalities between the Balkans and the
Caucasus in the twentieth century we can identify are the effects of forced
migration (including genocide), colonization, marginalization, economic
transformation (including urbanization, economic migration, and shifts
in traditional market patterns), and the rise of new polities.3 One way of
combating endangerment is to be found in literacy practices, but this is not
in and of itself a simple or sufficient solution. It is now well understood that
language preservation cannot be divorced from social context (e.g., Mülhäusler
2002). At the same time, however, in the context of symbolic capital in societies
that are already deeply embedded in modern nation-states, literacy becomes
important (cf. Friedman 2012a).
In order to set this discussion in the present context, a look at history and
the concept of indigeneity is useful. For most of the world’s endangered lan-
guages, indigeneity is part of both the discourse involved in the exercising of
rights including, among others, language rights.
2.1 Indigeneity in the Caucasus

In the case of the Caucasus, the indigenous argument is straightforward in
so far as the South (Kartvelian), Northwest (Abkhaz-Adyghe) and Northeast
(Nakh-Daghestanian) families are concerned. Regardless of the prehistory
that accounts for current distributions, the antiquity of the three Caucasian
language families in the Caucasus is beyond dispute, just as is the antiquity of
Basque in Europe. Nonetheless, such a criterion leaves out many languages that
have been spoken in—or achieved their formation in—the Caucasus despite
being obvious subsequent but nonetheless ancient arrivals, viz. various Indo-
European, Turkic, and Semitic languages such as Armenian, Ossetic, Karachay-
Balkar, Kumyk, Assyrian, etc.
2.2 Indigeneity in the Balkans

In the case of the Balkans, all of the languages are either Indo-European or
Turkic, and none should therefore claim indigeneity in the sense of the
Caucasian language families. Nonetheless, those Indo-European languages
descended from languages spoken in the Balkans longer than others use
3 The literature on these phenomena is too vast to cite, but among the relevant works dealing
with events that are less well known we can note particularly Ladas (1932), Üngör (2011), and
Richmond (2013).
Language Endangerment in the Balkans 81
precisely the “autochthonous” argument as part of their justification for what

at times can be viewed as ideologically colonial practices (cf. Friedman 2016). In
addition to Albanian and Greek, however, most of the languages of the Balkans
took shape precisely in the Balkans, and, like Albanian and Greek, arrived from
outside the peninsula and replaced the languages already spoken there. among
them Aromanian, Meglenoromanian, Romani, the Balkan dialects of Judezmo,
all of the Balkan Slavic languages and dialects, Gagauz, and West Rumelian
Turkish, as well as Arvanitika, and Tsakonian in the Albanic and Hellenic
branches, respectively (Friedman and Joseph Forthcoming).
If we take Bugarski’s (1992: 10) definition of ‘autochthonous’ in the former
Yugoslav context, i.e. spoken natively on the territory of the relevant polity for
at least a century, and if we extend that measure back by an extra five hundred
years or a millennium, we arrive at a concept of indigenous that adequately
accounts for the Balkan languages in the sense of the Balkan Sprachbund as
well as languages specific to the Caucasus that includes languages descended
from later arrivals. We can thus say that in this extended sense, indigeneity in
the Balkans and the Caucasus can be defined in terms of the centuries-long
language contact that creates a linguistic area (cf. Friedman 2010b, Comrie
2007). As we shall see, competing claims to relative ‘indigeneity’ in the Balkan
context are used to justify policies that lead to the extermination of languages
that have been spoken on a given territory for centuries and even millennia.
3 Imperial Contexts
Turning for a moment to the competing empires, and beginning with the
early modern period, it was the Ottoman, Safavid, and Russian empires that
impacted the Caucasus and the Ottoman, Russian, and Austro-Hungarian that
impacted the Balkans. On a scale of linguicidal practices, Muslim empires
were, at least until the early twentieth century, relatively benign, the Austro-
Hungarian Empire had competing destructive and protective tendencies
(cf. Gal 2015), while the Russian Empire has been linguicidal (cf. Wixman 1980).
In both the Balkans and the Caucasus, ideologies of totalizing, mono-ethnic
nation-states that had their origins in Western Europe worked to destroy
traditional multilingualisms, albeit through different mechanisms, and with
different results in different locales.
82 Friedman
4 Stateless Languages in the Balkans
Among the Balkan languages, all of those without titular nation-states suffer
from some degree of endangerment ranging from moribund to ecological
fragility.
4.1 Judezmo
Of the moribund languages, the worst affected is Judezmo, the majority of
whose speakers were murdered by the Nazis and their collaborators during
World War Two. Unlike Yiddish, which has seen a genuine revival in recent
years owing in large part to the spread of Lubavitcher Hasidism, now known as
Chabad, Judezmo is retreating. In Macedonia, by 2008 there were between five
and ten fluent speakers left, all survivors of World War Two. Their numbers have
declined since then, although there is still a handful of semi-speakers among
their children, prospects for the future are non-existent as of this writing. The
situation is similar albeit not as dire, in places such as Sofia, Salonika, Sarajevo,
and Bucharest, as well as Istanbul, although there are some younger activists
in Istanbul. The communities in the USA, Canada, and Israel are, for the most
part, all members of older generations. In the Caucasus, several hundred Juhuro
speakers were killed by Nazi occupiers, but for the most part the community
survived, although today its dispersal in Israel, the US, and Russia leave the
language endangered in Daghestan and Azerbaijan (cf. Altschuler 1990). We can
mention here in passing both Judeo-Greek (Romaniote) and Judeo-Georgian
(Q’ivruli), which are basically ethnolects of their respective languages. In both
cases, the number of speakers is dwindling owing to the Holocaust in the case
of the former and migration to Israel in the case of the latter.
4.2 Meglenoromanian
The other most highly endangered language in the Balkans is Meglenoromanian,
which was originally limited to a cluster of villages on and near Mt. Pajak, in a
region that is today divided between the Hellenic Republic and the Republic
of Macedonia. By the time this territory was ceded by Turkey to Serbia and
Greece in 1913, it was limited to eleven villages—three ended up on the Serbian
side, later the Republic of Macedonia, and eight were on the Greek side. Of
the eight that were ceded to Greece, however, the largest, Nãte/Nãnte, was
Muslim, and so all its inhabitants were sent to Turkey in the 1923 exchange of
populations mandated by the Treaty of Lausanne. There they disappeared from
view until being rediscovered in Turkish Thrace, independently by Turkish
and by Austrian and Turkish ethnologists working in the region (Kahl 2006,
Kurtişoğlu and Kurtişoğlu 2012). But the modern Nãntintsi preserve only a few
Meglenoromanian songs and dances and the memory of where their forebears
came from. Meanwhile, the language has gone extinct in two other villages
(Atanasov 1990, 2002)—one on each side of the border—leaving a total of
seven villages. Prior to the Balkan Wars, Macedonian was the primary contact
language for all Meglenoromanian villages, but on the Greek side Macedonian
was outlawed in the 1930s and practically all the Macedonian-speaking villages
in the Meglen were emptied and/or destroyed at the end of the 1940s. Thus, at
present Meglenoromanian is assimilating to Greek in Greece but Macedonian
in Macedonia. The future of Meglenoromanian is particularly grim since 1) in
Romania it is considered a dialect of Romanian (Rusu 1984), 2) in Greece all
citizens of Greece are Greeks and minority languages are actively discouraged
(Friedman 2012a), and 3) in Macedonia there is no support for Meglenoromanian
maintenance and the speakers are counted with Aromanians (Friedman 2001).
4.3 Aromanian
Aromanian is in a distinctly different situation from Meglenoromanian in that
there are till hundreds of thousands of speakers. They live in northern Greece,
southern Albania, the Republic of Macedonia, and southwestern Bulgaria.
Thousands also emigrated to Romania in the wake of the Balkan Wars and
World War One. In Romania, Aromanian immigrants were settled mostly
around Tulcea in Dobrudja, which at the time was the least Romanian and
most ethnically mixed region of Romania and had only recently been assigned
to Romania. The situation can be compared to the settlement of Pontic and
Cappadocian Greeks in northern Greece, i.e., Greek Macedonia, after the 1923
exchange of populations.4 More or less officially, however, Romania considers
Aromanian to be a dialect of Romanian and it thus receives no support aside
from the occasional folklore production. In the countries where Aromanian
originated, it is only in the Republic of Macedonia, which has one of the
smallest number of Aromanian speakers, that the language has any sort of
official recognition and support. In Greece, the language is mostly limited to
4 Pontic and Cappadocian—the Hellenic languages spoken by Christians expelled from

Turkey in the population exchanges mandated by the 1923 Treaty of Lausanne and, in the
case of Pontic, also spoken in the Soviet Caucasus until the dissolution of the USSR, when the
speakers left for Greece—could be considered independent Hellenic languages (like Griko in
Italy), but owing to Greek language policy, they have been driven to the brink of extinction
upon being relocated to Greece. Tskaonian differs historically from these in that it is in many
respects descended directly from Doric Greek rather than the Hellenistic Koine that is the
source of all other Hellenic languages. Nonetheless, the situation for Tsakonian is likewise
precarious.
84 Friedman
speakers over 40, although in some more isolated villages, as elsewhere in the
original Aromanian territory, the language is still being passed on to children
(cf. Bara 2005). Despite official support for Aromanian in the Republic of
Macedonia, however, numbers of speakers are declining, and the language
remains vital mostly in isolated villages. There is also a significant Aromanian
diaspora in Western Europe and North America, but Constantin Belamaci’s
(1888, cited in Balamaci 1987) rage remains a forlorn call, and Hadzhi Daniil’s
(1802, reproduced in Ninčev 1977) poem continues as Greek policy.5
4.4 Romani
Of the remaining stateless languages, Romani has some speakers in the
Caucasus, but, perhaps most significantly, its apparent closest relatives—
Lomavren and Domari—are also spoken in the Caucasus (Marushiakova and
Popov 2014). This region is thus unique in having—or at least having had—
representatives of all three of these diasporic Indic languages, of which Romani
took its definite shape in the Byzantine Empire, Domari (also called Karachi,
Garbet, Zutt, Nawar, etc.) in the Middle East, and Lomavren (also called Bosha,
5 Constantin Belamaci was the author of an Aromanian poem entitled Parinteascã dimandare
‘Paternal commandment’ that became the Aromanian anthem. The verse cited here gives a
sense of the content:
Cari-shi alasã limba a lui For whomever leaves his language:
S’lu ardã pira focului Let him be burned by flame
Si s-dirinã yiu pri loc Let him be destroyed alive where he stands
Si lli si frigã limba n-foc. Let his tongue be burned in flame.
In contrast, Hadži Daniil of Moschopole (modern Voskopoja, Albania) was a Hellenized
Aromanian who published a quadrilingual phrasebook the express purpose of which was
the extinction of Balkan languages other than Greek, as seen in the opening lines of his intro-
duction (reproduced in Ninčev 1977:83, given here in original and transliterated Greek and
in Wace and Thompson’s (1913:6) English verse translation.
Ἀλβανοὶ, Βλάχοι, Βούλγαροι, Ἀλλόγλωσσοι χαρῆτε,
Κ᾽ἑτοιμασθῆτε ὅλοι σας Ῥωμαῖοι νὰ γενῆτε.
Βαρβαρικὴν ἀφήνοντες γλῶσσαν, φωνὴν καὶ ἢθη,
Ὁποῦ στοὺς Ἀπογόνους σας νὰ φαίνωνται σὰν μῦθοι.
Albanoi, Vlakhoi, Voulgaroi, Alloglōssoi kharēte,
K’etoimasthēte oloi sas Rōmaioi na genēte.
Varvarikēn aphēnontes glōssan, fōnēn kai ēthē,
Opou stous Apogonous sas na phainōntai san mēthoi.
Albanians, Bulgars, Vlachs and all who now do speak
An alien tongue rejoice, prepare to make you Greek,
Change your barbaric tongue, your customs rude forego,
So that as byegone myths your children may them know.
Posha, etc.) in Armenia. Etymologically, it is clear that all three autonyms—Rom

Dom, Lom—are derived from the same Indic case name ḍom, but the languages
differ in such fundamental respects that Matras (2012), quite reasonably, doubts
that they necessarily ever constituted a single speech community, although it
is equally clear that at times their contact was sufficiently intimate for them
to have shared innovations. The three languages also occupy three distinct
places on the cline of endangerment. Romani is ecologically fragile: it remains
vital as long as its speakers suffer a social marginalization that is as effective
as the physical isolation of a mountain village, this latter being what protects
ecologically fragile languages in, e.g., Daghestan. With advances in social
equality, however, strategies for language preservation become necessary. A
number of these are being deployed in various contexts, and the need for them
continues Domari speakers are in the final stages of shift to majority contact
languages. Predication is still possible, but whole sections of the grammar have
already shifted. The Lomavren recorded by Patkanov (1887) appears to have
been where Domari is now (cf. Matras 2012). At present, Lomavren, like Anglo-
Romani, is only a lexicon (see Friedman 2011).
4.5 Caucasian Languages in the Balkans: Circassian, Armenian, Georgian

Aside from Romani, other languages that are or were shared by the Balkans
and the Caucasus were Circassian (West Circassian [Adyghe] and East
Circassian [Kabardian]), Armenian, and Georgian. According to Kănčov (1900:
116), the majority of Circassians that came to the Balkans did so as refugees
from the expanding Russian Empire in 1864. They were settled for the most
part in The Ottoman vilayets of Selânik and Üsküp/Kosova. With the Ottoman
loss of that territroy in the Balkan Wars 1912-193, followed by World War One,
followed by the forced migrations of the 1920 (cf. Ladas 1932), the majority of
Balkan Circassians left for or were forced to move to what became the Turkish
Republic. By the late twentieth century, there was one Circassian-speaking
village left in the Balkans, Čerkes kjoj, about ten kilometers from Prishtina
(Kurmel 1994). These last speakers of Balkan Circassian were evacuated to the
Adyghe Republic during the NATO bombing of 1999.6
Two other languages of the Caucasus with a presence in the Balkans are
Georgian and Armenian. A Georgian monastery (P’et’ric’on) was founded at
Bačkovo in the Bulgarian Rhodopes in the eleventh century (construction
6 The only datum we have is Besirov and Tlebzu (1981), who mention the replacement of the
ergative by the instrumental. We can speculate that this is a result of Serbian influence.
Kănčev (1900: 116, 178, 215) gives locations and statistics for Circassians in Macedonia, but
their dialect is completely lost to us.
86 Friedman
completed 1083), when the Georgian province of Tao and Bulgaria were both
part of the Byzantine Empire. Two inscriptions and old copies (thirteenth cen-
tury) of the monastery’s tipikon survive (Šanidze 1971).
4.6 Minority Languages and Dialects Related to Nation-state Languages

When we turn to Balkan languages that are related to Balkan nation-state
languages but are spoken in polities where the dominant language belongs to
a different linguistic group, we see a fairly consistent pattern of endangerment.
Arvanitika, which separated from the main body of Albanian and migrated
to what is today Greece about a millennium or so ago, has been pushed to
the brink of extinction in the course of the past 60 years or so. Of the Slavic
languages/dialects spoken in non-Slavic polities, Goran is endangered in
Kosovo by Bosnian and in Albania by Albanian, although the relative isolation
of the villages is impeding the process. The isolated Macedonian dialect of
Boboshtica in the Korcha region of Albania, which speakers themselves called
Kajnas ‘like us’ is now a linguistic tourist attraction performed for visiting
foreign linguists by a single old woman. Pomak in Bulgaria is Bulgarian, while
in Greece, since all its speakers there are Muslim, it is giving way to Turkish.
On the one hand, Christian speakers of these Rhodopian dialects consider
themselves and their language to be Bulgarian, whereas Muslim speakers do
not identify as ethnic Bulgarians. The Greek government has supported this
situation by allowing Pomak teaching materials (e.g., Kokkas 2004), which
are vigorously opposed by the Turkish elites in Greece (cf. Steinke and Voss
2007).7 In Macedonia, West Rumelian Turkish remained quite vital as a home
language despite schooling in standard Turkish, but at present it is in retreat
owing to greater freedom of communication. Gagauz has official support in
the Republic of Moldova, but not elsewhere, where it is in retreat.
In some ways, researching endangered languages in Greece is like research-
ing them in a war zone, because the Greek government and its supporters pre-
tend that Greece’s minority languages constitute a threat to its national security
(see Friedman 2012a). The irony is that it is Greece that has contributed to the
Republic of Macedonia’s insecurity, and yet, at the same time, endangered lan-
guage documentation in the Republic of Macedonia is unproblematic.
7 In linguistic terms, Rhodopian dialects are very distinctive, albeit part of the Balkan Slavic
dialect continuum that includes Bulgarian, Macedonian, Goran, Torlak, etc. The singling out
of Torlak as a separate language in Mosley (2010), while understandable in the context of the
former Serbo-Croatian, is more problematic in the context of Balkan Slavic.
5 Conclusion
I would like to conclude with some thoughts on language endangerment as

it relates to the Balkans and the Caucasus. In the case of the Caucasus, the
linguistic diversity, typological specificity, and ecological fragility are all
obvious arguments for attention. And there are obvious scientific reasons
for prioritizing the documentation of typologically unusual or historically
relatively unconnected languages. The sense of urgency is profound, and the
need for resources—both human and monetary—is undeniable. At the same
time, however, the larger intellectual justifications for rescue and revitalization
include social issues of equity and human well-being that are equally relevant
to minority languages under threat in the Balkans. To this can be added the
historical argument that the endangered languages and dialects of the Balkans
contain historical and cultural information that will be lost when they are. And
owing to the centuries of stable multilingualism, these languages and dialects
have much to tell us about language contact as well (see, e.g., Friedman 1997 on
the admirative in the Aromanian dialect spoken by the Frasheriote—but not
Mbaliote—villagers from Bela di suprã [Gorna Belica, Macedonia]). There is
also a certain historical irony in the fact that in many parts of the world, where
Europe’s settlement colonies have drastically reduced linguistic diversity, the
heirs of these colonists have some recognition of the need for trying to reverse
or mitigate the harm that has been done, but in Europe itself, especially in
the southeast, there is little to no such recognition, and in some places even
resistance and outright harassment even for efforts at documentation, let
alone revitalization. This, too, lends a certain sense of urgency to the argument.
Acknowledgments
This article draws on more than forty years of my own field work in the Balkans
and the Caucasus. I wish to acknowledge support from fellowships from the
following US granting agencies: American Council of Learned Societies for
a Fellowship in East European Studies (1986) and (2000-01) financed in part
by the National Endowment for the Humanities and the Ford Foundation,
International Research and Exchanges Board for travel grants to Macedonia
in 1991and 1992, National Endowment for the Humanities (2001, Reference
FA-36517-01), and Fulbright-Hays Post-Doctoral Fellowship from the U.S.
Department of Education as well as a fellowship from the John Simon
Guggenheim Memorial Foundation in 2008-2009, American Council of
88 Friedman
Learned Societies for a Fellowship in East European Studies with support from
the National Endowment for the Humanities and the Social Science Research
Council (2012-2013), an American Councils for International Education (ACTR/
ACCELS) Title VIII Research Fellowship with support from the U.S. Department
of State, Title VIII Program for Research and Training on Eastern Europe and
Eurasia (Independent States of the former Soviet Union) (2012). Some of the
research reported here from other sources was conducted while I was an
honorary visitor at the Center for Research on Language Diversity at La Trobe
University. The opinions expressed herein are entirely my own.
References
Adamou, Evangelia. 2008. “Armenian.” In Evangelia Adamou (ed.), Le Patrimonie pluri-

lingue de la Grèce (Le nom des langues II), 71-76. Leuven: Peeters.
Altshuler, M. 1990. Yehudei Mizraħ Kavkaz. Jerusalem: Hebrew University.
Atanasov, Petar. 1990. Le mégléno-roumain de nos jours. (Balkan-Archiv Neue Folge—
Beiheft 8). Hamburg: Buske.
Atanasov, Petar. 2002. Meglenoromâna astăzi. Bucharest: Romanian Academy.
Bara, Maria. 2005. Južnoaromunskij govor sela Turia (Pind). Munich: Biblion.
Balamaci, Nicolas S. 1987. A New Tone for Our Cultural Discussions. The Newsletter of
the Society Farsarotul 1(2)[July]. 11-12.
Bethel, John P. (gen. ed.) 1949. Webster’s Geographical Dictionary. Springfield, MA:
G. & C. Merriam Co.
Bugarski, Ranko. 1992. Language in Yugoslavia: Situation, Policy, Planning. In Ranko
Bugarski and Celia Hawkesworth (eds.), Language Planning in Yugoslavia, 9-26.
Columbus: Slavica.
Comrie, Bernard. 2008. Linguistic Diversity in the Caucasus. Annuasl review of
Anthropology 37. 131-143.
Friedman, Victor A. 1994. Surprise! Surprise! Arumanian Has Had an Admirative!
Indiana Slavic Studies 7. 79-89.
Friedman, Victor A. 2001. The Vlah Minority in Macedonia: Language, Identity,
Dialectology, and Standardization. In Juhani Nuorluoto, Martii Leiwo, Jussi
Halla-aho (eds.), Selected Papers in Slavic, Baltic, and Balkan Studies, (Slavica
Helsingiensa 21), 26-50. Helsinki: University of Helsinki.
Friedman, Victor 2010a. Sociolinguistics in the Caucasus. In Martin Ball (ed.),
Encyclopedia of Sociolinguistics of the World’s Languages, 127-38. London: Routledge.
Friedman, Victor A. 2010b. The Balkan Languages and Balkan Linguistics. Annual
Review of Anthropology 40. 275-291.
Friedman, Victor A. 2011. Review of Matras, Yaron. 2010. Romani in Britain: The Afterlife
of a Language. Edinburgh: Edinburgh University Press. The Journal of Language
Contact, 4. 295-301.
Friedman, Victor A. 2012a. A Tantrum from the Cradle of Democracy: On the Dangers
of Studying Macedonian. In Victor C. de Munck, and Ljupcho Risteski (eds.),
Macedonia: The Political, Social, Economic and Cultural Foundations of a Balkan
State, 22-43. London: I.B. Tauris.
Friedman, Victor A. 2012b. Copying and Cognates in the Balkan Sprachbund. In Lars
Johanson and Martine Robeets (eds.), Copies vs Cognates in Bound Morphology,
323-336. Leiden: Brill.
Friedman, Victor. 2016. The Importance of Aromanian for the Study of Balkan Language
Contact in the Context of Balkan-Caucasian Parallels. In Thede Kahl and Ioana
Nechiti, (eds.), Ethnic and Linguistic Diversity in Southeast Europe and the Caucasus.
Vienna: Austrian Academy of Sciences.
Friedman, Victor A. and Brian D. Joseph. Forthcoming. The Balkan Languages.
Cambridge: Cambridge University.
Gal, Suan. 2015. Imperial Linguistics and Polyglot Nationalism in Austria-Hungary:
Hunfalvy, Gumplowicz, Schuchardt. Balkanistica 28. 151-173.
Kahl, Thede. 2006. The Islamisation of the Meglen Vlachs (Megleno-Romanians): The
Village of Nânti (Nótia) and the “Nântinets” in Present-Day Turkey. Nationalities
Papers 34. 71-90.
Kokkas, N. 2004. Úchem so pomátsko. Ksanthi: Politistiko Anaptyksiako Kentro Thrakēs
Kănčov, Vasil. 1900. Makedonija: Etnografija i statistika. Sofia: Bălgarsko knižovno
družestvo.
Kurmel, Ömer Aytek. 1994. Kosova ve Çerkesler. Yedi Yıldız 4.14-15.
Kurtişoğlu Belma and Bülent Kurtişoğlu. 2012. Hidden Latin in Thrace: Notyalılar.
Paper delivered at the 3rd Symposium of the Study Group on Music and Dance
in Southeast Europe, Berovo, Republic of Macedonia, 4/17/2012-4/23/2012,
<http://www.ictmusic.org/group/music-and-dance-southeastern-europe>
accessed 11 December 2015.
Ladas, Stephen P. 1932. The Exchange of Minorities : Bulgaria, Greece, and Turkey. New
York: Macmillan.
Matras, Yaron. 2012. A Grammar of Domari. Berlin/Boston: Walter de Gruyter.
Mosley, Christopher (chief ed.) 2010. Atlas of the World’s Languages in Danger. Paris:
UNESCO.
Marushiakova, Elena and Veselin Popov. 2014. The Gypsies (Dom—Lom—Rom) in
Georgia. Proceedings of Annual Meeting of the Gypsy Lore Society and Conference
on Romani Studies, Bratislava, September 11-13, 2014. <https://sites.google.com/site/
glsproceedings/presentations> accessed 11 december 2015.
90 Friedman
Mülhäusler, Peter. 2002. How One Cannot Preserve Languages (but can preserve lan-
guage ecologies). In David Bradley and Maya Bradley (eds.), Language Endangerment
and Language Maintenance, 34-38. London: Routledge Curzon.
Ninčev, A. 1977. Četiriezičnijat rečnik na Hadži Daniil. Sofia: BAN.
Patkanov, K. I. 1887. O narečijah zakavkazskih Cigan: Boša i Karači. St. Petersburg:
Imperatorskaja Akademija Nauk.
Richmond, Walter. 2013. Circassian Genocide. New Brunswick, NJ: Rutgers University
Press.
Rusu, Valeriu (chief ed.) 1984. Tratat de dialectologie românească. Craiova: Scrisul
românesc.
Šanidze, A. 1971. Kartvelta monast’iri bulgaretši da misi t’ip’ik’oni. Tbilisi: Mecniereba.
Steinke, Klaus and Christian Voss (eds.) 2007. The Pomaks in Greece and Bulgaria: A
Model Case for Borderland Minorities in the Balkans. Munich: Verlag Otto Sagner/
Südoosteuropa Gesellschaft.
Üngör, Uğur Ümit. 2011. The Making of Modern Tukey. Oxford: Oxford University.
Wace, A. J. B. & M. S. Thompson. 1913. The Nomads of the Balkans. New York: Dutton.
Wixman, Ronald. 1980. Language aspects of Ethnic Patterns and Processes in the North
Caucasus. Chicago: University of Chicago Department of Geography.
Chapter 7
Instilling Pride by Raising a Language’s Prestige

George Hewitt
I have to begin by mentioning Ubykh, given my good fortune in meeting the

last speakers in Hacı Osman Köyü and Istanbul in 1974. The demise of the North
West Caucasian language Ubykh was played out on Turkish soil following the
mass-migration of the Ubykhs (perhaps up to 50,000 souls) at the end of the
Great Caucasian War in 1864. The late French scholar, Georges Dumézil, who,
of course, made the greatest contribution to documenting and analysing this
language, wrote how the Ubykhs even in their homeland (viz. in the environs of
modern-day Sochi) may always have been bi-/tri-lingual in one or both of the
languages of their neighbours and close linguistic relatives (namely the western
Abkhazians and the western Circassians). After resettling in regions of Turkey
where the majority of their fellow-Caucasian migrants were either Circassians
or Abkhazians, their elders evidently decided that, since they would have to
learn the language of their Turkish hosts and also to communicate with their
fellow-migrants, they would simply not bother passing on their native Ubykh
to the new generations. The decision was regrettable but understandable,
given (a) the circumstances in which the Ubykhs found themselves and
(b) that at that time nobody worried about the disappearance of languages.
What we have here, then, is a clear example, it would seem, of a community-
decision to allow their language to wither away, deeming it to be of less value
than those of their neighbours and hosts.
And wither it duly did, being described as already moribund by those schol-
ars who first visited the Ubykh settlements (the Dane Å. Benediksten 1898, the
German A. Dirr 1913/14, the Hungarian J. Meszaros 1930/31, and Dumézil 1930).
When Dumézil returned to Turkey in 1953 after World War II, he expected to
find no speakers left, but fate brought him together with Tevfik Esenç, who,
much younger than the other speakers, had been raised by his grandparents
and thus had had the by-that-time rare opportunity to be exposed largely to
Ubykh alone until he started school. Though the language of Tevfik’s wife and
the other Ubykh females in the Manyas settlements never seems to have been
investigated, Tevfik has become universally described as the last fully compe-
tent speaker, taking the language with him to the grave in 1992, though he did
pre-decease his wife.

92 Hewitt
Had the Ubykhs reconciled themselves to Russia’s victory in 1864 and

accepted the offer to live compactly under Russian domination (albeit perhaps
in the Kuban basin away from their traditional hilly terrain), their language
would surely be alive today, even if many of its speakers would certainly have
perished in the upheavals of the Revolution and Stalin’s Terror. This is a realis-
tic supposition, since both standard Abkhaz (plus its divergent Abaza dialect)
and two varieties of Circassian (western Adyghe and eastern Kabardian) were
included in the list of so-called ‘Young Written Languages’ by the early Soviets.
This gave them literary status and officially approved scripts. It would be inter-
esting to learn if anything has been published about the discussions over which
languages were to be included in this classification or even if records of those
discussions have survived. The effort that went into supporting languages in
the Caucasus in the 1920s until around the mid-1930s is astonishing. For exam-
ple, although the Daghestanian Lezgic language Udi was spoken in only 3 vil-
lages on either side of the Georgian-Azerbaijani border, a 2-dialect primer was
published (in Sukhum!) in 1934, and, despite the fact that only a tiny number
of South Caucasian Laz speakers lived on Soviet (Georgian) territory, a school-
primer was published (again in Sukhum) in 1935, two issues of a Laz paper
Mch’ita Murutsxi ‘Red Star’ having appeared in 1929 (see Feurstein 1992.320-22
for images of these rarities). Though Abkhaz, Abaza, Adyghe and Kabardian
have retained their literary status, literary aspirations for Udi and Laz (perhaps
unsurprisingly) quickly faded. But at first glance a far more suprising loss of
status was suffered by the S. Caucasian language Mingrelian, given the huge
number of ethnic Mingrelians concentrated in the west Georgian province of
Mingrelia.
I introduce Mingrelian as an example of how a state can act deliberately to
undermine the prestige that I argue is necessary for speakers to feel in order
to sustain their willingness to pass their language on to their children, thereby
ensuring the language’s long-term survival. Back in the 1980s when I was
researching Mingrelian the elderly mother of my main Mingrelian informant
(in Ochamchira, Abkhazia) asked her son (in his 50s) what use Mingrelian was
to a British linguist when it was of no use even to native Mingrelians! It would
be foolish, given the numbers involved,1 to suggest that Mingrelian is in any
immediate danger, but its fate over the last century well illustrates the negative
consequences for languages that government-actions can have. In his paper
to the 1990 Caucasian Colloquium at SOAS the German Wolfgang Feurstein,
who has long championed the cause of the Laz in Turkey and that of Laz’s
1 The number of ethnic Mingrelians (let alone that of competent speakers) is unknown, but it
has been estimated that speakers might number up to half a million.
Instilling Pride by Raising a Language ’ s Prestige 93
close sister Mingrelian (plus that of the more distant sister Svan) in Georgia,
compared the treatment of Laz in Turkey with that of Mingrelian and Svan in
Georgia (see Feurstein 1992), and I myself have contributed to the discussion
(Hewitt 1995). More recent examples could be cited, but simply consider this
brief summary: the leading Mingrelian politician of the 1920s, Ishak Zhvania,
advocated autonomy for Mingrelia together with the teaching of, and publi-
cation in, Mingrelian; books such as Ch’ita Chxoria ‘Red Ray’ and the Zugdidi
newspaper Q’azaxishi Gazeti ‘Peasant’s Paper’ appeared from 1930, but around
this time, as Stalin secured control of the Kremlin and Lavrent’i Beria assumed
Zhvania’s mantle as the leading Mingrelian politician, all talk of autonomy
ceased, Mingrelians were officially classified as ‘Georgians’, and publishing
in Mingrelian for Mingrelians (as opposed to linguistic or folkloristic works
for specialists) ended c.1938. Whilst it is popularly believed in Georgia that
Mingrelian, Laz and Svan are mere dialects of Georgian, even local linguists
who know this not to be the case but who do not wish to upset the official des-
ignation of these peoples as ‘Georgians’ have invented the (to my mind) odd
term ‘sociolinguistic dialect’ to describe the status of these tongues in Georgia.
This general downplaying of Mingrelian must surely have contributed to the
sad remark made by that old lady back in the 1980s.
Of course, the clue as to why Tbilisi has wanted as many residents of
Georgia as possible to identify themselves as ‘Georgians’ (just as for most of its
existence the Turkish Republic has tended to regard all residents of Turkey as
‘Turks’) lies in Zhvania’s desire to establish a Mingrelian autonomy. Separatism
was feared and remains feared to the present day.2 Let us recall Stalin’s own
words from his famous 1913 article on Marxism and the National Question:
‘What is to be done with the Mingrelians, the Abkhasians, the Adjarians, the
Svanetians, the Lesghians [lek’ebi in the Georgian version—BGH], and so on,
who speak different languages but do not possess a literature of their own? To
what nations are they to be attached? . . . What is to be done with the Ossets, of
whom the Transcaucasian Ossets are becoming assimilated (but are as yet by
no means wholly assimilated) by the Georgians and the Ciscaucasian Ossets
are partly being assimilated by the Russians and partly continuing to develop
and are creating their own literature? How are they to be “organised” into a
single national union?’ (pp. 48-9 of an undated English translation).
I asked an elderly Mingrelian in Upper Gal in the autumn of 2013 about his
ethnic self-awareness and how he distinguished between Mingrelians and
2 Interestingly, it seems that the reason for the quick demise of just 2 issues of the Laz news-
paper in Georgia in 1929 may have been the result of a request from Turkey to discontinue it
(Feurstein 1992.299).
94 Hewitt
Georgians. He replied: ‘We Mingrelians have our own language, but we are only
a people [Georgian xalxi], whereas the Georgians are a nation [Georgian eri].’
Asked how he differentiated between peoples and nations, he said that the
Georgians formed a nation ‘because they have a literature’! Whilst Stalin’s view
might have prevailed amongst (?most/many/all) Mingrelians, recent attempts
to introduce this categorisation to the Laz in Turkey by visitors from Georgia
have caused what I would judge to be entirely natural resentment. Consider
these quotes from a statement released in 2013 by a group of Laz intellectuals:
We do not find it ethical and vigorously condemn the appropria-

tion of people who have contributed to the Laz Community and the
attribution of those to the Georgians, through distortion and forced
interpretations . . . If Georgia and institutions in Georgia seek to raise the
Laz’s sympathy and increase their satisfaction it is recommended that
they should turn to topics like cultural autonomy and mother-tongue
education for Mingrelian and Svan, and support ways to encourage
and strengthen these languages. It would be a more democratic man-
ner for Georgia if they intended recognition of these languages as
mother-tongues without being pressed by Europe or other political pow-
ers but by means of a feeling of responsibility coming from within the
country and its institutions. It should be noted that Mingrelian and Svan
are a part of the world’s cultural heritage and efforts made to keep these
languages alive are nourished by universal principles, which arise out of
universal human rights. The extinction of these languages is a problem
that concernes not only the speakers themselves but all humanity (To the
Turkish and Georgian Public, 2013).3
There is an oblique reference in this statement to the European Charter for

Regional or Minority Languages (ECRML), which Georgia has yet to sign
(cf. Wheatley 2009). It is true that Turkey too has not signed it either, but,
possibly as part of its drive to join the EU, Turkey has relaxed official sanctions
against the use of minority-languages on its territory. And the Laz (along with
Circassians and Abkhazians) have taken welcome advantage of this positive
change in official attitudes.4 It is my firm conviction that linguists wishing
3 Read the statement at: http://abkhazworld.com/news/statements/973-laz-intellectuals-

explain-their-view-of-laz-ethnicity.html).
4 The Laz have made tremendous progress as outlined in a recent communication from a
member of the community, Eylem Bostancı:
to preserve endangered languages should, (even at the risk of incurring

unpopularity) either themselves or through statist institutions like the ECRML,
seek to persuade non-progressive states to recognise that providing proper
provision for minority languages (including language-tuition at school upto
a modest level perhaps, just to endow the language concerned with some
standing)5 is NOT necessarily an encouragement to separatism. If proper
linguistic provision is accompanied by respect for ethnic identity and non-
discriminatory treatment across the board, there is absolutely no reason why
separatism should result.
To turn finally and briefly to Abkhazia, I would note that, possibly as a
direct result of the war with Georgia, one has the impression that the need to
preserve Abkhaz is more widely felt than previously, language being an obvi-
ous and immediate marker of identity—in this case the identity targeted by
1) From September 2013, in the 5th-8th classes of the secondary schools Laz can be taken
as an elective lesson. In 2013, there were only five classes in the towns of Fındıklı (Rize)
and Arhavi (Artvin). In September 2014, the number of classes went up to 15, with pupils
taking Laz as an elective lesson in the towns of Ardeşen, Pazar and Fındıklı in the province
of Rize, and in the towns of Borcka and Arhavi in the province of Artvin. We are hoping that
the number of classes will go up to at least 30 in the next academic year and that classes will
also open in Hopa.
2) The Laz Institute, which was established in 2013, is developing relations with the lin-
guists and scholars around the world. It introduced courses for the teaching of Laz for adults
in Istanbul. The Institue has been commissioned by the Turkish Ministry of Education to
prepare text books to be used during the elective Laz lessons at the secondary schools.
The institute is also having talks with a number of universities in Turkey for Laz courses to
be introduced at these universities. Laz elective courses will continue to take place at the
University of Bosphorus (Bogazici Universitesi) in the 2014 academic year.
3) Laz intellectuals continue to work towards creating a Laz literature. The Lazika Yayın
Kollektifi (The Lazika Publication Collective) has published more than 60 books in the last
four years of its establishment, of which only five are in Turkish and all the rest are in Laz.
We have published Laz dictionaries, a Laz periodical (Tanura), five Laz novels, books on the
Laz history, Laz poem books, a high number of children’s books, and a variety of translations
into Laz from famous fairytales/stories from around the world. The fairytales/stories that
have been translated into Laz so far are as below and more are to come. We will be attending
to the Tüyap Istanbul Book Fair (the largest book fair in the country) in November. The Little
Prince (also translated into Mingrelian by the Laz Cultural Association), Romeo and Juliet,
The Little Black Fish, Polly-Anna, The Snow White, Don Quixote, Pinocchio.
5 This is not to suggest that the prime responsibility for passing on a language should lie with
schools. Native speakers are the ones who can and should most easily fulfil this task by
merely opening their mouths and speaking to children. But they might not feel inclined to
do so, if they do not recognise the importance of this simple task.
96 Hewitt
Georgian nationalism. With the aim of encouraging universal study and use
of the language, a law was introduced on 27 November 2007 under the late
Pres. S. Bagapsh requiring all official business to be conducted in Abkhaz from
1 January 2015. Whilst this might have been (and indeed remains as of October
2015) a noble aspiration, passing a law without making provision in terms
of teaching, publication of relevant language-materials, etc . . . means that it
will be impossible to enforce (the Vice-President elected on 24 August 2014
does not, for example, know Abkhaz!) and renders that law pointless. With
Abkhazia’s economy still in a parlous state, any assistance in terms of help with
producing suitable manuals and the training of teachers (especially for those
classes and subjects that have traditionally been taught in Russian even in
Abkhaz-language schools once tuition switches after the first few grades from
Abkhaz to Russian) would be welcome.
And, of course, the question of Mingrelian is also relevant to Abkhazia.
Here I have long advocated that, if the local Mingrelians want their children to
be educated through the medium of Georgian (as seems to be the case), this
should be allowed on condition that Mingrelian too is taught in local schools
up to a certain level of competence in order to raise its profile and prestige
amongst its native speakers. This view is not popular, however, with many
Abkhazians, I have to admit! But if the cause is noble, the battle is worth fight-
ing, as I hope we can all agree.
Postscript
I should add in conclusion that in an e-mail received from Georgia only a week
before this paper was read at Ardahan University a Mingrelian correspondent
who has translated and published a Mingrelian translation of ‘The Little
Prince’ = ch’ich’e mapaskiri6 (organised by the Laz [sic] Cultural Organisation
in Turkey, as noted in Footnote 4 above) told me that attitudes in Georgia
to publishing Mingrelian materials seem to be changing, a greater readiness
than, say, 10-15 years ago to accept such publications being noticeable. This was
welcome news, and one can only hope firstly that it is true and secondly that
such tolerance widens and deepens there.
6 See http://www.petit-prince-collection.com/lang/show_livre.php?lang=en&id=2615.
References
Feurstein, Wolfgang. 1992. Mingrelisch, Lazisch, Swanisch. Alte Sprachen und Kulturen
der Kolchis vor dem baldigen Untergang, in [B.] George Hewitt (ed.) Caucasian
Perspectives, 285-328. Unterschleissheim/München: Lincom Europa.
Hewitt, [B.] George. 1995. Yet a third consideration of Völker, Sprachen und Kulturen
des südlichen Kaukasus., Central Asian Survey, 14.2, 285-310.
Wheatley, Jonathan. 2009. Georgia and the European Charter for Regional or Minority
Languages, European Centre for Minority Issues (ECMI), Working Paper 42.
Chapter 8
Unwritten Minority Languages of Daghestan: Status

and Conservation Issues
Zaynab Alieva and Madzhid Khalilov
The modern world is divided into spheres of influence of several economic

and cultural (and linguistic) powers, which gradually absorb small nations
and ethnic groups. On the contrary, small ethnic groups (with their fragile
languages) lose their social significance. The use of native languages becomes
limited; they are not spoken any more, and eventually undergo extinction.
Among the languages that fall victim of this process are indigenous languages
of Daghestan, namely unwritten languages of the minority ethnic groups.
In the modern ethno-sociolinguistics, peoples of a country are traditionally
classified according to a quantitative characteristic, viz. the number of people
speaking a particular language. According to this principle, ‘minority ethnic
groups’ appear as part of the peoples of Russia. As shown in the official census,
the minority ethnic groups make 63 ethnic units in Russia, which are listed in
the ‘Red Book of the Peoples of Russia’, appear to be in the state of ethnic disas-
ter and are to become the object of prioritized protection. All Daghestanian
unwritten and newly written languages are listed in this book.
The problem of multilingualism in Daghestan and origin of its peoples
has always attracted not only linguists’ but also other researchers’ attention.
The migration theory of language origin of the Caucasus was popular among
early researchers. According to this theory, for many centuries the Caucasus
has served as a transit route for many tribes and nations in the era of Great
Migrations. The inadequacy of the migration theory of peoples and languages
of the Caucasus was partly established by their histrorical comparative study
and genetic classification.
The Caucasus belongs to one of the few regions in the world where an
extraordinary diversity of languages is represented in a relatively small area.
More than a half of the languages in the Caucasus are spoken in the republic of
Daghestan: there are more than three dozens of languages. Twenty-six of them
belong to the indigenous population and are genetically related. These are the
following languages:

Unwritten Minority Languages of Daghestan 99
Avar Hinukh Tabasaran

Agul Hunzib Tindi
Andi Godoberi Udi
Archi Dargwa Khwarshi
Akhwakh Karata Khinalug
Bagwalal Kryz Tsakhur
Bezhta Lak Tsez
Botlikh Lezgi Chamalal
Budukh Rutul Chechen
which form the Nakh-Daghestanian language family. There are also a significant
number of languages spoken in Daghestan that belong to other families:
Azeri, Kumyk, Nogay, which belong to the Turkic family, and Tat, which
belongs to the Iranian branch of the Indo-European family.
Currently, there are eleven official languages:
Avar Tabasaran Nogay

Dargwa Chechen Russian
Lak Azeri Tat
Lezgi Kumyk
three languages that recently became written:
Agul Rutul Tsakhur
and 18 unwritten languages:
Andi Budukh Tindi

Archi Hinukh Udi
Akhwakh Hunzib Khwarshi
Bagwalal Godoberi Khinalug
Bezhta Karata Tsez
Botlikh Kryz Chamalal
in the Republic of Daghestan. It should be noted that all speakers of Budukh,

Kryz, Khinalug, and Udi live in the Republic of Azerbaijan. In this list, there are
small one-village languages (i.e. languages spoken only in one single village),
100 Alieva and Khalilov
for example, Hinukh spoken by 600 people. The largest of the unwritten ethnic
groups is Andi whose language is spoken by about 30 000-40 000.
The lack of reliable data on the number of the so-called small ethnic groups
falls in the category of curiosities of demographic statistics in Daghestan.
However, according to the most precise data it is assumed that the number of
speakers of the unwritten languages, as part of the three million people inhab-
iting the region, exceeds 150 000. More than 100 000 people represent 13 ethnic
groups of the Andi-Tsez language group. Agul, Tsakhur, Rutul, and Archi that
belong to the Lezgic language group have recently received writing systems
and become newly literate, are spoken by ca. 50 000. It is very difficult to deter-
mine the number of speakers of other unwritten languages of the Lezgic group
such as Budukh, Khinalug, Kryz, Udi, which are located outside Daghestan and
almost totally assimilated to Azeri (Gamzatov 1995). Azeri was an inter-ethnic
language of communication for Aguls, Rutuls, Tabasarans, and Tsakhurs until
their languages got their official status.
An interesting situation is within the Avar-Ando-Tsezic group which consist
of fourteen languages: Andi, Akhwakh, Bagwalal, Botlikh, Godoberi, Karata,
Tindi, Chamalal-Andic subgroup; Bezhta, Hinukh, Hunzib, Khwarshi, Tsez-
Tsezic subgroup, among which Avar is the only written and literary language,
and the other thirteen languages have no writing tradition and are limited
to domestic and family use. Of course, it is expected that multilingualism
would be widespread in this region (i.e. knowledge of more than two or three
languages). The pattern of multilingualism is the so-called ethnic-ethnic-
Avar-Russian multilingualism (which is, for example, widespread among
Hinukh speakers: Hinukh-Tsez-Avar-Russian or Hinukh-Bezhta-Tsez-Avar-
Russian multilingualism). Another pattern of multilingualism is ethnic-Avar-
Russian trilingualism (which is widespread), against which the ethnic identity
of the representative of an unwritten language is developed. Trilingualism is
a standard way of communication for the Andi-Tsezic speakers. The speakers
of the Andi language, for example, identify themselves primarily as Andi and
yet call themselves Avar, as Avar language is the lingua franca in this group.
Interestingly, outside the republic, the Andi speakers identify themselves as
Daghestanians, and outside the country, they are simply Russians. This is the
multi-ethnic hierarchy of ethnic and linguistic identity of the Daghestanian
highlanders (Gamzatov 2005).
The majority of the Andic languages’ speakers live in the most mountain-
ous part of Daghestan, in the Andi Koysu river basin between Andi and Bogos
ridges. The western border between Daghestan, Chechnya and Georgia coin-
cides with the ethnic boundaries. Some Andic languages are also represented
in the republic of Azerbaijan (for example, village Ahvahdere, where Akhvakh
speakers live). The main language of communication there is Avar.
Since ancient times, the Tsez people (also known by the Georgian name
Dido) inhabit the Western part of Daghestan and partially Georgia, keeping
close ties with each other. Interestingly, the languages in the intra-group com-
munication for the Tsezic speakers are Bezhta (which is used with Hunzib and
Hinukh speakers) and Tsez (with Hinukh speakers); today Bezhta and Tsez still
partially provide these functions. Due to the economic factors and geographi-
cal proximity, the Tsezic languages’ speakers were influenced by Georgia, and
have been linked more closely with the Georgians, rather than with the Avar.
Currently, Avar serves as the language of interethnic communication among
the Tsezic peoples.
Analysis of the Daghestanian languages clearly confirms that the declara-
tions and decisions of language development in the given circumstances are
impossible without taking into account unwritten languages, since there has
been a large and prolific branch of linguistics dedicated to these languages.
Moreover, unwritten languages of Daghestan have become not just subfield,
but an independent field of modern Daghestanian linguistic and ethnographic
research. As a result, various projects and special programs for the study of
minority peoples, their genesis, historical and cultural past and present have
been developed in the historical, philological and sociological departments
of the Daghestan Scientific Center of the Russian Academy of Sciences. Serial
publications of historical and ethnographic materials titled ‘Small peoples of
Daghestan’ are ongoing. As a result, historical and ethnographic descriptions
of all Andi-Tsez and Archi languages have been published. Collection of oral
and poetic heritage of unwritten peoples and ethnic groups by folklorists have
become a very successful endeavor. Much work has been done in the study of
folk-art traditions and handicraft culture of these ethnic groups.
The idea of preservation of small unwritten languages and protection of
ethnic identities assumes a number of plans, projects and programs. The most
important current issues for the Daghestanian, particularly small unwritten,
languages are collection and documentation, comprehension and interpreta-
tion of the lexical data of each language. Therefore, the department of lexi-
cography and lexicology of the Daghestan Scientific Center of the Russian
Academy of Sciences carries out a multi-year program of making ethnic-
Russian bilingual dictionaries of unwritten languages. It is well known, that the
only reliable way to gather the lexical material of these minority languages is to
work among ethnic groups themselves, i.e. though direct fieldwork with their
speakers. Lexical information is being documented during extensive fieldwork
in the highland regions of Daghestan, and nowadays, almost all unwritten lan-
guages have their dictionary manuscripts, which include about seven to nine
thousand of lexical units of up to eight hundred pages. Thirteen of eighteen
dictionaries of this type have been published within two decades. This became
possible due to the sponsorship of the Holland Science Organization, Max

Planck Institute for Evolutionary Anthropology (Germany) and the Russian
Humanitarian Foundation. The most realistic and effective way of document-
ing the language of any nation is the preparation, publication and distribution
of its lexicon.
The following dictionaries have been published within the program of
ethnic-Russian dictionaries of unwritten languages:
Budukh-Russian dictionary (1984) by Meylanova

Bezhta-Russian dictionary (1995) by Khalilov
Tsez-Russian dictionary (1999) by Khalilov
Chamalal-Russian dictionary (1999) by Magomedova
Karata-Russian dictionary (2001) by Magomedova and Khalidova
Hunzib-Russian dictionary (2001) by Isakov and Khalilov
Khinalug-Russian dictionary (2002) by Ganieva
Tindi-Russian dictionary (2003) by Magomedova
Bagvalal-Russian dictionary (2004) by Magomedova
Hinukh-Russian dictionary (2005) by Khalilov and Isakov
Godoberi-Russian dictionary (2006) by Saidova
Akhvakh-Russian dictionary (2007) by Magomedova and Abdullaeva
Botlikh-Russian dictionary (2012) by Saidova and Abusov
Bezhta (2015) by Khalilov
Dictionaries of unwritten languages such as Andi, Archi and Khwarshi are

in different stages of preparation. Making the Kryz-Russian and Udi-Russian
dictionary is in the long-term research plans of the Institute.
According to academician Gamzatov (1995: 12), ‘doing a well-supplied
dictionary work on unwritten languages of Daghestan, we proceed from the
assumption that documentation of native and borrowed vocabulary of a lan-
guage which has no writing system, its maintenance and preservation for future
generations, of the still living experience of the human life and the ethnic way
of thinking is certainly a task of huge scientific and humanitarian importance
and at the same time in all respects humanitarian and extremely moral event,
as well as a noble and rewarding one.’
We believe that, first of all, we need to call attention of the local state
authorities and the public to the existing problems in this field in order to
preserve the unwritten languages of Daghestan. Therefore, it is necessary to
develop appropriate recommendations and take the most serious and urgent
action.
1. There are no writing systems in the Andi-Tsezic language subgroup. These

languages are oral and are used at home as vernaculars. The languages of school
education are Russian and Avar, the latter is taught in village schools as the
so-called rodnoj jazyk (“native language”). Additionally, one foreign language
is taught. This means, that children are exposed to three new languages at
school, whereas their native language is absent from the school curriculum.
However, in the primary school (especially in the first grade) teachers use
native languages as means of communication with the pupils, as children know
neither Avar nor Russian yet. Design of writing systems for these languages is a
necessary condition in order to preserve and develop ethnic minority groups.
Introduction of writing would be valuable for many reasons, particularly given
the challenges and difficulties of the primary education, which employs the
Avar and Russian languages.
2. It is urgent to develop a project plan, which would specifically focus on
the educational aspect of preservation and revival of the Daghestanian oral
languages, providing a detailed statement of mother tongue acquisition within
the family, at school, institutions; creation and publishing of alphabets, prim-
ers and other teaching guides, including radio and television programs; target
training and retraining of teaching staff, etc. The implementation of the first
stage of the proposed project will serve as a prerequisite for next programs;
their humanistic importance and significance cannot be overestimated.
3. In present circumstances, there is no direct threat of disappearance of
oral minority languages in Daghestan yet. However, due to the spread of differ-
ent types of bilingualism, migration and other processes there are certain facts
of loss of essential components of language activity like folklore, ‘blurring’ of
the structural peculiarities of these languages, and so on. It should be noted
also that Russian has a strong impact on these languages: high school students
and younger generation communicate in Russian and the knowledge of their
native languages is much more restricted.
4. In the last few decades, there has been an increasing interest among
minority Ando-Tsezic ethnic groups toward their history, culture, as well as
their mother tongue. The language of each nation is a natural and cultural
heritage of all humankind. Therefore, our duty is to create necessary condi-
tions for studying ethnic minority groups and their culture and languages
(for instance, such study could be detailed historical-ethnographic and gram-
matical research (Kibrik et al. 1977; van den Berg, Helma, 1995; Radžabov
1999; Kibrik et al. 2001; Khalilova 2009; Forker 2010; Isakov & Khalilov 2012;
Magomedova 2012; Comrie, Khalilov, & Khalilova 2015), collection and system-
atization of folklore and other text genres, different types of dictionaries, etc.).
Therefore, the main and urgent task is collecting and documentation of the
still existing linguistic, historical and cultural, folklore and ethnographic mate-
rial of the indigenous ethnic groups in Daghestan, which previously had not
been studied. Keeping all this wealth is a necessary prerequisite for the study
of history, ethnography and language, familiarizing the younger generation to
folk wisdom and transmitting the language and the whole bulk of folklore to
future generations.
A great amount of work in documenting the culture and the unwritten lan-
guages of Daghestan was done in the Department of Linguistics in Max Planck
Institute for Evolutionary Anthropology (Leipzig, Germany) under the super-
vision and with the financial support of professor, Ph.D., former director of
Department of Linguistics, Bernard Comrie. In particular, fundamental work
on folklore of Tsezic people was made within a few years. Today the follow-
ing books have been prepared and published: Tsez folklore by Abdulaev and
Abdulaev and Khwarshi folklore by Karimova. These books include Khwarshi
and Tsez tales, legends, stories, anecdotes, folk songs and ritual laments, as well
as the sample texts of spoken language (dialogues and monologues).
‘Bezhta-Russian phraseological, folklore-ethnographic dictionary’, of 420
pages, has been prepared by Khalilov and first released in 2014. It includes
rich folklore ethnographic material of the Bezhta people. The dictionary is the
first work on lexicographical description of phraseology, including paremiol-
ogy and folklore-ethnographic expressions of Bezhta. The dictionary includes
phraseological units, proverbs and a full list of common set expressions. The
dictionary lists independent Bezhta good wishes, teachings, swear words,
curses, folk beliefs, omens, divinations, spells, riddles, rhymes, tongue twisters,
funny expressions, as well as tribal (domestic) labels, signs in order to denote
appurtenance. The dictionary can be a source for future theoretical research
on phraseology, paremiology and, in particular, Daghestan-Russian phraseog-
raphy and paremiography.
Bezhta phraseological, paremiological and folklore-ethnographic units
are diverse in their content and function. The Bezhta folklore emerged from
the everyday observations of social and natural phenomena, some units are
associated with mythology and real historical events, some of them are bor-
rowed from related (Avar) or unrelated (Georgian and Russian) languages. A
certain part of idiomatic, paremiological and folklore-ethnographic material
of Bezhta dates back to the oriental languages, namely, Arabic. In general, this
folklore work is the most valuable material for studying the life of Bezhta peo-
ple in its historical aspect.
As can be seen from a brief analysis, the study of languages of the Tsezic sub-
group has been done relatively well. A great work has been done on compiling
and publishing ethnic-Russian dictionaries (except for Khwarshi-Russian dic-

tionary, which is still in progress). Detailed grammars have been prepared and
published by different linguists. For almost all Tsezic languages, folklore mate-
rial (as well as ethnographic material) has been collected and systematized. It is
achieved through a serious financial support of Holland Science Organization
and Max Planck Institute for Evolutionary Anthropology and active and col-
laborative work of various scientists (Bernard Comrie, Maria Polinski, Madžid
Khalilov, Ramazan Radžabov, Zaira Khalilova, Diana Forker, Arsen Abdulaev,
Raisat Karimova, who are fully engaged in the study of the Tsezic languages).
Such description of grammatical structure, collection and systematiza-
tion of folklore and ethnographic material is still to be implemented with
oral languages of the Andic subgroup of Avar-Andi-Tsezic languages. It mostly
depends on the financial support of local and foreign research centers, and on
the researchers involved in the study of the Andic languages.
Another important point is worth of notice. Not all unwritten languages of
Daghestan are studied and presented at the Institute of Language, Literature
and Art, named after G. Tsadasa of the Daghestan Scientific Center RAS,
which is the main research center of the study of the unwritten languages of
Daghestan. For example, Bezhta, Godoberi and Chamalal are actively investi-
gated: such scientists as Khalilov, Khalilova, Saidova, Magomedova and Alieva
work on these minority languages. However, there are no specialists working
on other languages of the Andi-Tsezic subgroup, as well as Archi, Budukh,
Kryz, Khinalug and Udi. Therefore, the main task of the local research insti-
tute is recruitment and training of scientific personnel, who will be represen-
tatives of such minority languages and who will be interested and engaged in
their study.
It is crucial that in October 1991, Russia adopted the ‘Law on Languages of
the Peoples of the Russian Federative Republic’, in which languages of all peo-
ples living in the country are declared a national, historical and cultural heri-
tage, and are guaranteed state protection. This law is supported by the Charter
for Regional and Minority Languages, adopted a year later by the Council
of Europe, in which the linguistic rights of national minorities are declared.
However, it should be noted that the Republic of Daghestan is the only region
of the Russian Federation, which has not yet adopted its ‘Law on Languages of
the Peoples of the Republic of Daghestan’. The draft law, which is under discus-
sion and preparation, should clearly specify certain measures for preservation
and development of endangered unwritten languages of Daghestan. However,
it is hard to believe that the attitude towards the languages will change radically
and there will be a turning point in their study immediately after its adoption.
The adoption of this law would guarantee their preservation and strengthen
their position. The law could clearly specify the place and the role of the native
languages in the life of the Daghestanian peoples.
The serious social support is still needed for another category of Daghestanian
languages, which are literary languages and languages that recently became
written (they are about fourteen). These languages already have educational,
scientific and artistic material, printed books, and mass media (radio and tele-
vision). Thus, they can be considered to be “safe” or not endangered. However,
nowadays there is a dangerous recurrence of narrowing the quantitative and
qualitative areas of native language use, reduction of school programs of the
native language and native literature. Many children, who grow up in the
urban areas, do not speak their mother tongue. This happens under irrevers-
ible process of urbanization, globalization and under the influence of Russian,
which is a means of inter-ethnic communication today. Native languages are
being moved into the background of public attention.
The change in the language situation, a revival of the ethnolinguistic life
in the country is only possible with the help of reasoned, comprehensive and
radical measures aimed for the future. At present, there is an ongoing mass
ethno-linguistic assimilation: out of three million Daghestanian people almost
two-third live in ethnic amalgamation. In addition, more than six hundred
thousand live outside Daghestan. Such territorial dispersal, as well as ethnic
intermarriages (almost every tenth Daghestanian family is exogamous, i.e. the
marriage partners are representatives of different ethnic groups), lead to the
state in which many traditional forms of culture are not associated with ethnic
identity any more.
Bibliography
Abdulaev, Arsen K. & Isa K. Abdullaev. 2010. Didojskij (cezskij) fol’klor. Makhachkala:
Lotos.
Comrie, Bernard, Khalilov, Madzhid and Khalilova, Zaira. 2015. Grammatika
bežtinskogo jazyka: fonetika, morfologija, slovoobrazovanie (A grammar of Bezhta).
Leipzig-Makhachkala: ALEF.
Forker, Diana. 2013. A Grammar of Hinuq. Berlin: De Gruyter.
Gamzatov, Gadži G. 1995. Bespis’mennyi, no Zhivoi, Real’nyi (Predislovie k serii
«Bespis’mennye jazyki Dagestana») // Khalilov, Madžid Š. “Bežtinsko-russkij slovar’.
Makhachkala: Nauka, p. 12.
Gamzatov, Gadži G. 2005. Lingvističeskaja planeta Daghestan. Etnojazykovoj Aspekt
Osvoenija. Moskva: Nauka, p. 63.
Ganieva, Faida A. 2002. Khinalug-Russian Dictionary. Makhachkala: Nauka.
Isakov, Isak A. & Khalilov, Madžid Š. 2012. Gunzibskiy Jazyk (A grammar of Hunzib):
Fonetika. Morfologiya. Slovoobrazovanie. Leksika. Teksty). Makhachkala: Nauka.
Jazyki Narodov Rossii. Krasnaja Kniga. Enciklopedičeskij slovar’-spravočnik. Moskva:
Academia, 2002.
Karimova, Raisat Š. 2014. Fol’klor xvaršincev. Makhachkala: ALEF.
Khalilov, Madžid Š. 1995. Bežtinkso-russkij slovar’. Makhachkala: Institut JaLI DNC RAN.
Khalilov, Madžid Š. 1999. Cezsko-russkij slovar’. Makhachkala: Institut JaLI DNC RAN.
Khalilov, Madžid Š. 2014. Bežtinsko-russkij frazeologičeskij, fol’klorno-etnografičeskij
slovar’. Makhachkala: ALEF.
Khalilov, Madžid Š. 2015. Slovar` bežtinskogo jazyka. Makhachkala: Nauka.
Khalilov, Madžid Š. & Isakov, Isak. 2001. Gunzibsko-russkij slovar’. Moskva: Nauka.
Khalilov, Madžid Š. & Isakov, Isak A. 2005. Ginuxsko-russkij slovar’. Makhachkala:
Nauka.
Khalilova, Zaira. 2009. A grammar of Khwarshi. Utrecht: LOT.
Kibrik, Aleksandr E., Sandro V. Kodzasov, Irina P. Olovjannikova, and Džalil S. Samedov.
(1977). Opyt strukturnogo opisanija arčinskogo jazyka. Tom 1. Leksika. Fonetika (in
Russian). Moscow: Izdatel’stvo moskovskogo universiteta.
Kibrik, Aleksandr E. et al. 2001. Bagvalinskij jazyk: Grammatika. Teksty. Slovar’ (A gram-
mar of Bagwalal). Moskva: Nasledie.
Magomedova, Patimat T. 1999. Čamalinsko-russkij slovar’. Makhachkala: Nauka.
Magomedova, Patimat T. 2003. Tindinsko-russkij slovar’. Makhachkala: Institut JaLI
DNC RAN.
Magomedova, Patimat T. 2004. Bagvalinsko-russkij slovar’. Makhachkala: Institut JaLI
DNC RAN.
Magomedova, Patimat T. 2012. Tindinskij jazyk. Makhachkala: Nauka.
Magomedova, Patimat T. & Rašidat Š. Xalidova. 2001. Karatinsko-russkij slovar’. Sankt-
Peterburg: Scriptorium.
Magomedova, Patimat T. & Abdullaeva, Indira I. 2007. Axvaxsko-russkij slovar’.
Makhachkala: Nauka.
Meilanova, Unejzat A. 1984. Buduxsko-russkij slovar’. Moskva: Nauka.
Radžabov, Ramazan N. 1999. Sinstaksis cezskogo jazyka. Moskva: Academia.
Saidova, Patimat A. 2006. Godoberinsko-russkij slovar’. Makhachkala: Institut JaLI DNC
RAN.
Saidova, Patimat A. & Magomed G. Abusov. 2012. Botlixsko-russkij slovar’. Makhachkala:
Institut JaLI DNC RAN.
van den Berg, Helma. 1995. A Grammar of Hunzib (with texts and lexicon). Lincom
Europa, München.
chapter 9
Report on the Fieldwork Studies of the Endangered

Turkic Languages
Yong-Sŏng Li
1 Introduction
Having realized the importance of the research and the documentation of the
endangered Altaic languages, 8 scholars of the Altaic Society of Korea carried
out at their own expense or on a personal level field researches at least 15 times
from 1972 to 2002, on Dagur, Sibe, Uilta, spoken Manchu, Ewenki, Orochen, and
Hezhe.1 All of these 8 scholars are male and graduated from the Department of
Linguistics at Seoul National University. There is no Turkologist among them.
These field researches are tabulated as follows:2
No. Year Language(s) Participant(s)
1 1972 Dagur Baeg-in SEONG

2 1972 Sibe Baeg-in SEONG
3 1981 Dagur Baeg-in SEONG
4 1986 Uilta Ju-won KIM
5 1997 Spoken Manchu Jae-il KWON, Dong-ho KO, To-sang
CHUNG, Gyu-dong YURN
6 1998 Ewenki, Orochen Ju-won KIM
7 1999 Ewenki, Dagur Baeg-in SEONG, Jae-il KWON, Ju-won
KIM, Dong-ho KO, Gyu-dong YURN,
Jae-mog SONG
8 1999 Ewenki Ju-won KIM, Dong-ho KO
9 1999 Sibe Ju-won KIM
10 2000 Ewenki, Dagur Baeg-in SEONG, Jae-il KWON, Che-mun
CHONG, Ju-won KIM, Jae-mog SONG
11 2000 Ewenki Ju-won KIM
1 See Seong 2008: 24-33 and Yu 2014: 23-24.

2 See Seong 2008: 25.
© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_0�0

FIELDWORK STUDIES OF THE ENDANGERED TURKIC LANGUAGES 109
No. Year Language(s) Participant(s)
12 2000 Ewenki Ju-won KIM

13 2001 Hezhe Ju-won KIM, Dong-ho KO
14 2001 Dagur Baeg-in SEONG, Che-mun CHONG
15 2002 Sibe Dong-ho KO, Ju-won KIM
Members of this small group began to work to organize the ASK REAL (Altaic
Society of Korea, Researches on Endangered Altaic Languages), the full-
scale research and documentation project on endangered Altaic languages.3
Supported by the Korea Research Foundation Grant, the following two succes-
sive research projects were carried out:4
(1) The 1st research project

subject: Fieldwork Studies of Altaic Languages for Genealogy of Korean
period: September 1, 2003-August 31, 2006
(2) The 2nd research project
subject: Building Digital Archive of Altaic Languages for the Study of
Genealogy of Korean
period: July 1, 2006-June 30, 2009
These two research projects are called ASK REAL unofficially. The director of
them was Ju-won KIM.5
These two research projects had the following two main goals:6
(1) Accumulating the extensive data on Altaic languages for the future study
on the genealogy of the Korean language;
(2) Joining the world-wide efforts for the documentation of endangered
languages.
3 See Yu 2014: 24.

4 See Seong 2008: 24, Kim 2008: 63-64, Kim 2011: 31-32, http://altaireal.snu.ac.kr/askreal_v25/,
and http://www.cld-korea.org/eng/archives/archives_1.php.
5 See http://altaireal.snu.ac.kr/askreal_v25/ and http://altaireal.snu.ac.kr/askreal_v25/mem
bers.html.
6 See Kim 2008: 31-34 and Yu 2014: 24.
110 Li
2 Field Researches
2.1 Determination of the Language or Dialect to Survey

The ASK REAL classified the Altaic languages into 55 (34 Turkic, 10 Mongolic,
and 11 Manchu-Tungusic) languages. The Altaic languages are distributed
widely across Eurasia, and the speakers of those languages are under multi-
lingual environment. And in most of the areas, the Altaic languages are not offi-
cial languages. Therefore, in those countries, the Altaic languages are in danger
of extinct.7
The most important criteria for the determination of the language or dialect
to survey were “the degree of endangerment” and “the possibility of survey”.
The first category is designated for those languages which are seriously endan-
gered. The second category is for those languages whose users are easily and
safely accessible.8
2.2 Organization of the Fieldwork Teams

The ASK REAL had three fieldwork teams: one for Turkic, one for Mongolic,
and one for Manchu-Tungusic. These three teams usually carried out the field-
works at the same time and two together, if possible. The author was respon-
sible for the Turkic team.
A fieldwork team was composed of mainly four people: one team leader
(professor) who controlled overall process of the field research, one specialist
(with a doctor’s degree) for Manchu-Tungusic, Mongolic, or Turkic languages
who did transcription; one questioner who was a fluent speaker of Khalkha
Mongolian, Chinese, or Russian; one person for sound and video recording.9
In the beginning of the 1st research project, a fieldwork team was so. But, later
the specialists (with a doctor’s degree) for the languages in question assumed
also the role of team leader in the Turkic and Mongolic teams. The persons for
sound and video recording were graduate students, the next generation of the
discipline, of the Department of Linguistics at Seoul National University. The
questioners were usually foreign graduate students of the same department.
7 See http://altaireal.snu.ac.kr/askreal_v25/altlang_main.html.
8 See Kim 2008: 136, Kim 2011: 51, and Yu 2014: 24.
9 See Kim 2008: 139, Kim 2011: 52, and Yu 2014: 24. For field research equipment, see Kim 2008: 143-
151, Kim 2011: 71-77, http://www.cld-korea.org/eng/fieldwork/fieldwork4_1.php, http://www
.cld-korea.org/eng/fieldwork/fieldwork4_2.php, and http://www.cld-korea.org/eng/field
work/fieldwork4_3.php.
According to the circumstances, the roles of the team members could overlap
and the number of the team members increased or decreased one or two.10
2.3 Questionnaire
Referred to Èwēnkèyǔ Jiǎnzhì (鄂温克语简志 “Brief record of the Ewenki
language”) of China, the first questionnaire of the Altaic Society of Korea was
prepared and used for Ewenki and Dagur in 2000.11 This questionnaire con-
sisted of vocabulary, grammar and conversational components. It was revised
and enlarged. In the version of September 2003, i.e. the first version for the first
research project, it had 1,625 vocabulary items, 219 conversational sentences,
and 182 grammatical sentences.12 Based on this version for China, the ver-
sions for Russia and Mongolia were prepared in 2003 and in 2005 respectively.13
These questionnaires were revised and enlarged continuously. Now they typi-
cally list around 2,750 vocabulary items, 340 conversational sentences, and 380
grammatical sentences.14
The Altaic Society of Korea has selected 24 lexical item classifications, seven
grammatical categories, and 17 situations for conversational settings.15 These
questionnaires are fit for the surveys which typically last 3-4 days for 6 hours
a day.16 The informants were requested to answer twice for each entry. In addi-
tion, spontaneous speech was often gathered in an impromptu manner.17
The phonological portion is withheld from these questionnaires, because in
many cases it is not possible to know the phoneme inventory due to the lack of
description of phonemes for the language to survey. Research into the inven-
tory is later performed using lexical data.18
When considering content placement, it is important to record informa-
tion about the native speaker informant in the front of the questionnaire. The
informant’s name, age, gender, ethnicity, family situation, information about
10 See Kim 2008: 139.

11 See Seong 2008: 30, Kim 2008: 153, and Kim 2011: 57.
12 See http://altaireal.snu.ac.kr/askreal_v25/enqpdf/enq_0004.pdf and http://altaireal.snu
.ac.kr/askreal_v25/field_research.html.
13 See Kim 2008: 154, 159 and Kim 2011: 57. 2004 is given instead of 2005 due to an editorial
error in Kim 2011: 57.
14 See http://www.cld-korea.org/eng/fieldwork/fieldwork3_1.php.
15 See Yu 2014: 25 and http://www.cld-korea.org/eng/fieldwork/fieldwork3_1.php.
17 See http://www.cld-korea.org/eng/fieldwork/fieldwork3_2.php.
18 See Kim 2008: 153 and http://www.cld-korea.org/eng/fieldwork/fieldwork3_2.php.
112 Li
language use, birthplace, residence history, and other relevant pieces of infor-
mation should be recorded.19
2.4 Processing Collected Data

We digitalized the collected data, and marked and extracted each item with
Sony Sound Forge for audio data and Window Movie Maker for video data.20
These works were taken by the graduate research assistants in almost all cases.21
2.5 Publications
The ASK REAL published three books for the consolidated reports of the proj-
ect in Korean:22
(1) The Altaic Society of Korea (ed.) (2006). Fieldwork Studies of Endangered
Altaic Languages – For the Genealogical Study of Korean and the
Preservation of Endangered Languages –. The Language and Cultural
Studies Series 2. Paju: Taehaksa. [한국알타이학회 (엮음) 2006. 절멸 위
기의 알타이언어 현지 조사 – 한국어 계통 연구와 알타이언어
보존을 위하여 –. 알타이학회 언어 문화 연구 2. 파주: 태학사.]
This book is a general report on the ASK REAL project done from
September 2003 to February 2006 (Fieldwork Studies of Altaic Languages
for Genealogy of Korean).
(2) Kim, Ju-won, et al. 2008. Documentation of Endangered Altaic Languages.
Paju: Taehaksa. [김주원 외. 2008. 사라져가는 알타이언어를 찾아서. 파
주: 태학사.]
This book is a general report on the ASK REAL project done from
September 2003 to August 2006 (Fieldwork Studies of Altaic Languages
for Genealogy of Korean).
(3) Kim, Ju-won, et al. 2011. Documentation of Altaic Languages for the
Maintenance of Language Diversity. Paju: Taehaksa. [김주원 외. 2011. 언어
다양성 보존을 위한 알타이 언어 문서화. 파주: 태학사.]
This book is a general report on the ASK REAL project done from July
2006 to June 2009 (Building Digital Archive of Altaic Languages for the
Study of Genealogy of Korean).
19 See Kim 2008: 156, Kim 2011: 58, and http://www.cld-korea.org/eng/fieldwork/field-
work3_1.php.
20 See http://www.cld-korea.org/eng/archives/archives_1.php. For detail, see Kim 2008: 176-
193, Kim 2011: 77-87, and Yu 2014: 27-28.
21 See Kim 2008: 176.
The Members of the ASK REAL published seven descriptive grammars on some
endangered Manchu-Tungusic, Mongolic, and Turkic languages in English as
Altaic Languages Series:23
(1) Kim, Ju-won, et al. 2008. Materials of Spoken Manchu. Altaic Languages
Series 01. Seoul: Seoul National University Press.
(2) Yu, Won-soo, et al. 2008. A Study of the Tacheng dialect of the Dagur lan-
guage. Altaic Languages Series 02. Seoul: Seoul National University Press.
(3) Li, Yong-Sŏng, et al. 2008. A Study of the Middle Chulym dialect of the
Chulym language, Altaic Languages Series 03, Seoul: Seoul National
University Press.
(4) Yu, Won-soo. A Study of the Mongol Khamnigan spoken in northeastern
Mongolia. Altaic Languages Series 04. Seoul: Seoul National University
Press.
(5) Li, Yong-Sŏng. 2011. A Study of Dolgan. Altaic Languages Series 05. Seoul:
Seoul National University Press.
(6) Kim, Ju-won. 2011. A grammar of Ewen. Altaic Languages Series 06. Seoul:
(7) Ko, Dong-ho & Gyu-dong Yurn. 2011. A Description of Najkhin Nanai.
Altaic Languages Series 07. Seoul: Seoul National University Press.
The first three books were selected among the excellent scholarly book
(and the only books in a foreign language) for 2009 by the National
Academy of Sciences of Korea.
The Members of the ASK REAL also published two monographs on the meth-
odology of field work researches and documentation in Korean:24
(1) Choi, Moon-jeong, et al. 2011. Linguistic Questionnaire for Investigation of

Altaic Languages. Paju: Taehaksa. [최문정 외. 2011. 알타이언어 현지 조
사 질문지. 파주: 태학사.]
(2) Choi, Woon-ho. 2011. The Practice of the Text Material Analysis and the
Digital Archive Constructions for Altaic Languages. Paju: Taehaksa. [최운
호. 2011. 알타이언어 텍스트 자료의 분석과 디지털 아카이브 구축의 실
제. 파주: 태학사.]
23 See Yu 2014: 29-30. These books do not contain spontaneous speech. The 8th volume
of this series was also published in 2015 as follows: Kim, Ju-won, Dong-ho Ko, Antonina
Kile & Moon-jeong Choi. 2015. The Life and Rituals of the Nanai People. Altaic Languages
24 See Yu 2014: 29.
114 Li
Papers on the findings of the fieldwork studies have been published in domes-
tic and foreign academic journals, or presented in conferences in Korean, or in
English.25
3 Fieldwork Studies of the Endangered Turkic Languages
3.1 The 1st Research Project

The original plan of the Altaic Society of Korea was to select two languages for
each team and to survey each language three times for three years, i.e. once
a year. The following languages were selected for the Turkic team: (1) Fuyü
Kyrgyz in China, (2) the Oghuz dialects of Amu Darya region in Turkmenistan.
In the regions where the endangered Altaic languages are spoken, it is usu-
ally very cold in winter, whereas the ground is so muddy that the contact with
the native speakers is very difficult in summer. Therefore, in the beginning of
the research project these three teams carried out fieldwork studies during the
semester. The entire period of a fieldwork including departure and return was
7-10 days. So, it was not possible to survey many informants.
The first fieldwork for Fuyü Kyrgyz was carried out during September 23-24,
2003 in the villages of Wujiazi and Qijiazi. The questionnaire had 1,625 vocabu-
lary items, 219 conversational sentences, and 182 grammatical sentences. It was
only possible to obtain the following data:
216 words and 22 conversational sentences from Wujiazi

305 words and 32 conversational sentences from Qijiazi.
The data obtained was very small in number. Therefore, the second fieldwork
for Fuyü Kyrgyz was carried out during January 15-16, 2004 in the village of
Qijiazi. However, it was only possible to obtain 363 words and 43 conversa-
tional sentences. It was also impossible to obtain any data on grammatical
sentences.
Upon the result of the fieldworks on Fuyü Kyrgyz, the original plan of the
research project was changed to survey as many Altaic (Turkic in particular)
languages as possible. Each team tried to save expenses. Consequently it was
possible to carry out more fieldwork studies than in the original plan.
It was impossible to get a visa for Turkmenistan in Seoul at that time.
Therefore, Prof. Dr. Mehmet Ölmez, one of the foreign collaborators, carried
25 See Yu 2014: 30-31.

out with his colleagues the fieldwork studies on the Oghuz dialects of Amu
Darya region in Turkmenistan during May 7-17, 2004.
The following Turkic languages were surveyed in the 1st research project.26
(Information is given in the following order: name of the object language
(or dialect); place of the fieldwork study; time period of the fieldwork study):
(1) Fuyü Kyrgyz; Wujiazi and Qijiazi, Fuyu County, Qiqihar, Heilongjiang
Province, China; September 23-24, 2003
(2) Fuyü Kyrgyz; Qijiazi, Fuyu County, Qiqihar, Heilongjiang Province, China;
January 15-16, 2004
(3) Shor (Mrass dialect); Myski, Kemerovo Province, Russia; April 20-23, 2004
(4) Oghuz dialects of Amu Darya region; Ashgabat, Turkmenistan; May 12-17,
2004
(5) Tuvan (Kök Monchak dialect); Aqqaba, Altay Prefecture, Xinjiang Uyghur
Autonomous Region, China; October 21-22, 2004
(6) Kazakh; Almaty, Kazakhstan; January 4-7, 2005
(7) Yakut; Yakutsk, Sakha Republic, Russia; February 16-20, 2005
(8) Chuvash; Cheboksary, Chuvash Republic, Russia; April 19-22, 2005
(9) Tuvan (Kök Monchak dialect); Qanas and Aqqaba, Altay Prefecture,
Xinjiang Uyghur Autonomous Region, China; April 28-May 4, 2005
(10) Tuvan (Tsaatan dialect); Khatgal, Khövsgöl Aymag, Mongolia; June 23-26,
2005
(11) West Yugur; Hongwansi, Sunan Yugur Autonomous County, Gansu
Province, China: October 17-21, 2005
(12) Gagauz in Kiev, Ukraine: February 5 and 17, 2006
(13) Urum (Oghuz dialect); Mariupol’, Donetsk Province, Ukraine; February
7-11, 2006
(14) Urum (Kypchak dialect); Mariupol’, Donetsk Province, Ukraine; February
8-9, 2006
(15) Krymchak; Simferopol’, Autonomous Republic of Crimea, Ukraine;
February 12-14, 2006
(16) Karaim (Crimean dialect); Jevpatorija and Simferopol’, Autonomous
Republic of Crimea, Ukraine; February 15-16, 2006
(17) Chulym Tatar (Middle Chulym dialect); Tomsk, Tomsk Province, Russia:
May 15-18, 2006
(18) Siberian Tatar; Tomsk, Tomsk Province, Russia; May 18 and 20, 2006
26 Two co-researchers performed a preliminary survey of Altai and Khakas during
February 17-19, 2004 in Novosibirsk, Russia. This survey is not included here.
116 Li
(19) Chulym Tatar (Lower Chulym dialect); Tomsk, Tomsk Province, Russia:
May 19-21, 2006
(20) Tuvan (Uriankhai dialect) in Tsagaan-Üür, Khövsgöl Aymag, Mongolia:
July 1-3, 2006
The author participated in all the fieldwork studies listed here except for
4-7. The informant of Lower Chulym was the last speaker of this dialect
and died in 2011. Therefore, the linguistic data collected by the author is
probably the last one for this dialect.
3.2 The 2nd Research Project

The original plan of the Altaic Society of Korea was to select six languages for
each team and to survey one language every six months. The following lan-
guages were selected for the Turkic team: (1) Gagauz, (2) Dolgan, (3) Bashkir,
(4) Salar, (5) Kyrgyz, and (6) Tofa. All the fieldwork studies were planned to be
carried out during the summer or winter vacation.
The following Turkic languages were surveyed in the 2nd research proj-
ect. (Information is given in the following order: name of the object language
(or dialect); place of the fieldwork study; time period of the fieldwork study):
(1) Salar; Xining, Qinghai Province, China; August 19-26, 2006

(2) Kyrgyz (Talas subdialect of Northern dialect); Bishkek, Kyrgyzstan;
December 22-25, 2006
(3) Kyrgyz (Ičkilik subdialect of Southern dialect); Bishkek, Kyrgyzstan;
December 28-30, 2006 and January 2, 2007
(4) Kyrgyz (Narïn subdialect of Northern dialect); Bishkek, Kyrgyzstan;
January 2-3, 2007
(5) Kyrgyz (Čüy subdialect of Northern dialect); Bishkek, Kyrgyzstan; January
3-5, 2007
(6) Kyrgyz (Ïsïk-Köl subdialect of Northern dialect); Bishkek, Kyrgyzstan;
January 6-8, 2007
(7) Dolgan; Yakutsk, Sakha Republic, Russia; January 30-February 5, 2007
(8) Yakut; Yakutsk, Sakha Republic, Russia; February 3-5, 2007
(9) Karaim (Trakai dialect); Trakai, Lithuania; July 17-28, 2007
(10) Bashkir (Ĕyĕk-Haqmar subdialect of Southern dialect); Ufa, Republic of
Bashkortostan, Russia; August 8-10 and 21, 2007
(11) Bashkir (Urta subdialect of Southern dialect); Krasnousol’sk, Republic of
Bashkortostan, Russia; August 11-13, 2007
(12) Bashkir (Zilim subsubdialect of Urta subdialect of Southern dialect);
Krasnousol’sk, Republic of Bashkortostan, Russia; August 14, 2007
(13) Bashkir (Dim subdialect of Southern dialect); Rajevka, Republic of

Bashkortostan, Russia; August 14-16, 2007
(14) Bashkir (Qïδïl subdialect of Eastern dialect); Asqar, Republic of Bash-
kortostan, Russia; August 19-20, 2007
(15) Chuvash; Ufa, Republic of Bashkortostan, Russia; August 22, 2007
(16) Tatar; Ufa, Republic of Bashkortostan, Russia; August 22, 2007
(17) Khakas (Sagay dialect); Abakan, Republic of Khakassia, Russia: August
2-4, 2008
(18) Khakas (Shor dialect); Chernogorsk / Abakan, Republic of Khakassia,
Russia; August 5-7 / 15, 2008
(19) Khakas (Koibal subdialect of Kacha dialect); Abakan / Bejskij rajon selo
Kojbaly / Abakan, Republic of Khakassia, Russia; August 7 / 12 / 13, 15-16,
2008
(20) Khakas (Kïzïl dialect); Abakan, Republic of Khakassia, Russia; August
8-11, 13, 16, 18, 2008
(21) Khakas (Beltir subdialect of Sagay dialect); Askizskij rajon selo
Apchinajev / Abakan, Republic of Khakassia, Russia: August 12 / 13, 18, 2008
(22) Khakas (Kacha dialect); Abakan, Republic of Khakassia, Russia; August
13-16, 2008
(23) Altai (Telengit dialect); Gorno-Altajsk, Altai Republic, Russia; January
18-20/28, 2009
(24) Altai (Altai-kizhi dialect); Gorno-Altajsk, Altai Republic, Russia; January
20-21/24/26/27, 2009
(25) Altai (Kumandy dialect); Gorno-Altajsk, Altai Republic, Russia; January
22/25, 2009
(26) Altai (Chalkandu dialect); Gorno-Altajsk, Altai Republic, Russia; January
23-24, 2009
(27) Altai (Teleut dialect); Gorno-Altajsk, Altai Republic, Russia; January 25,
2009
(28) Altai (Tuba dialect); Gorno-Altajsk, Altai Republic, Russia; January 28,
2009
The author participated in all the fieldwork studies listed here except
for 4.
The Tofa-speaking region was not easily and safely accessible. It would be very
expensive to go there. Moreover, it was informed that there was no fluent Tofa
speaker. Therefore, Khakas was selected instead of Tofa.
Altai was selected instead of Gagauz, because it was almost impossible to
find informants at that time in winter.
118 Li
3.3 The Unrealized 3rd Research Project

To carry out fieldwork studies on Altaic languages from July 1, 2009 to
June 30, 2014, the members of the ASK REAL applied in March 2009 for a grant
of the National Research Foundation of Korea (formerly the Korea Research
Foundation) with the subject ‘Documentation of Endangered Altaic Languages –
For the Study of Genealogy of Korean –’ (called ASK DEAL unofficially). The
Altaic languages to survey were as follows:
(1) Turkic: Fuyü Kyrgyz, Shor, Tuvan, Kumyk, Gagauz, Karaim, Urum
(2) Mongolic: Mongolian, Buryat, Kangjia, Oirat-Kalmyk, Bonan
(3) Manchu-Tungusic: Manchu, Uilta, Nanai, Ewenki, Ewen, Sibe, Orochi,
Solon
However, this research project was rejected by the examiners of the National
Research Foundation of Korea on the pretext that the members of the
ASK REAL did nothing in spite of the enormous support of the Korea Research
Foundation. This is simply not true at all. As mentioned above, the members
of the ASK REAL wrote a lot of books, articles, etc. Moreover, three books were
among the excellent scholarly book (and the only books in a foreign language)
for 2009 selected by the National Academy of Sciences of Korea. The real rea-
son is the common practice of funding two successive research projects at
most regardless of good result in Korea.
From July 2009 to January 2010, Seoul National University supported the
members of the ASK REAL to prepare a web site in English providing infor-
mation on the ASK REAL. The result is the web site http://altaireal.snu.ac.kr/
askreal_v25.
With funding from the National Research Foundation of Korea, Ju-won KIM
and his colleagues carried out the “Languages and Culture of the Indigenous
Peoples of the Amur River” Project from May 1, 2010 to April 30, 2013. The
author participated in this project for the first six months.
Sponsored by the Ministry of Culture, Sports and Tourism of Korea,
Ju-won KIM and his colleagues including the author prepared the web site
of the Center for Language Diversity (http://www.cld-korea.org/index.php)
from October 2010 to March 2011. This site has also English version consist-
ing of 4 sections of About, Fieldwork, Archives, and Bibliography. The materi-
als of the sections Fieldwork, Archives, and Bibliography are related with the
ASK REAL.
4 Conclusion
Using homogenous questionnaires and excellent audio/video equipment, the

members of the Altaic Society of Korea carried out two successive research
projects with the unofficial name ‘ASK REAL’ and published some books
and papers in connection with these projects. The following points are to be
mentioned:
(1) The period of fieldwork studies was not sufficient. So, it was not possible
to survey many informants. However, it would cost too much to survey for
a long time, for example, one or two months. Moreover, there was almost
nobody to do so, especially among the graduate research assistants.
(2) The number of informants was usually very small for each language. In
many cases, there were only one or two informants. Therefore, the reli-
ability of the collected data may be questionable. However, the members
of the ASK REAL did their best in a given situation.
(3) It was not possible for the members of the ASK REAL to continue their
fieldwork studies on (endangered) Altaic languages due to the common
practice of funding two successive research projects at most regardless
of good result in Korea. Financial support is needed for those fieldwork
studies.
(4) The members of the ASK REAL digitalized the collected data, and
marked and extracted each item with Sony Sound Forge for audio data
and Window Movie Maker for video data. These works were taken by
the graduate research assistants in almost all cases. However, the most of
these data are not transcribed by the experts. Practically no one among
the graduate students, the next generation of the discipline, wants to
devout himself/herself to Altaic studies in Korea. Because there is little
possibility for finding any chance in Altaic studies.
References
The Altaic Society of Korea (ed.) (2006). Fieldwork Studies of Endangered Altaic
Languages – For the Genealogical Study of Korean and the Preservation of Endangered
Languages –. The Language and Cultural Studies Series 2. Paju: Taehaksa. [한국알
타이학회 (엮음) 2006. 절멸 위기의 알타이언어 현지 조사 – 한국어 계통 연구
와 알타이언어 보존을 위하여 –. 알타이학회 언어 문화 연구 2. 파주: 태학사.]
Baskakov, Nikolaj A. (ed.) (1966). Jazyki narodov SSSR: 2. Tjurkskije jazyki. Moskva:
Nauka.
120 Li
Choi, Moon-Jeong, et al. 2011. Linguistic Questionnaire for Investigation of Altaic

Languages. Paju: Taehaksa. [최문정 외. 2011. 알타이언어 현지 조사 질문지. 파주:
태학사.]
Choi, Woon-ho. 2011. The Practice of the Text Material Analysis and the Digital Archive
Constructions for Altaic Languages. Paju: Taehaksa. [최운호. 2011. 알타이언어 텍스
트 자료의 분석과 디지털 아카이브 구축의 실제. 파주: 태학사.]
Deny, Jean, et al. (eds.) (1959). Philologiae Turcicae Fundamenta; T. 1. Wiesbaden: Steiner.
Grenoble, Lenore. 2008. Endangered Languages. In Peter K. Austin (ed.), One thousand
languages: living, endangered, and lost, 214-235, Berkeley & Los Angeles: University
of California Press.
Johanson, Lars & Éva Á. Csató (eds.) (1998). The Turkic Languages. London & New York:
Routledge.
Kim, Ju-won, et al. 2008. Materials of Spoken Manchu. Altaic Languages Series 01. Seoul:
Kim, Ju-won, et al. 2008. Documentation of Endangered Altaic Languages. Paju:
Taehaksa. [김주원 외. 2008. 사라져가는 알타이언어를 찾아서. 파주: 태학사.]
Kim, Ju-won. 2011. A grammar of Ewen. Altaic Languages Series 06. Seoul: Seoul
National University Press.
Kim, Ju-won, et al. 2011. Documentation of Altaic Languages for the Maintenance of
Language Diversity. Paju: Taehaksa. [김주원 외. 2011. 언어 다양성 보존을 위한 알
타이 언어 문서화. 파주: 태학사.]
Ko, Dong-ho & Gyu-dong Yurn. 2011. A Description of Najkhin Nanai. Altaic Languages
Li, Yong-Sŏng. 2008. Endangered Turkic Languages – Preliminary Report on Fieldwork
Studies –. Sibirische Studien 3/1. 1-25.
Li, Yong-Sŏng, et al. 2008. A Study of the Middle Chulym dialect of the Chulym language,
Altaic Languages Series 03, Seoul: Seoul National University Press.
Li, Yong-Sŏng. 2011. A Study of Dolgan. Altaic Languages Series 05. Seoul: Seoul National
University Press.
Seong, Baeg-in, Ju-won Kim, Dong-ho Ko & Jae-il Kwon. 2010. Grammar and lexicon
of Dagur and Ewenki Spoken in China. Daewoo Academic Series 597. Seoul: Acanet.
[성백인, 김주원, 고동호 & 권재일. 2010. 중국의 다구르어와 어웡키어의 문
법∙어휘 연구. 대우학술총서 597. 서울: 아카넷.]
Tekin, Talat. A New Classification of the Turkic Languages. 1991. Türk Dilleri
Araştırmaları (Researches in Turkic Languages) 1. 5-18.
Yu, Won-soo, et al. 2008. A Study of the Tacheng dialect of the Dagur language. Altaic
Languages Series 02. Seoul: Seoul National University Press.
Yu, Won-soo. A Study of the Mongol Khamnigan spoken in northeastern Mongolia. Altaic
Languages Series 04. Seoul: Seoul National University Press.
Yu, Won-soo. 2014. Documentation of Altaic Languages. In Workshop on Language

Documentation and Typology, 21-36. [This workshop was held at Seoul National
University with the same name on 12-13 September, 2014 by the Department of
Linguistics at Seoul National University.]
chapter 10
Empire, Lingua Franca, Vernacular: The Roots

of Endangerment
Nicholas Ostler
Endangered languages are one kind of outcome of a historical process which is

fundamental to the story of humanity. Recounting this process illuminates the
causes of language endangerment, and may give rise to a more realistic under-
standing of what it is, why it matters, and how any policy may be designed to
affect it or reduce it.
When we speak of language endangerment, we are not suggesting that the
human language faculty itself is in danger: what is endangered is one or more
specific languages, languages which stand to cease being used without being
replaced by any other equally distinct languages. In short, and to generalize
over many individual cases, the danger is that language diversity will be dimin-
ished: the number of languages in the world will decrease, and the average
population of languages remaining will increase correspondingly.
The first question to ask, then, in order to give the backgound to language
endangerment, is: Where does language diversity come from? How was
it created?
The answer is that it is joint result of the development of the human lan-
guage faculty (presumably a single event, approximately 100,000 years ago),
and the spread of humanity, a single species which might be called Homo
Loquens, to dominate ecological niches in every part of the earth. This process
of spread, which is believed to have taken place between the eras 100,000 and
12,000 years ago,1 totally outpaced any ability of the human race to keep in
touch with its constituent tribes as they moved across and out of Africa, and
ultimately into every part of the earth’s land surface except Antarctica. Since
human tribes were able to survive and flourish independently without mutual
contact, there was no common constraint on how the language which each
community spoke would develop.
The process of language change will have accompanied this vast spread of
humanity, and will have continued after all the niches had been filled with
peoples, each making a living out of local conditions and resources. Inevitably,
1 Burenhult 2000.
© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_0��

Empire, Lingua Franca, Vernacular 123
they will have developed different vocabularies to support their very different
life-styles; but they will also have perpetuated their languages by means that
are as old as language itself: the young learn to speak and understand from
the old. This process of learning requires imitation of sound in the context of
stimulus, but also rational reconstruction of the system and the meanings it
conveys. As a process, it does not give perfect copies of the elders’ speech, and
there is always the potential, in every generation, for the language to be a little
different from how the elders spoke it. And since there is no general opportu-
nity for distant tribes to contact one another and communicate, this process
of imperfect learning will have led – over 100,000 years – to all the diversity of
language, in phonetics, vocabulary and structures that we now discover when
we review the languages humanity still speaks and (where documented in
writing) has spoken.
This account of the origin of human language diversity is not just a ratio-
nal reconstruction. The purported route of humanity round the world is
confirmed by the pattern of gene mutations (Y chromosome for men,2 mito-
chondrial DNA for women),3 and also the gross positioning of detectable fami-
lies of languages.4
The ability to use language must have been important in making this
explosion of humanity’s homelands the success that it was. Language makes
thoughts, ideas and plans discrete and potentially explicit: therefore, they
become available for inspection, and for discussion with, and transmission to
others who use the same code. Joint planning is possible, and so is co-ordination
of action beyond the instinctive or habitual. Imagined possibilities can be
explored, loyalties pledged, memories recalled. This makes a tribe with lan-
guage much more powerful and effective in the game against nature.
But this does not exhaust language as it was experienced. It was not just
language as such: rather, it was a multitude of languages.
Given the spread of people out beyond the bounds of regular contact and
the iron law of imperfect transmission from generation to generation, differ-
ent, incompatible codes of language came to be used. This had little ill-effect
in these early days of human spread and residence: by its nature, the people
who spoke languages other than yours were the people you were unlikely to
meet, or to collaborate with. But even within a single code well within your
understanding, regional groups would speak with perceivable variation, so that
2 https://en.wikipedia.org/wiki/Human_Y-chromosome_DNA_haplogroup#Major_Y-DNA_
haplogroups.
3 https://en.wikipedia.org/wiki/Human_mitochondrial_DNA_haplogroup.
4 http://emeld.org/workshop/2004/bibiko/bibiko-original.html.
124 Ostler
speech became a marker of tribe or kin-group: how much more so, when one
encountered people who were incomprehensible. The way one spoke became
a badge of belonging – both reinforcing solidarity among comrades, but also –
when among enemies – providing a give-away shibboleth of one’s true loyalties.
This badge function of language patterned with, and perhaps sometimes
also against, other markers in dress and custom. But it also fed into the reserves
of remembered lore which came with an upbringing in a particular language,
so that each language grew its own traditions and literature, recall of which
served to reinforce, in a different way, the conscious identity of speakers.
However, this scattering of humanity into small, self-conscious and self-loyal
groups ran up against another characteristic of the environment, sometime
after the human species had taken effective occupation of all the landmasses
of the earth. This was man’s ability to domesticate and intensify the food ani-
mals and food plants: pastoralism and agriculture. Rather than living off the
land, seeking whatever might be hunted or gathered, some groups began to
adapt their life-style to control of foodstuffs.
This had various effects on their lived world. Importantly, they became more
sedentary, staying with their fields and pastures, and growing their population
in place, since the new methods meant more food could be produced with the
new approach to applying human effort. Sedentary bases needed to be pro-
tected, both from weeds and wild beasts, but also from other human beings
applying the world’s oldest labour-saving device: robbery. Protection and
predation – defensive and aggressive warfare – became specialist occupations;
and the utility of language in organizing large numbers of people was increas-
ingly valued. Large groups came to be safer – and more powerful – than small
ones; and the large groups began to define particular roles – ranks, functions
and jobs – for individuals. It became possible to progress through a career, but
even more to destine people for particular roles – including priests to preside
over tradition, and rulers to preside over the whole structure of agriculural and
pastoral society.
All in all, the economic principle of falling marginal costs began to assert
itself. When controlled units of production were large, the cost of adding to
them was less than where the units were smaller: growth fed on itself, mak-
ing bigger units natural winners – and conversely the traditional, small self-
sufficient units, natural losers. At least, this was the case where resources were
relatively easy to access: so the rich lands fell to larger and fiercer tribes, with
marginal land being left to the traditional hunter-gatherers.
Two of the problems of running large units is the threat of anonymous theft,
and the difficulty of enforcing distributive contracts when they rely solely on
the memory of (corruptible) participants and witnesses. A useful response to
this was the development of tally systems, and as their flexibility was appreci-
ated, they changed into complete system writing, which could represent dis-
tinctly anything that could be said. Once this was available (defining another
rank, function and job – the scribe) it soon became clear that there were many
other uses to which this frozen speech could be put, not least correspondence
which would make rulers’ decisions knowable, and enforceable, wherever
their language was understood.
Writing made possible the control of large-scale empires, the use of written
orders and intelligence overcoming a natural limitation on an oral command
structure. It also facilitated long-distance trade, by clarifying the content of
orders and payments. But the effectiveness of these could only be realized if
the recipients of messages could understand them. There was now a premium
on understanding the language of the ruler or the merchant – in other words,
the need for a common language between instructor and agent.
People now had a motive to learn other languages, at least the language of
those in power. This was the simplest case, but in fact, there was no pressing
need for the language of communication to be the instructor’s own, as long
as there was some language – a lingua franca – which was accessible both to
instructor and agent. Hence the Achaemenid empire of the Persians, founded
by Cyrus and destroyed by Alexander the Great, had for 400 years kept Persian
as the language of the rulers, while imperial communications over their mas-
sively multilingual domain “from Hōdu to Kush” (i.e from India to Sudan) were
conducted in Aramaic, which had been the native language of the (preceding)
Babylonian empire.
A further use that was discovered for written language during this period
of the mid first millennium BC (called by German philosopher Karl Jaspers
“the Axial Age”). This was the writing down of religious scriptures, making
permanent and evident (just like the ruler’s commands) the tradition passed
down by the priests. Since writing was highly explicit in its form, this also had
the effect of sanctifying a particular style and dialect of language. Once writ-
ten down, language does not change, even if it is read by successive genrations
whose spoken languge is different, and the result was to give scriptural lan-
guage an authority that surpassed any vernacular dialect.
The institution of human empire, then, with its three clear prongs of mili-
tary command (including taxation), commercial exchange and religious wor-
ship, all created a need for a large scale super-language – a lingua franca – with
currency that extended beyond the range of any vernacular, and inevitably a
language which could be written down fully and clearly.
With this established, the tendencies which have since made for language
endangerment were all in place. As the empires expanded, and became
126 Ostler
hardened by inherited loyalties, there was less space left for communities
which relied only on their own traditional vernaculars; inevitably, these would
only survive in less favoured sites, such as marshes, mountains and deserts. As
the empires tended to use their power to monopolize wealth, there would be
little left for the free peoples oustide. The hierarchy of command and wealth in
the empires would be reflected in a class system with defined échelons of rank:
but the outsiders who spoke their own language would be seen as beyond the
pale. And with the establishment of religions, notably the missionary faiths
Buddhism, Christianity and Islam, the denizens of the large empires would
also see themselves as having virtue too that was superior to the “lesser breeds
without the law”.
Although the small tribes had shown the variability and innovations which
once provided the wherewithal to spread humanity into every corner of the
habitable earth, they were in effect disrespected and humbled when humanity
had turned the world – and above all, its sources of food – into its own domain.
Language endangerment, then, which can been defined as the pressure to
give up on the use of the languages of smaller communities, is a tendency with
a long tradition. Looking, for example, at the area of western Europe, which
was largely conquered by Rome up to the first century AD, the 16 or more lan-
guages spoken in 500 BC had declined to some half dozen by 500 AD.
Most had been replaced by Latin, the languages of a large-scale empire; but
there was still a lot of use of Gothic, which had come in latterly with maraud-
ing tribes. The old languages that survived in Europe (Aquitanian aka Basque,
and Celtic in Brythonic and Goidelic forms), together with Punic and Albanian
slightly further afield, were all spoken in areas which had been on the edge of,
or quite beyond, the land occupied by Rome. Their territories were also moun-
tainous, making them of less economic value.
In the east of the Mediterranean, the Greek language, which had spread as a
lingua franca with Alexander’s empire (following on from a widespread Greek
commercial network round the whole sea), proved quite able to resist the con-
quering might of Rome: this was a case of one lingua franca (Greek) ranged
against another (Latin), and Greek won: it was already in place, already widely
known to many Romans (in both elite and slave classes), and did not yield in
prestige to Latin. It may also have benefited from the fact that there was rela-
tively less settlement of Roman colonies in the eastern Mediterranean (though
there was some – e.g. Corinth, sacked by Rome in 146 BC but re-settled only a
century later, in 44 BC, and with many Romans. It would become the capital of
the Roman province of Achaea, but still reverted to Greek.)
Since we have represented the tendency to language consolidation – the

flip-side of language endangerment – as almost a social necessity of human
development, at least in the period of increasing empires, it is worth pausing a

moment to consider why it might be seen as regrettable.
As we have painted it, language concentration, or at least the widespread
use of lingua francas, is a side-effect of the requirements of imperial adminis-
tration, which itself is an effect of falling marginal costs as economic units are
organized on a larger and larger scale. If so, perhaps the increasing population
of lingua-francas , and decline, even loss, of minority languages is inevitable.
Part of the regret at language loss is undeniably sentiment, and empathy for
what is lost as one human condition makes way for another: as the Roman poet
Virgil puts it – his work the very pinnacle of all-conquering Latin –
Sunt hic etiam sua praemia laudi; 

sunt lacrimae rerum et mentem mortalia tangunt. 
Solve metus; feret haec aliquam tibi fama salutem.
Triumphs have their rewards here, but events provoke tears too, and the
mortality of things touches us. Still, dissolve your fears: the renown of
this will bring you some relief.5
These words are spoken by Virgil’s (Anatolian) hero Aeneas, once a Trojan, but
soon to be Roman, as he contemplates a monument to his own city’s destruc-
tion. The fact that we may understand the vast forces that annihilate some of
our valued possessions will not lessen the regret at their loss: and loss is just
that – it does not necessarily lead to a corresponding, or even greater, gain
further down the line.
Look at the loss of the Celtic languages, once spoken all over the west of
Europe and Iberia, but all overwhelmed in Gaul and in Spain by Latin. (In
fact the modern Romance language are spoken almost exactly in the same
places where once was Celtic.) An artefact like the Gundestrup Cauldron from
Denmark, which shows a variety of strange details associated with the antler-
bearing god Cernunnos, makes it evident that he had a highly complex myth:
but we shall never know its content, since Greeks and Romans saw no reason
to write it down – neither in their own languages, nor (much less) in its own
Gaulish.
In fact, the loss of a language, and the community that was identified by
it, is not actually necessary in these cases, since the lingua-franca which has
endangered it could just as well be acquired in bilingualism with it. What
causes the loss is the lack of respect or affection for the old order – the
5 Virgil, Aeneid, i. 461-3; translation mine.

128 Ostler
class-based contempt for others’ way of life and web of old relationships, all
articulated in a language which has fewer speakers.
Indeed, the survival of diversity can be a source of strength, though it is
not typically seen as such by those who have no concern for endangerment.
Andaman Islanders showed amazing savoir faire, and lack of casualties, when
confronted with the vast tsunami that devastated coasts all round the Indian
Ocean; and in both world wars, the USA benefited from having access to
American Indian language speakers to use as “code-talkers” – since their lan-
guages (such as Navajo, Choctaw and Cherokee) were effectively undecipher-
able to intelligence officers in enemy powers such as Germany and Japan. It
takes an extreme situation to bring out the salutary value of diversity and flex-
ibility. But it is evident to those who value their own traditions that something
important stands to be lost.
There is in fact a curious irony about knowledge and the value of diversity.
Typically, those who represent, by their language and their lives, a minority
culture will not see its value as stemming from its diversity: they just want to
hold on to what has been their own. It is the representatives of empire – often
native speakers of the lingua franca – who are in a position to be conscious of
diversity as a value in itself, indeed a form of cultural wealth.
We are presently in Ardahan, a city now in the extreme north-east of Turkey,

but in a land which has previously formed part of various language zones –
and indeed empires. Together with the neighbouring Caucasus, it is a vivid
example of how not only imperial re-organization, but also the movements of
peoples can radically alter the linguistic profile of a territory.
They show that the general secular trends towards endangerment of minor-
ity languages which we have identified as diminishing linguistic diversity in
the modern word, may be difficult to discern in the language history of a spe-
cific region. Imperial lingua-francas can cloak the existence of the common
people’s vernaculars in the historical record; yet this situation may be unstable
in the longer term. Such lingua-francas can ultimately either be eliminated in
competition with some other elite language, allowing the vernacular to surge
up as the dominant language again, or else be replaced by another invading
vernacular. All of this has happened around Ardahan.
When we look as far back as we can for the language of this region, we get
back to the early first millennium BC. In that era, Ardahan lay in the northern
reaches of the kingdom of Urartu (a name which is also seen in the name of
Mt Ararat, not far to the south-east).
This people the Greek historian Herodotus calls the Alarodioi, as one
of many tribes ranged against the Greeks in Xerxes’ army of 480 BC – their
Georgia Russia
Tbilisi
Ordu Trabzon Rize
Kars
Armenia Azerbaijan
Erzurum
Iḡdir
Turkey
Malatya Van
Tabriz
Diyarbakir Siirt
Batman
Mardin
Urfa
Mosul Erbil Iran
Hamedan
Syria Iraq
Kenmanshah
map 1 Ardahan in west Asia today.

source: After Google Maps data.
last appearance in recorded history – although the army of Orondas, says

Xenophon (5th century), besides Armenians, included Mards and Khaldian
mercenaries – who would probably be Kurds and Urartians.)
Urartu was centred on Lake Van, and after successfully opposing the
Assyrian empire to its south in the 8th century, seems to have become a vassal
state ca 705 BC, through agreement with the Assyrian king Sennacherib. This
seems to have led to a period of peace and prosperity, which gradually degen-
erated through increasing attacks from the west and south, from marauding
Cimmerians, Scythians and Medes, all of whom would have been speaking
Iranian languages of the Indo-European family. These people are perhaps the
ancestors of the (equally Iranian) Kurds who dwell in a large area extending
130 Ostler
36○ 38○ 40○ 42○ 44○ 46○ 48○ 50○
Terek Scale
C 0 50 100 km.
A
U 0 50 100 ml.
C
A
E A S T E R N
( C A S P I A N
S
U
42○ S 42○
GANI
M
O
U P P E R S E A (W E S T E R N G R E AT S E A ) Kur Ala U
zon
i N
k)
(BLACK SEA) T
Irma A
(Kizil I
N
Ku S
r
P
O N
T
(Keik I C M T S . M. Aragatz
it) L.Gagham
40○
G R E A T
S E A )
40○ (Sevam)
) ARGISHTINHINELE
ys Kizil Irm
ak) axes EREBOUNI
(Hal (Ar
Kur
se M. Ararat
DOUCKAMMA Euph rates Era
(Firat)
U R A R T U
ARZASKU Erase (Araxes)
S E A
l
KIA cani M. Sipan
Ara Sea ANIASTANIA
iri
Na
MALATYA per n)
T A U Up (Va TUSHPAH (VAN)
38○ R NIHIRIANI 38○
an
U S
Lowe
M
Seyh
de
O U N T A I N S
na
r Na
Tigris Sahand
ra
MARASH
iri Se
.
Pu
S A S S Y ARDINI/
T
n
R I
a
ha
Euphrates
A N
(Urm
M KUMAHA MUSASIR
Jey
S E M
ia)
Se
SAMAL O P I
fid
N R Grea
A E t Za MEYSHTA SIBAR
GREAT SEA OF
M KARKEMIS TIL BARSIB Tig b

A ris
AMURRU
ARPAD HARRAN
ALALAH SHNGAR NINIVE ARBAILU
36○ (ERBIL) IZIRTU 36○
RATSAPA ISANA
UGARIT QULHU IMGUR-ENLIL
36○ 38○ 40○ 42○ 44○ 46○ 48○
map 2 The Kingdom of Urartu, 9th-6th centuries BC.

source: Licensed under CC BY-SA 3.0 via Commons – https://commons
.wikimedia.org/wiki/File.
east and west to the south of lake Van. It finally yielded to the Mede Cyaxares
in 612 BC, but shortly afterwards it was incorporated by Cyrus into the Persian
(Achaemenian) empire.
Urartian is a language attested in cuneiform inscriptions which have survived
to the present day (also called Haldian, after the empire’s ruling deity H̬ aldi,
a warrior god who usually depicted standing on a lion), and is related both to
Hurrian to its west, and Lezgian and Avar to the north in the Caucasus. But
it did not survive the collapse of the Urartian state. Instead, the Armenians,
with their Indo-European language separate from Iranian, appear to have
replaced it. The first Armenian-speaking dynasty was called Orontid (prop-
erly Eruand, later Yervanduni). It is clear, however, that there was consider-
able bilingualism between Armenians and Iranians in this area, since the
vocabulary of the Armenian language (called Hai by its speakers) is full of
Iranian loans.
This replacement seems comparable to the emergence of English from
under French in 14th-century England, after that country recovered from the
demographic (and social) shock of the Black Death. Perhaps because there was
little social mobility before the shock, it had not been possible for the elite
lingua-franca (respectively Urartian or French) to permeate into lower orders;
but when the aristocracy were shattered, the vernacular could emerge.
In fact, there is no literacy to attest the actual language being spoken until
two more conquests had passed over and around the Armenians: first the
Macedonian Greek conquest from the late 4th century (led by Alexander,
reinforced by his general Seleucus, and under Antiochus III finally depos-
ing the Orontids in 212 BC); then the Roman conquest (the general Lucullus
defeated Pontus and Armenia at the battle of Tigranocerta in 66 BC). One long-
lasting effect of the Roman domination of Anatolia was the penetration of
Christianity. This reached Armenia at the end of the 3rd century AD (hitherto
largely a Zoroastrian country), and the country has the distinction of being
the first officially Christian state. This led after a century (in 405) to the inven-
tion of the Armenian alphabet by Mesrop Mashtots, and the translation of the
scriptures and liturgy into Armenian.
36○ 38○ 40○ 42○ 44○ 46○ 48○ 50○
Scale
S A R M A T I A ????? 0 50 100 km.
C
A 0 50 100 ml.
PITYOUS U
SEBASTOUPOLIS S
A
C A S P
KOL KH I S
NAKALAKEVI KUTATISI S
42○
U 42○
IBERIA
S
PHASIS
?? URBNISI MTSKHETA
B L A C K S E A ?? UJARMA M
I A N
? TCHOR
? AMINOS RUSTAVI A ????O U
???? L ?
N
B
A
N T
A I
TRAPEZOUS I A
KAPALAK N S
NEO CAESAREA
AMASEIA P O S LPNATS
N T I
C M O U N T A I N
??
Ge
??
R gh
?
40○ O ????? am 40○

M NIKOPOLIS YERVANDASHAT
es DVIN
S E A
???
A A ra x A r a xe s
??
N VAGHARSHAPAT ARTASHAT
SEBASTEIA tes PAYTAKARAN
E Euphra G R E A T E R A R M E N I A
M
???
? P ZAREHAVAN
IR ZARISHAT
NAKHICHEVAN
A r a xe s
CAESAREA E ????
k
MELITENE uni HER
T A B z nVa n )
38○ U R ( VAN/SHAMIRAMARKERT 38○
U S MARTYROPOLIS
M S
????
O U N T A I N
Z
????
Kap
Tig ri s
A
GERMANIKEIA AMIDA P E R S I A N
es
ANAZARBA
G
uta
rat
AFRAZA
ph
HIERAPOLIS
n
BIRTHA
Eu
MOPSOUESTIA ZEUGMA
O
EDESSA TELA Gr
KYRRHOS SELEUKIA APAMEIA ea GANJAK
S
ANTHEMOUSIA Tig t Za
NISIBIS
A
b E M P I R E
E
EUROPOS KARRHAI ???? ri s ZANJAN

NAL S
RHOSOS SHIZ
HIERAPOLIS
M
INTER
36○ 36○
T
ANTIOKHIA BEROIA SHNGAR

KHALKIS BARBELISOS URBILLUM
S.
36○ 38○ 40○ 42○ 44○ 46○ 48○
map 3 Greater Armenia in 5th century AD.

source: Licensed under CC BY-SA 3.0 via Commons – https://commons
.wikimedia.org/wiki/File:Armenian4thcenturies.gif#/media/
File:Armenian4thcenturies.gif.
132 Ostler
Pontus Euxinus domain of

Phasis IA
GTephelis
the alans
TRACE Sinopa
O R Daruband
Constantinople Heraclea PA P H L A G O N I A Amysos Trapezunt GE Gandza
EA Nicomedia Artanos Kabala
ST CH ALD I A Dvinis KH
ROM Nicaea
Ancyra Amasya Anis
AC
H EN S H I R V A N
AN EM Theodosyopolis
Pergamon DorylaeonPIRE Sebastea TA RO N A R M E NI A
lesbos Mantsikhert
Smyrna A N A T O L I A Van n
a
ak N
chios ur
IJA
Iconion A S A S U N s p
I va
I C Siss A
Attaleia
IL Hayas Edessa RB
C Tarsos Antiochia A S S Y R I A
A ZE
rhodus Mosul
crete SY R I A J A B E L
RUS
CYP
ABBASID CALIPHATE & CONQUERED BY TERRITORY OF EAST MODERN

OTHER MUSLIM STATES SELJUK TURKS BY 1081 ROMAN EMPIRE AND GEORGIA
BY 1060 GEORGIA AFTER 1081
map 4 Armenia and Seljuq expansion 11th century.

Source: AFTER http://www.conflicts.rem33.com/images/Georgia/geor_
histr%202.htm.
The Arsacids (a Persian dynasty) had gained control of Armenia in the first
century AD, and may have reinstated elite use of Persian in this region, but this
appears to have had little effect on continuing vernacular use of Armenian.
Politically and religiously, Armenia was even autonomous from Persia after 451,
hence effectively a monolingual country.
This would only change in the 11th century. First the Byzantine Greeks
invaded. This would have had little effect on language use, except in court cir-
cles. But then the Seljuq Turks (under Alp Arslan) invaded Armenia, as a first
fruit of the Turkish invasion of Anatolia more generally, enabled by their vic-
tory at Manzikert in 1071. This invasion actually involved long-term settlement,
hence penetration of Anatolia by the Oǧuz Turkish language, side by side with
Armenian, but not as an elite lingua-franca, since for the Turks this role would
be played by Persian (or the mixture of Persian and Turkish known as Lisân-ı
Osmânî – “Ottoman”).
So far, our story has extended over at most 2000 years. But for most of the
following millennium, the pace of change slows down. Armenian and Turkish
co-existed (with Greek) as vernacular languages within the domain of the
Ottoman empire.
The latest accident of history (in the last century) is to segregate languages
much more into discrete states, so that Ardahan is capital of a Turkish-speaking
area, while Armenian is spoken in the state of Armenia, capital Yerevan

(which some conjecture to be related to Yervand, the modern form of Oronta).
Meanwhile to the north, in the mountainous Caucasus, Kartvelian languages
unrelated to Armenian or Turkish, such as Georgian, Mingrelian and Laz, have
largely maintained their age-old locations.
In the Ardahan region over all these millennia, the languages which have been
endangered – indeed which have ultimately disappeared – have not been the
languages of down-trodden lower orders, but the reverse, the languages of
elites.
These are the languages which – ironically enough tend to survive in the
written record, because they are used for royal inscriptions: Urartian from the
10th to 6th centuries BC, Old Persian and Imperial Aramaic from the 6th to
3rd, Greek from the 3rd BC to 10th AD, Persian and Osmani from the 10th to the
20th AD. But actual speakers and users have ultimately deserted them.
It is the languages of the people, Kurdish, Armenian and Oǧuz Turkish
which have shown permanence: Kurdish and Armenian throughout the 3000
years we surveyed, (though the presence of Armenian in eastern and central
Anatolia have recently been restricted artificially); and Turkish ever since its
arrival with Seljuq settlers in the 11th century.
How has this divergence been possible, from the general human pattern
outlined at the beginning of this paper? Are not dominant classes more capa-
ble of hanging on to, and indeed spreading use of, their languages?
The answer seems to be that this only applies when dominant urban upper
classes are in contest with an urban proletariat. By contrast, farming, i.e. set-
tled cultivation of the land, is more effective in perpetuating a language com-
munity not only than the hunter-gather lifestyle which historically it replaced,
but also than the elite urban cultures which grow up, ultimately supported by
the surplus of food created by farming. It was the urbanized classes, upper and
lower, which were at risk of being “driven out” when their cities are conquered
and put under new rulers.
As an Irish immigrant put it in Margaret Mitchell’s popular 1936 novel
Gone with the Wind, “Land is the only thing in the world that amounts to
anything, . . . for ‘tis the only thing in this world that lasts, and don’t you be
forgetting it!”
Empires, then, do have an effect of oppressing, and ultimately suppressing,
minority languages within their domains, provided that there is a lively pros-
pect of recruitment of members of lower orders into richer and more domi-
nant classes. Demeaned classes may try to ape their presumed betters, leaving
their own languages behind. If this kind of mobility is ruled out, then there
134 Ostler
may be a polarization of languages, with a serf or peasant class continuing to

speak their language, and never finding means of promotion: this seems to
have been the position in mediaeval England, but also ancient Anatolia. The
language of the elite, though, may itself be under threat when power politics
leads to an overthrow of the elite by others; and in such changes the language
of the rural food producers is likely to be immune to change.
Reference
Burenhult, Göran. 2000. Die ersten Menschen, Augsburg: Weltbild Verlag.

chapter 11
Endangered Turkic Languages from China

Mehmet Ölmez
The Caucasus is perhaps the foremost region in the world in terms of endan-
gered languages. When we analyze the linguistic connection between Turkey
and the Caucasus, among the first languages that stand out are the Laz lan-
guage, and Hamshin Armenian. What I shall be focusing on in this study
however, takes us farther east to the Altai Mountains; a region that bears strik-
ing similarities in its topography to Ardahan and the Caucasus. A number of
Chinese sources name Jin Shan (金山 “Gold Mountains”) as the region Turkic
peoples settled in.1 On that note, let us mention the peoples that first appeared
in this area and belong to the Turkic ethno-linguistic group. Those that remain
to this day within the borders of China are:
1. Uyghur 维吾尔, 2. Kazakh 哈萨克, 3. Kirghiz 柯尔克孜, 4. Salïr 撒拉, 5. Tatar

塔塔尔, 6. Tuvan 图瓦, 7. Yellow Uyghur 裕固, 8. Uzbek 乌孜别克, 9. Fuyu
Kirghiz; Kïrkïs 柯尔克孜.
Only six group of peoples among those listed continue to effectively use their
native language. Of these languages, the speakers of Uzbek have recently given
up speaking their own language and started speaking Uyghur. Consequently,
it is hard to consider Uzbek as one of the Turkic languages spoken in China.
The same applies for Tatar and Fu-yü Kirghiz as well. Middle and Old genera-
tions speak Mongolian (“Ölöt Mongolian”) while the young generation speaks
Chinese. The related details will be given below.
In China, more than one language belonging to the Turkic language group
are spoken. And notably, we see isolated Turkic languages spoken in China.
In 1970s, nine Turkic languages existed in China. Unlike the above mentioned
classification, these languages can be listed based on population/speaker as
follows: Uyghur, Kazakh, Kirghiz, Salïr, Uzbek, Yellow Uyghur, Tatar, Tuvan
and Fü-yu Kirghiz. Although not up-to-date, the demographics of China can
be viewed on Chinese articles.2 The list shows the languages and the number
1 From 通典 tong dian (http://ctext.org/tongdian/zhs?searchu= 代居金山).

2 http://zhidao.baidu.com/question/97856583.html?si=7. Bold numerals belongs to the
website.
© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_0��

136 Ölmez
of people that speak them: 6. 维吾尔族 The Uyghurs 8.399.393; 17. 哈萨克

族 The Kazakhs 1.250.458; 32. 柯尔克孜族 The Kirghiz 160.823; 36. 撒拉族 The
Salïrs 104.503; 48. 裕固族 The Yellow Uyghurs (speakers of both Mongolian
and Turkic) 13.719; 49. 乌孜别克族 The Uzbeks 12.370; 53. 塔塔尔族 The Tatars
4.890, respectively, on the basis of population density. The Fuyu Kirghiz and
the Tuvans are not included in official censuses since they are not known or
recognized as a separate community during the minority identification works
executed in China.
In this paper, Uyghur, Kazakh and Kirghiz which are spoken in Xinjiang and
thus known closely by Turcologists; Uzbek, which has become almost Uyghur
and Tatar, will not be dealt with in detail. Apart from that, another Turkic com-
munity whose language has not yet been studied sufficiently is the Tarbagatay
Kirghiz living in Xinjiang. Their ethnology has been studied by M. Čertïkov3
and their language by Erkin Awgaly.4
The information I provided below is based on my previously published
studies, related sources, and personal studies in the field, as well as the Tubi̇tak
project numbered 108K413 which is connected with this study. My library,
which I use for a detailed study to be included in a book in the future, has not
been fully explored as of yet and thus a part of the important sources regarding
the subject has not been used in this paper.
Both the book of the project and separate books about various languages
included in the project will be published soon. I would like to extend my
thanks to Batubayr from Ürümči Pedagogy University, Toolay from Bowuršin
& Kanas, Arslan from Sunen / Gansu, Yunus from Xining / Qinghai, TÜBİTAK
and Korea Research Institute for the help and support they provided me dur-
ing my studies in the field.
Now let me say just a few words on already extinct or almost extinct Turkic
languages from Xinjiang.
Uzbek
The Uzbeks living in China do not have their own autonomous prefecture.
Having been scattered to the north and south of Tian Shan 天山 (Uyg. Tengri
Tagh; literally “God’s Mountains”) in Xinjiang Uyghur Autonomous Region,
most Uzbeks live in the cities. The Uzbek population is centered in İli, Kashgar,
Yarkant, Urumchi, Aksu, Uchturpan, Chöchek and Karghilik. The population is
3 M. A. Čertïkov, Tarbagataiskie Kırgızı.

4 According to private communication about topic with Erkin Awghaly after his field research.
Endangered Turkic Languages From China 137
about 20,000 (Özbek Edebiyati Tarihi, 2005, p. 9). In China, a written language
or an alphabet peculiar to the Uzbeks do not exist. As mentioned above, the
Uzbeks use Uyghur as a written language. Adalaiti Abdulla’s study will give us
more detailed information (Abdulla 2013).
Tatar
According to the 2006 census, the Tatar population in China is around 5400.
Most of the Tatars live in Xinjiang Uyghur Autonomous Region, densely in
Urumchi, Gulja and Chöchek. Like the Uzbeks, the Tatars do not have their
own written language or alphabet. Ersin Teres’ study will give us new informa-
tion (Teres 2011).
Within the scope of my research, let us now analyze the four languages three
of which are spoken outside of Xinjiang, and one in Xinjiang.
Fuyu Kirghiz
The Fuyu Kirghiz are a turkic community whose language and society we
have the least information and material on. The earliest information we have
about the subject relies on the collected works and studies of Hu Zhenhua,
faculty member of Kirghiz/Kazakh Language and Literature Department
in Zhongyang Minzu Daxue (Minzu University of China; formerly known as
Central University for Nationalities). The information I will be providing here,
however, is mainly from the field work that the faculty members of Philology
Department in Seoul University, my colleagues from Korea Altay Research
Institute and I did in September, 2003. The Fuyu Kirghiz live in the villages of
Fuyu County, which is under the administration of Qiqihaer City in China’s
Heilongjiang (widely known as Manchuria) province. Named after the afflu-
ents of the Amur River (Heilongjiang “Black Dragon River”), Fuyu is a residen-
tial area of 30,000 people and 1 hour away from Qiqihaer. Although the Fuyu
Kirghiz are scattered around a few villages, they mainly live in two villages. The
research group I was a part of made a compilation in the villages of Wujiazi and
Qijiazi in the September of 2003 (my studies on the Fuyu Kirghiz people and
Fuyu Kirghiz was carried out within the frame of “Korea Research Institute”
project numbered KRF-2003-072-AL2002).
Since Hu Zhenhua visited this region, the number of speakers of Fuyu
Kirghiz have decreased and as of 2000s no native speakers remain. Today, the
138 Ölmez
Fuyu Kirghiz use Ölöt Mongolian and Chinese in their daily lives. During the
compilation work we did among the Fuyu Kirghiz, whose number reaches up
to 1000, we could talk to only seven people. Two of them were over 50 and the
others were over 70. Our best source who was over 70 stated that she used their
native tongue when she was 18. The population of Wujiazi and Qijiazi villages
consists of different communities and the Fuyu Kirghiz make up nearly half of
it. Despite all our efforts, it was not possible to fill in the half of 3000 question-
naires. However, even the limited material compiled is enough to show that
this language is a branch of Khakas and Shor. After all of the compilations are
discussed and published, useful results will come up for South Siberian Turkic
languages in particular and for all Turkic languages in general. The first studies
related to the data we collected were published in 2007 and 2010/2011 (see. Li,
Ölmez and Kim, 2007 and 2010/2011).
With respect to the importance of Fuyu Kirghiz, we can safely conclude
from our work based on limited language material that Fuyu Kirghiz is a
Turkic language that falls into the same category as South Siberian Turkic lan-
guages and basically modern Khakas written language. On the basis of the data
related to that language, we can reach several results about the phonetic prop-
erties and vocabulary of South Siberian Turkic languages and modern Turkic
languages. First of all, the general belief among the local people is that they
came to that region after the second half of the 18th century. In fact, after the
Manchus started to rule over China, the Manchus had several Mongol clans
settle (mainly Dagur Mongols) in the region. Among those clans coming from
Southern Siberia, there was the Fuyu Kirghiz who today call themselves Kirghiz
and speak the same Turkic language with today’s Khakas.
When Fuyu Kirghiz is compared with Old Turkic and modern Turkic lan-
guages in the light of previous studies, a few characteristic features can be
listed as follows:
1. Old Turkic and Common Turkish sound y- becomes c-: OT yap- “to cover,
close” = ǰap-; OT yay “summer” = ǰay; OT yė- “to eat” = ǰe-; OT yürek “heart”
= ǰürüh;
2. Old Turkic and Common Turkish d sound becomes z: uzï- “to sleep”, ɢizin-
“to get dressed”, güzi “groom”;
3. Final -g of a word becomes -h: OT tag “mountain” = tah, OT sarïg “yellow”
= sarïh, OT ulug “great, big” = uluh.
In order to make a comparison with modern Turkic languages, I can give a few
short sentences from our field reseach result:
sïn kaya tursïn “where do you live?”

sin mï tart! “(You) Smoke (a cigarette) too!”
men mïnda gïlgïn uşgun boltïr “it has been three days since I came
here.”
aragï ïzïrtïr “(alcoholic) drinks make you drunk.”
min at mïlbat gïlçïk ~ mïn at mïl gïlçïk “I rode a horse and came here”
min bozïn aldïr “I am taking (it) on my own”
sïn miniŋ piçeŋ buzïh “you are a little older than me”
bïs sïnïŋ olïh “we older than you”
What we see here as tart- (to smoke) corresponds to that of most Turkic
languages.
Likewise, aragï “drink” has the same root as Turkish rakï (a kind of alcoholic
beverage), now common in all Altaic languages as a borrowing from Arabic
through Persian (?).
ïzïrt- is related to Old and Middle Turkish esür-, esürt- “to become drunk”
(cf. Anatolian dialects: esirik, esirikli “fractious; little bit crazy”).
In the fifth sentence, gïlçïk is not anything else but Old Turkic < kelyük “has
come” (for details see. Ölmez, 2007).
boz is another form of Old Turkic bod “himself”.
buzïh is Fuyu Kirghiz form of Old Turkic bedük “big”.
olïh is the word ulug in Old Turkic.
Regarding these examples, you can see Ölmez, 2001; Ölmez 2006a, b and c.
The first work on the Fuyu Kirghiz and their social life was prepared by Gundula
Salk and Mambet Turdi. The last and detailed work belongs to Mixail Čertïkov
(Čertïkov, 2008a and 2008b).
Below I provide the words “moon”, “day” and some number names in order
to show the phonetic similarities or differences between Fuyu Kirghiz and
Turkic languages and Turkish.
Salïr
The first data on the Salïrs are based on the works conducted in the West which
are higher in number when compared to Turkish sources. At this point we need
to mention Russian scholar Potapov’s compilations based on the last decades
of the 19th century (and based on that N. Poppe’s work), the compilations and
publications of E. R. Tenišev, text publications of Zsuzsa Kakuk, the works of
anthropologist Kevin Stuart, the articles of R. Hahn and recently the works of
Arienne M. Dwyer.
140 Ölmez
We owe the primary information about the Salïrs to Kashgari, specifically

the part where the Oghuz clans are mentioned and the words Turkmen and
boy (clan) are explained. As we know, the Oghuz consist of twenty-four clans.
One of them is the Salïr. Sufficient information regarding the Salgurs/Salurs
can be found in the sources that give information about the Oghuz and the
Oghuz clans. For the historical resource entries about the name and the clan
of Salur, Salur Atabegs and Salur settlements in Anatolia, see Sümer, 1980,
pp. 336-344; 447-448.
When the Salïrs are discussed in Turcological studies, their language and
geography are almost always mentioned and covered with those of the Yellow
Uyghurs. However, both the culture (religion, clothing, production styles) and
geographies of these people are different. The two languages and peoples have
been covered under the same entry in basic reference sources and handbooks
published since 1959. Yet, it takes approximately 10 hours of a trip with the
vehicles found in the region from Xining, the capital city of Xunhua where
the Salïrs live, to the Sunan region where the Yellow Uyghurs live (for similar
information and opinions see Dwyer, 2007, p. 1, footnote 1). While the Salïrs
live together on the “Salïr” plain surrounded by the Huizus (Muslim Chinese
people) and Tibetan villages, nearly two hours away from Xining, the Yellow
Uyghurs live on the tableland which is two hours away from the city of Zhangye
of Gansu province.
There is no precise information about their population, either. According
to the records, their population was 40,000 in 1960s, 56,000 in 1978 and 69,000
in 1982 (Schwarz, 1984, p. 39). According to the 1990 census, they had a popula-
tion of nearly 65,000. The data belonging to the year 2000 suggests a popula-
tion of about 100,000. A considerable portion of this population lives in Salïr
(Xunhua), in Salïr Autonomous County (循化撒拉族自治县 Xunhua Salazu
Zizhixian), the city of Xining, Huizu Autonomous County of Gansu and Ili pre-
fecture of Xinjiang (for Ili Salïrs see Bibliography, Yakup, 1988 and 2002).
The Muslim Salïr people are very similar to the Huizus and the Muslim Han
Chinese in terms of culture (religion, clothing, production styles). They give
and take girls as wives; and share the same mosque (mišit, mišt). According
to my colleague Ma Wei (Yunus), the most important difference is maybe the
wedding traditions and ceremonies.
The Salïr people are involved in commercial activities in the whole region
(Qinghai / Tibet Autonomous Region and the capital of Tibet) and in many
cities of the People’s Republic of China. Mainly engaged in restaurant manage-
ment, the Salïr people run restaurants known as “Lanzhou Restaurant” which
are in compliance with Islamic standards. Not all of these restaurant owners
are Salïr however; most of them are Muslim Chinese, the Huizus.
As to the origin of the Salïr, we cannot find detailed information about their
history in Chinese resources. Related to the subject, one-folio long data found
and published by Tenišev is important and interesting (Tenišev, 1977); accord-
ing to the story circulating among themselves, they came to the region from
Samarkand at the end of the 14th century; they chose that place as their home-
land since the soil and water was similar to those of their motherland and the
accompanying camel, which was carrying a Qur’an on his neck, was found pet-
rified exactly at that point (Altiūli) after getting lost: that event is the subject of
the story named döye yül ~ döye yuli “camel spring” (Chin. 駱駝泉 luo tuo quan).
The altitude of the land they live is about 1800 meters. In a region where the
Yellow River originates and reaches the inner parts of China, the Salïr usually
settle along the “river”. Since it is the sole and closest river, the Salïr call the
Yellow River morun ~ morïn (< Mo.) for short; they never use the word “yellow”.
They have very good relationships with the Tibetans, especially with those liv-
ing on the mountainous parts. The Tibetan villagers usually shop from Salïr
shops. They speak Salïr effectively among themselves and in their daily lives.
They are engaged in agriculture relying on irrigation on vast plains. Apart from
Qinghai, there are a few thousand Salïr living in the city of Gulja, in Xinjiang
Uyghur Autonomous Region (I will deal with this issue in another paper; the
details are in the book of Abdurishid Yakup, Yakup, 2002). Those living within
the borders of Gansu (according to the information Yunus provided) have
almost forgotten the Salïr language.
Today, the Salïr call Xunhua as “Salïr” among themselves: Men salïrga
(~ Salïːga) vargur (~ vaːgur) “I will go to Xunhua”. They don not use the name
Xunhua.
With intensive efforts to learn their history and language as of 1990s, the
Salïr started to publish a periodical, which includes Latin lettered texts, issued
twice a year since 2008.
Distinctive Features of the Salïr Language

When compared with other Muslim Turkic language groups in Central Asia,
Chaghatay and Kipchak groups, Salïr language distinguishes itself in terms of
both grammar and vocabulary. When we compare it with Turkmen and Turkish,
however, it partially shows an archaic characteristic. On the other hand, it has
lost many grammatical elements in comparison with Yellow Uyghur which has
far less speakers (about 5000 people). For example, it is almost impossible to
see the first person affix for present when conjugating verbs. Palatal and labial
harmony may not be seen in case of vowels. When we consider the vocabu-
lary, we see that local Tibetan and Chinese have enormously affected Salïr and
the words borrowed from these languages have replaced the words of
142 Ölmez
Turkic origin. Today, Tibetan gača ~ geče are used even for the words “word,
utterance, language”:
Yaxçux gačanï hemme kiš yišar, yaxçux išnï hemme kiš etmes “anybody
(can) say a good word, not everybody does a good job / not everybody
can exhibit good behaviors.”
We separate the Salïr language from Kipchak and Chaghatay and approximate
it to the Oghuz family of languages since the Salïr use the verb et- ~ ėt-, and the
word el “hand” (in Kipchak and Chaghatay qol is used instead). Certainly, com-
mon use of the sound d- (dört “four” etc.) at the beginning of a word plays an
important role in this approximation. (Several important features along with
this will be discussed in detail in another paper; this paper is a report of a short
field work carried out in August, 2009).
The number systems do not go over “forty”. Especially the men, middle-aged
people or the elderly do not know much about numbers. As to the women, I
have heard the numbers forty, fifty, sixty, even from the young and middle-aged
ones. We can relate this to the fact that the men are in more contact with other
communities then women(the Chinese, the Tibetan). According to Musa Haji
at the age of 64:
Bir elli ma on dört “fifty and plus fourteen, fifty-four” (Ol Turkic *bir elig
yme on tört)
Below are the words which I heard on the first day and noted in a hurry in order
of subject and hearing. More information on the material I recorded and the
resources I collected will be included in another paper.
Atuh (~kȯp) išse kursagïm agrïr. “if I eat a lot, I will become ill; if I eat a lot,
I will have a stomachache”
Saŋa bala neče (~neǰe?) var? segis, oːl dört, ane dört; suŋzï on segis vara. “how
many kids do you have? Eight, son four, daughter four, grandchildren eighteen”.
sïh “right, right hand side; well, healthy”
dal “tree”
ağaš “wooden, timber; wood”
ağašli “forested; the forested village”
emih “bread”
hos “walnut”
ǰiǰek “flower”
ėt- “to do, to make”
kini “spouse, wife”

gadïn “woman”
ame “mother” (ana “mam” is also known despite being an old word)
ape “father” (ata “forefather” is also known despite being an old word)
kiš “person, human, man”
Salïr and Modern Uyghur can be compared in terms of similarities and differ-
ences with the following examples:
Salïr Uyghur
sarï sėriq “yellow”
sağal saqal “beard”
satïhǰï sėtiqçi “seller, merchant”
satïhlï iş soda “trade”
sen- öçmek “go out”
sender- söndürmek “extinguish”
sinih söŋek “bone”
sïh oŋ “healthy”
soğan piyaz “onion”
sor- sori-/sora- “to ask”
sorma harak “drink, alcoholic beverage”
suva- ussi- “to become thirsty”
süt süt “milk”
For brief information on the Salïr see Ölmez, 2012.
Yellow Uyghur
The population of the Yellow Uyghur living in the Gansu province of China,
in Sunan Uyghur Autonomous Region close to the city of Zhangye and on the
surrounding tablelands is over 10,000. Originally called Yellow Uyghur, this
community refer to themselves shortly as Yogur ~ Yugur (Chin. Yugu). Those
who speak a Turkic language are called Sarïg Yogur, while the others speaking
Mongolian are called Shira Yogur. Their names are Western Uyghur (Xibu Yugu
西部裕固) and Eastern Uyghur (Dung Yugu 东部裕固), and their languages are
Western Uyghur and Eastern Uyghur (Xibu Yuguyu and Dongbu Yuguyu 西部裕
固语 and 东部裕固语), respectively.
The history and the language of the Yellow Uyghur belonging to old peri-
ods are known quite well compared to the other Turkic people in China. The
144 Ölmez
Yaglakar clan of the Uyghurs who established the khaganate and ruled for
almost a hundred years (744-840) in Mongolia after the Uyghurs ended the
Turkish rule (Tujue 突厥) live in the Mongolian-speaking region (see Ölmez
2012). We can look at Ariz, 2002, a work that gathers other works regarding the
Yellow Uyghur.
Living on the tablelands around Sunen, today the Yellow Uyghurs move to
the tablelands of 2000-2500 metres high. With the coming of Spring, they go up
to 3000-3500 metres high tablelands and to 4000 metres high tablelands with
the coming of Summer (personal information received from Arslan).
We need to mention G. N. Potanin as the first Western scholar doing research
about the Yellow Uyghur. Having organized excursions to the Tangut region
between the years 1884-1886, Potanin visited Yugur villages and residences; he
collected data about the Mongolian-speaking Shira Yugurs and Turkic-speaking
Kara Yugurs. According to Potanin, the Kara Yogurs are divided into two fac-
tions as Yaglak and Hurungut. These factions also are divided into smaller
families (= otok). In the following years (1906-1908), C. G. Mannerheim arrived
in the region and compiled texts both from the Yellow Uyghur and the Shira
Yugurs. That was followed by Malov’s excursions (1909-1913) and detailed stud-
ies. After Malov, E. Tenišev took part in the activities of the People’s Republic
of China about compiling Chinese minority languages; he made publications
about text, dictionary and grammar studies. Afterwards, Lei Xuanchun and
Chen Zongzhen made related publications in China. Today, Martti Roos, Erkin
Ariz and Zhong Jingwen conduct studies in the field.
With regard to the language of the Yellow Uyghur, they generally have been
assumed to be the descendants of the Old Uyghurs presumably due to their
names and some secondary language properties. As we mentioned above, it
would be more accurate to classify them as the relatives of the Uyghurs who
migrated to the region from Mongolia rather than assuming them to be the
direct relatives of the Turfan Uyghurs. They have, naturally, connection with
the Turfan Uyghurs. We can compare that with the migration of some Buddhist
Uyghurs to the east, Dunhuang due to expansion of Islam in the Turfan region
and with the Old Uyghur Altun Yaruk Sudur found by S. Ye. Malov.
We should note that Yellow Uyghur shows similarities with Khakas language
in terms of some phonetic evolution: OT -d-, -d becomes -z-, -z (OT adak “foot”
> YUyg. azak, OT adgïr “stallion” > YUyg. azgïr, OT ïd- “to send” > YUyg. ïz-).
However, it differs from Khakas in some aspects, for example OT vowel y- regu-
larly becomes č- in Khakas, while sometimes it retains itself in Yellow Uyghur:
OT yïl “year”, Khak. čïl, YUyg. yïl; OT yïltïz “root”, YUyg. yiltïs; OT yigit “young;
strong” Khak. čit, YUyg. yïgït, yigit. OT consonant b- regularly becomes p- in
Khakas while it is seen both as b- and p- sound in Yellow Uyghur (see Ölmez,
1996 and 1998).
To put it precisely, in Yellow Uyghur the OT consonant -g seen at the end of
the polysyllabic words is retained as -k/-g but the consonant d becomes z as in
the above given examples.
Another old feature of Yellow Uyghur is seen in the number system. It shows
similarities with Old Turkic: yidigirma “17” < yėti yėgirmi; sagïs yigirma “18”
< sekiz yėgirmi, per otut “21” < bir otuz (see Clark, Geng & Clark).
In some words, glottalization occurs before the unvoiced consonants k and t.
YUyg. ahldï “altï” < OT altï, YUyg. tohɢïs “dokuz” < OT tokuz.
Tuvan
Unlike the above mentioned Fuyu Kirghiz, Salïr and Yellow Uyghur, Tuvan is
not a Turkic language exclusively spoken in China. Most Tuvans live in the
Tuva Republic within Russia (in fact an autonomous region), in several cities
of Russian Federation and in some regions of Mongolia. The Tuvan language
spoken in China does not differ much from the original Tuvan; therefore, I will
not dwell on the language of the Tuvan people of China in this short paper. I
will just touch upon the differences between the Chinese Tuvan and the Tuvan
spoken in Tuva. For Tuvan, see Ölmez, 2007, p. 25 and other. For the Tuvans liv-
ing in Mongolia see Erika Taube’s works. The Tuvans of China became known
better with the works of Geng Shimin, Talant Mawkanuli and Song Zengchun.
The Tuvans living in Mongolia, their folklore, population and traditions have
long been known thanks to the studies of Erika Taube. However, the villages
and the towns and the population of the Tuvans in China are not documented
as much when compared to Mongolia. Therefore, Marina V. Monguš went to
the region, conducted field work and gave information about the Tuvan vil-
lages and towns with pictures. According to that information, the Tuvan living
in China belong to the Ak Soyan (White Soyan) and Kara Soyan (Black Soyan)
clans (2002, p. 21). Marina Monguš touches on the current situation of the
Tuvan language and adds an interview at the end of her book. The interview
made with Daš Čömblöv is given below. I would like to add that I interviewed
the same source in Altay in 2004 (2002, p. 105):
– Башкы, силер каш дыл билир силер?

– МоолдуⱧ дылын билир мен.
КазактыⱧ дылын билир мен. Мончак дыл билир мен. Кыдатча
бичии-бичии билир мен.
146 Ölmez
– Teacher, how many languages do you speak?

– I speak the language of the Mongols. I speak the language of the
Kazakhs. I speak Tuvan. I speak very little Chinese.
(Monguš, 2002)
As is understood from the answer, Daš Čömblöv speaks three languages very
well but speaks the fourth language poorly. The interview is from the year 1993.
Yet, considering my personal observation I can safely state that the situation of
the language in the Altay region of China has changed quite a lot since then;
the Tuvan improved their Chinese; especially the Tuvan state officers speak
Chinese very well. Those who receive education in Urumchi also speak Uyghur
in addition to these four languages. To conclude, a well-educated “Monchak”
speaks five languages including their mother tongue.
This interview was made in 1993 with Daš Čömblöv who was born in 1962.
The interview was written with present-day Tuvan alphabet but the punctua-
tion reflects the punctuation of the Tuvans living in China. For example, in
Tuva хꝋвей is used with x-, however, only кꝋвей is used in the interview made
with Daš Čömblöv. Also, the word газыр which is derived from Kazakh qazïr is
used to mean “now, at the moment, today” (Kazakh < Arabic). Another notice-
able usage is the use of ивяааш instead of Tuvan эвээш.
In the later parts of the interview, Daš Čömblöv Oronbayv briefly tells how
their grandfather migrated from Russia to Kanas and met other Tuvans there;
according to him, that migration occurred in 1913. Two women whom I talked
to in Kanas told me a similar story. They, however, moved from Mongolia.
As mentioned before, the Tuvan population concentrated in 3 villages:
Kanas (Hanas), Akkaba and Kom. Kom is essentially a Tuvan village. Despite
being mainly populated by the Tuvans, Kanas is also home to the Kazakhs and
the Mongolians. Half of Akkaba’s population is Kazak and the other is Tuvan.
The Kazakhs call the Tuvan as “Kök Monchak”. The Tuvans, however, call
themselves diva or monchak. For population etc. see Mawkanuli, 1999, pp. 1-36;
Ölmez, 2007, pp. 25-29; Yolboldi and Kasi̇, 1987, pp. 287-289. Among the places
where the Tuvans live are the cities of Bowurǰin and Altay, and the towns of
Köktogay and Lamajao. Since it is difficult to distinguish the Tuvans from the
Mongols regarding lifestyles and beliefs, the Tuvans were accepted as Mongols
and were not included among the 56 minority groups in China’s censuses.
Chinese Tuvan was first made known by Geng Shimin. Russian Mongolist
B. H. Todayeva met people with a language resembling Mongolian but unfa-
miliar to him while he was identifying and recording Mongol languages in the
Altay region. With the opinion that the people were speaking a Turkic lan-
guage, he informs Geng Shimin, and thus the research of Geng Shimin begins
(this is what I was told by Geng Shimin).
Unlike Tuvan, words do not start with an h- sound but always a k- sound:
kep, kerek, köl, kün, küreş (Tuvan hep “shape, form”, herek “necessary”, höl “lake”,
hün “day, sun”, hüreš “wrestling”) etc.
In Tuvan standard written language č- is systematically used instead of OT
y-, but there are some words pronounced with ǰ- sound. In Chinese Tuvan,
however, only c- sound occurs: ǰan-, ǰït-, ǰi- (Tuvan čan- “to turn”, čït- “to sleep”,
či- “to eat”).
The Tuvan language spoken in Chinese Altay includes many Kazakh
(<Arabic) words such as mekeme, mekdep etc. while Tuvan from Russia do not.
We can compare some words used in both regions:
Chinese Tuvan abïš, Tuva adïš “palm”

Chinese Tuvan töödö, Tuva tödü “whole, all”
Chinese Tuvan din, Tuva šažïn “religion”
Chinese Tuvan ayïïr, Tuva ača “fork”
Bibliography
Abdulla, Adalaiti. 2012. Field Research on the Uzbek Language in China. Dilleri ve
Kültürleri Yok Olma Tehlikesine Maruz Türk Toplulukları, 4. Uluslararası Türkiyat
Araştırmaları Sempozyumu Bildirileri, ed. M. Erdal, Y. Koç, M. Cengiz, Ankara,
pp. 25-34.
Ariz, E. 2002. Sarı Uygur Araştırmalarının Toplu Bir Kaynakçası. KÖK Araştırmalar: Kök
Sosyal ve Stratejik Araştırmalar Dergisi: Vol. IV, S 1, pp. 117-148.
Chen, Z. 2004. Xibu Yuguyu yanjiu: Beijing.
Clark, L. 1996. Sarig Yugur Historical Linguistics and Early Turkic Counting system.
Turfan, Khotan und Dunhuang. Vorträge der Tagung „Annemarie von Gabain und
die Turfanforschung“, veranstaltet von der Berlin-Brandenburgische Akademie der
Wissenschaften in Berlin (9.-12.12.1994): (R. E. Emmerick, W. Sundermann, I. Warnke
u. P. Zieme, Ed.), pp. 17-49.
Çağatay, S. 1962. Sarı Uygurların Dili. Türk Dili Araştırmaları Yıllığı Belleten 1961:
pp. 37-42.
Čertïkov, M. A. 2008a. “Fuyuyskie Kırgızı. İstoriko-etnogragiçeskoe Issledovanie” [I].
Sibirische Studien: Vol. 3, S 1, pp. 45-126.
Čertïkov, M. A. 2008b. “Fuyuyskie Kırgızı. İstoriko-etnogragiçeskoe İssledovanie” [II].
Sibirische Studien: Vol. 3, S 2, pp. 141-256 (and following articles).
Čertïkov, M. A. Tarbagataiskie Kırgızı. http://www.kyrgyz.ru/?page=297.
Dwyer, A. M. 2007. Salar, A Study in Inner Asian Language Contact Processes, Part I:
Phonology: Wiesbaden.
148 Ölmez
Geng, S.; Clark, L. 1992-93. Saryg Yugur Materials. Acta Orientalia Academiae Scientarum
Hungaricum: Vol. XLVI, pp. 189-224.
Geng, Shimin. 2001. Materials of Tuvanian Language in China (4). Türk Dilleri
Araştırmaları: Vol. 11, pp. 5-21.
Hahn, R. F. 1998. Yellow Uyghur and Salar. The Turkic Languages: (Ed. L. Johanson and
É. Á. Csató), Routledge, London and New York, pp. 397-402.
Hermanns, P. M. 1942-44. Uiguren und ihre neuentdeckten Nachkommen. Revue
Internationale d’Ethnologie et de Linguistique / Internationale Zeitschrift für Völker-
und Sprachkunde: ANTHROPOS, Xxxv-Xxxvi, 1940-1941, pp. 78-99.
Hu, Zhenhua H. 1983. Heilongjiang Fuyuxiande Ke’erkezi-zu ji Qiyuyantedian.
Zhongyang minzuxue yuanxue bao: pp. 65-69.
Hu, Zhenhua H. 1991. Heilongjiangsheng Fuyuxiande Ke’erkezizu ji Qiyuyantedian.
Zhongguo Tujueyu Yanjiulunwenji: pp. 253-263.
Hu, Zhenhua H.; Imart, G. 1987. Fu-Yü Gırgıs: A tentative description of the easternmost
Turkic Language: Bloomington, Indiana.
Kakuk, S. 1961. Textes salar. AOH XIV: Vol. 1, S 2, pp. 95-117.
Lei, X.; Chen Z. 1992. Xibu Yugu Hanyucidian: Chengdu.
Li, Y.; Ölmez, M.; Kim, J. 2007. Some New Identified Words in “Fuyu Kirghiz (Part 1)”.
Ural-Altaische Jahrbücher: Vol. 21, pp. 141-169. Neue Folge.
Li, Y.; Ölmez, M.; Kim, J. 2010/2011. Some New Identified Words in “Fuyu Kirghiz
(Part 2)”. Ural- Altaische Jahrbücher: Vol. 24, pp. 165-188. Neue Folge.
Malov, S. Ye. 1957. Yazık jyoltıh Uygurov: Slovar’ i grammatika: Alma-Ata.
Malov, S. Ye. 1967. Yazık jyoltıh Uygurov: Teksti i perevodı: Moskva.
Malov, S. Ye. 2005. Sarı Uygurlarda Şamanlığın Kalıntıları. (Çev. M. Duranlı). Türk
Dünyası İncelemeleri Dergisi. Vol. 5, Nr. 2, pp. 391-399.
Mannerheim, C. G. E. 1911. Sarö and Shera Yögurs. Journal de la Société finnoougrienne:
Vol. 27, Nr. 2, pp. 1-72 + 5˚ + 1 harita.
Mawkanuli, T. 1999. The Phonology and Morpfology of Jungar Tuva: Unpublished
Doctoral Dissertation. Indiana University.
Mawkanuli, T. 2001. The Jungar Tuvas: language and national identity in the Prc. Central
Asian Survey: Vol. 20, S 4, pp. 497-517.
Mawkanuli, T. 2005. Jungar Tuwan Texts: Indiana University.
Monguş, Marina V. 2002. Tuvintsı mongolii i kitaya. Etnodispersnıye gruppı (İstoriya i
sovremennost’): Novosibirsk.
Ölmez, M. 1996. Sarı Uygurlar Sarı Uygurca. Çağdaş Türk Dili, Nr. 98, pp. 31-37.
Ölmez, M. 1996. Tuvalar ve Tuvaca. Çağdaş Türk Dili, Nr. 95, pp. 10-17.
Ölmez, M. 1998. Potanin’s Yellow Uigur Material and its Importance Today. Studia
Turcologia Cracoviensia 5: (= Languages and Culture of the Turkic Peoples,
M. Stachowski, Ed.), pp. 149-182.
Ölmez, M. 2001. Fuyu Kırgızcası ve Akrabaları. Türk Dilleri Araştırmaları: Vol. 11,
pp. 137-152.
Ölmez, M. 2006a. Fuyu Kırgızcasında ‘değil’. Ege Üniversitesi, Türk Dünyası Araştırmaları
Enstitüsü, I. Uluslararası, Türk Dünyası Kültür Kurultayı 9-15 Nisan 2006, Çeşme-:
İzmir.
Ölmez, M. 2006b. Fuyu Kırgızcasında Geçmiş Zaman Biçimleri. Sibirische Studien:
Vol. 1, S 1, pp. 117-124.
Ölmez, M. 2006c. Fuyu Kırgızcası Hakkında Yeni Bilgiler ve Türkolojiye Katkıları.
Sibirische Studien: Vol. 2, S 1, pp. 65-70.
Ölmez, M. 2007. Tuwinischer Wortschatz (mit alttürkischen und mongolischen
Parallelen), Tuvacanın Sözvarlığı (Eski Türkçe ve Moğolca Denkleriyle): Wiesbaden.
Ölmez, M. 2012. Oğuzların En Doğu’daki Kolları: Salırlar ve Dilleri. Türk Dili: C Ciii, Nr.
732, pp. 38-43.
Ölmez, M. 2013. Çin’deki Türk Dilleri ve Bugünü. Dilleri ve Kültürleri Yok Olma
Tehlikesine Maruz Türk Toplulukları, 4. Uluslararası Türkiyat Araştırmaları Sempo-
zyumu Bildirileri, ed. M. Erdal, Y. Koç, M. Cengiz, Ankara, pp. 397-422.
Ölmez, M. 32015. Moğolistan’daki Eski Türk Yazıtları: Ankara.
Özbek Edebiyati Tarihi: 2005. Ürümči, Šincaŋ Helk Nešriyati.
Poppe, N. 1953. Remarks on the Salar language. Harvard Journal of Asiatic Studies:
Vol. 16, N 3-4, pp. 438-477.
Potanin, G. N. 1950. Tangutsko-Tibetskaya Okraina Kitaya Tsentralnaya Mongoliya:
Sankt-Peterburg, 1893, Sobranie Slov Salarskago Nareçiya: The new edition, Moskva
(1950), pp. 426-434.
Qualin, M.; Wanxiang, M.; Zhicheng, M. 1993. Salar Language Materials: (K. Stuart
Ed.). Sino-Platonic Papers Nu: 43, Department of Asian and Middle Eastern Studies
University of Pennsylvania, Pennsylvania.
Saguchi, T. 1986. Historical Development of the Sarïgh Uyghurs. Memoirs of the
Research Department of the Toyo Bunko: Vol. 44, pp. 1-26.
Salk, Gundula; Mambet, Turdı. 1998. The “Fu-Yu Gırgıs” according to the present-day
situation and the legendary past: Kraków.
Schönig, C. 1997. A New attempt to classify the Turkic languages. Turkic Languages: Vol.
1-2, pp. 117-133; pp. 262-277.
Schönig, C. 1998. A New attempt to classify the Turkic languages. Turkic Languages: 3,
pp. 130-151.
Schönig, C. 1998. Bemerkungen zum Fuyu-Kirgisischen. Bahşı Ögdisi. Festschrift für
Klaus Röhrborn/Klaus Röhrborn Armağanı: (ed. von J. P. Laut, M. Ölmez). Freiburg/
İstanbul, pp. 317-340.
Schönig, C. 1999. The Internal Division of Modern Turkic and its Historical Implications.
Acta Orientalia Academiae Scientiarum Hungaricae: Vol. 52, p. 85.
Song, Z. 1981. Tuwayu yanjiu: Beijing.
Taube, E. 1993. Zur aktuellen Situation der Tuwiner im westmongolischen Altai. http://
userpage.fu-berlin.de/~corff/im/Landeskunde/Taube_Tuwa [December 1993].
150 Ölmez
Tenišev, A. R. 1963. Salarskiy Yazık. Moskva.

Tenišev, A. R. 1976. Stroy salarskogo yazıka. Moskva.
Tenišev, A. R. 1976. Stroy Sarıg-yugurskogo yazıka. Moskva.
Tenišev, A. R. 1997. “Salarskiy Yazık”, Yazıki Mira: Tyurkskie yazıki, Moskva, pp. 335-345.
Tenišev, E. R.; Todeyava, B. H. 1966). Yazık jyoltıh Uygurov: Moskva.
Teres, E. 2011. Çin Tatarcası Grameri: Ankara.
Thomsen, K. 1959. Die Sprache der Gelben Uiguren und das Salarische. (J. Deny, et al.
Ed.) Philologiae Turcicae Fundamenta 1, Wiesbaden, pp. 565-568.
Thomsen, K. 1989. Sarı Uygurların Dili ve Salarca. Türk Dili Araştırmaları Yıllığı Belleten
1985: (Tr. İ. Çeneli), pp. 191-197.
Wei, M.; Jianzhong, M.; Stuart, K. 2001. The Folklore of China Islamic Salar Nationality:
Vol. 15, The Edwin Mellen Press, Chinese Studies, Lewiston.
Wu, H. 1999). Tuwayu yanjiu: Shanghai.
Yakup, A. 1998. A Sample of Oral literature of Xinjiang Salars. Türk Dilleri Araştırmaları:
Vol. 8, pp. 49-72.
Yakup, A. 2002. An Ili Salar Vocabulary. Introduction and Provisional Salarenglish
Lexicon: Contribution to the Studies of Eurasian Languages Series 5. Department of
Linguistics, University of Tokyo.
Zhong, J. 2007. Xibu Yuguyu miaoxie yanjiu: Beijing.
General Reference Resources for the Turkic Languages and People in

China and the Bibliography
Hoppe, T. 1996. Die ethnischen Gruppen Xinjiangs: Kulturunterschiede und intereth-
nische Beziehungen: Hamburg 1995, (Unveränderter Nachdruck 1996).
Schwarz, H. G. 1984. The Minorities of Northern China. A Survey: Western Washington.
Tekin, T.; Ölmez, M. 2003. Türk Dilleri / Giriş: İstanbul.
Wei, Cui-yi. 1985. The Turkic-Speaking Minorities in China. Materialia Turcica:
Band 11, pp. 61-84.
Yolboldi, N.; Ḳasi̇, M. 1987. ǰuŋgudiki Türkiy Tillar: Ürümči, pp. 287-314. Šinǰaŋ Dašu
Nešriyati.
Zhonggua tujueyuzu yuyan cihuiji [English subtitle: Comparative Dictionary of Turkic
Languages in China]: Beijing 1990.
chapter 12
The Death of a Language: The Case of Ubykh

A. Sumru Özsoy
1 Introduction
Can a language itself play a role in bringing about its own extinction? Can the
complexity of the linguistic structure of a language have a negative effect on
its transmission to the next generation? With respect to the conditions that
lead to language endangerment and extinction, the issues have generally been
approached in terms of extra-linguistic factors such as forced language shift as a
consequence of military, economic, religious, cultural, or educational interven-
tion or voluntary language shift due to the speakers’ negative attitude towards
their own language (UNESCO, Nettle and Romaine 2014, Farf’an and Ramallo
2010, Tsunoda 2006). This paper considers the case of Ubykh, a Northwestern
Caucasian language which became extinct with the quiet passing away of
its last speaker, the 88-year-old Tevfik Esenç, in his home in the Hacı Osman
village of the Manyas region in Balıkesir, Turkey, on Oct. 8, 1992. The paper
basically claims that, although some of the external factors of extinction did
prevail for Ubykh, among the factors that led to its own extinction was in fact
also the language itself. In contrast to the relatively longer survival of its related
languages which were exposed to the same external factors, the fact that it was
Ubykh which became extinct seems to suggest that this might indeed be the
case, i.e. that the complexity of the linguistic system, specifically of the phono-
logical system, of the language contributed negatively to the transmission of
the language to the younger generations and therefore constitutes one of the
factors to be considered in the endangerment of the language.
It is a fact that the Ubykh children acquiring language were exposed to
linguistic systems with much simpler phonological systems than Ubykh’s. It
is therefore not inconceivable that this had an adverse effect of the acquisi-
tion of the language to the degree that the inter-generational transfer of the
language took place at a diminishing pace, resulting in the extinction of the
language, while the other Caucasian languages with simpler systems, though
exposed to the same external factors of endangerment, did not share the same
fate. Despite the fact that they lost their linguistic heritage, the majority of
the Ubykh speakers retained their Caucasian and Ubykh identity as well as

152 Özsoy
the non-linguistic aspects of cultural heritage. The following quotation from

Smeets (2013) expresses this process “Many Ubykh gradually assimilated to
things Circassian and all of them eventually developed into Turkish citizens
cherishing their Ubykh (-Circassian) descent.”
The quiet passing away of the 88-year-old Tevfik Esenç in 1992 marked not
only the end of the life span of an individual, but also of a linguistic heri-
tage. The last breath of this one man erased the name of the Ubykh language
from the list of endangered languages and rewrote it on the list of extinct
languages. Despite the fact that the conditions generally associated with lan-
guage endangerment and extinction unquestionably prevailed for Ubykh,
what is crucial is that they did so too, with the exception of one, for the other
Caucasian languages. While Ubykh became extinct, the other Caucasian
languages which were exposed to the same historical, socio-political events
and sociolinguistic conditions during the last hundred and fifty years were
able to survive quite robustly relatively longer, many of them still being spo-
ken by the diaspora, at least by the older members. The recent revival among
the members of the younger generation of these languages is an indication
of their awareness of their linguistic heritage and will hopefully reverse lan-
guage attrition, resulting in an increase in the number of more competent
speakers of these languages. Ubykh, unfortunately, does not share hopes of
the same process. Its complex phonological system renders Ubykh to be a lan-
guage highly unlikely to be revived. This paper aims to provide arguments to
this effect.
The organization of the paper is as follows. Section 2 will give a brief survey
of the external factors associated with language endangerment: The migra-
tion of the Ubykh speakers, as well as of the speakers of many other Caucasian
languages, out of the Caucasus and their settlement in Anatolia. Section 3
discusses some approaches to language endangerment and extinction. The
phonological system of Ubykh is discussed in Section 4 where it is argued that
its complexity is an additional factor that led to the extinction of the language.
Section 5 is the Conclusion.
2 Historical Background
The homeland of Ubykh was the northwest region of the Caucasus. Until
1860’s, the Ubykh lived along the eastern shores of the Black Sea, in the area of
the present day Sochi, neighboring the Adyghe speaking Shapzug and Abaza.
With a population of approximately 40,000-50,000 (Russian sources cited in
Landmann 1981). the Ubykh lived in this ‘ethnically (and linguistically) hetero-
geneous’ region until 1864 (Bodenstedt 1848 in Landmann 1981).
The Death of a Language: The Case of Ubykh 153
r. Kuban Present-day distribution of West Caucasian

languages in the Caucasus. Adapted from
Krasnodar Paris (1974a:6).
y g
A d Armavir
r.
h r. Kuma
L
ab a
e
r. Kuban
Maykop
r.U
Ad r. Be l a j a
r
up
Cherkessk
y Besleney
Tuapse gh
e
k r. Malka
a Mozdok
Sochi b
a r d r. Terek
r. Psou A j a
Nalchik n
Gagra b
r. Bzyb k h a
z
Abkhaz Sukhumi
Abaza r. Kodor
East Circassian
West Circassian
Mingrelian r. Ingur
Map 1 Northwest Caucasian languages.
The middle of the 19th century was a period of disruption for the Caucasus.
The military invasion of the area by the Russian forces and the drastic changes
in the sociopolitical structure of the region that followed in the mid 1860’s also
brought about a dramatic change in the demography of the Caucasus. More
than 400,000 Caucasians were forced to leave their homeland and emigrate
out of the area to the neighboring Ottoman Empire.1 Of the groups that immi-
grated to the Ottoman state, the Ubykh were unique in that they were the only
group among the Caucasians who migrated out of the region as the whole pop-
ulation. At the three ports of entry into the Ottoman Empire – İstanbul and
the two ports on the Black Sea, Samsun and Trabzon – the immigrants were
assigned locations for settlement across Anatolia. The groups speaking differ-
ent northwestern Caucasian languages were all treated as Circassians.
Landmann (1981) observes that there were approximately 600 Northwest
Caucasian villages in Turkey in mid 20th century. These villages were dispersed
over the large expanse of the Anatolian terrain. As a consequence of the settle-
ment policies of the authorities, the Caucasian groups were split, settling in
locations distant from each other. Landmann identifies 15 of the Caucasian
settlements as Ubykh villages and indicates their geographical distribution
as follows:
1 The historical details of the plight of the Caucasians during the forced migration can be
found in the extensive research listed in the reference list at the end of the paper.
154 Özsoy
Kocaeli- Sapanca 8
Balıkesir 3-4
Samsun 2
Kahraman Maraş 2
(Landmann, 1981)
The geographical distribution of the Ubykh villages above is as follows: 12 in

the Marmara (Kocaeli-Sapanca, Balıkesir), 2 in the Black Sea (Samsun), and
2 in the South-east (Kahraman Maraş) regions of Turkey. As the distribution
indicates, the greater majority of the Ubykh were settled in the Marmara
region, although the two major districts even in this region are relatively far
apart from each other. The consequence of the settlement plan was that these
villages wee dispersed over different regions of Anatolia making any contact
among the Ubykh groups almost an impossibility.
Andrews (1989) gives a slightly longer list of the distribution of the Ubykh
villages in Anatolia than Landmann’s. Andrews lists the names of 31 villages
which seem to be predominantly Ubykh and 21 villages in which the Ubykh do
not constitute the majority of the population. Significantly, Andrews names
two villages not mentioned by Landmann. These are in the Adana area which
is located in the southern part of Turkey. The following map reflects the distri-
bution of the locations of Circassian settlements in Anatolia.
The following statistics given by Landmann (1981) on the demographic dis-
tribution of the various linguistic and ethnic groups for the population of the
two Ubykh villages, Büyük Çamurlu and Akifiye, in the province of Kahraman
B L A C K S E A
I stanb u l Düzce z
d 20
e/6 A/84
10 f g/71
66 y B/6
h C/45 K/3
a/16 c/30 Ankara 35
16
b/74 k/27 x/13
1/4 D/20
j/4 L/5 Muş ?
i/5 E u u u
m/4 2 F/63 M/2
n/3 w/2
p/3 t/12
o/3 I/18 J/2
r/2 G/5 N/3
q/1
s/2
v/2 H
6
Map 2 The distribution of the villages of Circassian-origin population in Anatolia.

Maraş, is an indication that the term ‘Ubykh village’ is a relative term which
expresses that the group with the largest percentage of the population in
the village is of Ubykh origin, rather than the village being inhabited only by
Ubykh speakers.
Büyük Çamurlu (108 households) Akifiye (80 households)

83 Ubykh (77%) 41 Ubykh (52%)
9 Shapzug (8%) 14 Shapzug (17%)
4 Abaza (4%) 13 Abaza (16%)
1 Kabardian (1%) 1 Kabardian (1%)
1 Chechen (1%) 10 Turkish (13%)
4 Pomak (4%) 1 Kurdish (1%)
5 Turkish (4%)
1 Kurdish (1%)
Landmann (1981) remarks that in 1965, 1971 and 1975, the speakers in the two
villages could speak only a few words of Ubykh. The Ubykh considered them-
selves as Cherkess. Landmann further remarks that the same also holds for the
Lezghi and the Chechen of the Kahraman Maraş province.2
3 Ubykh and Language Endangerment
Research on language endangerment and language extinction generally

approaches the issues in terms of the nature of the extra-linguistic factors that
give rise to negative effects on the linguistic heritage of its speakers. Nettle
and Romaine (2000: 250) state language extinction to be “. . . . part of the larger
picture of worldwide near total ecosystem collapse”. Brenzinger and de Graff
(2006) distinguish between external and internal conditions of language
endangerment and extinction. They define external forces as “military, eco-
nomic, religious, cultural, or educational subjugation” and internal forces as
‘a community’s negative attitude towards its own language, a general decline
of group identity’ (UNESCO, EOLSS 6.20B.10.3). Brenzinger and de Graff further
maintain internal pressures to derive from external factors and “Together, they
halt the intergenerational transmission of linguistic and cultural traditions.”
2 As Landmann (1981) comments, the Ubykh of the two villages in the Kahraman Maras
province were not researched by Dumezil or Dirr or any other linguist. The same holds for
the Ubykh villages in the Adana area. To the best of my knowledge, that situation has not
changed since then.
156 Özsoy
For Nettle and Romaine (2012), three major causes of language endangerment
and extinction are population loss, forced language shift and voluntary shift.
They propose that vitality of a language can be negatively affected by top-down
policies whereby it is retrieved from official institutions and public domains –
e.g. courts, commerce, politics – and restricted to home use and/or by
bottom-up practices whereby the language “retreats from everyday use and sur-
vives primarily in ceremonial or more formal use, e.g. school”. Tsunoda (2006)
distinguishes between different types of language death, classifying them in
terms of cause, speed, combination of the two as well as the role of register in
language attrition.
That the conditions generally maintained to underlie language endanger-
ment and extinction prevailed for the speakers of the Ubykh language can be
ascertained by the brief account of the dramatic history of the Ubykh people
relayed in Section 2. Crucially, however, the expulsion from homeland and
forced migration to the Ottoman Empire were not confined to be the fate of
the Ubykh people only. The speakers of the other northwestern Caucasian lan-
guages also suffered through the same dramatic events as the Ubykh people
and were forced to migrate out of their homeland, settle in a land and terrain
strange to them, surrounded by an unrelated dominant language. The villages
in which the Caucasian immigrants settled were dispersed throughout various
regions across Turkey, in most cases distant from each other. Their languages
were surrounded by other languages, Turkish being the dominant language in
many regions. It is evident that these immigrants, like the Ubykh speakers, were
exposed to the same external conditions, with the exception of one, which are
generally taken to be the cause of language endangerment and extinction.
The external force, which is maintained by Nettle and Romaine (2012) to be
a crucial factor in language extinction, was loss in population. Population loss
affected the Ubykh people much more drastically than the other Caucasian
groups since the Ubykh had already started to decrease in numbers long before
they were forced to leave their homeland in the Caucasus in 1860’s. Smeets
(2013) summarizes the plight of the Oubykh as follows:
. . . . . . the Ubykh, who have been hit worst of all peoples involved in the
Caucasian War. Their numbers, which were initially already lower than
those of their immediate relatives, were first reduced dramatically by the
losses inflicted during the defense of their homeland, then during the
exode and its aftermath, and again during the turbulent period in Turkey
following the First World War. The survivors lived in a few pockets of vil-
lages dispersed over distant Turkish vilayets, which left their language
with little chances for survival.
Given that there does not seem to be any evidence a distinction was made
in the settlement policies administered by the authorities to the different
Caucasian groups, the villages of the immigrants with different linguistic
backgrounds were dispersed over different regions of Anatolia, with relatively
restricted possibility of communication between the villages. Of the other
two factors put forth by Nettle and Romaine (2012), i.e., forced language shift
and voluntary language shift, most Caucasian speakers, initially mainly men
but gradually women as well, went through stages of bilingualism, acquiring
proficiency in Turkish as well as in their native tongues, and in many cases
of the surrounding (or neighboring) Caucasian language(s). Nevertheless, the
Caucasian languages other than Ubykh managed to survive even to the pres-
ent date, albeit with observable language attrition among the members of all
the groups as the level of education of each new generation of the diaspora
improved and mass media, particularly with the introduction of television
which made Turkish accessible to even the oldest female members of the com-
munities who had not had too much exposure to the dominant language until
then, reached even the most remote parts of the country. The question then is
why it was Ubykh, but not the other Caucasian languages, that became extinct.
The claim here is that the answer lies not only in the nature of the extra-
linguistic factors of language endangerment and extinction but that the com-
plexity of the linguistic system of the language, specifically of its phonological
system, should also be taken into consideration. Given the same external fac-
tors, the complexity of a phonological system, particularly when the children
are exposed to languages with much simpler phonological systems, is very
likely to have a negative effect on the transfer of the language from one genera-
tion to the next, eventually leading to the extinction of the language, as in the
case of Ubykh.
4 The Linguistic System of Ubykh
Ubykh belongs to the Adygean branch of the Northwest Caucasian language

family. Its morphological and morphosyntactic features are prototypical of
those of the NW Caucasian language family. Within the assumptions of the
generative grammar framework, Ubykh has been shown to abide by the gen-
eral (universal) syntactic constraints; movement out of complement clauses
but not out of adjunct clauses is licensed (Özsoy 1998); its relative clause con-
structions conform to universal properties of relativization (Özsoy, 1998). As
expressed by Colarusso, however, the phonological system of Ubykh is one
where “. . . any rigorous account of human phonetic perceptual capacity will
158 Özsoy
have to take into account this precious marvel, Ubykh”. (www.languagesoft-

heworld.info/ . . ./obituary-the-ubykh-language.html, Jan. 25, 2012).
The descriptions of the phonological system of Ubykh by Vogt (1963),
Dumezil (1965), and Charachidze (1989) reflect the extent of the complexity
of the system. In terms of simplicity in the consonantal systems of languages
as defined by WALS, the differences in the three accounts of the consonan-
tal system of Ubykh provide support for the view that, along with the dimin-
ishing numbers of its speakers as the most significant of the external forces
of language endangerment and extinction, the complexity of its phonologi-
cal system had a strong effect in the eventual extinction of Ubykh. In WALS,
phonological/phonetic simplicity is defined as follows: “Consonants (which)
are in various ways inherently simpler (perhaps because they involve smaller
movements to pronounce them, or are easier for a listener to distinguish from
other sounds”. Complexity in the consonantal system of a language can then
be described in terms of a number of different properties: how large its pho-
nemic inventory is, the degree of (dis)proportionateness among the members
of the phonemic inventory, the nature of phonetic features. The differences in
three existent descriptions of the phonology of Ubykh ascertain the phono-
logical component of the language to exhibit all these properties.
4.1 Ubykh Phonology

The three accounts of the phonological system of Ubykh by Vogt (1963),
Dumezil (1965), and Charachidze (1989) differ from each other in the nature
and number of the consonantal phonemes of the language. As Colarusso (1992)
relays, Dumezil (1965) modified the analysis of the consonantal system initially
set up by Vogt (1963) by adding the velar plosive series [k, g, k’] and deleting the
voiceless glottal fricative [h]. Colarusso (1992) further comments that Leroy
and Paris (1974) carried out a number of X-ray analyses of Tevfik Esenç to clar-
ify the issues in the phonetics of Ubykh. Regardless of the non-uniformity in
the descriptions and the phonetic issues left unclarified, the fact remains that
the consonant inventory of Ubykh is one of the largest among the languages
of the world according to Maissen’s (2012) survey of consonant systems of the
languages.3 The consonants of Ubykh have a very complex set of phonetic fea-
tures with large number of points of articulation with many sounds being pro-
duced with double secondary articulations. Ubykh is also recorded to possess
3 With its 122 consonants, the Khoisan languages in southern Africa is the family with the
largest number of consonants (Maddieson, 2013). Khoisan languages have click sounds, not
found in Ubykh.
the most disproportionate ratio of phonemic consonants to vowels. As these

facts reflect, Ubykh has a very complex phonological system.
4.1.1 Consonant Inventory

Dumezil (1965) posits 81 consonantal phonemes for Ubykh, differing from
the number given by Vogt (1963) who differentiated 80 consonantal contrasts.
According to Dumezil’s system, Ubykh has 29 fricatives, 44 plosives/affricates,
27 sibilants, 20 uvulars, 3 different l-sounds, sonorants, and three vowels. The
plosive and affricate systems are given in Table 1.
Based on Charachidzé’s (1983) description, Table 1 shows that for every
point of articulation of plosive/affricate articulation, there is an ejective
sound. The table also reflects ten points of articulation for the plosives and
affricates, distinguishing alveolar, post-alveolar, alveolo-palatal and retroflex
plosives/affricates. Within the alveolar and post-alveo-palatal regions, there
are seven contrasting sounds, in the velar-uvular region has 16 contrastive
Table 1 Plosive and affricate phonemes of Ubykh Charachidzé 1983
Plosives and affricates Voiceless Voiced Glottalic
Labial Plain p b p’
Pharyngealized p̱ ḇ p̱’
Apico-dental Plain t d t’
Labialized tº̱ dº̱ tº̱’
Apico-alveolar ɫ’
Lamino-alveolar c ʒ c’
Dorso- Plain ċ ʒ̇ ċ’
post-alveolar Labialized ċ º̱ ʒ̇º̱ ċ º̱’
Lamino-post-alveolar č ǯ č’
Apico-palatal ç ƺ ç’
Dorso-velar Plain (k’)
Palatalized k’ g’ k’·
Labialized kº̱ gº̱ kº̱’
Dorso-uvular Plain q q’
Palatalized q’ q’’
Labialized qº̱ qº̱’
Dorso-uvular-pharyngeal Plain q̱ q̱’
Labialized q̱º̱ q̱º̱’
160 Özsoy
plosive sounds. Labialization as secondary articulation is common to all plo-

sive/affricate sounds with the exception of lamino-post-alveolar and apico-
palatal series. Velar and uvular sounds have palatalized and the labials have
pharyngealized phonemes.
Table 2 Fricative series in Ubykh
Fricatives Voiceless Voiced
Labial Plain
Pharyngealized
Labio- Plain f
dental Pharyngealized v
Apico-dental Plain
Labialized
Apico-alveolar ɫ l
Lamino-alveolar s z
Dorso-post-alveolar Plain ṡ ż
Labialized ṡ º̱ żº̱
Lamino-post-alveolar Plain
Labialized šº žº
Apico-palatal ş Ȥ
Dorso-velar Plain x ɣ
Dorso-uvular Plain X Ɣ
Palatalized X’ Ɣ’
Labialized Xº̱ Ɣº̱
Dorso-uvular-pharyngeal Plain X̱ Ɣ̱
Labialized X̱ º̱ Ɣ̱º̱
Glottal h
As Table 2 reflects, Ubykh has a very complex series of fricative sounds. The
table illustrates that Ubykh has a fricative contrast for every point of articula-
tion with the exception of labials and apico-dental region. There are 10 con-
trasts in the sibilant series and 12 contrasts in the velar/uvular series. At each
point of articulation, there is voicing contrast as well, resulting in voiced and
voiceless members of each pair. As secondary articulation, labialization and
palatalization are the common processes.
Table 3 Sonorants in Ubykh
Sonorants Nasals Liquids Glides
Labials Plain m w
Pharyngealized m̱ w̱
Apico-dental Plain n
Apico-Alveolar r
As shown in Table 3, the sonorant system of Ubykh two points of articulation

for nasals. The bilabial nasal and the glide have pharyngealized counterparts.
4.1.2 Phonetic Features

The complexity of the phonetic features of the Ubykh sounds is reflected in the
phonemic charts given above. As can be observed in the charts, Ubykh employs
almost every possible point of articulation in the oral cavity for contrast. The
contrasts in the alveolar/post-alveolar/palatal areas produced with apico/
lamino/dorso distinctions, with retroflexed affricates and fricatives, seem to
exhaust all possible phonetic productions. The complexity of the velar/uvular
series of plosives and fricatives and the sibilant series are phonetic challenges,
particularly in view of the much simpler systems of the surrounding languages.
Ubykh has 3 different l-sounds.
An additional source of phonetic complexity of Ubykh sounds is the three
processes of secondary articulations in the plosive and fricative series. The
majority of sounds in these series have contrasting phonemes produced with
secondary articulations labialization, palatalization, and pharyngealization.
Ubykh also has a pharyngealized bilabial nasal.
4.1.3 Consonant-vowel (dis)proportion

Ubykh is a language with the most disproportionate ratio of consonants
to vowels. In contrast to the 81 consonants of the language, Dumezil posits
3 vowels.
4.2 Consonantal Systems of Neighboring Languages

A comparison of Ubykh’s phonemic system with the phonemic system of some
of the neighboring Caucasian languages brings the complexity of the former
into focus more. For one, all other languages have a consonantal inventory
much smaller than that of Ubykh’s. Colarusso (1989) posits 48 consonants
162 Özsoy
Table 4 The consonant system of Kabardian (Colarusso, 1989)
Stops Fricatives Sonorants
vls vcd gl vcl vcd gl nasal trill glide

labial p b p’ f v f’ m
Alveolar t d t’ n r
c ʒ c’ s z
lateral ɫ l ɫ’
Alveo-palatal ṡ ż ṡ’
Palato-alveolar š ž
Palatal k’ g’ kº’ ˆ x ĝ y
Velar kº gº kº’ xº w
Uvular q q’ X Ɣ
qº qº’ Xº Ɣº
Pharyngeal ʕ ħ
Laryngeal ʔ h
ʔº
for Kabartay; Abaza has 60 consonants (cf. Lomtatidze and Klychef 1989), A
consequence of these facts is that the phonetic features of the consonants
are much simpler than those of Ubykh sounds. Further, the consonant-vowel
ratio of these language is not as disproportionate as it is in Ubykh. All these
indicate that the consonantal system of these languages are not as complex
as Ubykh’s. Table 4 reflects the consonant system of Kabardian adapted from
Colarusso (1989).
Typical of a Caucasian language, the consonantal system of Kabardian
too includes ejectives in its inventory. Nevertheless, the total number of con-
sonants in Kabardian is significantly smaller than in Ubykh. According to
Colarusso (1989), Kabardian has 50 phonemes in total for Kabardian, 48 of
which are consonants and 2 vowels. As can be noted, Kabardian has much sim-
pler plosive and fricative series than Ubykh’s. The contrast in the sibilant and
uvular/pharyngeal series is much smaller and secondary articulations much
more restricted than those of Ubykh, resulting in a much simpler phonological
system within the definition of simplicity provided by WALS.
5 Discussion
Language extinction takes place when a language is no longer passed on to the

younger generations. What is significant in the case of Ubykh is that the fac-
tors that lead to language endangerment and extinction (cf. Brenzinger and de
Graaf 2006) were, as discussed earlier, were common to the speakers of other
Caucasian languages as well. Further, the relevance of both the external and
internal factors defined by Brenzinger and de Graaf (2006) should at best be
approached with questions in the case of the speakers of Caucasian languages
in the present day. In the absence of any survey or official statistics regarding
linguistic affinity of individuals, it is difficult to determine the extent of the
degree of identity among the Caucasian diaspora with their cultural and lin-
guistic heritage. The last census in which proficiency in language(s) other than
Turkish was recorded was in the 1960’s. Since then, no official record of linguis-
tic or cultural heritage is available. However, the existence of several cultural
associations founded by the Caucasian diaspora which are actively involved
in promoting their cultural and linguistic heritage among the younger genera-
tions is an indication that, despite the irreversible historical facts, the internal
factors generally associated with language extinction should at best be consid-
ered with great caution in the case of the people of Caucasian descent living in
Turkey today. The growing awareness of their common descent and interest in
their linguistic heritage among the younger generations of the Caucasian dias-
pora are an indication of pride in their cultural heritage. What is also significant
is that people of Ubykh origin, although the language has now been extinct for
over two decades, are proud of their heritage and are highly esteemed by the
other groups of Caucasian descent.
It is therefore highly unlikely that an analysis of language extinction in
terms of the common definition of internal factors – the absence of group
identity, loss of – can be considered in the case of Ubykh. Then the question
remains: Why did Ubykh become extinct? And the answer lies in the account
of the phonemic inventory of Ubykh and its comparison with the phonemic
inventory of the northwestern Caucasian language given above, also present-
ing dim hopes for the revival of the language. In the language contact situation
that developed as a consequence of the external factors discussed above, the
complexity of the phonological system of Ubykh became a source of challenge
for the child acquiring language. Children acquire language through exposure
(cf. Chomsky 1975, Crain and Lillo-Martin 1999, Saxton 2010). When exposed
to systems phonetically simpler, the child is more likely to acquire the less
164 Özsoy
complex system. Given the fact that the number of speakers of Ubykh was
much smaller than the speakers of the other Caucasian languages, “Many
Ubykh gradually assimilated to things Circassian and all of them eventually
developed into Turkish citizens cherishing their Ubykh (-Circassian) descent”
(Smeets, 2012).
References
Andrews, P. 1989. Ethnic Groups in the Republic of Turkey. Wiesbaden: Harrasowitz.

Brenzinger, M., T. de Graf. 2006. Language Documentation and Maintenance. Ency-
clopedia of Life Support Systems (EOLSS 6.20B.10.3), UNESCO. (On-line encyclope-
dia: http://www.eolss.net/)
Charachidze, G. 1989. Ubykh. In Greppin, J. A. C. (gen. ed). The Indigenous Languages
of the Caucasus, vol. 2, B. G. Hewitt (ed.). The North West Caucasian Languages,
357-459. Delmar, New York: Caravan Books.
Chomsky, N. 1975. Reflections of Language. New York: Pantheon Books.
Colarusso, J. 1992. How many consonants does Ubykh have?, In Hewitt, G. ed.,
Caucasian perspectives, 145-55. Unterschleissheim: Lincom Europa.
Colarusso, J. 1989. Kabardian. In Greppin, J. A. C. (gen. ed). The Indigenous Languages
of the Caucasus, vol. 2, B. G. Hewitt (ed.). The North West Caucasian Languages,
263-355. Delmar, New York: Caravan Books.
Crain, S., Lillo-Martin, D. C. 1999. An introduction to linguistic theory and language
acquisition. Malden (MA): Blackwell Publishing Ltd.
Dumezil, G. 1975. Le Verb oubykh. Paris: Klincksieck.
Farf’an, J. A. F., F. Ramallo. 2010. New Perspectives on Endangered Languages.
Amsterdam: John Benjamins.
Landamann, A. 1986. Akifiye-Büyükçamurlu Ubychen Dörfer in der Südost Türkei.
Heidelberg: Esprint Verlag.
Lomtatidze, K., R. Klychef. 1989. Abaza. In Greppin, J. A. C. (gen. ed). The Indigenous
Languages of the Caucasus, vol. 2, B. G. Hewitt (ed.). The North West Caucasian
Languages, 91-153. Delmar, New York: Caravan Books.
Maddieson, I. 2013. Consonant Inventories. In Dryer, Matthew S. & Haspelmath, Martin
(eds.) The World Atlas of Language Structures Online. Leipzig: Max Planck Institute
for Evolutionary Anthropology.
Nettle, D., S. Ramaine. 2000. Vanishing Voices. Oxford: Oxford University Press.
Özsoy, A. S. 1998. Ibıhça’da Yapısal Engeller (Structural Constraints in Oubykh). Doğan
Aksan Armağanı. Ankara: Ankara Üniversitesi Basımevi.
Özsoy, A. S. 1988. Relative Clause Construction in Oubykh. Proceedings of the IVth
Colloquim on Caucasology. CNRS; Paris.
Saxton, M. 2010. Child Language: Acquisition and development. Thousand Oaks, (CA):
SAGE Publications Ltd.
Smeets, R. 2013. Reflections on the Caucasus: 21 May 1864-2010. https://circassian
world.wordpress.com/2013/09/06/reflections-on-the-caucasus-21-may-1864-2010/.
Tsunoda, Tasaku. 2006. Language Endangerment and Language Revitalization. Berlin:
Mouton de Gruyter.
Vogt, H. 1963. Dictionnaires de la langue oubykh. Oslo: Universitetsforlaget.
World Atlas of Linguistic Structures (WALS). http://wals.info/chapter/1.
Whaley, Lindsay. 2004. Can a Language that never existed be saved? In Freeland, J. & D.
Patrick (eds.). Language rights and Language Survival. Manchester, UK: St. Jerome
Publications.
chapter 13
Diversity in Dukhan Reindeer Terminology1

Elisabetta Ragagnin
The Dukhan People and Language
The Dukhan people are a Turkic-speaking nomadic group inhabiting the

northernmost regions of Mongolia’s Khövsgöl region. This area borders on the
northeast with Buryatia and on the west with the Tuvan republic. Nowadays
ethnic Dukhans number approximately 500 people and are divided into two
main groups: those of the “West Taiga” (barïïn dayga) originate from Tere Khöl,
whereas those of the “East Taiga” ( ǰüün dayga) came from Toju; both regions
are in Tuva.
Presently, around 32 Dukhan families are reindeer herders2 in the sur-
rounding taiga areas, on the south slopes of the Sayan mountains, whereas the
remaining families have settled down in the village of Tsagaan Nuur and in
neighbouring river areas, abandoning reindeer breeding. Some families, how-
ever, regularly rejoin the taiga in the summer months and tend to reindeer.
Although the Dukhan people identify themselves as tuhha, in Mongolia they
are generally called Tsaatan, a rather derogatory term meaning ‘those who
have reindeer’, stressing in this way the fact that they are not like Mongolian
herders.3 Recently the more neutral Mongolian term tsaačin ‘reindeer herders’
has been introduced. In the available published materials, Dukhans have been
designated by several other names such as “Urianxay”, “Taiga Urianxay”, “Taigïn
Irged” ‘peoples of the taiga’, “Oin Irged” ‘peoples of the forest’ and “Soiot”
1 I wish to thank the Dukhan community for their constant cooperation in documenting their
language and culture.
2 Dukhans follow the so-called Sayan-type of reindeer breeding, characterized by small-size
herds of reindeer used as pack and riding animal and as a source of milk products. On the
Sayan-type of reindeer herding, see Vainshtein (1980). For more recent views of Sayan econo-
mies, see Donahoe & Plumley (2003). Hunting used to be an important part of the Dukhan
economy. However, hunting and fishing proscriptions were recently issued by the Mongolian
government. In order to balance the impact of these proscriptions, the Mongolian govern-
ment has granted Dukhan families dwelling in the taiga and tending to reindeer a state pen-
sion calculated on the base of family numbers.
3 The Mongolian style of pastoral nomadism is based on the so-called five snouted animals:
sheep, goats, cattle (cows and yaks), horses and camels.

Diversity In Dukhan Reindeer Terminology 167
(Badamxatan 1962: 3). Dukhans do not call themselves Uyghur, as claimed in

some publications; see Ragagnin (2011: 20-21).
Linguistically, Dukhan belongs to the Taiga subgroup of Sayan Turkic
together with Tofan, the Toju variety of Tuvan and some varieties of the Tere-
Khöl area as well as Soyot of Buryatia.4 Reindeer-breeding and hunting char-
acterizes or characterized the lifestyle of these groups until not too long ago.
On the other hand, Standard Tuvan and the rest of its dialects belong to
Steppe Sayan Turkic together with Altay-Sayan varieties spoken in China and
western Mongolia, and Uyghur-Uriankhay (Tuhan) of East Khövsgöl.5
Nowadays, Dukhan is actively spoken by the older generation, that is by
speakers older than 40. Younger Dukhans communicate in Darkhat-Mongolian,
although they possess passive knowledge of Dukhan. Furthermore, language
loss is more acute in Tsagaan Nuur and river areas, where most of the house-
holds have already completely switched to Darkhat-Mongolian.6
As the result of a Mongolian-Tuvan educational project, Standard Tuvan,
which differs from Dukhan, was taught on a non-compulsory basis as a for-
eign language for three hours per week in the local boarding school for just a
few years (1990-1993 and 2002-2005). In more recent years, Oyunbadam, the
director of the local boarding school, has been organizing Dukhan language
summer schools in the taiga camps. She is also trying to reintroduce Tuvan lan-
guage teaching in the school curriculum. Mass-media in Tuvan language are
not available. Television, which exists and works in the taiga as well, broadcasts
programmes in Khalkha-Mongolian.
The fact that Dukhan people are quite famous in Mongolia (and beyond)
as being the only reindeer herders of the country – and thus representing a
highly rated tourist destination advertised by national and international tour-
ist agencies – has positively affected the Dukhan community with regard to
the consciousness of the uniqueness of their culture and native language.
However, the future status and use of Dukhan will be dependent on the
4 On the taiga vs. steppe division, though with slight differences from the view presented here,
see Žukovskaja et al. (2002). Furthermore, on Soyot, see Rassadin (2010), on Tofan, Rassadin
(a.o. 1971, 1978, 2014) and Harrison (2003), on Toju, Čadamba (1974), on Tere-Khöl Tuvan,
Seren (2006), and on Dukhan, Ragagnin (2011).
5 On standard Tuvan, see, a.o. Isxakov & Palm’bax (1961) and Anderson & Harrison (1998); on
Tuvan dialects, see Sat (1987) and on Uyghur-Urianxay, see Bold (1975) and Ragagnin (2009).
6 The general view among scholars is that the Darkhat people are of Turkic origin and that
their language and customs have become Mongol in the past few centuries. For a short survey
of Darkhat grammatical features, see Sanžeev (1931) and Gáspár (2006).
168 Ragagnin
socio-economical status and the level of education, professional knowledge

and employment of the Dukhan people.
Below I shall list and comment on Dukhan reindeer terminology, a very
unique part of their lexicon surely belonging to world cultural heritage. This
scarcely documented special lexicon is rapidly getting lost and falling into
oblivion. In my fieldwork sojourns, I could personally see how fuzzy it is among
many Dukhans.
Dukhan Reindeer Terminology7
To start with, the general term referring to ‘reindeer’ is iβɨ ‘rangifer tarandus
sibiricus’ (cf. Tuvan ivi and Tofan ibi ~ ivi ‘id’). This term is possibly etymo-
logically related with iwiq ‘the she-antelope, which frequents stony tracks and
deserts’, recorded in Maḥmūd Al-Kāšɣarī’s encyclopaedic work Compendium of
the Turkish Dialects and glossed with Arabic ẓabya (Dankoff & Kelly 1982: 108);
cf. Hauenschild (2003: 100). A wild, i.e. not tamed reindeer, on the other hand,
is called tasfanaŋ. This term possibly goes back to taspan (see below) and aŋ
‘wild animal’. Among Taiga Sayan Turkic varieties merely Soyot displays the
close cognate daspanaŋ ‘wild reindeer’ (Rassadin 2006: 43).
With regard to a new-born reindeer calf, Dukhan displays the terms ehsɨrɨk
and anhay. The former is a transparent Turkic agent formation from the ver-
bal stem ehsɨr- ‘to get drunk’ and literally means ‘drunkard’. This denomination
most probably is based on the fact that the new-born baby reindeer tumbles
like a drunkard. Moreover, it surely belongs to the set of taboo names applied
to young creatures (both humans and animals) in order to protect them from
evil spirits. It is quite unlikely that evil spirits would take away a drunkard.
Moreover, from ehsɨrɨk the verbal stem ehsɨrɨkte- ‘to calf/to fowl’ is formed. The
letter term, anhay, is etymologically more obscure. It could be traced back
to *ana ‘mother’ augmented by the hypocoristic suffix -KAy, thus meaning
‘mommy’.8 Both formal and semantic correspondences of Dukhan ehsɨrɨk and
anhay are documented throughout Taiga Sayan Turkic, eg. Toju-Tuvan aˁniy
‘reindeer calf’, eˁzirik ‘affectionate name for reindeer calf’ (Čadamba 1974: 63),
Soyot aˁnay ~ aˁnhay ‘reindeer calf till one year of age’, eˁsirik ‘new-born rein-
deer calf’ (Rassadin 2006: 22, 85, 204). Steppe Sayan Turkic, on the other hand,
displays corresponding items referring to babies of other animals, cf. ezirik
7 Some Dukhan reindeer terms are listed in Badamxatan (1962: 9), Somfai-Kara (1998: 18-19),
Kuular & Suvandi (2011a) and Ragagnin (2012).
8 Similar expressions are used a.o. in Turkish, see Ragagnin (2012: 135-136) for details.
(eˁzirik) ‘goatling, (kid), deer’s cub (fawn)’ (Tenišev 1968: 608b, Dorlig & Dadar-
ool 1994: 242a), and aˁnay ‘young offspring of a goat or mountain goat’ (Monguš
2003: 130b).
Moreover, with regard to new-born reindeer, Dukhan inventory includes the
terms öskʉsek and hosbayak not documented so far in the rest of Taiga Sayan
Turkic. The former denotes a motherless fawn and morphologically represents
a diminutive formation from öskʉs ‘orphan’. The latter refers to a new-born
reindeer fawn rejected by its own mother and represents a nominal formation
from the verbal stem hos- ‘to refuse animal’s babies’.9
Proceeding along the age axis, a young reindeer up to one year of age is
called hokkaš, a rather obscure term, arguably going back to the diminutive
formation kuškaš ‘small bird’10 from kuš ‘bird’ through phonological distortion,
not uncommon in taboo names. A.o, cf. Tere-Khöl Tuvan xokaš ‘reindeer calf
below one year of age’ (Seren 2006: 81), Tofan hokkaš ‘one-year old reindeer
calf’ (Rassadin 1971: 190) and Soyot hoqaš ~ hokkaš ‘one year-old reindeer calf
(in its second year)’ (Rassadin 2006: 85).
When reaching one year of age, the young reindeer is called taspan.
Cf. Toju Tuvan daspan ‘1 year-old reindeer’ (Sat 1987: 77), Tofan daspan
‘2 year-old young wild reindeer’ (Rassadin 1995: 21a). The etymology of taspan
is obscure; for some proposals, see Tatarincev (2002: 105-106).
Six months later, in the autumn, at the age of 18 months, the male reindeer
is named toŋgǝr and the female toŋgʉr.11 Both toŋgǝr and toŋgʉy are etymo-
logically rather obscure.12 They may be related to a rhotacised form of Turkic
toŋuz ‘pig’.13 This assumption, however, needs further investigation. Cognates
9 On the other hand, Standard Tuvan employs the form xosturgan (xos- ‘to refuse animal
babies’-caus-past part) to characterize a young animal refused by its own mother, eg.
xosturgan xuragan ‘rejected lamb’. I wish to thank my colleague Choduraa Tumat for pro-
viding me with this example.
10 Note, in this respect, that a structurally similar lexeme is documented in Sarigh Yugur:
gohqaš ‘small bird’ (Nugteren & Roos 2006: 110).
11 For all the analyzed terms, to distinguish between male and females, the terms er/erhek
vs. epšɨ may be used.
12 Within the whole of Sayan Turkic, Turkic toŋuz shows modern correspondences merely
in the so-called Uyghur-Uriankhay variety of Eastern Khövsgöl Aimag in the form toos
(Ragagnin 2009: 229).
13 It is worth noting that traces of Turkic toŋuz occur in other documented languages of this
area, such as Mator toŋoi ‘pig’ (Helimski 1997: 365, §1060), Toju Tuvan doŋay ‘two year-old
wild reindeer’ and Tuvan doŋay (Monguš 2003: 474a) ‘bear cub’ may also belong here. In
this respect, it should be kept in mind that names of strong and dangerous animals, such
as the boar, belong to the set of taboo names in use across Siberia and neighbouring areas.
170 Ragagnin
of Dukha toŋgʉr and toŋgʉy are documented in neighbouring Taiga-Sayan

varieties, cf. Tere-Khöl Tuvan toŋgur/tuŋxur ‘about two-year old male reindeer’
and tuŋxuy ~ tuŋguy ‘about two-year old reindeer doe’ (Kuular & Suvandi 2011:
165); also cf. Toju Tuvan dongur ‘about one-year-old young reindeer’, Tofan
dongor ‘grown-up wild male reindeer’, Altay dongur ‘young reindeer till one
year of age’ (Seren 2006: 82) and Soyot doŋgur ~ doŋgɨr ‘two-year-old reindeer
buck’ (Rassadin 2006: 45).
With regard to productive male reindeer, Dukhan uses the term ehter, liter-
ally ‘screamer in rut’, a Turkic participial formation from eht- ‘to scream in rut’,
already documented in Old Turkic sources. Evidently, ‘screaming in rut’ is the
most important characteristics of a male reindeer on heat. Cognates are docu-
mented throughout Taiga Sayan Turkic, e.g. Toju Tuvan eˀder (Čadamba 1974:
63-64) and Soyot eˀter ~ eˀtǝr (Rassadin 2006: 205) ‘reindeer buck’. Cognate
forms are also documented in neighbouring Buryat Sayan dialects: eteer ‘breed-
ing reindeer’ (Rassadin 1996: 149). Besides, Dukhan displays the term döŋxʉr,
originally meaning ‘hornless’ and used with reference to a young reindeer
buck. Tofan displays the cognate dönggür ‘male domesticated uncastrated
ridable reindeer in its third year and first mating season, but not ready for
mating’ (Harrison 2010: 57). Also cf. Toju Tuvan döˁnggür ‘without antlers, with
dropped antlers, one of the terms in use for a breeding reindeer’ (Čadamba
1974: 64). Accordingly, maybe this term is used for a reindeer buck after its first
antlers fall. Also see Tatarincev (2002: 235-236) and Monguš (2003: 495b) in
this respect. A term peculiar to Dukhan is usǝn but, literally ‘long leg’, specifi-
cally referring to a reindeer buck that will be castrated in the autumn, at least
according to some informants. This term is not documented in other Taiga
Sayan varieties. Finally, an “experienced” reindeer buck is called ulǝγ ether, lit-
erally ‘big ehter’.
The general term to refer to a fertile reindeer doe is mïndɨ. Etymologically,
most likely it represents a loanword from Samoyedic; cf. Mator méinde ‘rangi-
fer ferus’.14 Cognates of this item are widespread across north Asian languages
whose speakers are reindeer herders, see Tatár (1985) for examples. Further
terms specific to does are mïndɨǰak, a diminutive form of mïndɨ, denoting a
reindeer doe that has fowled once, and ǰaš doŋgʉy, literally ‘young doŋgʉy’
referring to a young doe about to foal. Furthermore xur doŋgʉy, literally ‘last
year’s doŋgʉy’ is employed for a reindeer doe that has foaled twice, haš mïndɨ
14 According to Helimski (1997: 301-302) Mator méinde may be traced back to Protosamoyedic
*məjan-ce̮ɜ (məjan ‘ground (gen)’ + ce̮ɜ ‘(tamed) reindeer’.
refers to an older reindeer doe with little fur, kïsǝr mïndɨ to a dry doe and,
finally, a mature reindeer doe is called ulǝɣ mïndɨ, i.e. ‘big mïndɨ’.
With regard to gelded reindeer, Dukhan displays a rather large set of terms,
including guuday, tüktǝɣ mĩis̃ , pir tüktǝɣ guuday, ihx̃ ɨ tüktǝɣ guuday, üš tüktǝɣ
guuday, ǰarɨ and bogana. Castration usually occurs when the reindeer is in
toŋgǝr-age, in autumn;15 the expression toŋgǝr tïhrtaar, literally ‘pull the toŋgǝr-
reindeer’ in fact means ‘to castrate’. Once castrated, the reindeer is called guuday;
also cf. Soyot quuday ‘domesticated three years old reindeer buck’, Tere-Khöl
Tuvan kuuday ‘small/young male reindeer’ (general term), Tofan kuuday ‘rein-
deer buck about 2-3 years old’ (Rassadin 1995: 33a). Ščerbak (1961: 91-92) and
Tatarincev (2004: 327), derived the term kuuday from Turkic kuu ‘grey’ and day
‘foal, young horse’. One year after castration, that is when the reindeer is three
years old, it is called tüktǝɣ mĩis̃ , literally, ‘hairy antler’. A Dukhan synonym for
tüktǝɣ mĩis̃ is pir tüktǝɣ guuday, literally ‘one-haired guuday (one hair-der guu-
day). Based on the same syntactic structure (cardinal number + tük ‘hair’, aug-
mented by the adjectivalizing suffix -LXɣ + guuday) are: ihx̃ ɨ tüktǝɣ guuday and üš
tüktǝɣ guuday referring, respectively, to ‘two-haired guuday’ and ‘three-haired
kuuday’, i.e. ‘four-year vs. five-year old gelded reindeer’. Formally related items
are documented in the other Taiga Sayan Turkic varieties, a.o. Soyot düktɨγ miis
‘domesticated young reindeer buck in its third year of age’ (Rassadin 2006: 47),
Tere-Khöl Tuvan iyi tüktüg kuuday ‘three years old male reindeer’, üš tüktüg
kuuday ‘four years old male reindeer’ (Seren 2006: 82), Toju Tuvan bir düktüg
mïyïs ‘male reindeer about 3 years of age’, iyi düktüg mïyïs ‘male reindeer about
four years of age’, üš düktüg mïyïs ‘male reindeer about five years of age’ (Seren
2006: 82). Furthermore, in Dukhan, the lexeme mĩis̃ occurs in the construc-
tion sããrsǝk mĩĩs with reference to a reindeer with one-dropped antler (sããrsǝk
‘one of two’).
The term ǰarɨ refers to a “calm” riding and packing reindeer, older than
four years of age, at least according to some informants. Cognates are found
throughout Taiga Sayan Turkic, e.g. Tofan ǰarǝ, Soyot čarï ‘riding and pack-
ing reindeer’ (Rassadin 1971: 194; 2006: 153), Tere-Khöl and Toju Tuvan čarï
‘castrated reindeer’ (Seren 2006: 82). Interestingly enough, Steppe Sayan Turkic
varieties show a rather different picture. The standard Tuvan cognate čarï
refers to a breeding male reindeer (Tenišev 1968: 520a) and in the so-called
Uyghur Uriankhay Sayan-Turkic variety of Eastern Kubsugul ǰarǝ is the only
existing term meaning ‘reindeer’ (Ragagnin 2009). Interestingly enough, a
15 On reindeer castration, see, a.o. Vainshtein (1980: 126).

172 Ragagnin
cognate is documented in Rašīd-ud-Dīn’s Compendium of Chronicles. In §107

dealing with the “forest” Uriangqat tribe, the Ilkhanid Persian historian wrote:
“They had no cattle or sheep but raised and caught instead mountain oxen,
mountain rams, and jür (antelope), which is like a mountain sheep, which they
milked and drank” (Thackston 2012: 42, §107).16 Note, in this respect, that the
Sayan western Buryat dialects, which show several Turkic-induced features,
displays the cognate zari denoting both a gelded reindeer (older than 4 years
of age) and a (breeding) reindeer (Čeremisov 1973: 251b). Cognates of this term
are quite widespread in Siberia. Yenisseic terms such as for instance Yug 4sɛh:ːr
‘reindeer’ and Ket sʼɛlʼ ‘reindeer’ may be related items (Werner 2002: 183); also
see Khabtagaeva (2015: 116). Furthermore, Dukhan ǰarɨ occurs in the nomi-
nal compound ehter-ǰarɨ which designates a breeding reindeer. Formal and
semantic cognates are well documented in other Taiga Sayan Turkic varieties,
e.g. Tofan eˀter ǰarï (Rassadin 1996: 149-150) and Soyot eˀter čarï (Rassadin 2006:
153). In these formations ǰarɨ is clearly used as a species collective denomination.17
The compound noun can thus be interpreted as ‘the category of rutting
reindeer’.
Last of all, bogana ~ mogana refers to a male reindeer castrated at advanced
age. Tere-Khöl Tuvan patterns together with Dukhan displaying boxana ‘gelded
reindeer’ (Kuular & Suvandi 2011: 165), whereas Toju-Tuvan bogana čarï denotes
an older breeding reindeer. The etymology of bogana ~ mogana is obscure.
It may represent a nominal formation built with the Mongolic suffix -gAnA,
forming names of plants and animals (Poppe 1954: 41), or a deverbal forma-
tion built with the Mongolic suffix -gAn (Poppe 1954: 45).18 Finally, the relation
between Dukhan pogana with Gagauz bobana ‘seven/eight years-old sheep’
(Ščerbak 1961: 153) needs further scrutiny.
Origin of Reindeer Breeding
Much ink has been spilled in trying to trace the origin of reindeer taming and
reindeer herding culture across northern Eurasia. According to the renowned
16 Also cf. the information supplied by Marco Polo’s XIII century travelogue concerning
reindeer herding nomads in the Bargu area (Ragagnin 2015).
17 In this regard, also see Hauenschild (2003: 105-106).
18 Cf. Erdal (1991: 87) for the corresponding denominal suffix -gAn deriving zoological and
botanical names in Old Turkic.
Russian anthropologist Vainshtein (a.o. 1980 and 1986) reindeer breeding origi-
nated in the Baykal-Sayan area.19
There are many Dukhan legends on the origin of reindeer taming. To finish
the present contribution I wish to include one of these. Several further vari-
ants circulating within the Dukhan community have been collected and will
be published in Oyunbadam & Ragagnin (forthcoming).
“How Dukhans Became Reindeer Herders” (original Dukhan version)
(1) Ehrte purǝn šaɣ-da tuhha ǰon-nar tuhha ulǝs-tar

early previous time-loc Dukhan people-pl Dukhan people-pl
tayga taŋdǝ pile aŋ-na-p tiiŋ-ne-p gulaš pir
taiga high plain with game-v.der-cb squirrel-v.der-cb on foot one
hem-den pir hem-be gulaš-ta-p amǝdǝra-p
river-abl one river-dat on foot-v.der-cb live-cb
ǰora-an ǰime iyen. (2) Ïn-ǰa-n-gaš göhh-ey
move-post.vbn thing ptc that-v.der-med-cb many-adj.der
ǰïl-lǝŋ so-on-da göhh-ey ǰïl-ǝn aŋ-na-p
year-gen end-poss3-loc many-adj.der year-adv.der game-v.der-cb
tiiŋ-ne-p amǝdǝra-p ǰora-aš mïn-ǰa-p amǝdǝra-p
squirrel-v.der-cb live-cb move-cb this-v.der-cb live-cb
ǰora-aš palǝk paylaŋ pol-gaš aŋ meŋ guš tiiŋ
move-cb fish small fish become-cb game grain bird squirrel
altǝ tiiŋ mïn-dǝɣ aŋ pičče pičče aŋ paylak-tar-ǝ
sable squirrel this-adj.der game small small game richness-pl-poss3
pile aŋ-na-p amǝdǝra-p pir hem-ner hem-ner hem
with game-v.der-cb live-cb one river-pl river-pl river
ihšt-ǝn-ge ǰïht-ar pol-gan ǰime. (3) Ol
inside-poss3-dat lie-intra.vbn become-post.vbn thing that
hem hem-ner-nǝŋ ihšt-ǝn-de thuskay pasa hem hem-nǝŋ
river river-pl-gen inside-poss3-loc separated and river river-gen
tuhha-lar-ǝ hem-nǝŋ tuhha-lar-ǝ thuskay ol
Dukhan-pl-poss3 river-gen Dukhan-pl-poss3 separated that
hem hem-ǝn-de aŋ meŋ-ne-p aŋ-na-p
river river-poss3-loc game grain-v.der-cb game-v.der-cb
amǝdǝra-p ǰora-an ǰimee. (4) JÌ a göhh-ey
live-cb move-post.vbn thing:emph yeah many-adj.der
19 In this respect, also see Laufer (1917), Vasilevič & Levin (1951) and Vitebsky (2005).
174 Ragagnin
ǰïl-lǝŋ so-on-da maɣatčok pir ǰüs ǰïl-lǝŋ

year-gen end-poss3-loc maybe one hundred year-gen
so-on-da la ǰüs ǰïl-lǝŋ so-on-da ǰime
end-poss3-loc ptc hundred year-gen end-poss3-loc thing
be ol tayga taŋdǝ-sǝn-dan am pir göhh-ey
q that taiga high plain-poss3-abl now one abundant-adj.der
aŋ sïïn pol-gaš thoš irey iβɨ pol-gaš
game maral deer become-cb antelope bear reindeer become-cb
öske aŋ-nar tïkka höy pol-gan. (5) Onǝŋ ihšt-ǝn-den
other game-pl very many become-post that:gen inside-poss3-abl
iβɨ iβɨ te-p pir pasa pir ïndǝɣ aŋ par
reindeer reindeer say-cb one and one such game existent
pol-gandǝroo. (6) J ǎ ol iβɨ te-p aŋ-nəŋ ol
become-res.emph yeah that reindeer say-cb game-gen that
tuhha ǰon-nar-nǝŋ hïyo-on-da ïrak emes tïkka ǰook
Dukhan people-pl-gen edge-poss3-loc distant neg.cop very near
töhxǝm ǰorǝ-p tur-ar ïndǝɣ iβɨ te-p aŋ-nar
close move-cb stand-intra.vbn such reindeer say-cb game-pl
tur-ar pol-gandǝrǝ iyen. (7) Ol tuhha-lar-nǝŋ
stand-intra.vbn become-res ptc that Dukhan-pl-gen
hïyo-on-da ol tayga taŋdǝ-lar-nǝŋ amǝdǝra-p ǰïht-ar
edge-poss3-loc that taiga high plain-pl-gen live-cb lie-intra.vbn
ǰer-ler-nǝŋ ol hem-ner-nǝŋ tayga taŋdǝ-sǝn-da
place-pl-gen that river-pl-gen taiga high plain-poss3-loc
ǰa ïn-ǰa-n-gaš ol am göhp ǰïl-lǝŋ so-on-da
yeah that-v.der-med-cb that now abundant year-gen end-poss3-loc
ǰime iyen ol tuhha-lar am ol iβǝ te-p aŋ pile
thing ptc that Dukhan-pl now that reindeer say-cb game with
pot-ǝ pile tïkka töhxǝm pol-gan-ǝ. (8) J ǎ
self-poss3 with very close become-post.vbn-poss3 yeah
po iβǝ te-p po aŋ-dan tuht-ǝp al-gaš pot-tar-ǝβǝs
this reindeer say-cb this game-abl catch-cb take-cb self-pl-poss1.pl
mun-ǝp etele-p süt-ǝn sa-ap ihhj-ǝp ïn-ǰa
ride-cb use-cb milk-poss3.acc milk-cb drink-cb that-adv.der
pot-ǝβǝs-tǝŋ po amǝdǝra-l-ǝβǝs-ka hereɣle-se
self-poss1.pl-gen this live-n.der-poss1.pl-dat use-cond3
gan-dǝɣ ǰime erɣöö te-p ïn-ǰa-n-gaš ulǝs-tar
which-adj.der thing ptc-emph say-cb that-v.der-med-cb people-pl
ïn-ǰa sooda-ǰ-ǝp gel-gendǝroo. (9) Ïn-ǰa-n-gaš
that-adv.der speak-coop-cb come-res.emp that-v.der-med-cb
sooda-š-kan. (10) höy ïnǰa pir gahš ǰïl on aǰǝɣ

speak-coop-post many such one how many year ten more than
ïn-ǰa-aš sooda-ǰ-ǝp gel-gen. (11) Ïn-ǰa-aš la
that-v.der-cb speak-coop-cb come-post that-v.der-cb ptc
am ol iβǝ-nǝ tuht-ar thööhǝ pile ara-sǝn-ga
now that reindeer-acc catch-intra.vbn history with interval-poss3-dat
sooda-š-kaš ekkǝ thee tuht-ǝp tuht-pa-an thaa la
speak-coop-cb good ptc catch-cb catch-neg-post ptc ptc
tur-ǝp al-gandǝrǝ. (12) J ǎ ïn-ǰa-n-gaš am pir le
stand-cb take-res yeah that-v.der-med-cb now one ptc
hün ol am po arhtïï tayga iβɨ tayga te-p pir
day that now this behind taiga reindeer taiga say-cb one
tayga-da pir göhh-ey tayga-lar ara-sǝn-da iβɨ
taiga-loc one abundant-adj.der taiga-pl interval-poss3-loc reindeer
tayga te-p pir ǰaŋgǝs poš tayga par pol-gandǝrǝ. (13) Ol
taiga say-cb one single free taiga existent become-res that
tayga-da ol iβɨ te-p aŋ ǰerle göhh-ey
taiga-loc that reindeer say-cb game really abundant-adj.der
pol-gandǝroo. (14) J ǎ ïn-ǰa-n-gaš la am pir
become-res.emph yeah that-v.der-med-cb ptc now one
le hün ol tayga-da pir gatay ïn-da amǝdǝra-p
ptc day that taiga-loc one woman that-loc live-cb
ǰïht-ar pol-gandǝroo ol tayga-sǝn-da iβɨ tayga
lie-intra.vbn become-res.emph that taiga-poss3-loc reindeer taiga
te-p höy iβɨ tur-ar tayga-da. (15) Ïn-ǰa-n-gaš
say-cb many reindeer stand-intra.vbn taiga-loc that-v.der-med-cb
ol gatay-nǝŋ hïyo-on-da ol iβɨ te-p ak gök
that woman-gen edge-poss3-loc that reindeer say-cb white blue
ǰime-ler pol-sa tïkka töhxǝm oht-ta-p tur-ar
thing-pl become-cond3 very close grass-v.der-cb stand-intra.vbn
pol-gandǝr ol aŋ-nar ol iβɨ aŋ-nar. (16) Ïn-ǰa-n-gaš
become-res that game-pl that reindeer game-pl that-v.der-med-cb
la am pir hün ol gatay am gïrɣan pir peǰen aǰ-a
ptc now one day that woman now old one fifty cross-cb
gar-lǝɣ gatay ǰime dǝrǝ. Ol gatay am po iβɨ-den
snow-adj.der woman thing cop that woman now this reindeer-abl
piree-nǝ tuht-ǝp al-sa gandǝɣ ǰime-l te-eš pot-ǝn-ga
one-acc hold-cb take-cond3 which thing-ptc say-cb self-poss3-dat
ööred-ǝr te-eš hünnǝŋne ol iβɨ teŋ guɣǝr-da-p
teach-intra.vbn say-cb everyday that reindeer like gugur-v.der-cb
176 Ragagnin
hün-se-er pol-gandǝrǝ. (17) ol guɣǝr-da-p la

day-v.der-intra.vbn become-res that gugur-v.der-cb ptc
hün-se-er pol-gandǝrǝ. (18) Sitǝk mitiin
day-v.der-intra.vbn become-res urine ech.der:poss3.acc
ǰïlga-d-ǝp gör-ǝp ïn-ǰa-p la töhxǝ-se le töhxǝ-p
lick-caus-cb see-cb that-v.der-cb ptc approach-cond3 ptc approach-cb
le tur-ar pol-gandǝrǝ. (19) Hünnǝŋne guɣǝr-da-p
ptc stand-intra.vbn become-res everyday gugur-v.der-cb
pot-ǝn-ǝŋ ün-ǝn-ge ööred-ǝp tur-ar pol-gandǝrǝ.
self-poss3-gen voice-poss3-dat teach-cb stand-intra.vbn become-res
(20) Ïn-ǰa-ar-da la ol iβɨ te-p ol
that-v.der-intra.vbn-loc ptc that reindeer say-cb that
iβɨ-ler-nǝŋ ol tayga gatay-ǝ ol guɣǝr-da-p
reindeer-pl-gen that taiga woman-poss3 that gugur-v.der-cb
gïškǝr-ǝp suɣǝr-ǝp tur-ar-ǝn šuptɨ-sən öören-ǝp
scream-cb wail-cb stand-intra.vbn-poss3.acc all-poss3.acc learn-cb
pil-ǝp al-gandǝrǝ. (21) Ïn-ǰa-n-gaš la ol gatay-ga
know-cb take-res that-v.der-med-cb ptc that woman-dat
tïkka la töhxö-ör pol-ǝp gel-gen. (22) Ïn-ǰa-p
very ptc approach-intra.vbn become-cb come-post that-v.der-cb
pir on aǰǝɣ on peš hire hün ïnǰa pot-ǝn-ga ïn-ǰa
one ten more than ten five about day so self-poss3-dat so
ǰül ǰül ǰüme söɣle-p tur-sa tur-sa töhxe-y per-ɣendǝrǝ.
solid solid thing say-cb stand-cb stand-cb approach give-res
(23) Ol iβɨ-ler pür gol-ga ǰeht-er
that reindeer-pl extremely hand/arm-dat reach-intra.vbn
mïn-ǰa-p tuht-ar gol-ga ǰeht-kǝdeɣ pol-ǝp
this-v.der-cb catch-intra.vbn hand/arm-dat reach-n.der-pot become-cb
gel-gendǝrǝ. (24) ÌJa pir le hün ol gatay šïlba
come-res yeah one ptc day that woman lasso
gaht-ǝp al-gaš aŋ šïlba gaht-ǝp al-gaš aŋ šïlba
interlace-cb take-cb game lasso interlace-cb take-cb game lasso
tuht-ar te-eš usǝn šïlba gaht-ǝp al-gaš la am
catch-intra.vbn say-cb long lasso interlace-cb take-cb ptc now
pir le hün pir ekkɨ töhxǝ-p gel-gen pičče-ǰek
one ptc day one good approach-cb come-post.vbn small-n.der
iβɨ-nɨ urǝk-ta-andǝrǝ. (25) Am tuht-ǝp a-ar orhta
reindeer-acc lasso-v.der-res now catch-cb take-intra.vbn middle
orǝk-ta-ar or ta ol
h ǰime am tïkka thaa ǰütkǝ-p
lasso-v.der-intra.vbn middle that thing now very ptc struggle-cb
möö-βiin möö-βiin ïn-ǰa-n-gaš ol gatay

spring-neg.cb spring- neg.cb that-v.der-med-cb that woman
ukša urǝk-ta-aš pahš pa-an pahhj-in-ga paš
immediately lasso-v.der-cb head tie-poss3.acc head-poss3-dat head
paɣ sup al-gaš am paɣ-la-p al-gandǝrǝ. (26) Ol
tie put:cb take-cb now tie-v.der-cb take-res that
al-gan iβǝ-sǝ i x̃ǝ gar-lǝɣ
h pičče toŋgʉy
take-post.vbn reindeer-poss3 two snow-n.der small toŋgʉy-reindeer
pol-gandǝrǝ. (27) ÌJa am tããrda-sǝ pasa gaht-ta-p
become-res yeah now tomorrow-poss3 and layer-v.der-cb
piree-nǝ tuht-ǝp pol-ǝr emes-pe te-eš hölse-p
one-acc catch-cb become-intra.lf neg.cop -q say-cb be excited-cb
tur-ar ǰime tuht-ǝp pol-ǝr emes-pe te-eš
stand-intra.vbn thing catch-cb become-intra.lf neg.cop-q say-cb
hölse-p tur-ar ǰime pol-gandǝrǝ. (28) Ol tüün
be excited-cb stand-intra.vbn thing become-res that yesterday
tuht-kan iβɨ-sǝn hïyo-on-da pasa onǝ eter-ǝp
catch-post.vbn reindeer-poss3 edge-poss3-loc and that:acc follow-cb
tur-ar iβɨ-ler yaβa-ǰ-ǝp tur-ar pol-gandǝrǝ.
stand-intra.vbn reindeer-pl go-coop-cb stand-intra.vbn become-res
Moon pasa ïra-βas höy iβɨ-ler. (29) Õ õsǝn-dan
this:abl and be far-intra.vbn many reindeer-pl that:poss3-abl
tããrda-sɨ ol pasa la pir pičče-ǰek iβɨ-nǝ
tomorrow-poss3 that and ptc one small-n.der reindeer-acc
urǝk-ta-p al-gandǝrǝ iyen. (30) ÌJa urǝk šïlβa šïlβa-sɨ
lasso-v.der-cb take-res ptc yeah lasso lasso lasso-poss3
pile paɣ šïlba-sɨ pile urǝk-ta-p al-gaš ǰa am ihx̃ɨ
with rope lasso-poss3 with lasso-v.der-cb take-cb yeah ptc two
iβɨ-lǝɣ pol-a per-ɣendǝr am. (31) Am pir
reindeer-adj.der become-cb give-res now now one
toŋgǝr pir toŋgʉy iβɨ-lǝɣ pol-ǝp
toŋgǝr-reindeer one toŋgʉy-reindeer reindeer-adj.der become-cb
al-gaš ǰa ol gatay am ol peǰen aǰǝɣ gar-lǝɣ
take-cb yeah that woman now that fifty more than snow-adj.der
gatay ihx̃ǝ iβɨ-lǝɣ pol-ǝp gel-gen-ǝ. (32) ÌJa
woman two reindeer-adj.der become-cb come-post-poss3 yeah
ol gatay am ol toŋgǝr toŋgǝy ïn-ǰa-aš
that woman now that doŋgur-reindeer doŋguy-reindeer that-v.der-cb
ïnǰa am pot-ǝ-nǝŋ gïl-ǝp al-gan šïlβa-sɨ pile
so now self-poss3-gen make-cb take-post.vbn lasso-poss3 with
178 Ragagnin
ïn-ǰa-p la ol arɣamǰǝ pile oht-kar-ǝp ïn-ǰa-p

that-v.der-cb ptc that lasso with grass-v.der-cb that-v.der-cb
tur-sa ol ïn-ǰa-ar ǰerle amthan-nəɣ ǰime
stand-cond3 that that-v.der-intra.lf entirely taste-adj.der thing
gol-ga ǰi-tǝr-t-ǝp tur-sa tur-sa la am
arm/hand-dat eat-caus-caus-cb stand-cb stand-cb ptc now
pot-ǝn-ga ööred-ǝp al-gaš am ïn-ǰa-n-gaš ol gatay
self-poss3-dat teach-cb take-cb now that-v.der-med-cb that woman
ol peǰen aǰǝɣ gar-lǝɣ gatay ihx̃ ɨ iβɨ-lǝɣ
that fifty more than snow-adj.der woman two reindeer-adj.der
pol-ǝp gel-gen. (33) Hïyo-on-da tur-ɣan am ol
become-cb come-post edge-poss3-loc stand-post.vbn now that
hem tayga-lar-da ǰurht-a-p tur-ɣan ol tuhha
river taiga-pl-loc land-v.der-cb stand-post.vbn that Dukhan
ǰon-nar öske tuhha ǰon-nar am po pir gatay
people-pl other Dukhan people-pl now this one woman
iβɨ-lǝɣ pol-gan am pis-ter pasa onǝ ol
reindeer-adj.der become-post now we-pl and that:acc that
gihhjǝ teɣ tuht-ǝp al-aalǝ te-eš am ara
person like catch-cb take-vol.pl say-cb now interval
ara-sǝn-ga tïŋna-š-kaš sooda-š-kandǝroo. (34) J̌a
interval-poss3-dat listen-coop-cb speak-coop-res.emph yeah
am ol iβɨ-nɨ tuht-ar-ǝl te-eš am sooda-š-kaš
now that reindeer-acc catch-intra.lf-ptc say-cb now speak-coop-cb
gel-gen. (35) Am gesǝk ǰamdǝk ol gatay-nǝŋ arɣa-sɨ
come-post now part some that woman-gen means-poss3
pile tuht-ǝp tur-ar gatay gihhjɨ gesek ǰamdǝk
with hold-cb stand-intra.vbn woman person part some
ulǝs-tar am ulǝɣ gïhska gašpal ǰer-ler-ɣe tusak sal-ǝp
people-pl now big short gorge place-pl-dat trap set-cb
sür-ǝp göhh-ey göhh-ey pile böl-ǝp le
chase-cb abundant-adj.der abundant-adj.der with gather-cb ptc
ïn-ǰa-p tur-ɣaš ol tuhha-lar iβɨ-lǝɣ pol-ǝp
that-v.der-cb stand-cb that Dukhan-pl reindeer-adj.der become-cb
tuht-ǝp al-gan ǰime dǝroo. (36) Ïn-ǰa-n-gaš
catch-cb take-post.vbn thing cop.emph that-v.der-med-cb
ooŋ so-on-da ol pis-tǝŋ am po tuhha ǰurht-tǝŋ
that:gen that-poss3-loc that we-gen now this Dukhan country-gen
aram onuun soŋ ol höy höy iβɨ-ler-ǝn
let’s see that:abl end that many many reindeer-pl-poss3.acc
ös-kǝr-ǝp ol eβeeš pičče thïp al-gan

grow-caus-cb that few small find:cb take-post.vbn
iβɨ-sǝn ös-kǝr-ǝp ol ïn-da tayga
reindeer-poss3.acc grow-caus-cb that that-loc taiga
taŋdǝ-sǝn-da ǰoru-ur ol ǰerlǝk gihhjɨ-ɣe
high plane-poss3-loc move-intra.vbn that wild person-dat
töhxǝ-βes iβɨ-ler thee ïn-da par la
approach- neg.intra.vbn reindeer-pl ptc that-loc existent ptc
pol-dɨ. (37) Ïn-ǰa-n-gaš am ol tuht-ǝp al-gan
become-past that-v.der-med-cb now that catch-cb take-post.vbn
iβɨ-ler-nɨ ös-kǝr-ǝp hem hem hem-de ǰorǝ-p
reindeer-pl-acc grow-caus-cb river river river-loc move-cb
ǰoru-ur ol göhh-ey tuhha-lar am pir-er
move-intra.vbn that abundant-adj.der Dukhan-pl now one-adj.der
iβɨ-lǝɣ pol-ǝp iβɨ-sǝn ös-kǝr-ǝp gesǝk
reindeer-adj.der become-cb reindeer-poss3.acc grow-caus-cb part
ǰamdǝk am iβɨ-sɨ pile eβeeš iβɨ-sɨ pile
some now reindeer-poss3 with few reindeer-poss3 with
ši-in-de ǰorǝ-p iβɨ-sǝn mun-ar ol
direction-poss3-loc move-cb reindeer-poss3.acc ride-intra.vbn that
iβɨ-sǝn am gohš-ta-p aŋ-na-p
reindeer-poss3.acc now balanced luggage-v.der-cb game-v.der-cb
gel-gen. (38) Am aŋ-nǝŋ eht-ǝn
come-post now game-gen meat-poss3.acc
go š-ta-p
h urǝɣ darï-ïn mun-dǝr-ǝp
balanced luggage-v.der-cb child seed-poss3.acc ride-caus-cb
süt-ǝn sa-ap ihhj-ǝp ïn-ǰa-p ol iβɨ-nɨ
milk-poss3.acc milk-cb drink-cb that-v.der-cb that reindeer-acc
am ašǝɣla-p pot-ǝn-ǝŋ amǝdǝra-l-ǝn-ga tïkka ašǝɣla-p
now exploit-cb self-poss3-gen live-n.der-poss3-dat very exploit-cb
ašǝɣla-ar pol-ǝ per-ɣendǝroo. (39) Ïn-ǰa-n-gaš
exploit-intra.vbn become-cb give-res.emph that-v.der-med-cb
am ol am ol hem-de gulaš-ta-p palǝk-ta-p
now that now that river-loc on foot-v.der-cb fish-v.der-cb
aŋ-na-p men-ne-p ǰi-p ǰora-an tuhha
game-v.der-cb grain-v.der-cb eat-cb move-post.vbn Dukhan
ǰon-nar ïn-ǰa-n-gaš pot-ǝ mun-ar-lǝɣ sa-ap
people-pl that-v.der-med-cb self-poss3 ride-intra.vbn-adj.der milk-cb
ihhj-er süt-tǝɣ po iβɨ-sǝn süt-ǝn sa-ap
drink-cb milk-adj.der this reindeer-poss3.acc milk-poss3.acc milk-cb
180 Ragagnin
ihhj-ǝp (40) õõsǝn-dan ïŋgay peerǝ göhhj-ǝp maŋna-p

drink-cb that:poss3-abl further since nomadize-cb run-cb
iβɨ-nɨ mun-ǝp göhhj-er-ǝn-de
reindeer-acc ride-cb nomadize-intra.vbn-poss3-loc
gohš-ta-p al-gaš göhhj-ǝp hem hem-den
balanced luggage-v.der-cb take-cb nomadize-cb river river-abl
taɣ-dan ïn-ǰa-p ǰoru-ur pol-a per-ɣendǝroo.
mountain-abl that-v.der-cb move-intra.vbn become-cb give-res.emph
(41) Am mun-ar-lǝɣ pol-gan šuptǝ
now ride-intra.vbn-adj.der become-post.vbn all
ǰorǝ-ǰ-ǝr ǰoru-u usǝn-a-p gel-gen.
move-coop-intra.vbn move-n.der-poss3 long-v.der-cb come-post
(42) Pir hem-den pir hem-be göhhj-ǝp gel-gen.
one river-abl one river-dat nomadize-cb come-post
(43) Göhhj-ǝp aŋ-na-p amǝdǝra-r pol-ǝp
nomadize-cb game-v.der-cb live-intra.vbn become-cb
gel-gen. (44) Ara ara-sǝn-ga aal-la-ǰ-ǝp
come-post interval interval-poss3-dat family-v.der-coop-cb
ǰorǝ-ǰ-ar pol-ǝp gel-gendǝroo. (45) ÌJa
move-coop-intra.vbn become-cb come-res.emph yeah
ïn-ǰa-n-gaš am tuhha gihhjǝ / iβɨ te-p ǰime
that-v.der-med-cb now Dukhan person reindeer say-cb thing
ïn-ǰa tïhh-ǝl-gan.
that-adv.der find-pass-post
Translation20
(1) Once upon a time (lit. in early times before) the Dukhan people, the vari-
ous Dukhan peoples were living going about hunting for large and small ani-
mals through the taiga and high plains, and moving about from one river to
the other. (2) So, after many years, after living from hunting large and small
animals, after living thus, living from hunting, fishing big and small fish, hunt-
ing game and searching for grain, hunting birds and squirrels, sables and
squirrels, and such animals, all the small animals and the resources (of the
forest), among many rivers they settled down inside one of them. (3) Separated
within the rivering land from each other by the rivers, the Dukhans of one river
20 The English translation is faithful to the original Dukhan text. Thus, hesitations and refor-
mulations of the speaker are retained in order to demonstrate the normal flow of speech
when telling a story. Speakers in fact modify and repeat sentences when building up
a story.
or the other, the Dukhans of one river or the other were separated, and they lived
from hunting for game and collecting grains and berries along one river or another.
(4) After many years, after maybe a hundred years, even after a hundred
years – or something (like that) – there happened to be many wild animals
(coming) from the taiga and high plains: maral deer, antelopes, bears, reindeer
and other animals. (5) Within those, there was such an animal called rein-
deer. Sooo it was. (6) Well, those animals called reindeer were staying beside
the Dukhan peoples not far away, very near. Such animals (called reindeer)
were coming to live (there). (7) Alongside those Dukhans of that taiga and
high plains, in the taigas and high places of the lands and those rivers, where
they were living, so, yeah, after many years, those Dukhans well, the animal
called the reindeer and the Dukhans came very close to each other: (8) Yeah,
the Dukhans came about to speak about the following: “How would it be if we
captured some of these animals, these (animals) called reindeer and we our-
selves were to ride them, use them, milk their milk, drink it, and to make use of
them in this life of ours in that way”. (9) So, they spoke in that way. (10) They
went on speaking for many years, like this, several years, about ten. (11) So, well,
having spoken together and among themselves about capturing that reindeer,
they remained at the point of almost catching them. (12) Yeah, so, well, one day
there was this taiga behind, a taiga called the reindeer taiga, an isolated and
free taiga among the various taigas. (13) In that taiga there were really a lot of
those animals called reindeer. (14) So, well, one day an old woman settled down
there, in that taiga, in that taiga, in the taiga called the reindeer taiga where
many reindeer lived. (15) So, as for those white and blue things called reindeer,
they came to graze very close beside that woman, those animals, those rein-
deer animals. (16) Well, one day that woman, well, that old woman, a woman
around fifty, well, that woman thought, “If I would capture one of those rein-
deer, how would it be?”. In order to teach herself the voice (of the reindeer) she
passed her days just making gugurt-sounds. (17) She just passed her days mak-
ing gugurt-sounds. (18) She tried to get them to lick her urine and in this way
(the reindeer) kept getting closer and closer. (19) Doing gugurt sounds every
day, she happened to make the reindeer familiar to her own voice. (20) And by
doing this, that reindeer, those reindeer, the woman of that taiga was able to
learn all of the gugurt-sounds, cries and wines of those reindeer called reindeer,
all of them. (21) So (they) were really getting very close to that woman. (22) For
about ten fifteen days she kept on saying such kind of things and it (the rein-
deer) became close to her. (23) Those reindeer came up to (her) hand in such
a way to be almost touched, they (gradually) started to be reachable by hand.
(24) Well, and one day that woman twisted together a lasso, a lasso for ani-
mals, she twisted together a long lasso in order to catch that reindeer. Well,
182 Ragagnin
and one day she caught with the lasso a small reindeer that came nicely close
to her. (25) As soon as she caught it, since it was not struggling and kicking a
lot, she roped it immediately. Then she put a collar on its head, a collar and,
now, she has got it tied up. (26) That reindeer she got was a small two year old
tongʉy-reindeer. (27) Yeah, well, the next day she got excited thinking “Isn’t it
possible to catch one of them again?” (28) Beside the reindeer she had caught
the day before, also (other) reindeer that followed that one happened to be
wondering around (there) and not far away from this place there were also
many reindeer. (29) The next day after that, she roped another small reindeer
as well. (30) Well, a rope lasso, with a lasso, she roped it with her collar and
lasso, yeah, well, that woman of more than fifty years old became indeed some-
body who has two reindeer. (31) Now she had one toŋgǝr-reindeer and one
tongʉy-reindeer. Well, that woman of more than fifty years of age (gradually)
became somebody who has two reindeer. (32) Yeah, that woman, now that
toŋgǝr-reindeer, tongʉy-reindeer, having done that, and doing that with her
lasso that she had made herself, and doing that, she took them out to graze
with the lasso, in that way. She was feeding it (the reindeer) little by little by
hand with very tasty things (salt),21 and having done that, now, that woman,
a woman of about fifty years old had two reindeer. (33) Those other Dukhan
people who were living along rivers and in the taigas that were nearby, well,
they heard (it) and they were talking among themselves, well, that this one
woman had reindeer and thought, “Let us now also catch them like that person
(did)”. (34) Yeah, now they spoke about how to catch that reindeer. (35) Some
of them catch it (with) the same kind of lasso of that woman – that woman
person. Some of those people now set up traps in places where there are big
and small gorges/ravine and chase (them), and gather them in great numbers.
Having done that, those Dukhan came to possess reindeer. They caught (them).
Sooo it was. (36) So, after that, our, well, of this Dukhan country, wait let me
think, after that in the end the wild reindeer that wonders around in the taiga
and high plains and that does not approach peoples exists as well. They have
raised the many many of those reindeer from that (time) on, they raised those
few, small reindeer they had caught, and there in the taiga and high plains
there always also existed wild reindeer that do not approach anyone. (37) So
now, growing that reindeer they caught, they move from river to river. Each of
the many Dukhans now have reindeer, and they raise their reindeer, some of
them go around with reindeer, they go around in their own directions with few
reindeer, and they, they ride their reindeer. Now they came to pack up these
21 Reindeer, like other animals, always lick urine for its salt content. The story implies that
the woman is trying to domesticate them in this way.
reindeer of theirs and go hunting. (38) Now they (i.e. the Dukhans) have come
to be people who pack the meat of wild game on them (i.e. the reindeer), have
their children ride on them, milk them and drink their milk, and in this way
now they use that reindeer, they became people who make much use of the
reindeer in their own life. Sooo it was. (39) Now the Dukhan peoples who used
to move around in that river area living from fishing, hunting and gathering,
after that, they came to have a riding animal, they milked and drank its milk,
they milked and drank the milk of the reindeer; (40) besides, they nomadize
back and forth moving fast, when they nomadize they use the reindeer as pack
animals, they nomadize from river to river, from mountain to mountain, they
came to be moving around like this, they raised them to carry their packs,
many of them. (41) Now they have a riding animal, all of them, the distances
they cover in their trips became long(er). (42) They have come to nomadize
from one river to another river. (43) They have come to be people who live
by nomadizing and hunting. (44) They have come to be people who move
and visit each other. (45) Yeah, so, the Dukhan person . . . the reindeer was
discovered/found in this way.
Transcription and Abbreviations
The transcription employed here follows general Turcological principles, with

the following additions: the signs ɨ and ʉ represent the high central vowels
occurring beyond first syllables, and the super-script [h] designates preaspi-
ration of fortis consonants, corresponding to Tuvan and Tofan glottalization/
pharyngealization. Grammatical abbreviations occurring in the interlinear
glosses are:
ABL ablative LIM limitative converb

ACC accusative LOC locative
ADJ.DER adjectival derivation MED medial
N.DER nominal derivation
NEG negative
AST assertive NF non-focal
CAUS causative PASS passive
CB converb PAST past
COLL collective PL plural
COMP completive POSS possessive
COND conditional converb POST postterminal
COOP cooperative POT potential
184 Ragagnin
COP copula PTC particle

DAT dative Q question particle
DES desiderative REC reciprocal
DIR directive REFL reflexive
ECH.DER second participant of RES resultative
an echo-compound
EMPH emphatic SG singular
GEN genitive SIM similative
HF high-focal V.DER verbal derivation
IMP imperative VBN verbal nominal
INT intensification VOL voluntative
INTRA intraterminal 1 first person
ITER iterative 2 second person
ITJ interjection 3 third person
LF low-focal
References
Badamxatan, S. 1962. Xövsgöliyn Tsaatan ardïn aǰ baydlïn toym. Studia Ethnographica

II/1, 1-66.
Bold, Luvsandorǰiyn. 1975. Uygar-urianxay xelniy egšig avia. Xel Zoxiol Sudlal 11, 133-145.
Čadamba, Zoya B. 1974. Todžinskij dialekt tuvinskogo jazyka. Kyzyl.
Čeremisov, K. M. 1973. Burjatsko-russkij slovar’. Moskva.
Dankoff, Robert & Kelly, James (eds. & translators) 1982: Compendium of the Turkic
dialects, by Mahmūd al-Kāshgharī. (Sources of Oriental Languages & Literatures 7,
Turkish Sources VII/I.) Duxbury, Mass.
Donahoe, Brian & Plumley, Dan (eds.) 2003. The Troubled Taiga. Special Issue of
Cultural Survival Quarterly, Spring 2003 (27:1).
Dorlig, C. & Dadar-Ool, B. 1994. Tïva-Mool tol’. Ölgiy.
Erdal, Marcel. 1991. Old Turkic word formation. A functional approach to the lexicon.
2 vols. (Turcologica 7.) Wiesbaden.
Gáspár, Csaba. 2006. Darkhat. (Languages of the World/Materials 419.) München.
Harrison, David K. 2003. Language Endangerment among the Tofa. Cultural Survival
Quarterly. Fall 2003, 53-55.
Harrison, David K. 2010. The last speakers, the quest to save the most endangered lan-
guages. Washington DC.
Hauenschild, Ingeborg. 2003. Die Tierbezeichnungen bei Mahmud al-Kaschgari.
Eine Untersuchung aus sprach- und kulturhistorischer Sicht. (Turcologica 53.)
Wiesbaden.
Helimski, Eugen. 1997. Die Matorische Sprache. Wörterverzeichnis ‒ Grundzüge der

Grammatik ‒ Sprachgeschichte. (Studia Uralo-Altaica 41.) Szeged.
Isxakov, Fazyl G. & Pal’mbax, Aleksandr A. 1961. Grammatika tuvinskogo jazyka.
Fonetika i morfologija. Moskva.
Khabtagaeva, Bayarma. 2015. Some remarks on Turkic elements of Mongolic origin in
Yeniseian. Studia Etymologica Cracoviensia 20, 111-126.
Kuular, Elena Mandan-Oolovna & Suvandii, Nadežda Daryevna. 2011. Leksika olene-
vodstva v Tere-Xol’skom rajone. Novye Issledovanija Tuvy 2011/1, 160-168.
Kuular, Elena Mandan-Oolovna & Suvandii, Nadežda Daryevna. 2011a. Polovozrastnye
nazvanija olenej v dialektax tuvinskogo jazyka. Novye Issledovanija Tuvy 2011/4,
146-151.
Laufer, Berthold. 1917. The reindeer and its domestication. Memoirs of the American
Anthropological Association 4/2, Lancaster PA.
Monguš, D. A. (ed.) 2003. Tolkovyj slovar´ tuvinskogo jazyka. Tom 1: A.-J. Novosibirsk.
Nugteren, Hans & Roos, Marti. 2006. Prolegomena to the classification of Western
Yugur. In: Erdal, Marcel & Nevskaya, Irina (eds.), Exploring the eastern frontiers of
Turkic. (Turcologica 60.) Wiesbaden. 99-130.
Oyunbadam Č. & Ragagnin Elisabetta (forthcoming). Duxa xel ba tunii soyol (Dukhan
language and culture).
Poppe, Nicholas. 1954. Grammar of written Mongolian. Wiesbaden.
Ragagnin, Elisabetta. 2009. A rediscovered lowland Tofan variety in northern Mongolia.
Turkic Languages 13, 225-245.
Ragagnin, Elisabetta. 2011. Dukhan, a Turkic variety of Northern Mongolia: Description
and Analysis. (Turcologica 76.) Wiesbaden.
Ragagnin, Elisabetta. 2012. Etymologische Überlegungen zu einigen Rentiertermini im
Sajantürkischen. In: Erdal, M. & Kellner-Heinkele, B. & Ragagnin, E. & Schönig, C.
(eds.) Botanica und Zoologica in der Türkischen Welt. Festschrift für Ingeborg
Hauenschild. Wiesbaden: Harrassowitz. 133-140.
Ragagnin, Elisabetta. 2015. Bargu. In: Simion, S. & Burgio, E. (eds.) Giovanni Battista
Ramusio: Dei viaggi di Messer Marco Polo (Filologie medievali e moderne 5). Venezia.
(http://virgo.unive.it/ecf-workflow/books/Ramusio/lemmi/Bargu.html).
Rassadin, Valentin I. 1971. Fonetika i leksika tofalarskogo jazyka. Ulan-Ude.
Rassadin, Valentin I. 1978. Morfologija tofalarskogo jazyka v sravnitel’nom osveščenii.
Moskva.
Rassadin, Valentin I. 1995. Tofalarsko-Russkij slovar’. Russko-Tofalarskij slovar’. Irkutsk.
Rassadin, Valentin I. 1996. Prisajanskaja gruppa burjatskix govorov. Ulan-Ude.
Rassadin, Valentin I. 2005. Slovar’ tofalarsko-russkij i russko-tofalarskij. Sankt-
Peterburg.
Rassadin, Valentin I. 2006. Sojotsko-Russkij slovar’. Sankt-Peterburg.
Rassadin, Valentin I. 2010. Soyotica (Studia Uralo-altaica 48). Szeged. [ed. by B. Kempf].
186 Ragagnin
Rassadin, Valentin I. 2014. Tofalarskij jazyk i evo mesto v sisteme tjurskix jazykov. Elista.
Sanžeev, G. D. 1931. Darxatskij govor i fol’klor. Leningrad.
Sat, Šuluu Č. 1987. Tyva dialektologija. Kyzyl.
Seren, Polina. 2006. Tere-Xol’skij dialekt tuvinskogo jazyka. Abakan.
Somfai Kara, Dávid. 1998. A xöwsgöl-i tibák. In: Birtalan, Ágnes (ed.) Öseink nyomán
Belső-Ásziában II. Hitvilág és nyelvészet. (Magyar Felsőoktatás Könyvek 10.)
Budapest. 17-21.
Ščerbak, Aleksandr M. 1961. Nazvanija domašnix i dikix životnyx v tjurkskix jazykax.
In: Ubrjatova, Elizaveta I. (ed.) Istoričeskoe razvitie leksiki tjurkskix jazykov. Moskva.
82-172.
Thackston, Wheeler M. (tr.) 2012. Classical Writings of the Medieval Islamic World:
Persian Histories of the Mongol Dynasties: Rashiduddin Fazlullah (vol. III). London.
Tatár, Magdolna. 1985. Sayan Etymologies. Beşinci Milletlerarası Türkoloji Kongresi.
İstanbul, 28-23 Eylül 1985. Tebliğler. I. Türk Dili 1. İstanbul. 243-252.
Tatarincev, Boris I. 2002. Ėtimologičeskij slovar’ tuvinskogo jazyka. Tom II: D, Yo, I,
J. Novosibirsk.
Tatarincev, Boris I. 2004. Ėtimologičeskij slovar’ tuvinskogo jazyka. Tom III: K,
L. Novosibirsk.
Tenišev, Ėdgem R. (ed.) 1968. Tuvinsko-russkij slovar’. Moskva.
Vainshtein, Sevyan [Vajnštejn, Sev’jan I.] 1980. Nomads of South Siberia. The pasto-
ral economies of Tuva. Cambridge. [English translation of Vajnšeijn, Sev’jan 1972,
Istoričeskaja etnografija tuvincev, by Michael Colenso]
Vainshtein, Sevyan I. 1986. Origin of reindeer-herding in Eurasia. In: Lehtinen, Ildikó
(ed.), Traces of the Central Asian culture in the North. Finnish-Soviet Joint Scientific
Symposium held in Hanasaari, Espoo 14-21 January 1985. (Suomalais-Ugrilaisen
Seuran Toimitukssia/Mémoires de la société finno-ougrienne 194). 279-286.
Vasilevič, G. M. & Levin, M. G. 1951. Tipy olenevodstva i ixh proisxoždenie. Sovietskaya
Etnografya 1.
Vitebsky, Piers. 2005. Reindeer people. London.
Werner, Heinrich. 2002. Vergleichendes Wörterbuch der Jenissej-Sprachen. Band II: L-Š.
Wiesbaden.
Žukovskaja, N. L. & Oreškina, M. V. & Rassadin, Valentin I. 2002. Sojotskij jazyk. In:
Neroznak, V. P. (red.) Jazyki narodov Rossii. Krasnaja Kniga. Moskva. 164-170.
Chapter 14
How Much Udi is Udi?

Wolfgang Schulze
1 What is This Paper About?
This short essay entails some reflections about the object of ‘language docu-
mentation’ in a specific setting of functional bilingualism. In this paper, I start
from a sociological stance that can be opposed to a linguistic approach of
defining the object at issue. The perspective taken here is especially relevant
because functional bilingualism grounded in the tradition of first language
bilinguals (as it is the case with the subjects of this paper, namely the ‘Udis’) is
best interpreted in terms of a holistic view (Grosjean 1982), who “argues that
each bilingual is a unique individual who integrates knowledge of and from
both languages to create something more than two languages that function
independently of each other” (Reyes 2008: 79). As for the community of Udis,
a small ethnic group dwelling mainly in the village of Nij in Northwestern
Azerbaijan, the problem is even more complicated, because one of the ‘lan-
guages’ involved in this type of bilingualism can be considered as being mas-
sively influenced by the other language, namely Azeri, the official language of
Azerbaijan. In order to illustrate this point, I refer to one example randomly
taken from published Udi texts (translation of Jona), cf. (1):
(1) däniz-ä gele ost’ahar tufan-e bak-sa

see-LOC much strong storm-3SG become-PRES
‘There comes up a strong storm on the sea.’ (Jona 1.4)
(Translation by a translator group in Nij 2009)
Except the few morphological units and the lexical unit bak- ‘to become’, all
terms are taken either from Azeri or are older (Oriental) loans. Hence, one
may ask, to which degree it is justified at all if we relate this phrase to the ‘Udi’
language. In case the speech of the Udis would be under survey in terms of
‘language documentation’, one might thus ask what kind of linguistic object
we have to deal with. According to the holistic perspective as discussed by
Grosjean (1982) it would be appropriate to speak of a unitary linguistic knowl-
edge system expressed in terms of a ‘mixed language’. However, this does

188 Schulze
not seem to be the case among many speakers in Nij. Rather, their linguis-
tic practices are characterized by a internally structured, nevertheless holis-
tic knowledge system that is profiled towards Udi or Azeri according to the
social and situational setting of communicative acts (supplemented by role
features and the language biography of the speakers). The question of how to
account for the given linguistic practices in terms of ‘language’ again becomes
complicated, because the language components of bilingualism among Udis
are marked for an unbalanced pattern with respect to language and power. In
fact we have to deal with a typical pattern that contrasts a ‘powerful’ majotar-
ian communicative system (Azeri) with a minotarian one (Udi) which again is
majotarian in the small community of Nij itself. This complex situation invites
us to reconsider the concept of ‘language’ at least when aiming at the doc-
umentation of linguistic practices in communities such as Nij. In my paper,
I will propose some arguments that aim at a more sociological understanding
of ‘language’ in terms of linguistic practices.
2 Linguistic Practices
Standard approaches to language documentation today usually refer to a

usage-based perspective claiming that it is the actual use of a given language
by members of the corresponding language community that should be under
survey. For instance, Himmelmann (1998: 166) claims that the “aim of a lan-
guage documentation is to provide a comprehensive record of the linguistic
practices characteristic of a given speech community (. . .). This (. . .) differs
fundamentally from (. . .) language description [which] aims at the record of a
language (. . .) as a system of abstract elements, constructions, and rules (. . .).”
Likewise, Woodbury (2003: 39) states that “(. . .) direct representation of natu-
rally occurring discourse is the primary project, while description and analysis
are contingent, emergent byproducts which grow alongside primary docu-
mentation but are always changeable and parasitic on it.” It is hence crucial
to determine what is meant by “linguistic practices of a given speech com-
munity” at all and in which way these practices are related to what is called
a ‘language’. The notion of “linguistic practices” does not make reference
to a particular language but basically refers a set of linguistic actions that can
“genuinely be regarded as forms of social behavior” (Skinner 1971: 1). Linguistic
actions belong to the world of “rhetorical acts”, that “characterize collections
of communicative acts that achieve specific medium-independent rhetorical
goals and include actions such as identifying an entity, describing it, dividing
How Much Udi Is Udi ? 189
it into its subparts or subtypes, narrating events and situations, or arguing to

support a conclusion” (Maybury 1999: 380). They are part of the set of genres
that “dynamically embody a community’s ways of knowing, being, and acting”
(Bawarshi and Reiff (2010: 78), referring to Berkenkotter and Huckin (1993)). The
French psychologist Yves Clot has pointed out that „[l]es attendus sociaux d’un
genre – souvent sous-entendus – concernent autant les activités techniques et
corporelles que les activités langagières“(Clot 2008: 77). Accordingly, linguistic
practices or rhetorical acts can be described as a type of social actions. The
rhetorical genre underlying these practices “refers to a conventional category
of discourse based in large-scale typification of rhetorical action; as action, it
acquires meaning from situation and from the social context in which that
situation arose” (Miller 184: 163). Applying the notion of ‘typification’ (Schütz
and Luckmann 1975) we can say that the linguistic practices of a community
represent a specific type of social action, which is grounded in the fact that the
members of the community “share common types; this is possible insofar as
types are socially created (or biologically innate)” (Miller 1984: 157).
Hence, linguistic practices can be regarded as a specific type of practiced
communicative knowledge being part of the habitus of a social group. It fol-
lows that the delimitation of particular “linguistic practices” is not primar-
ily based on criteria related to a particular language (in its linguistic sense).
Rather, it is intimately linked to the question of what is meant be a “given
speech community” (Himmelmann 1998). Obviously, the relevant determi-
nant here is ‘speech’. If we refer to ‘speech’ (in its conventional sense) as the
vocalized form of human communication, Himmelmann’s definition turns
out to be somehow tautological: In fact, the vocalized form of “human com-
munication” can easily be paralleled with “linguistic practices” (or: rhetorical
acts). The decisive term thus is “community”. Taking Himmelmann’s wordings
literally, we can assume that ‘linguistic practices’ are seen as a marker for the
communality of a group of people. In this sense, a group of people represent-
ing a social structure shares a collective knowledge about conventionalized
and habitualized linguistic practices. Linguistic practices are thus regulated
by norms of a particular type of social behavior that is grounded in a common
system of symbolic knowledge. In other words: Linguistic practices of a com-
munity reflect a particular genre, which might be called ‘a language’ from a
sociological point of view (see below).
As a matter of consequence, Himmelmann’s definition of ‘language docu-
mentation’ does not tell us about those parameters that may be relevant for
describing a given social group as representing a ‘speech community’. It seems
that we have to start from the subliminal assumption that the delimiting factor
190 Schulze
would be a ‘language’. In this sense the definition starts from two reciprocal
hypotheses: (a) The assumption that a group of people actually represents a
‘community’, and (b) that this ‘community’ defines itself by referring to a collec-
tive knowledge system called ‘language’. However, there is sufficient evidence
to assume that social groups construe themselves as social units (‘communi-
ties’) by referring to a distinctive bundle of inherited or otherwise established
sociocultural patterns of knowledge and of practices resulting there from. This
bundle may include patterns of linguistic practices, but does not necessar-
ily do so. In other words: We cannot equal the ‘speech’ of a community (as
large as it may be) to ‘a’ language as such. Taking the above-given definition by
Himmelmann serious, we have to start from communities delimited by socio-
cultural features by monitoring their linguistic practices whether or not these
practices are grounded in a knowledge system scientifically described as ‘a’
language.
Relating ‘a’ language to dimensions of language use preconditions the
delimitation of a given language and is thus determined by sociological fea-
tures rather than by mere language-internal structural features, cf. the defini-
tion of ‘language’ as “a form of activity of human beings in societies” (Halliday,
McIntosh, Strevens 1964: 4).
Starting from a group-internal perspective we may ask whether the mem-
bers of the corresponding community relate their linguistic practices to what
Mannheim (1980 [1922]) has called “communicative knowledge”, that is to
explicit and reflective knowledge in terms of awareness. This view does not
necessarily mean that communicative knowledge has to show up in terms of
language awareness, in case language is seen as a scientifically identifiable
unit. It may well be the case that members of a speech community are aware
of their linguistic practices even though the underlying collective knowledge
system does not represent a ‘language’ of its own (in the linguistic sense). One
might argue that an indicator of language awareness is the presence of a corre-
sponding linguistic sign in terms of an endonymic language name (be it under-
ived or derived from other units such a geographical terms, ethnic names etc.).
However, note that naming linguistic practices in a community by the com-
munity itself is not always present, cf. Gregersen (1976: 95) who reports: “There
is no native name for the language [s.c. Manam, W.S.] (unnamed vernaculars
seem to be the rule in New Guinea)” (see Mühlhäusler (2006) for discussion).
Most likely, linguistic knowledge given in a community can at least partly be
understood as “conjunctive knowledge” in terms of Mannheim (1980 [1922]),
this is as implicit, experiential, non-reflective, praxeological knowledge
grounded in everyday practices, also see I. Schulze (2014a)). Taking this point
into consideration, we may reformulate Himmelmann’s definition given above

as follows: The aim of a language documentation is to provide a comprehen-
sive record of the linguistic practices as documented for a community, more
precisely of the rhetorical genre as a type of social action that is grounded in a
corresponding collective knowledge system.
The degree of awareness the practitioners of linguistic actions within a com-
munity may have with respect to their linguistic practices strongly depends
from the degree to which these linguistic practices are ‘administrated’ and
‘controlled’ by corresponding institutions. By ‘institution’ I mean all kinds of
socially accepted and habitualized mechanisms that structure expected forms
of interaction and cooperation between members of a community in recur-
rent situations (cf. Schwarz 2009). In this sense, linguistic practices expressing
a corresponding collective knowledge system can be viewed as being con-
trolled by “cultural institutions” in the sense of Parsons (1991 [1951]: 33). The
more this kind of institutionalization of linguistic practices is ‘delegated’ to
individuals or peer groups that have then the power to determine upon both
the rules and the social value of these practices, the more some kind of aware-
ness regarding one’s own practices may emerge. This process may result in
various types of “language ideology”, that is in “self-evident ideas and objec-
tives a group holds concerning roles of language in the social experiences of
members as they contribute to the expression of the group” (Heath 1977: 53).
Another kind of output may show up as “a bias toward an abstract, idealized
homogeneous language, which is imposed and maintained by dominant insti-
tutions” (Lippi-Green 1997: 64). Representatives of such “cultural institutions”
may thus play the role of ‘language sages’, be it through official installment,
be it through self-assignment. The language ideology embodied by such peer
groups or language sages may thus become a part of the conceptual world of a
‘speech community’ as well as part of the cultural capital of the speakers (see
Bourdieu 1979). For the objectives of this paper it is relevant to stress the role
of both state organizations and of local language sages. In the case of Udi (see
below section 3), language awareness has been strengthened (if not generated
at all) by Azerbaijani state organizations that interpret Udi as a witness of the
non-Armenian past of Western Azerbaijan (including Mountain Karabakh,
see Schulze (2011a) for details). Its ‘cultivation’ and fostering by state organiza-
tions has implemented a new type of language awareness among Udi speak-
ers, supported by activities of one local language sage, namely Georgi (Jora)
Keçaari (1930-2006), a former school teacher in the Udi village of Nij who had
been trained at the University of Baku. Keçaari undertook various efforts to
turn Udi linguistic practices, that is the embodied cultural capital (in the sense
192 Schulze
of Bourdieu), into its objectified version (by producing school books, readers
etc.). Supported by its institutionalization, this cultural capital has become a
major motivator of language awareness among Udis in Nij. This process has
resulted in changes in the attitude of Udi speakers with respect to their original
linguistic practices: Before the Armenian-Azerbaijani conflict had erupted in
1989/90, Udi-Armenian-Azerbaijani trilingualism had been a standard pattern
in Udi settlements (see below). Armenian was by that time part of the linguis-
tic landscape in Nij. Fifteen years later, all public appearances of Armenian
had been eliminated just as actions had been undertaken to ‘purify’ Udi from
Armenian elements.
The processes briefly alluded to here illustrate that language awareness has
to be regarded as a substantial part of social and even political models of lin-
guistic practices. The institutionalization of linguistic practices present in a
corresponding community can be interpreted as a shift from ‘variety’ to ‘lan-
guage’ (in the sense of sociolinguistics). The famous dictum “a shprakh iz a
dialekt mit an armey un flot” (a language is a dialect with an army and a navy)
(Weinreich 1945: 13) illustrates that a ‘language’ can be described as some kind
of ‘armed dialect’, controlled and administered by corresponding official insti-
tutions. In this sense, the total of standard linguistic practices of a single ‘speech
community’ can be regarded as representing a linguistic variety (Ammon and
Arnuzzo-Lanszweert 2001: 815) or – if confined by geographical aspects – a dia-
lect, whether or not there are other related varieties (or dialects) present in
the region. From an internal view, ‘related’ should be paralleled with mutual
comprehensibility to the degree that every-day communication is made pos-
sible somehow. It should be noted that these assumptions do not start from a
scientific definition and delimitation of languages. It makes sense to describe
the total of intercomprehensible varieties/dialects as a Gesamtsprache from
a sociolinguistic point of view (Ammon and Arnuzzo-Lanszweert 2001: 815).
However, the idea that the individuals of a ‘speech community’ are marked for
a common ‘language’ (as opposed to variety of a Gesamtsprache) seems to be
justified only, if we can show that some kind of ‘arming’ of the given variety,
that is the institutionalization of linguistic practices has taken place.
Summing up the point made so far, we can say that the term ‘language docu-
mentation’ is misleading in case we refer to Himmelmann’s definition quoted
in the beginnings of this section. The linguistic practices observable for a given
community may well constitute a linguistic variety of its own even though the
internal structure of these practices (that is grammar, lexicon, phonology etc.)
does not represent what is conventionally called a ‘language’ from a scientific
point of view. In other words: It is not necessarily the case that the linguistic
definition of ‘a’ language corresponds to the understanding of ‘language’ by a

given speech community.
The problem becomes even more complicated in case a ‘speech com-
munity’ is marked for what is scientifically called ‘bilingualism’ or (if given)
‘multilingualism’. Here, I only refer to bi- or multilingual speech communi-
ties that are marked for bilingual or multilingual first language acquisition
(McLaughlin 1984: 73) typical e.g. for the speech community at issue (Udis,
see section 3). From their early childhood, most of the Udi children are today
exposed to two different communicative systems, namely Udi and Azerbaijani.
Both systems are present in the every-day linguistic practices of the speech
community in Nij. The third system, namely Armenian, was especially pres-
ent in the adjacent community of Vartashen (now Oğuz) before 1989, but
has become an anathema since then. Here, I cannot dwell upon the question
whether a “unitary language system hypothesis” or an “independent develop-
ment hypothesis” should be applied in order to describe the knowledge system
of Udi/Azerbaijani ‘first language’ bilinguals (see Hoffmann 2014: 75-76). After
school enrollment, the input of bilingual first language acquisition becomes
restructured and reorganized on the basis of controlled language acquisition
(mainly Azerbaijani, occasionally Udi in three local schools). Hence, Udi chil-
dren are exposed not only to the every-day linguistic practices in the village,
but also to the ‘armed variant’ of the Azerbaijani Gesamtsprache and to the
semi-armed variety of Udi. This bundle of different impact factors strongly
influences both language awareness and actual linguistic practices of later
adults. In fact, the linguistic practices in Nij are marked for both massive code-
witching phenomena and patterns of segregation with respect to language
use triggered by corresponding communicative frames. Actually, it makes
sense to describe the bilingual patterns of linguistic practices in the speech
community of Nij in terms of features relatable to the notion of “language
variety according to use” (see Halliday, McIntosh, Strevens 1964: 87). The ques-
tion then is, whether the two instantiations of local linguistic practices (here:
Azerbaijani and Udi) represent two different and independent communica-
tive systems, or whether they can be seen as two varieties of a unitary system
of linguistic practices. If the latter is true, the question emerges, what has to
be documented in terms of ‘language documentation’, when yielding at the
linguistic practices of a given speech community. The problem becomes even
more complicated, if one of these ‘varieties’, namely Udi, is massively influ-
enced by features typical for the other (in our case: ‘armed’) variety, namely
Azerbaijani. This question will be discussed in some more details in the
following section.
194 Schulze
3 The Case of Udi
From a linguistic point of view, Udi is generally described as being a mem-

ber of the East Caucasian language family, more precisely as a member of its
southern branch, conventionally termed ‘Lezgian’. It still is under discussion
whether Udi is a marginal member of the Lezgian branch or whether it rep-
resents an early off-spring of the Eastern Samur subgroup (including Lezgi,
Tabasaran, and Aghul, see Gippert et a. 2009, Schulze 2015). Udi is a more or
less direct descendent of Caucasian Albanian, a language spoken in the realm
of Caucasian Albania in late antiquity and early medieval times and docu-
mented mainly in the lower texts of two palimpsest manuscripts found in the
Mount Sinai monastery (see Gippert et al. 2009 for a full coverage). The lan-
guage name ‘Caucasian Albanian’ is a modern coinage: We do not have evi-
dence for what the corresponding endonym once had been. Today there are
about 10,000 people who claim to belong to the ethnic group of Udis and who
are Christians by belief. Until 1990, most of them related themselves to either
the Georgian Orthodox Church or the Armenian Apostolic Church. Until
1836, the Caucasian-Albanian Christianity was marked for a certain degree of
autocephaly, which has been revived in 2002.
Today, the only compact place where Udi settle is the village of Nij
(Azerbaijani Nic, Udi niˤž), located in Northwestern Azerbaijan and inhabited
by some 6,000 people, see map 1:
Map 14.1 The village of Nij in Northwestern Azerbaijan.

© 2016 Google Image Landsat.
In 2009, some 65% of the inhabitants of Nij declared to be ethnic Udis, the
rest being chiefly Azerbaijanis. Nij is divided into sixteen ‘family-based’ quar-
ters (šaq’q’a or mähällä), two of which are mainly inhabited by Azerbaijanis
(Yalgaşlı, Abdallı).
Until 1989, a more or less compact group of ethnic Udis was present in the vil-
lage of Vartashen (now Oğuz), too, located some 20 km northwest of Nij. Before
1989 Vartashen was inhabited by some 5,000 people (roughly 40% Armenians,
15% Jewish Tats, and 30% Udis). Together with the local Armenians, most of
the Udis from Vartashen were forced to leave the village in 1990 due to the
Armenian-Azerbaijani conflict and thus moved to various places of the for-
mer USSR, e.g. to Armenia, more precisely to the borderlands of the Armenian
province Tavush and to the village of Zinobiani (1938-2000: Okt’omberi) in
Eastern Georgia, which had been founded by emigrants from Vartashen in
1922 in the context of the Armenian-Azerbaijani conflict 1918-1920 (see Schulze
2011a; today, some 200 ethnic Udis live in Zinobiani, together with the same
number of Georgians, cf. SSSD 2003: 110-111). Just as it is true for the Nij Udis, the
Zinobiani Udis know a ‘language sage’ (Simon (Mamuli) Nešumašvili), who has
undertaken many efforts to advance language awareness among the local Udis.
He has founded a “Society of Georgia’s Udis” and has developed a Georgian-
based script for the local Udis that he has used when preparing a new transla-
tion of the Gospels (Nešumašvili 2012). The number of Udis who have stayed
in the Oğuz region after 1990 is difficult to determine. Azerbaijani sources talk
about 79 Udis in 2009. However, it is unknown to which degree they live in
mutual contact at all. USSR-internal migration (esp. in the 1970ies) has condi-
tioned that quite a number of Udis are now to be found in scattered places of
the former USSR, e.g. in the Rostov region, near Moscow, and in Kazakhstan.
In sum, we can assume that the number of ethnic Udis does not exceed 10,000
people. This number, however, does not match the actual number of people
who use Udi in every-day communication. In many places outside Nij, Udi
has become an endangered variety, being replaced by the local language as a
general means of communication. For instance, the Udis who have migrated
to Armenia 35 years ago, have switched to Armenian nearly completely (see
Schulze and Schulze 2016).
Disregarding minor patterns present e.g. in Russia, it becomes obvious
that Nij and (to a less extent) Zinobiani are the main places where people
are exposed to Udi as a means of every-day communication. As has been said
above, both villages are marked for bi-ethnic composition: 50% of the inhab-
itants of Zinobiani are ethnic Georgians, and 35% of the inhabitants of Nij
are Azerbaijanis. The local varieties of Georgian (Kakhetian) resp. Azerbaijani
196 Schulze
(Northern variety) are supplemented by the normative and thus ‘armed’ ver-
sion of the corresponding Gesamtsprache. Hence, the total of linguistic prac-
tices observable in the two villages is marked for at least three layers, two of
which are closely related (the two varieties of Georgian resp. the two varieties
of Azerbaijani). In the rest of this paper, I will concentrate on the linguistics of
the village of Nij in order to ask to which extent the third given communicative
system (namely Udi) can be regarded as a distinct pattern of linguistic prac-
tices fully independent from the Azerbaijani varieties.
Based on field work carried out in April 1998, John Clifton (2005) reports
that “[l]anguage mixing is not common in Nic [Nij, W.S.]. According to Keçaari
[the late ‘language sage’ of Nij, W.S., see above], Udi young people in Nic do not
mix Russian and Udi when speaking. And they use Udi consistently at home”
(Clifton and others 2005). It is interesting to see that Keçaari does not refer
to Azerbaijani, but to Russian in order to show that the linguistic practices of
the Udis are not affected by ‘language mixing’. He is silent about Azerbaijani
(just as Clifton et al. (2005) are by large, even though they say that “[t]he goals
of the research were to investigate patterns of language use, bilingualism, and
language attitudes with regard to the Udi, Russian, and Azerbaijani languages
in the Udi community”). The results presented in Clifton’s survey (in fact the
only survey on sociolinguistic attitudes of inhabitants of Nij so far) have thus
to be taken with great care. It should be stressed that Clifton has worked with
34 informants only (less than 1% of the Udis in Nij): “Half of the Udi speakers
who answered questions were chosen by Jora Keçaari, the other half were cho-
sen randomly as we sat in the teahouse or visited various offices. Participants
were given a particular set of questions at random.” Considering Jora Keçaari’s
role as a propagandist of Udi language awareness as well as the fact that he
selected the main body of the small group of informants raises doubts and
sets the survey at risk of reflecting a biased view. In addition, it is remarkable
that the study concentrates on Russian, even though the relevance of
Russian has decreased considerably since the 1990ies especially in rural areas
(3% claiming to be fluent in Russian, see Ramazanova (2014)). The survey is
rather silent about the role of Azerbaijani. From this we may conclude that the
presence of Azerbaijani within every-day linguistic practices in Nij has become
part of the standard habitus of the villagers, no longer featuring a specific phe-
nomenon to be surveyed.
Hence, the question emerges what has to be considered as the linguistic
substance embodied by the linguistic practices of the speech community of
Nij. On the one hand, we have to ask whether Nij represents a single ‘speech
community’ or whether we have to divide it just for linguistic reasons into an
Azerbaijani and an Udi speech community. Basically, the inhabitants of Nij

share common socioeconomic patterns (this includes e.g. cattle breeding,
sericulture, horticulture, poultry farming, and craftsmanship). Intermarriage
between ethnic Udis and ethnic Azerbaijanis is a pattern well-documented for
Nij. Moreover, Udis and Azerbaijanis share quiet a number of cultural patterns
typically present in the Transcaucasus. Nevertheless, the fact that Udis adhere
to rites of Christianity, whereas most Nij Azerbaijanis are Moslems, suggests
that corresponding patterns influencing everyday-culture (such as festivals,
allowance of pig breeding among Christians etc.) condition some kind of ‘func-
tional biculturalism’ (also cf. Hałas (2010) for a detailed discussion of the under-
lying term ‘culturalism’). I use this term in the sense that Udis share a common
cultural knowledge that is indexed as being more Udi-like or more Azerbaijani-
like according to different ranges of every-day contexts and events (as for
the Udi-like style, see Volkova 1994, Çabanov and Hüseynov 1999, Guvarasy
2001). Functional biculturalism “concerns, when, where and with whom”
people refer to the corresponding subsystems of cultural knowledge (para-
phrasing Colin Baker’s definition of ‘functional bilingualism’ (Baker 2011: 5).
Within the village, the majotarian state of Udis may lead to the impression that
the corresponding cultural patterns are more prevailing than those relatable
to Azerbaijanis. However, the possible dominance of these patterns is coun-
terbalanced by the fact that the Azerbaijani traditions in Nij are backed by
parallel traditions outside the village including the institutional propagation of
an overall Azerbaijani cultural style (cf. the “concept of culture of Azerbaijan”
propagated by Azerbaijani president Ilham Aliev). The Udi traditions have been
incorporated into the concept of a common Azerbaijani cultural tradition in
the context of the Karabakh conflict, allowing certain cultural idiosyncrasies
to be practiced in order to document the residues of an earlier non-Armenian
culture (including religion) in Western Azerbaijan. In this sense, Udi culture
is seen as being part of the cultural traditions of Azerbaijan and hence of its
inhabitants, the ‘Azerbaijani people’. Accordingly, the external view on the Udi
component of Nij is not marked for delimitation, but for integration into to
the overall construction of Azerbaijani ‘national’ identity (embodied mainly
in the ethnicity of Azerbaijanis). The adaptation of this view by the Udis in
Nij conditions that they construe their own identity as being “the same as
the Azerbaijani one, but different at the same time” (Keçaari 2004, p.c.; cf.
I. Schulze (2014b) for the description of rather analogous views among mem-
bers of minority groups in Armenia).
Accordingly, it is problematic to describe two distinct sociocultural lay-
ers for the Nij community. The concept of ‘functional biculturalism’ allows
198 Schulze
assuming that the cultural practices of the members of this community waver
between a more Azerbaijani-like and a more Udi-like style, the selection of
which depends on situational frames as well as upon biographical and social
parameters. Both styles can be seen as being grounded in the overall cultural
patterns of Azerbaijan that have become stabilized and habitualized over cen-
turies and that enjoy public propagation in present-day Azerbaijan.
The term ‘speech community’ (see again Himmelmann’s quote given in
section 2) suggests that ‘speech’, that is the ensemble of linguistic practices
can be seen as a delimiting factor that would subdivide (in the given case)
the community of Nij into two ‘sub-communities’. Accordingly, one would
have to start from some kind of ‘language-specific linguistic practices’, that
is one would have to isolate those layers of observed linguistic practices
that are grounded in particular linguistic knowledge systems, called ‘languages’.
However, this procedure presupposes that we have corresponding instruments
and tools available that would allow revealing these layers. One way would be
to ask people how they would label their actual linguistic practices in terms of
‘language’. Still, this way of referring to the internal view of people on their lin-
guistic practices presupposes that the individuals are sufficiently aware of the
different layers of their linguistic practices, which again presupposes that their
understanding of ‘language’ corresponds to that of the scientific perspective.
Moreover, the isolation of language-specific layers when documenting linguis-
tic practices is at risk to produce an artificial output, namely an “abstract, ideal-
ized homogeneous language”, to quote Lippi-Green (1997: 64) again.
4 Udi: Language or Communicative Style?
Obviously, the whole problem is related to different ways of approaching the

definition of ‘a language’. If we view ‘a language’ from a sociological point of
view, we might claim that individual languages (better: Gesamtsprachen, see
section 2) represent the collective knowledge system of a social group or com-
munity the members of which refer to when acting in term of linguistic prac-
tices. The delimiting factor would not be given by some structural or ‘linguistic’
properties (in the sense of scientific parameters), but merely by sociocultural
parameters separating the given social group from others. However, this view
faces the problem that individuals may belong to different social groups at the
same time (Simmel 1890: 110-116, also see Degele 1999). Accordingly, it is crucial
to set up adequate parameters in order to fix a relevant social circle (Sozialer
Kreis, Simmel 1890) that would furnish the basis for assessing a ‘community’.
This aspect is also relevant because ‘functional biculturalism’ and ‘functional

bilingualism’ present in Nij are both strongly controlled by just those param-
eters that are often referred to when describing individual social circles. This
includes role-dimensions such as ‘family’, ‘producer’, ‘consumer’, ‘believer’,
‘public role’ etc. Udi-based and Azeri-based linguistic practices among the
inhabitants of Nij are strongly related to these roles (supplemented by situ-
ational, demographic and other parameters). In other words: The ‘language’
present in the linguistic practices of the Nij community cannot be seen as
reflecting a homogenous knowledge system, but shows up as a complex net-
work of knowledge units that is profiled differently depending on given roles,
situation and so forth.
The alternative, namely starting from a linguistic definition of ‘a language’
faces similar problems. First, it is crucial to decide which parameters have to
taken into account when arguing that one language is different from the other.
Traditionally, people refer to a set of arguments starting from the question of
intelligibility. In addition, arguments stemming from observation with respect
to phonology, grammar, and the lexicon are introduced in order to fix the sta-
tus of a given communicative system as a ‘distinct’ language. The question is,
whether this external view matches the internal view of a ‘speech community’.
The best way to test this aspect is to look at strategies of people in a community
to render their own communicative system unintelligible for others who use
the same communicative system. The corresponding strategies are best docu-
mented in so-called special languages such as Rotwelsch or Jenisch in Germany
(e.g. Siewert 2003, Efing 2005), Verlan or Louchébem in France (e.g. Méla 1991,
Alliot 2009), Minderico in Portugal (I. Schulze 2014) and so on. The fact that
the relevant strategies mainly concern phonetics and the lexicon but rarely
grammatical features goes together with the above-mentioned distinction of
communicative and conjunctive knowledge as described by Mannheim (1980
[1922], see section 2). Obviously, it is basically the communicative knowledge
(here: lexicon, phonetics) that allows manipulating a given paradigm of lin-
guistic practices. From this we may conclude that people usually define ‘their
language’ as being specific because of particular features of phonetics and the
lexicon. Grammar, being less accessible due to its inclusion in conjunctive
knowledge, does not seem to play a major role in this context. It follows that
the scientific (linguistic) characterization of ‘a language’ on the basis of just
grammatical features does hardly match the standard perception of people
exhibiting the corresponding linguistic practices. From a folk-linguistic point
of view, ‘a language’ is marked especially for pronounced differences in the lex-
icon supplemented by pronounced differences in the phonetics of utterances.
200 Schulze
Differences in the grammatical system (mainly in its morphological domain)

may be understood as ‘strange’ or somehow ‘group-specific’, but they rarely
serve to identify a foreign language by laypersons. The question naturally is
whether there is some kind of margin related to the quantity of differences e.g.
in the lexicon that should be transgressed in case a given communicative sys-
tem is defined as representing a different language. To my knowledge, no such
studies exist so far. In addition, it is relevant to note that the question does
not address just quantitative issues: We can likewise hypothesize that a certain
number of salient or clue terms already suffices to label linguistic practices as
being ‘foreign’ (especially among people who are marked for a certain degree
of bi- or multilingualism).
From a linguistic point of view, Udi qualifies as a distinct ‘language’ because
of its idiosyncrasies present in all domains of the ‘language system’. It is
marked for a phonological system that is opposed to the Azeri system because
of the presence e.g. of pharyngealized vowels (e.g. xaˤ ‘dog’ vs. xa ‘fur, wool’,
uˤš ‘firewood’ vs. uš(e) ‘night’ and so on) and lengthened (historically glottal-
ized) consonants (e.g. k:ul ‘earth’ vs. kul ‘hand, p:i ‘blood’ vs. pi ‘having said’
etc.). The morphosyntax is marked for moderate agglutination with certain
tendencies towards fusion. Other distinctive features are e.g. the presence of
an ergative case and the paradigm of so-called floating agreement clitics (see
Harris 2002, Schulze 2011b). Nevertheless, Udi grammar shares a number of rel-
evant features with Azeri, such as the presence of an O-split (basically ‘given/
non-given’), of a monopersonal system of agreement markers triggered by the
S/A NPs, of postpositions, and of converbial strategies (less pronounced in Udi
than in Azeri). Accordingly, speakers of Udi are not confronted with a fully
different system of grammatical categories when exposed to Azeri linguistic
practices, cf. the following sentences:
(2) Udi: (bez) xaʕ te-zax-p’u

I.POSS dog NEG-I.DAT-COP.PRES
Azeri: mən-im it-im yox-dur
I-1.Sg.POSS do-1Sg.POSS be_not-COP.PRES
(3) Udi: meyvan-a k’ä-i-zu
fruit-O.DEF eat.PAST-PAST-1SG
Azeri: meyvən-a ye-di-m
fruit-O.DEF eat-PAST-1SG
‘I ate the fruit.’
Nevertheless, the number of differences in realizing the corresponding catego-

ries seems to be sufficient in order to set Udi apart from Azeri from a gram-
matical point of view.
As for the lexicon, however, things are quite different. The two dictionar-
ies of Udi currently avaible (Gukasjan (1974) and Mobili (2010)) entail some
7,000 resp. 9,000 entries (including compounds, phrasemes etc.)). 26,3 % of
the entries in Gukasjan’s dictionary are direct loans from Azeri. This figure has
raised to 37,6 % in Mobili’s dictionary. These loans cover all domains of every-
day life and cannot be related to specific conceptual domains or domains of
cultural/social practices. Out of the 1.233 lexical concepts listed in Comrie and
Khalilov (2010) that have a match in Udi, 278 Udi terms (22.45 %) are loans
from Azeri (disregarding loan translations). The following diagram illustrates
the percentage of loans from Azeri with respect to the individual conceptual
domains proposed by Comrie and Khalilov (2010):
Diagram 14.1 Relative percentage of Azeri loans in Udi according to

conceptual domains.
It becomes obvious that the percentage of Azeri terms in the domains of ‘war-
fare and hunting’, ‘possession’, ‘clothing’, ‘dwelling’, ‘agriculture’, ‘religion and
belief’ is higher than the average (22.45%). Roughly the same picture emerges,
when we relate the number of Azeri loans both to the total of loans and to the
total of conceptual units, cf. diagram 2.
Given the lack of a comprehensive contemporary corpus of linguistic prac-
tices, it is difficult to judge upon the question which actual frequencies can
be ascribed to the individual loans. In addition, it has to be assumed that the
individual domains are present in accordance with the given communicative
202 Schulze
Diagram 14.2 Absolute percentage of Azeri loans in Udi.
frame and in accordance with the social roles of the interlocutors. This means
that we have to include parameters related to the communicative habitus of
the corresponding social circle. For instance, it can be assumed that in the
frame of traditional role concepts, Udi men would talk more about topics
like agriculture, hunting and animals, politics and the like in public, whereas
women would address topics related to home work, social relations, cuisine
etc. more frequently when talking to each other in a corresponding setting.
This means that the linguistic practices of the Udi community may be marked
for an even higher percentage of Azeri loans in certain communicative situa-
tions, whereas they are less frequent in others. Personal observation suggests
that the more communication is done in public, the more Azeri loans occur.
The question thus emerges what is meant by ‘Udi’ when aiming at the
“language documentation (. . .) of the linguistic practices characteristic of a
speech community”, to quote Himmelmann (1998) again. One might argue
that the degree of phonological integration of Azeri loans into the collective
communicative knowledge system called ‘Udi’ can be an argument for judg-
ing upon the ‘Udishness’ of these loans. It goes without saying that especially
older loans may have undergone this type of accommodation to the phono-
logical system of Udi. However, the relevant changes are marginal and would
not hinder Azeri speakers to understand them if embedded into Udi phrasing.
On the other hand it is difficult to tell whether the spontaneous use of Azeri
terms in Udi is nothing but another type of code switching that would not
harm communication because in fact all Udis share the corresponding bilin-
gual knowledge system.
These assumptions also question the value of lexical documentation if not
derived from corpus linguistics reflecting the actual linguistic practices of a
‘speech community’. For instance when asking Udis what their term for ‘fish’
would be they would probably answer “čäli”, that is they would refer to the
non-Azeri term (without clear etymology). However, when talking about fish it
may well be the case that they say something like ayč:ä balıq aq:alzu ‘tomorrow
I will buy a fish’, using the Azeri term balıq ‘fish’ (or even sabah balıq aq:alzu,
using Azeri sabah for ‘tomorrow’). Hence, Udi čäli might be present in the
knowledge system of these speakers, but it would not be part of their every-
day linguistic practices. So, when documenting the linguistic practices of these
Udis, the Azeri term would show up, but not the Udi one.
Turning the perspective around, one may ask to which extent Azeri speak-
ers of Nij are able to follow a conversation in Udi. Informants told me that
they are usually able to understand the general topics of such conversations
by referring to the Azeri terms showing up in the speech of the interlocutors.
A more comprehensive understanding would depend on the question of how
many Azeri terms the speakers use. They likewise admit that under certain
circumstances such as communication in privacy, they would have difficulties
to get at least an idea of the contents of the conversation. From a perceptual
perspective, they would thus be faced with some kind of massive special lan-
guage as described above. The percentage of loans given in Gukasjan’s and
Mobili’s dictionaries illustrate that the number of Azeri loans increased in the
last 40 years (from 26,3% to 37,6%). Naturally, these figures do not tell much
about the actual relevance of the corresponding terms in the linguistic prac-
tices of the members of the Udi speech community. Likewise, it is difficult to
tell to which degree Robert Mobili being a full Udi-Azeri bilingual has tested
his words with the help of other Udi speakers. It should be noted that Mobili
does not include recent loans related to the domains of politics and sciences,
except if they have undergone characteristic phonetic changes (Mobili 2010: 5).
Accordingly, we may assume that the number of Azeri terms in actual Udi
speech is even higher than 37,6 %.
The observations presented in this paper question the notion of ‘Udi’ as ‘a
language’ at least from a sociolinguistic point of view. The bilingual setting of
the Nij community cannot be described as the co-existence of two distinct lin-
guistic knowledge systems. The strong tendency toward the inclusion of Azeri
patterns in the linguistic practices of Udis does not result in some kind of ‘mixed
language’, which would typically result in a monolingual knowledge system.
Rather, we may assume that Udi is on its way of becoming a communicative
style of Azeri retaining to a certain degree its grammatical, phonological, and
in parts lexical idiosyncrasies. The more Udis include Azeri lexical terms in
their linguistic practices the more this communicative system will become
intelligible to Azerbaijanis, at least on a rudimentary level. Udi will neverthe-
less keep on representing a particular way of speech in the village of Nij, just
204 Schulze
as it would be typical for other types of Azeri registers and dialects. Sure, in
case the grammatical system does not change dramatically, Udi will not simply
end up as some kind of Azeri dialect. Rather, we have to assume that it will
results in a new type of communicative style or register that may be termed
a culturally-defined variety or ‘cultural dialect’ of Azeri. However, as has been
said above, the growing language awareness among certain proponents of the
Udi community is a factor to be included in this forecast. The official acknowl-
edgement of Udi as some kind of ‘cultural capital’ for the Azerbaijani state and
society may condition that Udi survives as ‘a language’ because of its surplus
for the Azerbaijani state and – by adopting this external view – for the Udi
community of Nij.
5 Conclusions
The remarks presented in this short paper illustrate that the scientific param-
eters used to define ‘a language’ do not necessarily meet what can be derived
from the observation of the linguistic practices of a ‘speech community’. It
seems more adequate to distinguish between ‘Udi’ as a scientifically defined
unit and ‘Udi’ as representing a particular rhetorical genre that is part of the
habitus of people in Nij. Obviously, the scientifically delimited unit (Udi as ‘a
language’) is part of the knowledge system of many Udis, although we have
to assume that not all Udis share this knowledge system to the same extent.
Hence, Udi as ‘a language’ can be regarded as a catalogue of linguistic phenom-
ena abstracted from the documentation of linguistic practices that are filtered
according to scientifically established criteria. However, even this perspective
only allows delimiting Udi from other communicative systems by referring to
the feature of intelligibility. Over centuries, the communicative system called
‘Udi’ has undergone massive changes in nearly all domains of phonology,
grammar, and lexicon. Many of these changes were induced by language con-
tact. The ‘original’, that is East Caucasian profile of Udi has become distorted
to a degree that it is even doubtful whether we should label Udi as an East
Caucasian language today. Thus the question of intelligibility by people using
another communicative standard seems to be the only valid way of defining
Udi from a linguistic point of view.
However, the linguistic practices of the Udi community become intelligi-
ble to Azerbaijani members in and outside the community the more the Udis
‘tune’ their practices toward the communicative system of these people. Thus,
these linguistic practices do not represent a stable and non-dynamic knowl-
edge system, but are part of the social dynamics given for the members of
the community. In this sense, the understanding of ‘Udi’ as a stable and well-
defined linguistic system doesn’t seem to be justified. Rather, we have to assume
that this system is just a subset or one of the layers of actual linguistic practices
that may be activated to a different degree depending on the corresponding
social roles, social conditions, frames, and situations.
In this respect, it does not make sense to claim that Udi is that component
of the linguistic practices observable in the Nij community that it marked for
‘Udiness’. Such a claim would presuppose that we know what ‘Udi’ is. Rather
we might say that Udi is that component of the local linguistic practices that
cannot be processed or understood by ‘other’ people. Still, such a definition
would fragment the reality of linguistic practices. ‘Udi’ as ‘a language’ would
then no longer be understood as a typified rhetorical genre present in the com-
munity of Nij (and elsewhere) the practice of which might in parts be intelligi-
ble for others sharing a related rhetorical genre knowledge. From this it follows
that there are at least two ways of defining ‘a language’: First, by referring to
linguistic criteria (of which kind so ever); second by referring to the actual lin-
guistic practices of a community sharing a common rhetorical genre. It will be
up to people engaged in ‘language documentation’ to decide which object one
will chose.
References
Alliot, David. 2009. Larlépem-vous Louchébem? L’argot des boucher. Paris: Éditions
Horay.
Ammon, Ulrich and Arnuzzo-Lanszweert. 2001. Varietätenlinguistik. In G. Holtus,
M. Metzeltin, and C. Schmitt (eds.), Lexikon der Romanistischen Linguistik I,2, 793-
823 Tübingen: Niemeyer.
Baker, Colin. 2011. Foundations of Bilingual Education and Bilingualism. Bristol, Buffalo,
Toronto: Multilingual Matters.
Bawarshi, Anis S. and Mary Jo Reiff. 2010. Genre. An Introduction to History, Theory,
Research, and Pedagogy. West Lafayette, Indiana: Parlor Press.
Berkenkotter, Carol and Thomas N. Huckin. 1993. Rethinking Genre from a
Sociocognitive Perspective. Written Communication 10: 475-509.
Bourdieu, Pierre. 1979. La Distinction – Critique sociale du jugement. Paris: Éditions de
Minuit.
Çabanov, Gəmeršah and Rauf Hüseynov 1999. Udilər. 2nd edition. Baku: Elm.
Clifton, John M., Deborah A. Clifton, Peter Kirk, and Roar Ljøkjell. (2005). The sociolin-
guistic situation of the Udi in Azerbaijan. SIL Electronic Survey Reports 2005-014.
Clot, Yves. 2008. Travail et pouvoir d’agir. Paris: Presses Universitaires de France.
206 Schulze
Comrie, Bernard and Madzhid Khalilov. 2010. The Dictionary of Languages and Dialects
of the Peoples of the Northern Caucasus. Leipzig and Makhachkala: MPI Evolutionary
Anthropology.
Degele, Nina. 1999. Soziale Differenzierung: Eine subjektorientierte Perspektive.
Zeitschrift für Soziologie 28,5: 345-264.
Efing, Christian. 2005. Das Lützenhardter Jenisch. Studien zu einer Sondersprache.
Wiesbaden: Harrassowitz.
Gippert, Jost, Wolfgang Schulze, Zaza Aleksidze, Jean Pierre Mahé. 2009. The Caucasian
Albanian Palimpsests from Mount Sinai. 2 vols. Turnhout: Brépols.
Gregersen, Edgar A. 1976. A note on the Manam language of Papua New Guinea.
Anthropological Linguistics 18,3: 95-111.
Grosjean, François. 1982. Life with two languages: An introduction to bilingualism.
Cambridge, MA: Harvard University Press.
Gukasjan, Vorošil. 1974. Udincǝ-azǝrbajcanca rusca lüǧǝt. Bakı: Elm.
Guvasary, Venera Antonova. 2001. Udiland. Oslo: Bergersen.
Hałas, Elżbieta. 2010. Towards the World Culture Society: Florian Znaniecki’s Culturalism.
Bern: Peter Lang.
Halliday, M. A. K., A. McIntosh, P. Strevens. 1964. The Linguistic Sciences and Language
Teaching. London: Longmans.
Harris, Alice. 2002. Endoclitics and the Origins of Udi Morphosyntax. Oxford: Oxford
University Press.
Heath, Shirley Brice. 1977. Social history. In: Joshua A. Fishman et al. (eds.), Bilingual
Education: Current Perspectives. Vol. 1: Social Science, 53-72. Arlington, VA: Center
for Applied Linguistics.
Himmelmann, Nikolaus P. 1998. Documentary and descriptive linguistics. Linguistics
36: 161-195.
Hoffmann, Charlotte. 2014 [1991]. An Introduction to Bilingualism. New York and
London: Routledge.
Lippi-Green, Rosina. 1997. English with an accent: Language, ideology, and discrimina-
tion in the United States. London: Routledge.
Mobili, Robert. 2010. Udi-azerbaycanin urusin ǝyitluğ. Bakı: Qrifli nǝşr.
Mühlhäusler, Peter. 2006. Naming Languages, Drawing Language Boundaries and
Maintaining Languages with Special Reference to the Linguistic Situation in Papua
New Guinea. In: D. Cunningham, E. E. Ingram and K. Sumbuk (eds.), Language
Diversity in the Pacific. Endangerment and Survival, 24-39. Clevedon etc.: Multilingual
Matters.
Mannheim, Karl. 1980 [1922]. Eine soziologische Theorie der Kultur und ihrer
Erkennbarkeit – konjunktives und kommunikatives Denken. Frankfurt a.M.: Suhrkamp.
Maybury, Mark T. 1999. Communicative Acts for Multimedia and Multimodal Dialogue.
In: M. M. Taylor, F. Néel, and D G. Bouwhuis (eds.), The Structure of Multimodal
Dialogue II, 375-392. Philadelphia/Amsterdam: Benjamins.
McLaughlin, Barry. 1984. Second Language Acquisition in Childhood. I. Preschool

Children. Hillsdale, NJ: Lawrence Erlbaum.
Méla, Vivienne. 1991. Le verlan ou le langage du miroir. Langages 101: 73-94.
Miller, Carolyn R. 1984. Genre as Social Action. Quarterly Journal of Speech 70,2:
151-167.
Nešumašvili, Simon. 2012. Eve’l däftär matveyaxo mark’nuxo, luk’axo va ioananxo udin
muzen. Tbilisi: Krist’ianuli memk’vidreobis liga.
Parsons, Talcott. 1991 [1951]. The Social System. With a New Preface by Bryan S. Turner.
London: Routledge.
Ramazanova, Aynur. 2014. Knowledge of Russian in Azerbaijan. Caucasus Research
Resource Center 21/04/2014. [http://crrc-caucasus.blogspot.de/2014/04/knowledge-
of-russian-in-azerbaijan.html]
Reyes, Iliana. 2008. Bilingualism in holistic perspective. In: J. González (ed.),
Encyclopedia of bilingual education, 79-82. Thousand Oaks, CA: SAGE Publications,
Schulze, Ilona. 2014a. Sprache als fait culturel. Studien zur Emergenz, Motiviertheit und
Systematizität des Lexikons des Minderico (Portugal). Hamburg: Kovač.
Schulze, Ilona. 2014b. Methodologische Überlegungen zur soziokulturellen
Dokumentation von Minderheiten in Armenien. Iran and the Caucasus 18, 2:
169-193.
Schulze, Ilona and Wolfgang Schulze. 2016. A Handbook of the Minorties of Armenia.
Hamburg: Kovač.
Schulze, Wolfgang. 2011a. A brief note on Udi-Armenian relations. In: J. Dum-Tragut
and U. Bläsing (eds.), Cultural, Linguistic and Ethnological Interrelations in and
Around Armenia, 151-170. Newcastle upon Tyne: Cambridge Scholars Publishing.
Schulze, Wolfgang. 2011b. The Origins of Personal Agreement Clitics in Caucasian
Albanian and Udi. In: Manana Tandaschwili and Zakaria Pourtskhvanidze (eds.),
Folia Caucasica, Festschrift für Jost Gippert zum 55. Geburtstag, 119-168. Frankfurt
a.M. and Tbilisi: Univ. Frankfurt/Staatl. Univ. Tbilisi.
Schulze, Wolfgang. 2015. From Caucasian Albanian to Udi. Iran and the Caucasus 19,2:
149-177.
Schütz, Alfred und Thomas Luckmann. 1975. Strukturen der Lebenswelt. Neuwied:
Luchterhand.
Schwarz, Anna. 2009. Vorlesung “Soziologische Grundbegriffe”. Script. Europa
Universität Viadrina, Frankfurt a.d. Oder.
Siewert, Klaus. 2003. Grundlagen und Methoden der Sondersprachenforschung. Mit
einem Wörterbuch der Masematte aus Sprecherbefragungen und den schriftlichen
Quellen. Wiesbaden: Harrassowitz.
Simmel, Georg. 1890. Über soziale Differenzierung . Soziologische und psychologische
Untersuchungen. Leipzig: Duncker & Humblot.
Skinner, Quentin. 1971. On Performing and Explaining Linguistic Actions. The
Philosophical Quarterly 21, 82, 1-21.
208 Schulze
SSSD. 2003. Sakartvelos mosaxleobis 2002 c’lisp’irveli erovnuli saq’oveltao aγc’eris Ʒiritadi
šedegebi. Vol. 2. Tbilisi: Sakartvelos st’at’ist’ik’is saxelc’ipo dep’artmenti.
Volkova, N. 1994. The Udis. In: Paul Friedrich and Norma Diamond (eds.). The
Encyclopedia of World Cultures, Vol. 6. Russia and Eurasia/China, 375-378. Boston:
Hall and Co.
Weinreich, Max. 1945. Der YIVO un di prablemen von undzer tsayt. YIVO bleter 25,1:
3-18.
Woodbury, Anthony. C. 2003. Defining documentary linguistics. In: P. Austin (ed.),
Language Description and Documentation, Vol. 1. (Endangered Languages Project.),
35-51. London: School of Oriental and African Studies.
Chapter 15
Language Contact in Anatolia: The Case of Sason

Arabic
Eser Erguvanlı Taylan
1 Introduction
The region of present-day Turkey, throughout history, has been the homeland
of diverse peoples with different cultural, religious and linguistic backgrounds.
Thus, this geographical area of great linguistic diversity is fertile lands for linguis-
tic research on language contact. This study aims to present a brief overview of
Sason Arabic (SA), one of the least documented and understudied dialects of
Arabic spoken in the multilingual and multi-cultural eastern Turkey, as a good
case for a language contact study.1 The town of Sason, which is situated in the
north of the mountainous province of Batman, has slightly more than thirty
thousand inhabitants according to the 2014 census. SA is spoken in a multi-
lingual environment where the main contact languages have been Armenian,
Zazaki (Kurdish) and Turkish, but due to demographic changes that have taken
place in the nineteenth and twentieth centuries SA speakers are now diminish-
ing in number.2 There is no diaglossia among SA speakers.
Jastrow (2006: 144-5) classifies the Arabic dialects spoken in Turkey into
three major groups, and several sub-groups. SA is identified as sub-group of
Mesopotamian Arabic:
i) Syrian sedentary Arabic

Hatay
Mersin, Adana
ii) Syrian Bedouin Arabic
Urfa
1 See the appendix for the location of Batman and Sason on the map.
2 There isn’t a reliable source on the number of SA speakers today or in the past. The estimate
given by our younger informant was about 5,000 to 6,000 people, though it is not clear what
this estimate is based on.

210 Taylan
iii) Mesopotemian Arabic (qeltu-dialects)3

Mardin
Siirt
Diyarbakır
Kozluk-Sason-Muş
Both Jastrow (2005, 2006) and Talay (2011) refer to Haim Blanc as the person
who first introduced the term Mesopotemian Arabic in his monograph entitled
Communal Dialects in Baghdad (Cambridge, Massachusetts 1964), in which
he gives a description of the Muslim, Christian and Jewish dialects spoken
in Bahgdad until 1950. According to Haim Blanc, based on the pronunciation
of the Classical Arabic qultu ‘I said’, the Jewish and Christian dialects com-
prised the qəltu-dialects and the Muslim gilit dialect.4 Haim Blanc also argued
that the qəltu-dialects spoken by the Christians and Jews in Baghdad were an
older variety spoken by the sedentary population in the time of the Abbasid
caliphs, while the Muslims dialect was a much later development, introduced
by the Bedouins who gradually moved into the country during the reign of
the Ottoman Empire (Jastrow 2006: 157-8). This brief historical perspective
indicates that the qəltu-dialects found in Anatolia must belong to the older
linguistic stratum, which implies that the earlier settlers in the region must
have been Christian and Jewish speakers of qəltu-dialects rather than speak-
ers of the Muslim gilit-dialect. Indeed, our informants when asked about who
their ancestors were, replied that the Arabic speaking population in the area
was believed to have migrated from the Basra region, long time ago, and that
SA was exclusively spoken by the Christian population until only recently. It
should be recalled that, historically speaking, the whole area was the land of
the Armenian people long before the Arabs and the Kurds appeared in the
scene. However, due to the migrations of the non-Muslims from the area in
the late nineteenth and early twentieth century, the remaining SA speakers
today are now predominantly Muslims living in the surrounding villages.5 The
contact languages in the area have basically been SA, Armenian and Zazaki,
the variety of Kurdish spoken in that region. Turkish appeared in the lives of
3 See Talay (2011) for a survey of the work done on the Mesopotemian dialects of Arabic spoken
in Turkey.
4 The qəltu dialects preserve the Old Arabic *q as a voiceless uvular stop /q/ while in the gilit
dialects this sound surfaces as is a voiced velar stop /g/ in most cases. Furthermore, in the
qəltu dialects the 1.SG perfect is -tu, while it is -it in the gilit dialects, Jastrow (2005, 2006).
5 Some of the Muslim SA speakers of today might be Christians who have converted to Islam.
Language Contact In Anatolia: The Case Of Sason Arabic 211
the people of Batman as the language of the state much later, probably, as late
as the early twentieth century.
It is well-known that long-term language contact leads to linguistic change;
the degree and type of change vary depending on a number of factors, such
as nature and duration of contact, social status of the languages in contact,
etc. The linguistic changes contact languages undergo may result in language
loss, borrowing at different levels or mixing of languages. SA, being one of the
languages spoken in the multilingual communities of northern Batman, would
then be expected to exhibit some effects of language contact. Thus, the nature
and extent of language contact and their effects on SA duly call for an investi-
gation. However, before raising any questions regarding how contact with the
languages in the area may have influenced SA, first a brief description of its
major linguistic features will be presented. Then, the findings will be discussed
from the perspective of the possible effects of language contact.
2 Linguistic Features of SA
The SA data presented in this study derives from the data collected in a field
methods class at Boğaziçi University in the spring of 2013 and Akkuş (2014b).6
Data were supplied through two informants, one being a middle aged gentle-
man around age 50-55, who was an illiterate polyglot in SA, Armenian, Zazaki
and Turkish, which he learned after the age of 25, and the other a bilingual
male university student of age 22-23.
When asked to what extent they could understand the other Arabic dia-
lects they were exposed to, our informants emphatically stated that they were
unable to comprehend any other Arabic dialect, be it one spoken in Turkey,
such as the Mardin or Adana/ Hatay dialect or one outside of Turkey. They said
they could only recognize certain words.7
6 The source is specified when data from Akkuş is resorted to. Examples with no source speci-
fied are part of the author’s data collected in the field methods class.
It is worth noting that the first linguistic study on SA is Akkuş (2014a). I would like to
thank him for doing the final check on the SA data in this study.
7 The complete unintelligibility of SA with the other Arabic dialects spoken in the area was
also confirmed by another SA speaker in the conference, who was an employee in Ardahan
University.
Some basic expressions of greetings in SA are provided below to give a feel for the dailect.
(i) bəlxer dʒi:t/e (M/F) (ii) ʃəme kənt/e (M/F).
Welcome How are you?
(iii) Baʃ kəttu.
I am fine.
212 Taylan
Now an overview of the major syntactic and morpho-syntactic properties of

SA, predominantly at the simplex structure level, is presented.
(i) Word Order

The default word order is SVO; however, certain other orders are observed
typically in sentences with focused or topicalized elements in the appropriate
syntactic and/or pragmatic contexts. As focused elements are preferred pre-
verbally, SOV and OVS are other frequently occurring orders used to express
object focusing, as seen in (1)b-c. VSO is a highly marked order, and VOS occurs
when the verb phrase is contrastively used, as in (1)d and (1)e, respectively.
OSV, on the other hand, is not acceptable at all, as 1(f) illustrates.
(1) a. bənt ayale ʃi (SVO)

girl eat.PST.3SG.F food
‘The girl ate (food).’
b. bənt ʃi ayale (SOV)
c. ʃi ayale bənt (OVS)
d. ??ayale bənt ʃi (VSO)
e. ayale ʃi bənt (VOS)
f. *ʃi bənt ayale (OSV)
When the verb is in the progressive aspect, the auxiliary occurs before the verb,
as shown in (2):
(2) Ayşe ki təʃreb meyn

Ayşe COP.3SG.F drink-3SG.F water
‘Ayşe is drinking water.’
As pronominal objects in SA surface in the form of a post-verbal clitic, this

yields an SVO order. In (3), -liyen is the clitic expressing the pronominal third
person plural object, while in (4)b -a stands for the third person feminine pro-
nominal object, illustrating that these clitics express gender, too.
(3) nana nə-dur-liyen

we 1Pl-search-3PL
‘We are searching them.’
(4) a. ina ki-tu atßex lali dʒeʒa

I COP-1SG cook.1SG this(F) chicken
‘I am cooking this chicken.’
(4) b. ina ki-tu atßex -a

I COP-1SG cook.3SG.F
‘I’m cooking it.’
In complement clauses, as shown in (5) and (6)a, and relative clauses as seen
in (6b), the observed order is VS(O), which is noted to be a highly marked order
in simplex sentences.
(5) Ali irəllu [i-fqez]

Ali want.3SG.M 3SG.M-run.IPV
‘Ali wants to run.’
(6) a. ma-səma-tu le [ dʒo zɣar ] (Akkuş 2014b, ex. 2a-b)

neg-hear-PST.1SG that came.3PL children
‘I didn’t hear that the children came.’
b. ənt kitab le [ i-habb Cihan ] tə-qri

2SG.M book that 3SG.M-love Cihan ] 2SG.M -read
‘You read the book that Cihan likes.’
In non-verbal sentences the copula occurs sentence finally. The copula form,
which is kən for first and second persons (singular or plural), gets inflected
for number and gender, as shown in (7)a-c and (7)f-g. For third person singu-
lar or plural there is no gender distinction on the copula; it is ye for singular
third persons and -nen for plural third persons, regardless of gender, as seen in
(7)d-e and (7)h.8 These examples also demonstrate that the predicate adjective
agrees in gender and number with the subject. The masculine singular form
is the unmarked form of the adjective while the feminine singular carries the
suffix -e (e.g. mamlu:n-e ‘happy-F’). In the plural form of the adjective there is
a single plural suffix, which is -in for both genders (e.g. mamlu:n-in ‘happy-PL’).
(7) a. ina mamlu:n kən-tu

I happy COP-1SG
‘I am happy.’
8 In quick speech ye was observed to have the variants -yi,- i, -iye conditioned by the preceding
sound.
214 Taylan
(7) b. ənte mamlu:n-e kən-te

you.SG.F happy-F COP-2SG.F
‘You(SG.F) are happy.’
c. ənt mamlu:n kən-t

you.2SG.M happy COP-2SG.M
‘ You (SG.M) are happy.’
d. iya mamlu:n-e ye
she happy-F COP-3SG
‘She is happy.’
e. iyu mamlu:n ye
he happy COP-3SG
‘He is happy.’
f. na:na mamlu:n-in kənn-a

we happy-PL COP-1PL
‘We are happy.’
g. ənto mamlu:n-in kən-to

you.PL happy-PL COP-2PL
‘You (PL) are happy.’
h. iyen mamlu:n-in-nen
they happy-PL-COP.3PL
‘They are happy.’
When the predicate is a locative NP, the unmarked word order is now sub-
ject-copula-locative, as seen in (8)a. Furthermore, in (8)b we see the third per-
son plural copula form, namely kənno, surfacing in this order, which was not
observed with predicate adjectives.
(8) a. ənte kən-te fə Istanbul

you.SG.F COP-2SG.F in Istanbul
‘You are in Istanbul.’
b. insana:d kən-no f-o:de

people COP-3PL in-room
‘The people are in the room.’
(ii) Negation
There are two different morphemes to negate verbs; ma is used when the verb
is in the indicative, as in (9)a, but la when the verb is in the imperative mood,
as in (9)b. Both forms immediately precede the verb.
(9) a. kitab ma: qari-tu-n

book NEG read.PST-1SG-it
‘I didn’t read the book.’
b. la: tamel
NEG work.2SG.M
‘Don’t work.’
In the presence of an auxiliary/copula, the negative morpheme ma always pre-

cedes the auxiliary/copula, as shown in (10)a and (11). The placement of ma in
any other position yields an ungrammatical structure, as seen in (10)b.
(10) a. Ayşe ma: ki təʃreb meyn

Ayşe NEG COP.3SG.F drink.3SG.F water
‘Ayşe is not drinking water.’
b. *Ayşe ki ma: təʃreb meyn

Ayşe COP.3SG.F NEG drink.3SG.F water
(11) ina mamlu:n ma: kən-tu

I happy NEG COP-1SG
‘I am not happy.’
In sentences with a non-verbal predicate, when the subject is third person sin-
gular or plural, number and gender agreement surface on the negative mor-
pheme as the full form of the copula is absent, illustrated in (12)a and (13)-(14).
(12) a. sabi orandʒi mu-u

boy student NEG-3SG.M
‘The boy isn’t a student.’
b. */??sabi mu-u orandʒi

boy NEG-3SG student
216 Taylan
(13) bənt orandʒi me-y

girl student NEG-3SG.F
‘The girl isn’t a student.’
(14) zɣar orandʒi me-nnen

children student NEG-3PL
‘The children aren’t students.’
The ungrammaticality of (12)b demonstrates that the negative copula, just

like the affirmative copula, is not favored in any position other than sentence
final.
(iii) Question Formation

Wh-questions occur in-situ in nominal sentences; for example, amma ‘where’
is the predicate in (15) and is in sentence final position. In verbal sentences
wh-questions are always pre-verbal as seen in (16) and (17)b. The examples
in (17) also illustrate that an affirmative sentence with SVO order is expressed
with SOV order when the object is a wh-question word. In fact, as wh-question
words are inherently focused elements, it is expected for them to occur pre-
verbally since this is the order observed with focused NPs.
(15) Ali amma-ye?

Ali where-3SG
‘Where is Ali ?’
(16) Ali amma məʃ-i

Ali where went-3SG
‘Where did Ali go to?’
(17) a. Ayşe kəllom tə-tbex ʃ̧orbiye.

Ayşe every=day 3SG-cook soup
‘Ayşe cooks soup every day.’
b. Ayşe kəllom ʃəne tə-tbex?

Ayşe every=day what 3SG-cook
‘What does Ayşe cook every day?’
When there are two wh-questions in a sentence, order becomes impor-

tant. As illustrated in (18)a and b, ande ‘who’ has to precede amma ‘where’,
and this is the only possible order.9 Again, the verb is in final position in
all cases.
(18) a. ande amma məʃ?

who where go.PST?
‘Who went where?’
b. *amma ande məʃ?

where who go.PST
There is no special particle or morpheme to form yes-no questions. Rising into-

nation differentiates declaratives from yes/no questions; hence, (19)b, where
the intonation rises sentence finally, has a yes-no question interpretation.
(19) a. kən-t f-stanbul

COP-2SG.M in-Istanbul
‘You are in Istanbul.’
b. kən-t f-stanbul ↑
COP-2SG.M in-Istanbul
‘Are you in Istanbul?’
(iv) NP Structure and Nominal/Substantive Morphology

SA is a prepositional language, and adjectives follow the nouns as shown in (20)
and (21)a-b, respectively. In these respects SA exhibits head initial properties.
(20) məʃa kelp

for dog
‘for the dog.’
(21) a. kelp gbir

dog big
‘big dog.’
9 The wh-question ande ‘who’ also stands for ‘which’:

Ali ande bənt ihab.
Ali which girl love.3SG.M.
‘Which girl does Ali love?’
218 Taylan
b. bənt gbir-e
girl big-F
‘the big girl.’
Adjectives agree with the gender of the noun they modify. Masculine gender
is unmarked; feminine gender is marked by the suffix -e as observed on the
adjective in (21)b.
Numerals and demonstratives precede the noun, as illustrated in (22) and
(23), respectively:
(22) arba habu təffa hamər

four item apple red
‘four (items of) red apples.’
(23) ina ala dʒam qaraf-tu10

I this (M) glass break.PST-1SG
‘I broke this glass.’
SA has no definite article. There are a few frozen expressions where the rem-
nants of a definite article are visible, such as in the expression for ‘welcome’
given below.
(24) bi-l-xer dʒi-to

in-the-goodness came-2PL
‘welcome’
Indefiniteness is expressed through the morpheme ma which follows the noun

just like an adjective:11 As there is no definite article, an object clitic on the verb
is observed to serve the function of expressing the specificity or definiteness of
the direct object.
10 The other demostrative is laga ‘that (M)’. Demonstratives agree in number and gender
with the noun; e.g. lala (F) kitab ‘that book’, lagu kita:ba ‘those books’.
11 The adjective follows the indefinite marker:
bənt ma koisi
girl a beautiful
‘a beautiful girl’
(25) a. Ali bənt ma ku idu:r

Ali girl a COP look=for.3SG
‘Ali is looking for a(any) girl.’
b. Ali bənt ma ku idu:r-a

Ali girl a COP look=for-3SG.F
‘Ali is looking for a (specific) girl.’
c. Ali bənt ku idu:r-a

Ali girl COP look=for-3SG.F
‘Ali is looking for the girl.’
d. Ali ku idu:r-a
Ali COP look=for-3SG.F
‘Ali is looking for her.’
The object in (25)a has an indefinite reading as it has the [N ma] structure
with no object clitic on the verb. However, when the object clitic -a appears on
the verb, as seen in (25)b, the object in the form of a [N ma] structure receives
an indefinite but specific reading. When ma is not present then the object is
interpreted as definite, as seen in (25)c. When there is no overt direct object
but the verb has the object clitic, then we get a pronominal object reading, as
shown in (25)d.
If the direct object is masculine (e.g. ʃi ‘food’ in (26)), then the object clitic
agreeing with it surfaces as -u:
(26) a. Ahmad ʃi ayal-Ø

Ahmad food eat.PST-3SG.M
‘Ahmad ate food.’
b. Ahmad ayal-u.
Ahmad eat.PST-3SG.M
Ahmad ate it.’
The examples above also illustrate that SA has no case marking on subjects
and objects.
(v) Verbal Morphology

The perfective and imperfective verb forms fall into two different inflectional
paradigms. Subject agreement in number and gender surfaces as a verbal suffix
220 Taylan
in the perfective, whereas subject agreement is marked through both prefixes

and suffixes in the imperfective, as shown in (27):
(27) ‘run’ Perfective(V+suffix) Imperfactive(prefix+ (Akkuş 2014b)

V+suffix)
1SG.M/F faqas-tu a-fqez
2SG.M faqas-t tə-fqez
2SG.F faqas-te tə-fqəz-e
3SG.M faqaz i-fqez
3SG.F faqaz-e tə-fqez
1PL.M/F faqaz-na nə-fqez
2PL.M/F faqas-to tə-fqəz-o
3PL.M/F faqaz-o ə-fqəz-o
(i) The templatic morphology of classical/standard Arabic is greatly reduced

in SA. Only passive structures are productively formed with the prefix ən-; no
by-phrases can be used with passive verbs.
(28) a. Ali qaraf mase

Ali break.PST.3SG table
‘Ali broke the table.’
b. mase ən-qaraf-e
table PASS-break.PST-3SG.F
‘The table is broken.’
Note that when the feminine direct object of the active sentence (28)a becomes
the subject of the passive structure (28)b, the verb now exhibits gender agree-
ment with the feminine subject through the suffix -e.
Morphological causative formation has been reduced to only a few verbs;
(29)b illustrates that the verb ‘cut’ is one of the verbs that retains its morpho-
logical causative form where the second consonant of the root is doubled.
(29) a. Fatma qad-e ra:sa

Fatma cut.PST-3SG.F hair.her
‘Fatma cut her hair.’
b. Fatma qat-t-e ra:sa məʃa quafor

Fatma cut.PST-CAUS-3SG.F hair.her to hair.dresser
‘Fatma had the hairdresser cut her hair.’
For most of the verbs, however, periphrastic means employing verbs such as
‘give’ and ‘make’ are used to express their causatives as shown in (30) and (31).
Note that the agent the causer acts upon is expressed in the form of a PP (i.e.
məʃa N) both in periphrastic causatives and in morphological causatives.
(30) imma məʃa Fatma ʃi adəd-u addil.

her.mom to Fatma food give.PST-3SG.M making
‘Her mother made Fatma cook.’
(31) Doxtor məʃa Ali ku isi f-iju sipor.

doctor to Ali COP make in-him sports
‘The doctor is making Ali do sports.’
(vi) Existential Sentences and Possessives

Both existential and possessive relations at the sentential level are expressed by
the same existential predicate ifi, as illustrated in (32)a and (32)b, respectively.
(32) a. fə e:ne ifi dʒalabma zəxar (existential)

in room exist some boys
‘There are some boys in the room.’
b. Ahmad ifə-l:u kitap (possessive)

Ahmad exist-sSG.M book
‘Ahmad has a book.’
3 Effects of Language Contact on SA
This preliminary survey of the basic properties of SA has shown that this variety
deviates from classical/standard Arabic significantly in a number of ways. SA
is not unique in this respect as other dialects of Arabic have also been noted to
differ notably from classical Arabic. Work done on Arabic dialects spoken in dif-
ferent countries, has revealed that these dialects exhibit a simplification in their
grammar, such as the loss of the dual form on verbs, adjectives and pronouns,
loss of case endings in nouns and adjectives, and loss of mood distinctions in
the verb, etc. (Kaye 2009: 562). SA follows suit, and exhibits the same simplifica-
tions as well as additional ones. Based on the limited data presented in the above
section, the following points stand out as some of the characterizing features
of SA:
222 Taylan
a) There is no definite article in SA; remnants of the definite article are

observed only in certain frozen forms. The definite reading of a noun
derives from the absence of the indefinite marker ma and the grammati-
cal context.
b) Nominal/substantive morphology has been simplified. Nouns have just
singular and plural forms, and no longer a dual form. Nouns have no case
marking. Adjectives are inflected for number and gender agreement.
c) Verbal inflections and templatic morphology have been heavily reduced.
Loss of the dual category is not the only simplification on verbs; gender
agreement on the plural verb forms has been lost, too. Only second and
third person singular verb forms reflect gender agreement with the sub-
ject.
Passive voice is productively formed through morphological means,
that is, with the invariant prefix ən-, as was shown in (28). Causativiza-
tion, however, makes use of periphrastic means as the productive strat-
egy, outnumbering the verbs retaining the morphological means of
consonant gemmination, which exist as lexicalized causative forms of
only a restricted set of verbs.
d) The simplification in the morphological system covers derivational mor-
phology, too. For example, instead of using an affix to derive an agentive
noun, analogous to the agentive -er in English, SA makes extensive use of
relative clauses which are introduced with the relativizer le:
(33) racel le ikri kitap

person REL write book
‘writer (lit: person who writes books).’
e) Varying constituent orders are observed. Some orders demonstrate that

SA displays head-final properties, such as the copula in non-verbal sen-
tences occurring in sentence final position, while others display head-
initial properties, such as the N+Adj. order.
f) In the phonological system, pharyngealized consonants (i.e. the emphatic
consonants) and the glottal stop seem to have been lost. Other conso-
nants such as [p], [ʒ], [tʃ] appear to have been introduced to the conso-
nant inventory. The vowels encountered in SA are [i e ə u o a]; length is
also observed but its phonemic status needs to be investigated.
g) There seem to be layers of lexical borrowing, as some of the basic vocabu-
lary items are non-Arabic, such as atsura ‘bird’, tʃotʃo ‘grandmother, tʃowo
‘grandfather’, tornij ‘grand-child’, etc. Borrowings from Turkish constitute
a more easily identifiable layer, for example, orandʒi ‘student’ (instead of
the Arabic talebe), o:retmen ‘teacher’, o:de ‘room’, etc., mase ‘table’, cam
‘glass’, etc. The other sources of lexical borrowing need to be investigated,

and this will necessitate knowledge on the varieties of Zazaki and Arme-
nian that have been spoken in the area.
Even in a survey as brief as this one, SA as a contact language with the above
designated features raises questions regarding the social and demographic
structure of the region through time. It is well-known that the Armenian and
Kurdish populations have existed in eastern Turkey for a very long time, though
Armenian is the earlier language spoken in region. What is not exactly known
is when the Arabic speaking population or populations moved into the area.
One possible explanation for their presence may be the Arab conquests of
Mesopotamia, Middle East and North Africa after the 7th century and the inva-
sions in the Abbasid era. The linguistic consequence of this expansion would
be the introduction of Arabic as the superimposed language. Thus, the other
existing languages spoken in the conquered lands, such as the different dialects
of Aramaic, Armenian, Persian, Coptic, etc. must all have come into contact
with Arabic, in varying degrees of intensity. Whether the Arabic speaking pop-
ulation in Batman (Sason) can be tracked down to this first wave of Arab con-
quests or whether their presence in this area is due some later migration from
Basra (the date of which we do not know), as claimed by the informants, needs
to be researched. In either case, we can assume that the first Arabic settlers
brought the social and geographical variety they spoke with them; the question
then is whether this earlier variety of Arabic can be traced down or not.
Furthermore, if SA speakers in the past were exclusively Christians, as told
by our informants, then, who were these Christians? Were they Armenians or,
going back in history, could there have been Syriac/Aramaic speakers, who
lived in the area as well? Assuming Christian population(s) existed in the
Batman region before Arabic speakers appeared, they must then have been
exposed to Arabic over a long period of time, especially when Arabic became
the dominant language socially and/or politically, and thus learned this vari-
ety of Arabic. Or can the Christian SA speakers be traced back to Christian or
Jewish Arabic speakers moving into the region from Baghdad or other parts
of Iraq in an early period? Recall that both SA and the Christian and Jewish
dialects of Baghdad belong to the older qəltu-dialect group. In either scenario,
due to the cultural and linguistic diversity of the region, we can assume that
bilingualism emerged as a consequence of this prolonged social, political and
economic contact between communities speaking different languages, lasting
until today. Looking at the linguistic features of SA and the changes and sim-
plifications observed at different levels of grammar, asking whether SA today
is just another regional Arabic vernacular or whether it has emerged into a
distinct language would be a valid question. One hypothesis would be that
224 Taylan
long term, close contact with other languages in the area has led to borrowing,
language change and possibly to a certain degree of language mixing. If we
assume Arabic was the superstrate, we then need to determine what the sub-
strate languages have been, what layer(s) their presence is witnessed in and
what changes they have triggered or brought about. Another hypothesis could
be that the variety of Arabic spoken in the region, with time, emerged into a
creole, which would mean that SA has now become a separate language. For
this hypothesis to be pursued, however, we would need a thorough analysis of
SA and also information on the properties and structural features of creoles,12
bearing in mind that drawing the line between a creole and a non-creole is no
simple matter. This alternative may appear to be a far-fetched speculation, but
probably one that is worth investigating, as there are two attested Arabic based
creoles that have been identified, namely Nubi Arabic spoken in some regions
of Uganda and Kenya and Sudanese Creole Arabic (Juba).13
The major aim of this study has been to draw attention to the existence
of the unexplored and endangered Sason Arabic spoken in multilingual east-
ern Turkey and introduce its basic linguistic properties which reflect interest-
ing effects of language contact. We hope that this research will trigger interest
in the many issues and questions that call for further investigation.
References
Akkuş, Faruk. 2014a. Functional Categories and Phrase Structure of Sason Arabic. MA
Thesis, Boğaziçi University.
Akkus, Faruk. 2014b. Clause Structure in Contact Contexts: the Case of Sason Arabic.
Talk given at Boğaziçi University.
Bakker, Peter, Aymeric Daval-Markussen, Mikael Parkvall & Ingo Plag. 2011. Creoles are
typologically distinct from non-creoles. In Bhatt, Parth and Tonjes Veenstra (eds.),
Creoles and Typology. Special Issue of the Journal of Pidgin and Creole Languages, 26:
1, 5-42. Amsterdam: John Benjamins.
Bickerton, Derek. 1981. Roots of Language. Ann Arbor: Karoma.
12 Peter Bakker, et all. (2011) after surveying a large set of data and comparing structural
properties which are analogous in terms of complexity, claim that creoles constitute a
typologically distinct category of languages. They argue that creoles are more similar to
one other than they are to other languages, and thus can be identified in terms of their
particular structural features.
Also see Bickerton (1981) for the twelve properties of creoles that he identified.
13 127 pidgin and creole languages are listed in the repertoire of Hancock (1977) given in
Romaine (1988).
Jastrow, Otto. 2006. Arabic dialects in Turkey: Towards a comparative typology. Türk
Dilleri Araştırmaları, 16: 153-164. Istanbul.
Jastrow, Otto. 2005. Anatolian Arabic. In K. Versteegh (ed.), Encyclopedia of Arabic
Language and Linguistics, vol. I: 86-96, Leiden: Brill.
Kaye, Alan. 2009. Arabic. In Comrie Bernard (ed.), The World’s Major Languages (2nd
ed.), 560-578. London: Routledge.
Romaine, Suzanne. 1988. Pidgin and Creole Languages. Longman Linguistics Library.
Talay, S. 2011. Arabic Dialects of Mesopotamia. In Weninger Stefan (ed.), The Semitic
Languages, 909-920. De Gruyter Mouton.
Appendix
Map 15.1 Location of the province of Batman in eastern Turkey.
Map 15.2
Sason as the northest district of Batman.
Chapter 16
Language and Emergent Literacy in Svaneti

Kevin Tuite
1 Introduction
The Svan-speaking communities of Upper and Lower Svaneti have for centu-
ries been identified as – and identify themselves as – Georgians, even though
Svan speech is not mutually intelligible with Georgian. Depending on whom
you ask, or what sources you consult, Svan has 15,000, or 50,000, speakers, or
some number in between; is endangered, or is not; is a language, or merely a
dialect of Georgian; and its speakers are, or are not, a distinct ethnic group.
Svan is the outlier in the Kartvelian family, having probably separated from
the common ancestor in the Bronze Age. Svan shares the basic morphosyn-
tactic profile of Georgian – bipersonal verb, three series of tense-aspect-mood
paradigms, shifting case assignment by transitive and active intransitive verbs
(“split-ergativity”), a rich variety of dative-subject constructions, the gram-
matical category of “version” – but has very divergent vocabulary (Tuite 1997).
To give an impression of how impenetrable Svan sounds to Georgians from
elsewhere, here is an excerpt from a Svan folk poem with parallel Georgian
translation (Shanidze & Topuria 1939: 54)
Svan text Georgian translation

cxemæd miča ži xok’ida tavisi mšvild-isari auɣia,
liz-ličedi č’ur xobina. svla-c’asvla dauc’q’ia.
mešjæl mare mæg wešgd laxcwir, meomari k’aci q’vela uk’an
dast’ova,
sgwebin otčæš, txum, esogæn. c’in gausc’ro, tavši moekca.
Gloss of Svan text Free translation of Svan text

[bow.and.arrow:NOM his up he.has. ‘He has taken up his bow
taken and arrow,
go-leave indeed he.has.begun He has set out.
fighter man:NOM all:NOM behind he. He left all the warriors
left behind,
before he.managed, head:DAT, He took the lead, he stood at
he.stood.to.them] their head.’

Language And Emergent Literacy In Svaneti 227
The population of Svaneti in 2006 was 22889, of which 14270 in Upper Svaneti
(Mestia Municipality) and 8619 in Lower Svaneti (Lentexi Municipality). By
way of comparison, the population was estimated at 15000 in 1886, 9533 of
whom lived in Upper Svaneti. One recent estimate of the number of Svan
speakers gives a total of 26120, 14709 of whom speak an Upper Svan dia-
lect (Lower Bal or Upper Bal), and the remainder (11411) speak Lower Svan
(Lashx, Lent’ex or Cholur dialect) (Tschantladse, Babluani & Fähnrich
2003: 12).
Accounts of the present-day situation of the Svan language diverge signifi-
cantly. An article in the October 2014 issue of National Geographic paints a
pessimistic picture: an 86-year-old from the remote village Adishi is said to be
“one of the few remaining fully fluent speakers of Svan”, and a 14-year-old is
quoted predicting that “the Svan language will disappear with my generation”
(Larmer 2014). The latest version of Ethnologue (Lewis, Simons & Fennig 2015)
credits Svan with 15000 speakers. Its vitality is evaluated at level 7 (shifting)
on the EGIDS (Expanded Graded Intergenerational Disruption Scale), which
indicates the assessment that the language “is not being transmitted to chil-
dren”. (By comparison, Mingrelian has 500K speakers & EGIDS rating 6a “vig-
orous”; Laz has 22K speakers & EGIDS rating 6b “threatened”). Gippert’s 2005
report on “Endangered Caucasian languages in Georgia” is more optimistic.
The size of the Svan speech community is estimated at 50,000, but Gippert
and his colleagues noted a high incidence of code-mixing and code-switching
with Georgian.
I have had the opportunity to observe Svan usage in Upper Svaneti (mostly
in the commune of Latali), Lower Svaneti (during fieldwork in 1997), and in
two communities of Svans who were relocated to lowland Georgia (Axali
Xaishi & Jandari/Lemshvanier). In Lower Svaneti, children were mostly spo-
ken to in Georgian, although I encountered a handful of older women in
one of the more remote villages who spoke little if any Georgian. In Latali,
children speak and are spoken to in Svan, but several people expressed con-
cern about the extent to which a full command of Svan is being passed on to
the youngest generation. One friend in his mid-40s explained that, whereas
he and other of his age learned Georgian only after acquiring Svan as their
mother tongue, the newest speakers appear to be Georgian-dominant. Svan
is still the principal language within compact ‘diaspora’ settlements of Svans,
many of which are composed of people from specific villages. Here as well I
took note of children speaking Svan; I also had the opportunity to observe an
instance of local conflict resolution, which took place mostly in Svan, although
some participants preferred to speak Georgian. On the whole, the Svan language
remains prevalent where Svans live compactly in homogenous communities,
228 Tuite
but it is rarely heard when Svans move to the cities or migrate abroad in search
of work.1
2 The Emergence of Written Svan
In a celebrated and much-quoted passage from Giorgi Merčule’s Life of Grigol

Xandzteli (c. 950), Kartli, at first the name of an East-Georgian state, was rede-
fined as “the spacious country within which the liturgy is celebrated and all
prayers are performed in the Georgian language” (kartlad priadi kweq’anay
aɣiracxebis, romelsa-ca šina kartulita enita žami šeic’irvis da locvay q’oveli
aɣesrulebis). This larger territory, defined by a common liturgical language,
included many districts in western Transcaucasia where vernacular languages
other than Georgian were in use (Tuite 2008). By the 10th century, Svaneti was
a flourishing center of Georgian Orthodox church-building and icon-making.
Written documents and inscriptions from medieval Svaneti are in the Georgian
language only (Ingoroq’va 1941; Silogava 1986, 1988).
Little attention was paid to the local vernaculars spoken by Georgian
Orthodox communities, until Prince Vaxushti Bat’onishvili’s 1745 Description
of the Kingdom of Georgia. Since none of the vernaculars was used in writing,
they were classified “by ear” as either some form of Georgian, or a different
language. On the basis of a handful of words which phonetically resembled
their Georgian equivalents, Mingrelian was characterized by Vaxusht’i as
“degraded Georgian”. Svan, however, was simply a different language, as was
Abkhazian.
Mingrelians: “The great and prominent speak Georgian, although they

also have their own language, like degraded Georgian” (enit arian didni
da c’arčinebulni kartuli enita, aramed akwst tvisica ena, gana c’amqdari
kartulive . . .; 783).
Svans: “They have their own language, but know Georgian as well”
(ena tvisi akwst sak’utari, gana uc’q’ian kartulica; 788)
Abkhazians: “They have their own language, although the elites
know Georgian” (ena sak’utari tvisi akwst, aramed uc’q’ian c’arčinebulta
kartuli; 786)
1 According to Richard Bærug (pers. comm.), the language situation in the provincial capital
Mestia, which has become the center for a burgeoning tourist industry, differs somewhat
from that of the remaining Upper Svan villages. Georgian is widely used in everyday com-
munication, and some children seem to prefer it to to Svan.
A major breakthrough was made by the explorer Johann Güldenstädt (1787),

who for the first time detected the Kartvelian affiliation of Svan on the basis
of crudely-transcribed word lists. This laid the groundwork for the systematic
description and comparison of the languages of the Caucasus, carried out with
increasing intensity in the middle and later 19th century.
The genetic affinities established by the new field of historical linguistics
were incorporated into the concept of Georgian national identity which came
to prominence in the latter half of the 19th century, most notably in Jakob
Gogebashvili’s highly-influential primers for schoolchildren. In his primary-
school textbook Bunebis k’ari (22nd ed, 1912: 496-513, 537-547), Kartvelian-
speaking Mingrelians and Svans are included in the Georgian people (eri),
but the West-Caucasian-speaking Abkhazians and Indo-Iranian-speaking
Ossetians are not.
With the exception of toponyms and personal names recorded in medieval
documents from Svaneti, the words and short phrases collected by Güldenstädt
and Klaproth during their expeditions to the Caucasus in the late 18th and
early 19th centuries were the earliest attestations of the Svan language in the
written medium. The first grammatical sketch of Svan was published by Rosen
in 1847 (Cagareli 1873: 78-80). Rosen’s Svan and Mingrelian examples are writ-
ten in Georgian script with a few additional characters; Laz – although closely
related to Mingrelian – was transcribed by Rosen in Arabic characters, as the
data were collected in Ottoman territory. By the 1860’s, Peter Uslar was collect-
ing grammatical data from Svan speakers, some of which appeared posthu-
mously in Vol. 10 of the Sbornik materialov dlja opisanija mestnostej i plemën
Kavkaza (1890: V-LI). Although Uslar considered the Georgian alphabet partic-
ularly suited to the complex phonologies of Caucasian languages (Otčet 1864:
10), he devised a modified Cyrillic script, incorporating some Georgian letters.
One of the first appearances of the Uslar script was in an 1864 Svan primer
intended for use in schools to be opened by the Society for the Reestablish-
ment of Orthodox Christianity in the Caucasus [Obščestvo vosstanovlenija
pravoslavnogo xristianstva na Kavkaze], which had been founded 4 years ear-
lier (Otčet 1864: 11). Similar schoolbooks were prepared in Chechen, Abkhaz and
Ossetic (Savenko 2010). The Svan primer, entitled Lušnu anban “Svan alphabet”,
contains spelling exercises for teaching the script to children, prayers, a cat-
echism, and a trilingual glossary (Svan-Georgian-Russian).
At the time, the Svan peasantry was largely illiterate (Tepcov 1890: 64).
Knowledge of spoken Georgian was unevenly distributed by geography and
gender, reaching its peak among adult males in eastern and southern Svaneti
(Nizharadze 1964: 169-172). Most women were monolingual. Nonetheless,
the Cyrillic-based Svan literacy initiative was met with suspicion and
230 Tuite
hostility, notably from Besarion Nizharadze, an Orthodox priest who had

been trained under the auspices of the Society.2 The distancing of written
Svan from written Georgian, and the intended use of Svan as a medium for
Christian instruction, however well-intentioned, was understandably per-
ceived by Georgian intellectuals as a step toward the dissolution of both Giorgi
Merchule’s Georgia (defined by a shared liturgical language), and Jakob
Gogebashvili’s Georgia (defined by linguistic affiliation to a single “mother
tongue”), in the large body of the Russian Empire.3
Svan materials published in the Sbornik materialov dlja opisanija mest-
nostej i plemën Kavkaza from 1890-1910, including a Russian-Svan dictionary,
were written in variants of the Uslar modified-Cyrillic alphabet (Gren 1890; Iv.
Nizharadze 1890; G. & I. Nizharadze 1894; I. Nizharadze 1910). Beginning in 1910,
a new journal appeared, Materialy po jafetičeskomy jazykoznaniju, directed by
N. Ya. Marr, who brought in a very different policy for the presentation of texts
in the non-literary Kartvelian languages. First, a modified Georgian script was
adopted for Laz, Mingrelian and Svan materials. Second, these texts were not
accompanied by translations. This practice was maintained throughout the
Soviet period: the four-volume Svan Prose Texts series and Svan Chrestomathy
contain no translations or glosses.4 Some linguists have surmised off the
record that Svan-Georgian (and also Mingrelian-Georgian) bilingual editions
and dictionaries were not produced in order to avoid giving the impression
that Svans and Mingrelians are ethnically distinct from (other) Georgians.
Whether or not this was in fact a motivating factor, the new text-presentation
policy had at least one significant consequence. If the preceding phase in
Svan literacy could be described as an opening outward, both by making
Svan-language texts available to the broader scholarly community, and provid-
ing instructional materials to the Svans in their own language, the new phase
2 Resistance increased in response to the stepped-up Russification campaign that followed the
succession of Tsar Alexander III to the throne in 1881. Kiril Ianovski, director of the Caucasus
Educational District (Kavkazskij učenbyj okrug) imposed the obligatory teaching of Russian
in Georgian primary schools, and also declared that the medium of instruction in schools
in Mingrelia should be Mingrelian, rather than Georgian, which was removed from the
curriculum.
3 Among those reacting negatively to the Tsarist literacy initiative for Kartvelian minority
languages was the writer Vazha-Pshavela, in the 1902 poem Vin aris k’aci?: “He sows enmity
between Kartli, Imereti and Kakheti [names of Georgian provinces], creates special alphabets
for Mingrelia and Svaneti” (sak’utar anbans šeudgens samegrelos da svanetsa).
4 For some reason, Svan poetry was an exception to this rule, as shown by the Georgian transla-
tions provided in the 1939 Shanidze-Topuria anthology, and the 12-volume Geo. Folk Poetry
collection.
would appear to have turned Svan writing inward. The Svan chrestomathies
published from 1910 to the end of the Soviet era are accessible only to a small
circle of Kartvelologists with the necessary training to decode them without
an accompanying translation, and the Svans themselves. The medium of writ-
ten Svan was limited to the transcription of oral literature and ethnographic
descriptions originally produced in that language, and not to be used for any
of the functions already assumed by Georgian.
The Svan orthography developed by Marr and his Georgian colleagues
aimed at phonetic precision. In addition to the obsolete Old Georgian let-
ters for [q] (<ჴ>) and [j] (<ჲ>), new characters were created to represent long
vowels (marked by a macron), [æ] (written as umlauted “a”) and [ə] (<ჷ>).
The linguists hesitated between “v” (<ვ>), “ü” (<ჳ>) and a special character
called “u-brjgu” (<Ⴓ̂>) to represent Svan [w], before settling on the last option.
Even before Marr’s new journal appeared, however, Gvedo Nak’an, a soldier
from Upper Svaneti serving in what is now Turkish territory, worked out a Svan
orthography of his own. Nak’an’s 1908 diary, published in the first volume of
the Svan Prose Texts series (Shanidze & Topuria 1939: 41-48), employed only the
letters available in the standard Modern Georgian alphabet. The schwa vowel
is simply omitted from the spelling, and [æ] is not distinguished from [a]. Long
vowels are occasionally written with double letters, but often not distinguished
from short vowels. The editors of the anthology “cleaned up” Nak’an’s spell-
ing to conform to their phonetically-precise orthographic standards for Upper
Bal Svan, but his original spellings can be found in the endnotes (Shanidze &
Topuria 1939: 459-462). Rather than being an approximative representation of
Svan speech cobbled together by a semi-literate writer, Nak’an’s orthography
is surprisingly adequate, as long as the reader understands the inner workings
of the language. The presence of the schwa vowel, for example, can almost
always be predicted from phonotactic constraints on consonant sequences.5
The raising of [a] to [æ] is provoked by an /i/ or /e/ in the following syllable in a
word’s underlying morphological structure. Georgian “v” is perfectly adequate
for representing Svan [w], since [v] and [w] are nonconstrastive allophones.
Long vowels, which only occur in the Upper Bal and Lashx dialects, are not
predictable, but their semantic load is low. Very few pairs of words are distin-
guished only by vowel length.
Thus, more than a century ago, two approaches to writing Svan were inde-
pendantly devised: a phonetically exact but phonologically redundant system
developed by and for linguists; and a phonetically imprecise but semantically
5 Schwa was also not written in the spelling of Svan place names in medieval manuscripts,
e.g. <lha> for Ləha, <pxt’ler> for Pxət’rer, etc. (Ingoroq’va 1941: 20, 129).
232 Tuite
adequate system created by and for Svan speakers. These are the two options
confronting the new generation of Svan writers today, as will be discussed in
the final section.
3 Debates over Svan Language, Identity and Literacy
The 1897 Russian Imperial census included the number of individuals (pop.
15756) using Svan speech (“narechie”), which was listed in the Kartvelian lan-
guage category along with Mingrelian. Svans were also counted separately, as
a subgroup of Georgians, in the 1926 USSR census (pop. 13218). They were not
counted in Soviet censuses from 1939 onward, nor are Svans recognized as a
distinct group in the post-Soviet Georgian republic. Interestingly, the Svans
were counted separately in the 2002 and 2010 Russian Federation censuses
(pop. 41 in 2002, 45 in 2010), and also included in the total count of Georgians.
The issue of whether the Svans represent a distinct national or ethnic
group is also tied up with the curious debate – difficult for many outsiders
to understand – about the status of their speech: is Svan a “language” or a
“dialect”? The proponents of the latter opinion (see, for example, Putkaradze
2002, 2003) operate with an exclusivist concept of language inherited partly
from Giorgi Merchule, who defined Georgia on the basis of a shared liturgi-
cal language; and the prominent 19th-century intellectuals Chavchavadze and
Gogebashvili, who attributed a single “mother tongue” (deda ena, also the title
of Gogebashvili’s best-known school primer) to the Georgian people. A less-
noticed predecessor is the Soviet definition of nationality: Each officially rec-
ognized national group had a single “native language” (rodnyj jazyk), which had
a written form, and which was used in at least the initial years of education.
Furthermore, each individual was ascribed a single native language, whether
or not they had equal or greater competence in other languages.
If one accepts these presuppositions, it follows that the identification
of Svan as a “language” would be tantamount to recognizing the Svans as a
nationality distinct from the Georgians. By identifying Svan and Mingrelian as
“dialects” – even though acknowledging that they are not mutually intelligible
with Georgian – Putkaradze and others who share his views assert that they
serve the same function as Georgian dialects in the accepted sense (such as
Pshavian, Tushetian or Gurian); that is, as nonliterary vernaculars vis-à-vis the
single literary language of the Georgian nation. As did Gogebashvili, they define
Georgianness on the basis of linguistic affiliation (speakers of Kartvelian lan-
guages) rather than a shared liturgical language. As one would expect, propo-
nents of this view frequently recall the Tsarist educational policy of the 1880s
and 1890s, and the simultaneous publication of Svan and Mingrelian texts in
Cyrillic script, to justify their conviction that Russia is trying to stir up separat-
ist sentiment among Mingrelians and Svans as part of a “divide and conquer”
strategy in the South Caucasus.
Also arousing the suspicions of the Svan-as-Georgian-dialect camp are
reports that translations of the Bible into Mingrelian and Svan are planned or
even underway. This has drawn the attention of the Georgian Orthodox church
hierarchy, which officially condemned the publication of religious texts in
Kartvelian languages other than Georgian. The former Georgian ombudsman
Sozar Subari drew heavy criticism – not least from the Church leadership –
for having voiced support in 2005 for a projected Svan-language version of the
New Testament (more on which later).
One prominent target is the European Charter of Regional and Minority
Languages, which has yet to be ratified by Georgia 15 years after it joined the
Council of Europe (Putkaradze, Dadiani, Sherozia 2010). The Charter obliges
participating states to promote the use of regional languages in education, the
justice system, public services, media and culture. As defined by Charter, how-
ever, the category of “regional or minority languages . . . does not include . . . dia-
lects of the official language(s) of the State”, further reinforcing the position of
those who refuse to acknowledge Svan as a language.
4 Present-Day Language Practice and Emergent Svan Literacy
Even as debate continues over the status and role of the Svan language, a new
manifestation of Svan literacy is emerging, which seeks to slip between the
Scylla of Georgian exclusivity, and the Charybdis of standardization – which
would entail the selection of one of the Svan dialects as the basis of the written
language, normalization of the orthography, decisions about what is or is not
“correct” Svan vocabulary and grammar, and the fixing of ground rules for the
creation of new words. More significantly, none of the contributors to contem-
porary Svan writing gives the slightest indication that their practice is in any
way incompatible with their identity as Georgians. In this concluding section,
I will present three recent initiatives in Svan writing.6
6 This is by no means an exhaustive list. Other instances of present-day written Svan usage
include blogs by local doctors on topics such as hepatitis and cancer; comments on
Svan-themed videos on YouTube; and posters announcing a skiing contest in Upper Svaneti.
234 Tuite
(i) The Svan New Testament

A draft of the translation is presently in circulation, but there are no plans to my
knowledge to publish it. For understandable reasons, the participants in this
project prefer to remain anonymous for the time being. The translation is in a
variety of the Lower Bal dialect, with an orthographic style that adheres closely
to the transcription norms used by Georgian Kartvelologists – notably, with
respect to the representation of the schwa vowel, and the usage of “u-brjgu”
<ŭ> to represent /w/ (which most present-day Svans write with the Georgian
letter <v>).7 The draft contains numerous alternative renderings (in parenthe-
ses), and occasional citations from Georgian, Russian and German versions
of the New Testament. Here are two verses from the Gospel of St Matthew
(chap 1: 18-19):
<mateš läxenär, mänk’viš txŭim> (Matthew’s gospel, first head/

chapter)
<18. i ieso krisdeš litŭene amži (lasŭ) atxŭid: miča di märiam iosebiš
ləq’dän lasŭ i mine ušxvarte liqdäld märiam čolɣanŭeli c’q’ilän
kunxenka.>
(And Jesus Christ’s birth thus (was) happened: His mother Mary was
Joseph’s betrothed and before their coming to each other, Mary has
become pregnant from the Holy Spirit)
<19. i ioseb, miča č’äš (leč’šəri), mac’vdi (mare) lasŭ, i made xek’vad eča
liušxe i ušdil ka lipšŭdes laxp’ire (//ka lipšŭded gŭi laxad).>
(And Joseph her husband (fiancé), was an upright (man), and did not
want to expose her, and he intended to secretly release her (//out release-
ADV heart come-to-him))
The Svan terms for “gospel” and “chapter” are calqued from Georgian saxareba
“joyful news” and tavi “head, chapter”, respectively. In verse 19, the translator
appears to hesitate between /la=x-p’ir-e/, employing a root borrowed from
Georgian a-p’ir-eb-s “intends”, and the more idiomatic /gwi la=x-a-d/ “heart
came to him”; and also between ascribing the role of husband or fiancé to
7 Written text is enclosed in angled brackets, with the following coding of languages and
scripts: <Svan in Georgian script>; <Svan in Latin script>; <Georgian in Georgian script>;
<Georgian in Latin script>. Phonological renderings are placed between /slanted bars/.
Joseph. For the rendering of New Testament terminology pertaining to Jewish

and pagan ritual, the translators drew upon the rich vocabulary referring to var-
ious aspects of Svan folk religious practice, which are still very much a part of
everyday life in Svan villages. In the passage Matthew 23:18-19, for example, the
term “offering” (dōron) is translated by no fewer than three Svan equivalents: /
qid/ “gift”, /q’wiž/ “sacrificial animal” (especially its roasted liver), and /lemzir/
“consecrated” (used most often with reference to bread offerings). The holy
breads (toùs ártous tēs prothéseōs) mentioned in Mark 2:26, “which only the
priests were allowed to eat”, are described as /uc’onäš/ – literally, “unseeable” –
a term used by Svans to denote offering breads baked from special wheat flour,
which only household members are allowed to eat or even see. As equivalents
for the Greek terms for “altar” (thusiastērion, bōmos), the translators had no
Svan term readily to hand, since, in Svan ritual practice, offerings are held
up by the presenters (while facing eastwards), and not placed on a table. As
equivalents, the translators chose either /laqwmi/ “ritual site” (the term com-
monly used to designate Christian churches), or /ladbäši/, which denotes the
enclosed space adjoining a church where women baked bread for use as offer-
ings (Bardavelidze 1941: 15).
(ii) Six Young Authors in Search of an Orthography

My second example is a short-story competition for teen-age writers, spon-
sored by the Grand Ushba Hotel in Becho and its Norwegian director Richard
Bærug. The first competition took place in 2013, and a second the following
year. The submissions were evaluated by a jury of Svan native speakers (includ-
ing the poet Erekle Saghliani, who projects an image which is simultaneous
strongly Georgian and strongly Svan). Here is part of one of the announce-
ments of the 2013 competition on Facebook, written in Georgian script with
additional characters.
<lax si xi 12-xenka 18-teka ləzai (1995-2001 zäiži lətav), si ču ǰamiēda

monac’ileob axk’əda lušnu lit’erat’urä k’onk’urste 2013 zaisga! čvatīr
lušnud tavisupal tema – imvaiži – žicxändads – livadži, sgvebd līziži
mädei eǰk’älibži si maivai ǰalat’. [. . .]. nambuäl xek’ves lēsv 4-xanka 16
gverdteka mädei 2000-xanka 8000 sit’q’va – lēkvisg. īra sačukräl.
mačēne nambuälar ira ečeisga. drev: 1 mart’ 2013.>
(If you are 12 to 18 years old (born in the years 1995-2001), you can take
part in the Svan literature competition for 2013! Write in Svan on any
theme – whatever you would prefer – your desires, your successes or
whatever you like. [. . .] The story must be from 4 to 16 pages or 2000 to
236 Tuite
8000 words. There will be gifts (prizes) for the best stories. Deadline: 1
March 2013.)
Not all comments posted on the contest’s Facebook page were positive: some
criticized “mistakes” in the use of Svan, or what they took to be unwarranted
code-mixing (Georgian loans in the announcement are marked by underlin-
ing). Another commentator cited the passage in the Georgian constitution
concerning the status of Georgian as sole official language. Other writers,
however, vigorously defended the competition (most of these comments were
posted in Georgian, but quite a few in Svan).8
Six young authors, aged 12 to 17, won prizes in the 2013 competition. The
texts composed by the prize winners, published in an anthology (Bærug 2013),
show interesting variation in orthographic style, since each author had to
work out his or her own norms for writing Svan. (This could also be said of the
authors of the texts accompanying the stories in the anthology, and the posters
and Facebook announcements promoting the contest). The most “authorita-
tive” models for writing Svan are the 20th-century anthologies compiled by
linguists, who, as noted earlier, aimed for a fairly explicit representation of
the pronunciation. Some of the young writers – especially 12-year-old Erek’le
and 14-year-old Mari – appear to have been influenced by the linguists’ ortho-
graphic norms, including their use of apostrophes to mark vowel syncopation
when a clitic is attached to the following word (e.g. <ž’eser> = /ž(i) eser/ “in
QUOT”). Two writers however devised a phonologically-based orthography
reminiscent of that used in the 1908 soldier’s diary. 17-year-old Jemal tended
not to write schwas in contexts where they were automatically inserted before
resonants (/x-a-k’pən-x/ “offers” written <xak’pnx>; /daqəls/ “goat-DAT” writ-
ten <daqls>), or otherwise predictable. On the other hand, schwa was usually
written when it functioned as the root vowel of a word (e.g. <ɣən> “festival”).9
One writer from the Lower Bal dialect area, 17-year-old Giorgi, employed
phonetically-precise spellings, but not necessary those favored by the linguists.
8 Responding to previous comments criticizing the quality of the Svan used in the competi-
tion announcement, one user posted the following sarcastic remark in Svan: <si xochaamd
atdawy lushnud i echqaango axgacxad qa; konkurs> (You have such a good command of
Svan, so announce your own competition).
9 It is worth noting that Sopho and Jemal – the two authors who favored phonological
spellings – are the grandchildren of Goguca Xergiani, now 80 years old, who was one of the
pioneers of the newest phase of Svan writing. She is the author of a 2-volume collection of
Svan-language poetry and prose, “Maxvshi Baba”; vol 1, 1999; vol 2, 2004.
To represent the schwa vowel, Giorgi chose the Georgian letter “o” with an
umlaut sign, implying that he pronounced schwa with a degree of lip-rounding
(<löxins> = /ləxin-s/ “good times-DAT”; <daqöl> = /daqəl/ “goat”).10
The orthographic styles of the six prize-winners can be summarized by
comparing three parameters: (1) representation of schwa, (2) representation
of the glide [j], (3) use of apostrophes to signal vowel loss at the point of word
liaison:
(-1) underspecified (0) middle ground (+1) precise
schwa [ə] [ə] usually omitted: [ə] written: [ə] = <ö>:

Sopho, Jemal Mari, Salome, Erekle Giorgi
yod [j] [j] = <i> [j] = <y> most times [j] = <y> always
Sopho, Jemal Mari, Giorgi, Erekle Salome
liaison with no apostrophes: apostrophes after ž(i): apostrophes always:
vowel syncope Sopho, Jemal, Giorgi Mari Salome, Erekle
(iii) Intimate Literacy: Svan-Language Facebook Chat

As has been frequently noted, phone text-messaging and the social media have
contributed to the emergence of new genres of writing, as well as an intensi-
fication of written communication among many users. One seemingly para-
doxical feature of new-media communication is that it favors both the use of
widely-spread languages (English, in particular) and in-group-oriented linguis-
tic innovations (such as the rapidly-changing corpus of abbreviations, neolo-
gisms and cybercultural references inventoried on such sites as “Know Your
Meme” and “Urban Dictionary”). In terms of the contrastive directions of soci-
olinguistic evolution described by Thurston (1987), social-media communica-
tion is simultaneously exotero- and esotero-genic: oriented toward openness
and exclusion. Of the social media which have emerged in the new millen-
nium, I have focussed my attention on Facebook (FB). Users of Tumblr, Reddit,
4Chan and the like generally identify themselves by pseudonyms, rarely if ever
10 This vowel was described by the phonetician S. Zhghent’i (1949: 65-66) as a “delabialized
/u/” ([ɯ] or [ɤ]), which is how it sounds to me. One wonders if some Svan speakers are
manifesting the same trend toward rounded schwa as has occurred in some varieties of
European French.
238 Tuite
meet face to face, and form thematically-centered discussion groups. Most

FB users, on the other hand, display their real names, and cluster into mini-
communities largely comprising individuals who are also acquainted with
each other off-line.
In Facebook communities comprising a Svan-speaker and his/her “friends”,
or Svaneti-based discussion groups and their members, the proportion of
Svan language use relative to Georgian varies considerably.11 Some commu-
nities appear to avoid Svan altogether, whereas in others Svan expressions
and interventions occur in almost every discussion. These electronically-
mediated exchanges among acquaintances could be characterized as a form
of intimate literacy. Like more traditional manifestations of intimate literacy,
such as diaries and personal letters, Svan-language FB chat tends toward
in-group-oriented opacity (from the standpoint of outsiders), spontane-
ity, code-switching, playfulness, and laxity with respect to norms for written
communication. FB Svan orthography, whether in Georgian, Latin (or occa-
sionally Russian) characters, resembles the system used by the soldier Nak’an
in 1908: limited to the letters in the standard alphabet, but fairly adequate in
phonological terms. Unlike its predecessors, though, Svan intimate literacy on
Facebook is multiparticipant, evolves in realtime, and is not dependant on the
production of relatively durable artifacts such as letters or books. The frontier
between private and public communication is more porous, since outsiders
can “listen in” on the discussions on many FB pages – one is reminded of the
simultaneously closed and open nature of cellphone conservations in public
spaces. Here are two examples of recent Svan Facebook communication (per-
sonal names have been anonymized):
(a). The first example is a posting headed <talibani mulaxeli q’opila, icodit?>
(The Taliban is from Mulakh, didn’t you know?), accompanied by a photograph
of a man holding a gun. This elicited a sequence of joking comments in both
Georgian and Svan, including a Russian adjective as well. Some participants
switched codes within a single intervention.
AB: <o---s xoša xaǰeš> :)) (He really resembles O-)

EF: <namet’ani didi p’at’ivi xom araa talibanistvis>  (That is not excessive
respect for the Taliban)
11 My sample – which includes groups based in Lower as well as Upper Svaneti – is of course
biased toward people I know personally, or friends of friends.
CD: <o---s deesa, xadu k---alšaal isgiidraal>  (Not O-, rather he looks a lot
like the K- family) [reference to AB’s relatives]
EF: <k’rasni mlax xo iq’o da axla taliban . . . t’ašiii> :))) (There was [R: red]
Mulakh and a now a Taliban . . . Applause :))))). [This refers to a previous
posting where the expression “Red Mulakh” appeared, albeit entirely in Svan].
CD: <čven “švania txvim” vart, is k’i ara, ra kvia> . . . :))) (We are “the head of
Svaneti”, this is not, what’s it called . . .)
EF: <uoiiiii uoiiiiii dedee> 
GH: <mulaxši tu mest’iaši imaleba> (Is he hiding in Mulakh or Mestia?)
[Mulakh and Mestia are neighboring communes in Upper Svaneti].
IJ: )))))))))))))
(b). A prayer for one’s brothers. The following text, also in Georgian script, was
posted by KL in early February 2014, at the time of the mid-winter torch festi-
val (Limp’æri). At this time, Svan men carry lit torches to their neighborhood
church, one for each male in the family, and pray for the peace and well-being
of the community’s menfolk.
<he ɣerbet, ǰgurags didab, atpišir mušvan mulump’ari, žaxirian limačd

merde, atasd atgen ladi xedvai korxanka lamp’ar kačes eči lizge lirde.
xoca paq’ esag mušvan mara čiesgi nensga, atasdu amgenenad, atasu
aǰhienax, mišgu laxvbas, L, K, D, Z, X, G, G . . .>
(O God, glory to St George, increase the numbers of Svan torch-bearers,

renowned elders. Raise up by thousands those who go out from their
houses today with torches, [increase] their lives and being. Set a good hat
upon all the Svan men among them, raise them up a thousandfold,
may you be numbered in thousands, my brothers [there follows a list
of names])
The posting was followed by two responses, one in Svan and one in Georgian:
MN: <ɣerbetu ǰamz ri> (May God bless you)

OP: dzma xar, --. genacvale šen .. (You are a brother, K--. You are dear to me.)
KL’s orthography ignores the distinction between long and short vowels, and
that between [æ] and [a]. He represents schwa with the Georgian letter “u”
(e.g. <ǰgurags> = /ǰgərǟgs/ “St-George-DAT”). MN, however, leaves a space in the
middle of the word <ǰamz ri> where schwa appears (/ǰamzəri/ “blesses you”).
240 Tuite
5 Conclusion
As much as it might go against the ambitions of both those who proclaim

Georgian as the sole literary language for those who identify as Georgians, and
those who aspouse the creation of a standardized Svan for use in education,
publication and administration, the young authors and Facebook-users seem
to have found a third way. Through their efforts, and those of many of their
contemporaries, a new Svan literacy is emerging which coexists with literacy
in Georgian and other languages (formerly Russian, and now increasingly
English). The new literacy has so far sidestepped the fraught issue of standard-
ization, which would favor one variety of Svan to the detriment of the others.12
Svan writing has assumed many of the same functions as Svan orality, as an
intimate register signalling identity, belonging and closeness. It is developing
as a medium for humor, prayer and the reinforcing of attachment. If there is a
message in this for us concerned outsiders – including those lowlanders who
believe they know better what the Svans need than do the Svan themselves – it
may well be that we should stand aside and let the young people decide what
forms and functions Svan literacy should assume. We should do what we can
to provide encouragement and resources, but above all we must not stand in
their way.
Acknowledgments
I wish to express my deepest thanks to those of my Svan friends and colleagues

who contributed directly or indirectly to this paper, and in particular, Medea
Saghliani; Nino, Leri and Levan Tserediani; and Rusudan Ioseliani. Thanks
also to Richard Bærug, Paata Bukhrashvili and Greg Anderson for helpful
comments; and to Gurkan Dogan and his colleagues for their generous and
intellectual stimulating hospitality during my stay at Ardahan University.
Iwæsu xarid!
12 Standardization is also not strictly necessary in the case of Svan, since the dialects are
largely intelligible with each other. The principal differences concern vowel length (which
as mentioned has little semantic load), umlaut, sycope and some aspects of morphology.
Selection of the Upper Bal dialect spoken in Mestia – the economic and administrative
center of Upper Svaneti – as the basis for a standard language would oblige speakers of
the Lower Bal and Lentekh dialects to learn to write long vowels which do not exist in
their vernaculars.
Bibliography
Abuladze, Ilia, chief ed. 1963. Grigol xandztelis cxovreba (The Life of Grigol of Xandzta).
Dzveli kartuli agiograpiuli lit’erat’uris dzeglebi, c’igni I (V-X ss.), 248-319. Tbilisi:
Mecniereba.
Bardavelidze, Vera. 1941. kartvelta udzvelesi sarc’munoebis istoriidan: ɣvtaeba barbar-
babar (From the history of the ancient beliefs of the Georgians: the deity Barbar-
Babar). Tbilisi: Mecniereba.
Bærug, Richard, ed. 2013. lušnu bopšre lit’erat’ura / svanuri sabavšvo lit’erat’ura/ Svan
youth literature. Tbilisi: Sulakauri Publishing.
Cagareli, A. 1873. O grammatičeskoj literature gruzinskago jazyka. Kritičeskij očerk.
Sanktpeterburg: Imperatorskaja Akademija Nauk.
Davitiani, A., V. Topuria and M. Kaldani, eds. 1957. svanuri p’rozauli t’ekst’ebi II: balskve-
mouri k’ilo (Svan prose texts, II: Lower Bal dialect). Tbilisi: Mecniereba.
Gippert, Jost. 2005. Endangered Caucasian languages in Georgia. Lessons from
Documented Endangered Languages, David K. Harrison, David S. Rood & Arienne
Dwyer, eds. Amsterdam 2008, 159-194.
Gogebashvili, Jakob. 1876/1976. deda ena, anu anbani da p’irveli sak’itxavi c’igni (Mother
tongue, or Alphabet and first reader). Tbilisi: Ganatleba.
Gogebashvili, Jakob. 1912/1976. bunebis k’ari. mesame da meotxe c’lisatvis (The Gate of
Nature. For third and fourth grade). Tbilisi: Ganatleba.
Gren, A. N. 1890. Svanetskie teksty. Sbornik materialov dlja opisanija mestnostej i plemën
Kavkaza X, Otdel 2: 76-160.
Güldenstädt, Johann Anton. 1787. Reisen durch Russland und im Caucasischen Gebürge.
Vols. I-II. St. Petersburg: Kayserliche Akademie der Wissenschaften.
Ingoroq’va, P’avle. 1941. svanetis saist’orio dzeglebi (The historical documents of
Svaneti). Tbilisi: Mecniereba.
Klaproth, Julius von. 1814b. Kaukasische Sprachen. Anhang zur Reise in den Kaukasus
und nach Georgien. Halle & Berlin: Buchhandlung des Hallischen Waisenhauses.
Larmer, Brook. 2014. Medieval mountain hideaway. National Geographic, October 2014,
78-99.
Lewis, M. Paul, Gary F. Simons, and Charles D. Fennig (eds.). 2015. Ethnologue:
Languages of the World, Eighteenth edition. Dallas: SIL International. Online ver-
sion: http://www.ethnologue.com.
Lušnu anban / Svanetskaja Azbuka (Svan alphabet) 1864. Tiflis: Tipografija Glavnago
Upravlenija Namestnika Kavkazskago.
Nizharadze, Besarion. 1962 (Vol I), 1964 (Vol II). ist’oriuli-etnograpiuli c’erilebi.
(Historical and ethnographic essays). Tbilisi: Tbilisi State University Press.
Nizharadze, G. & I. 1894. Svanetskie teksty. Sbornik materialov dlja opisanija mestnostej
i plemën Kavkaza XVIII: 91-132.
242 Tuite
Nizharadze, Iv. 1890. Svanetskija skazki. Sbornik materialov dlja opisanija mestnostej i
plemën Kavkaza X, Otdel 2: 161-241.
Nizharadze, I. I. 1910. Russko-svanskij slovar’. Sbornik materialov dlja opisanija mest-
nostej i plemën Kavkaza XLI: 1-520.
Oniani, A., M. Kaldani and Al. Oniani, eds. 1979. svanuri p’rozauli t’ekst’ebi IV: lašxuri
k’ilo (Svan prose texts IV: Lashx dialect). Tbilisi: Mecniereba.
Otčet 1864. Otčet o sostojanija i deistvijax Otdela s 1859 do 1863. Zapiski Kavkazskago
Otdela Imperatorskago Russkago Geografičeskago Obščestva Knižka VI: 1-42.
Put’k’aradze, T’ariel. 2002. kartvelebi, zogadkartuli samc’ignobro ena da kartvelta
dialekt’ebi [The Georgians, the Common-Georgian literary language and the
Georgians’ dialects]. Kartveluri memk’vidreoba 6.187-203.
Put’k’aradze, T’ariel. 2003. monatesave enobriv erteulta k’valipik’aciis sak’itxisatvis
tanamedrove mecnierebaši [On the question of the qualification of related linguis-
tic units in contemporary science]. Kartveluri memk’vidreoba 7.206-227.
Put’k’aradze, T’ariel, Dadiani, Ek’a & Sherozia, Revaz. 2010. evrop’uli kartia regionuli an
umciresobis enis šesaxeb da sakartvelo (The European Charter for Regional or
Minority Languages and Georgia). Kutaisi: National Institute of Education.
Savenko, E. A. 2010. Iz istorii dejatel’nosti obščestva vosstanovlenija pravoslavnogo
xristianstva na Kavkaze (k 150-letniju osnovanija). Universitetskie čtenija 2010:
materialy naučno-metodičeskix čtenij Pjatogorskogo gosudarstvennogo ling-
vističeskogo universiteta. Pjatigorsk, 15. pp 91-97.
Shanidze, A., Kaldani, M. & Ch’umburidze, Z., eds. 1978. svanuri enis krest’omatia.
(A chrestomathy of the Svan language). Tbilisi: Tbilisi University Press.
Shanidze, A., V. Topuria & M. Gujejiani, eds. 1939. svanuri p’oezia (Svan poetry). Tbilisi:
Mecniereba.
Shanidze, A. & V. Topuria, eds. 1939. svanuri p’rozauli t’ekst’ebi I: balszemouri k’ilo (Svan
prose texts, I: Upper Bal dialect). Tbilisi: Mecniereba.
Silogava, Valeri. 1986, 1988. svanetis c’erilobiti dzeglebi (X-XVIII ss). (The written docu-
ments of Svaneti, 10th-18th centuries). Tbilisi: Mecniereba.
Tepcov, V. Ja. 1890. Svanetija (Geografičeskij očerk). Sbornik materialov dlja opisanija
mestnostej i plemën Kavkaza X, Otdel 1: 1-68.
Thurston, William R. 1987. Processes of change in the languages of north-western New
Britain. Pacific linguistics. Series B no. 99. Canberra: Research School of Pacific
Studies, Australian National University.
Topuria, V. & Kaldani, M. eds. 1967. svanuri p’rozauli t’ekst’ebi III: lent’exuri k’ilo. (Svan
prose texts, III: Lent’ex dialect). Tbilisi: Mecniereba.
Tschantladse, Isa, Rusudan Babluani & Heinz Fähnrich. 2003. Tscholurswanisch-
Deutsches Verbenverzeichnis. Jena: Friedrich-Schiller-Universität.
Tuite, Kevin. 1997. Svan. (Languages of the World / Materials, vol. 139); München:
Lincom Europa.
Tuite, Kevin. 2008. The Rise and Fall and Revival of the Ibero-Caucasian Hypothesis.
Historiographia Linguistica, 35 #1; 23-82.
Vaxusht’i Bat’onishvili. Aɣc’era sameposa Sakartvelosa (Description of the Kingdom of
Georgia). Kartlis cxovreba, IV. (S. Q’auxchishvili, ed.) Tbilisi. Sabch’ota Sakartvelo,
1973.
Xergiani, Goguca. Maxvshi Baba; vol 1, 1999; vol 2, 2004.
Zhghent’i, Sergi. 1949. svanuri enis ponet’ik’is dziritadi sak’itxebi. (Principal questions of
Svan phonetics). Tbilisi: Mecniereba.
Chapter 17
The Internet as a Tool for Language Development

and Maintenance? The Case of Megrelian
Karina Vamling
1 Introduction
Urbanisation and globalisation, in combination with the increasing use

of information and communication technology (ICT), are processes that
have a profound influence on societies today. At the same time, we are wit-
nessing an increase in endangered languages. These processes, and in
particular, the widespread use of English and other major languages on
the Internet, are often assumed to pose a serious threat to the survival of
smaller languages (Crystal 2001). However, there is also another side to this
spread of information and easy access to ICT; namely, the potential use
of computer-mediated communication by smaller language communities
for the promotion and development of their languages. The increased use
of the Internet has opened up new communication channels. As Crystal
points out:
[. . .] the Net provides an identity which is no longer linked to a geographi-

cal location. People can maintain a linguistic identity with their relatives,
friends, and colleagues, wherever they may be in the world. Whereas, tra-
ditionally, the geographical scattering of a community through migration
has been an important factor in the dissolution of its language, in future
this may no longer be the case. (Crystal 2002, 142)
The general focus of this paper is on the Internet as a potentially impor-

tant tool for the development and maintenance of smaller languages, espe-
cially in geographically dispersed communities. As a case in point, the paper
discusses the increased availability and use of ICT by the Megrelians in
Western Georgia, its importance for the maintenance of the Megrelian lan-
guage, and how this changes the traditional diglossic situation (Vamling and
Tchantouria 2010).

the Internet as a Tool for Language Development and Maintenance 245
2 Computer-Mediated Communication and Multilingual Speakers
There are many types of technologies for computer-mediated communication

(Thurlow and Mroczek 2012): real-time chats, email, instant messaging, discus-
sion forums, web pages, to mention but a few. Each differs according to certain
parameters, and it is to be expected that the interaction is more conversation-
like in some more than others. One such aspect is whether the interaction
is asynchronous or, as in a typical conversation, synchronous (Paolillo 2011).
Another feature is to what extent the communication is person-to-person
or directed to multiple recipients (where the linguistic repertoire is not fully
known). There is a large body of research on computer-mediated communica-
tion and its relation to speech in comparison with writing. Crystal (2001, 24-61)
discusses the issue under the notion of “netspeak”. In reviewing the speech/
writing dimension of computer-mediated communication, Baron (2008, 48)
notes:
[. . .] it resembled speech in that it was largely unedited; it contained many

first- and second-person pronouns; it commonly used present tense and
contractions; it was generally informal; and CMC language could be rude
or even obscene. At the same time, CMC looked like writing in that the
medium was durable, and participants commonly used a wide range of
vocabulary choices and complex syntax.
Multilingual speakers have a rich and complex repertoire of languages and

styles as a resource in communication. A special situation is at hand in online
communication when the choice is between a more prestigious variant that
is used in writing and a variant that is used in informal communication
(Warschauer et al. 2002). In her study on the use of standard German in com-
parison with Swiss–German dialects in Internet Relay Chat (IRC) Siebenhaar
(2008) made the observation that a large part of the IRC communication was
in dialect rather than standard German and that each chatter tended to use
his/her own written dialect conventions. Paolillo (2011) poses the question
whether bilingual computer-mediated interaction is similar to that of ordinary
multi or bilingual face-to-face conversation, especially with respect to choice
of code and code-switching; i.e., to what extent does the medium of interac-
tion affect the communication?
The spontaneous and active use of Megrelian in computer-mediated com-
munication is particularly important with respect to questions of maintenance
246 Vamling
of the language, as this offers a new (and potentially growing) domain for the
use of the language:
[. . .] Internet having an incredible social impact, gives credence to the

attempts by minority language speakers and members of numerically
smaller cultures to use online resources to support cultural maintenance.
This might be done by creating culturally specific web-based content
[. . .] to strengthen a culture that feels under threat from social and lin-
guistic communities. (Green 2010, 186)
3 Case to be Discussed: Megrelian
Megrelian (Mingrelian) is a Kartvelian (South Caucasian) language, mainly

spoken in Western Georgia (Samegrelo and Abkhazia). The estimated number
of speakers of Megrelian is over 400,000 (Klimov 1999), but the language has no
literary standard. Megrelians are bi- or trilingual and use Georgian (or Russian)
as their literary language. Despite having rather many speakers, Megrelian has
been classified by UNESCO as “definitely endangered” (Endangered Languages:
The Full List). The alternating use of Megrelian and Georgian refers to a diglos-
sic situation (Wexler 1971) between two languages that are not mutually intel-
ligible; i.e. not dialects of one language. In the Georgian/Megrelian diglossic
situation, there is a rather clear established functional division between the
use of Georgian in formal contexts/written form and the use of Megrelian in
informal contexts/oral form (Vamling and Tchantouria 2010).
Megrelian is closely related to Laz (or Chan), a Kartvelian language spoken
in Turkey and by small numbers in Georgia. There is some mutual intelligi-
bility between Megrelian and Laz, but it is made difficult as a result of heavy
Turkish influence on the Laz lexicon. Some researchers regard Megrelian and
Laz as dialects of the common ancestor language, Zan (Chikobava 1936). Other
terms refer to both Megrelian and Laz (Chan) as Kolkhian or the Megrelo-
Chan language.
Factors underlying the decreasing number of speakers of Megrelian are
urbanisation and forced migration as well as the absence of any central
institutional support for the language. The political development during the
post-Soviet period has split up the Megrelian population and they currently
live in communities on different sides of borders, hindering direct contact.
The situation has its roots in the ethnopolitical conflict in Abkhazia in 1993
(Chervonnaya 1994) when much of the Georgian population (approx. 250,000)
were forced to leave Abkhazia and take refuge in Georgian-controlled parts
of Western Georgia or in Russia. A majority of the forced Georgian migrants

from Abkhazia were speakers of Megrelian. The conflict is still unsolved, and
contacts across the administrative border have become increasingly difficult
due to the former Autonomous Republic of Abkhazia proclaiming its inde-
pendence from Georgia. Russia and Georgia have lacked diplomatic relations
since the Russo–Georgian war in 2008. Following the war, Abkhazia was recog-
nised as an independent state by Russia (2008). The administrative borderline
dividing Abkhazia and Georgia “proper” follows the Enguri River and thereby
divides the Megrelian speaking area.
3.1 Not a Minority Language

Megrelians do not identify as a minority, rather as Georgians, and therefore
they do not appear to strive so hard for linguistic rights, as speakers of many
minor languages do in other settings around the world. This also relates to
how Megrelians view language endangerment and the need for institutional
support (Vamling and Tchantouria 2010). Because Megrelians consider them-
selves Georgians (but at the same time, Megrelians), the issue of Megrelian
identity has attracted some attention (Broers 2001; Svitzer 2012). A question
that has been debated extensively in Georgian society is whether Megrelian
is a language or a dialect and whether it would be appropriate to develop a
standardised orthography or if it should be left to a laissez-faire situation. This
concern of the Georgian public over the status of Megrelian has partly been
in the context of the discussion of the principles of the European Charter
for Regional and Minority Languages. Some politicians and scholars fear that
granting Megrelian the status of a regional language would fuel separatist
sentiments in the already conflict-ridden Georgia. The discussion over the
language/dialect issue and whether a standardised orthography should be
developed for the language has also been reflected in the scholarly debate
(Gvantseladze 2006).
As shown above, it is difficult to delimit “Megrelians” from Georgians in a
simple way. Persons referred to as Megrelians in this paper speak the Megrelian
language (along with Georgian and/or Russian), have their roots in Western
Georgia, and are typically recognised by family names ending in -ava, -ua,
and -ia.
3.2 Re-emergence of Megrelian

Research on the Megrelian language falls largely into two periods, an earlier
period up to the end of the 1930s and a later one. The following researchers of
the earlier period should be mentioned: Rosen (1845), Tsagareli (1880), Marr
(1910), Kipshidze (1914), Deeters (1930), Chikobava (1936) and Khubua (1937).
248 Vamling
During a short period in the late 1920s and early 1930s the local Megrelian com-
munist party elite aimed at establishing an Autonomous Megrelian Region
and published newspapers in Megrelian. However, the plans were abruptly
stopped, as were the Megrelian newspapers (Gamsakhurdia 1989). From the
end of the 1930s until the latest decades very little research was published
on Megrelian for political reasons. However, several studies of Megrelian
have been published in the post-Soviet period. Among them are a grammar
sketch in English by Harris (1991), a Megrelian–Georgian dictionary in three
volumes (Kajaia 2001-2002), a recent Megrelian–Russian–Georgian dictionary
(Klimov and Kajaia 2013), a reading grammar of Megrelian–Laz in Georgian
(Danelia and Dundua 2006), grammatical descriptions of Megrelian (Kartozia
et al. 2010; Lomaia and Gersamia 2012a) and studies of Megrelian grammar by
the present author (Tchantouria and Vamling 2005, Vamling and Tchantouria
1991, 1993; Vamling 2005) and of use of the language in different domains of soci-
ety (Vamling 2000; Vamling and Tchantouria 2010). Collections of Megrelian
folklore and tales have been published by Kipshidze (1914), Khubua (1937) and
later by others such as Lomaia and Gersamia (2012b).
4 Megrelian and Its Presence on Internet
The increased use of the Internet has opened new communication channels.
Generally, the Internet is highly available in Georgia and particularly in Tbilisi
(IDFI 2013). The Internet penetration in Georgia in 2014-2015 is estimated to be
49% in a recent report presented by Freedom House (2015). They note in their
comments: “Internet access and usage continues to grow rapidly in Georgia,
particularly as interest in connecting with friends through social-networking
sites has increased in recent years.”
For the Megrelians, this increased access to the Internet has meant new –
virtual – possibilities for contacts across borders. What impact has this had for
the use and development of the language? Megrelian is their natural choice
for communication at home and in informal contexts. To a large extent,
computer-mediated communication belongs to this sphere as well. However,
Megrelian has no standardised literary form as the language is not used in
printed media, education and administration. Despite this, a new domain
is emerging for the language in the form of the spontaneous and non-
standardised use of Megrelian on the Internet.
A preliminary study has been conducted of interactions in Megrelian online
forums such as Facebook and YouTube. In total, approximately 800 entries of
various lengths have been collected. In such interactions, it has been observed
that elements are often combined in Megrelian, Georgian, Russian and even
Laz. All three scripts, Georgian, Latin and Cyrillic, are used frequently, and
switching between languages is common.
4.1 Mixing of Codes and Code-Switching

It is generally not possible to delimit the geographical area of origin solely on
the basis of the language or script used. Computers in Georgia have Georgian,
Russian and Latin keyboards preinstalled, whereas Georgian keyboards are
not always found on computers in Russia. One has to take into consideration
that members of the Megrelian community are generally trilingual and can
master all three scripts. This mixing of codes, both of languages and scripts,
is illustrated in comments to the video clip megruli simghera mengrelian
song (YouTube 2009). Comment (a) is in Megrelian, written in the Georgian
script. The comment dating from June 25, 2010, starts in Georgian in the Latin
script (b), and switches into Megrelian using the Russian script (c-d).1 The last
Megrelian comment (e) uses the Latin script.
(1a) ქუგალე სამარგალო შურს დო გურს . . .!!!!

kugale samargalo šurs do gurs
dear Samargalo soul.DAT and heart.DAT
‘Dear Samargalo, to (my) soul and heart.’
Megrelian in Georgian script. Sep. 2, 2010
(1b) gmerto damiloce chemi saqartvel!!!!!!

God.VOC bless.IMP my Georgia
‘Oh God bless my Georgia!’
Georgian in Latin script. June 25 2010
(1c) маргали ворек дзаламс миорс нина маргалури

margali vorek dzalams miors nina margaluri
Megrelian I am very much I like language Megrelian
‘I’m Megrelian. I like the Megrelian language very much.’
Megrelian in Russian script. Continued from above. June 25, 2010
1 The Georgian script does not distinguish small and capital letters. This principle is kept in the
glosses as well.
250 Vamling
(1d) самаргалоши нина оре ак мута ре усквамури

samargaloši nina ore ak muta re uskvamuri
Samegrelo.GEN language is here nothing is most beautiful
‘The language of Samargalo is the most beautiful language.’
Megrelian in Russian script. Continued from above. June 25, 2010
(1e) martalo mangar re arzo martalo mangaro miort

true strong is all true very S1SG.O2PL.love.PRS
‘It’s truly strong. I like you (all) very much.’
Megrelian in Latin script. Aug 16, 2010
The following example (2) represents comments to a song in Megrelian on

YouTube and shows several cases of code-switching into Russian (R) and two
cases (2g-h) of Georgian (G) words in the Megrelian text (YouTube 2011). All
three languages are written in the Latin script and in non-standard forms. For
instance, the Russian words tancuim ‘we dance’ and vaafsheee ‘in general’ cor-
respond to the standard forms танцуем tancuem and вообще voobšče. The fre-
quent use of smilies and visually prolonged words mimic exclamations, which
contributes to the conversation-like impression of the passage.
(2a) aeee tancuim vseeeeeeee (R) :D:D:D:D

(interj.) dance.2PL.PRS all
‘Hey, we are all dancing.’
(2b) gela karoche (R) skan kampania gashinee jimaa :D:D:D

Gela (filler word) your company 2SG.get.FUT brother
‘Gela, listen, my brother, you will find your company.’
(2c) gelash kampanias xolo tena ibirioo?: D:D::D

Gela.GEN party.DAT also that sing.2SG.AOR.Q
‘Did you also sing that (song) at Gela’s party?’
(2d) vaaaaaaaa saxoll jimaa sagoool margalef

(interj.) great brother great Megrelian.PL
‘Oh, well done, brother – great Megrelians!’
(2e) mangar antikvar xalx voret quggalee brat (R)

good old people 1PL.be.PRS my dear brother
‘We are a good ancient people, my dear brother.’
(2f) quggaleeeeeeeeeeeeeeeeeeeeeeeeeee mangar koch req!! sagollll!!!

my dear good man you are great
‘Great, you are a good man, my dear.’
(2g) leqsooo si vaafsheee prosta (R) chiche miork

Leqso.VOC you in general just a little S1SG.O2SG.love.PRS
ra (G) jima
(filler) brother
‘Leqso you just (unfinished sentence), I like you, you know, my brother.’
(2h) :D:D:D:D:D:D Genacvaleeeeeeeeeeeeeeeeeeeeeet (G)

my dear (PL)
‘my dear.’
It is also the case that different languages may be used within the same thread
of the conversation (FB margaluri nina). In thread 3, the person “Z” opens the
conversation in Megrelian and continues in Georgian. The person “H” answers
in Megrelian, and then “Z” closes the short thread in Georgian.
(3a) Z: ჯგირ რე. ბრავო! (M)

jgir re Bravo!
good is bravo!
‘It’s good. Bravo!’
(3b) Z: თარგმანიც მაგარია ორიგინალივით! (G)

targmanic magaria originalivit
translation.also great.is original.like
‘The translation is also great, like the original.’
(3c) H: ჩხე ჩილამურით ეფშა . . . (M)

chkhe chilamurit epsha
warm tear.Instr full
‘Full of warm tears’
(3d) Z: მაგარია! (G)

magaria
great.is
‘It’s great!’
252 Vamling
4.2 Megrelian on Facebook

One of the uses of Megrelian on the Internet occurs in social media such as
Facebook, Twitter and Russian Vkontakte (vk.com), Odnoklassniki (www
.odnoklassniki.ru), Moymir (my.mail.ru), forum groups, chats, films and com-
ments on YouTube, Rutube (rutube.ru) and other video sites. Social media is
a natural arena for the younger generation and functions as a complement in
maintaining community networks. The largest public Facebook communi-
ties with a Megrelian profile are margaluri nina–megruli ena, (The Megrelian
language [in Megrelian]–The Megrelian language [in Georgian]) with 18,839
followers (October 22, 2015) and kolkhuri tsentri (Kolchian Center) with 11,359
followers. A community page that has almost trippled its number of followers
in one year is megruli poeziiz shedevrebi (Masterpieces of Megrelian poetry).
In this paper, we look closer at one Facebook page, margaluri nina (Megrelian
language), with 2,649 followers (October 2015), showing an over 80% increase
of followers in one year.
4.2.1 Facebook “margaluri nina” (Megrelian Language)

What does the Facebook network, “margaluri nina” Megrelian language, look
like? At the time of this study, the page had 1,490 “friends” or members (August
2014). As noted above, the Megrelian community has been subject to pressure
and extensive migration in the post-Soviet period, and one question to address
is, therefore, how this is reflected in the online network and to what extent
does it seem to bridge a geographical division. Is this a platform for commu-
nication in a scattered community? Presentations on Facebook often include
some geographical information, but as this is not always shown, it is impos-
sible to map the full information. Studying the choice of writing systems used
on the basis of available keyboards could be one source, but that would not be
a reliable indication, as explained above. However, it emerges that among the
113 persons who were actively posting and commenting on the page, there was
a rather large group from the city of Zugdidi in Samegrelo, Western Georgia
(15 “friends”). Among the active “friends”, eight showed that they have their
roots in Abkhazia, and ten “friends” resided abroad. Several other locations
were indicated. Without having access to the full geographic coverage, the
impression is that the focal area is Zugdidi in Samegrelo with a rather wide
spread within and outside of Georgia.
Having in mind that speakers of Megrelian use Georgian in written com-
munication and Megrelian (and Georgian) in oral communication, the ques-
tion remains: which language is chosen in the conversation-like social media
situation? When all verbal comments to postings on the “margaluri nina”
page amounting to approximately 300 were analysed, it emerged that only
a small number of the 1,490 “friends” had been active. Moreover, among the
300 postings, 18% were made by outsiders, persons who were not “friends”
of “margaluri nina”. In such cases, comments were almost always made in
Georgian. In the remaining 246 comments, i.e. comments made by “friends”
(76 persons), the ratio is that approximately 60% of the comments were writ-
ten in Megrelian and 40% in Georgian. Furthermore, the majority of the active
“friends” are Megrelians according to features mentioned above. Three quar-
ters of the active “friends” have surnames ending in -ua, -ava or -ia.
It has also to be taken into account that the conversation on the Facebook
page “margaluri nina” is public and not person-to-person in the ordinary
sense, which means that the participants are aware that when they choose to
write in Megrelian, comments are understood only by those who are fluent in
Megrelian, whereas when they write in Georgian, they know it will be under-
stood by all, whether they have a Megrelian background or not.
4.3 Internet as a Depository

Webpages publishing Megrelian online materials represent a more static form
of communication, but they are also of interest in this context as they rep-
resent one aspect of the Megrelian presence on the Internet, although such
materials more closely resemble printed publications. There is also a connec-
tion in that Megrelian publications – online or offline – often are a topic dis-
cussed in social media.
There is a large body of folkloristic material of various kinds, with songs and
oral poetry, but it also includes attempts to teach the Megrelian language using
various forms of video materials. One of the most popular YouTube channels
is “margaluri nina – megruli ena – Megrelian Language” that has over 3,300,500
views (25 October 2015). The interest in the channel has increased approxi-
mately 1.5 times in the last year (www.youtube.com/user/1986Giga/videos).
Another popular YouTube channel is “zugdiduri” with 5,920,600 views (in
October 2015) with a mixture of videoclips in Georgian, Megrelian and even
in Laz (www.youtube.com/ user/zugdiduri/featured). The Facebook page,
“margaluri literatura”, Megrelian literature posts and discusses literature in
Megrelian, often in a combination of both oral and written forms (www.face
book.com/profile.php?id=100005462970091). The following comments were
posted in response to the news on “margaluri literatura” of a new Megrelian
publication:
(4) X: – cignis maghaziebs ivapuno qoichquda? (M)

‘Do you know, will it be on sale in bookshops?’
I: – gexarit patoni . . . ma iro axlos vipiq (M)

‘My dear Sir, I will always be nearby.’
254 Vamling
Z: – ქო იჸუაფ, დო ხოლო გიაჸუნუ შხვა ლიტერატურა

მარგალურო. (M)
– ko, iɁuap, do xolo giaɁunu šxva lit’erat’ura margaluro.
‘Yes, it will, and there will be more literature in Megrelian.’
M:– სამწუხაროდ “უტყდებათ” მეგრულად საუბარი, ვაი,

ასეთ მეგრელებს რა ვუთხარი, ტუტუცები არიან
უბრალოდ და შეუგნებელნი. (G)
– samc’uxarod “ut’q’debat” megrulad saubari, vai, aset megrelebs
ra vutxari, t’ut’ucebi arian ubralod da šeugnebelni
‘Unfortunately, they are “ashamed” of speaking Megrelian. Ah, such
Megrelians, what shall I say to them? They are foolish and irresponsible.’
Finally, we note the educational and cultural enterprise of setting up Wikipedia

in Megrelian, working in tandem with the Megrelian Wikipedia Facebook. So
far, the Megrelian Wikipedia includes over 5,300 article enties (October 2015)
and has increased by approximately 13% with respect to the number of articles
during the last year. The popularity of the Megrelian Wikipedia Facebook has
also enjoyed an increase of 28% in 2015.
5 Concluding Remarks
The general observation that can be made on the basis of this limited study
of the presence of Megrelian on the Internet is that it is quite diverse. Megrelian
is found in social media, in video clips and in various (private) initiatives to
create online publications of literature, folkloristic and language materials,
often used in different combinations (i.e. Megrelian videos and online pub-
lications are promoted and discussed in social media). Moreover, the interest
in Internet communities and web pages with a Megrelian profile seems to be
increasing quite substantially. Social media is a natural arena for the younger
generation, and it functions as a complement in maintaining increasing, often
transnational, community networks. In this sense, the Internet functions as a
tool for the development of the language.
Social media, with its conversation-like mode of communication, pro-
vide a situation that is close to casual oral communication – one of the main
domains of the use of Megrelian. The observed use of Megrelian in chats and
social media is thus a challenge to the written/oral divide in the Georgian/
Megrelian diglossia, and a sign of that it might be in the process of changing.
The Internet and CMC may create and maintain a common arena of Megrelian
communication to strengthen cohesion and thereby contribute to the mainte-

nance of the language.
Abbreviations
AOR aorist, CMC computer-mediated communication, DAT dative, FUT future,

G Georgian, GEN genitive, ICT information and communication technology,
IMP imperative, M Megrelian, PL plural, PRS present, Q question marker,
R Russian, SG singular, VOC Vocative.
References
Baron, Naomi S. 2008. Always On: Language in an Online and Mobile World. Oxford:
Oxford University Press.
Broers, Laurence. 2001. “Who are the Mingrelians? Language, Identity and Politics in
Western Georgia.” Paper presented at the Sixth Annual Convention of the
Association for the Study of Nationalities 2001. Accessed September 14, 2016. http://
georgica.tsu.edu.ge/?p=319.
Chervonnaya, Svetlana. 1994. Conflict in the Caucasus. Georgia, Abkhazia and the
Russian Shadow. London: Gothic Image Publications.
Chikobava, Arnold. 1936. č’anuris gramat’ik’uli analizi. Tbilisi: ssrk’ mecnierebata
ak’ademiis sakartvelos pilialis gamocema.
Crystal, David. 2001. Language and the Internet. Cambridge: Cambridge University
Press.
Crystal, David. 2002. Language Death. Cambridge: Cambridge University Press.
Danelia, Nana and Inga Dundua. 2006. k’olxuri (megrul-lazuri) ena. Tbilisi: Universali.
Deeters, Gerhard. 1930. Das kharthwelische Verbum. Vergleichende Darstellung des
Verbalbaus der südkaukasischen Sprachen, Leipzig.
“Endangered languages: The Full List.” The Guardian Datablog. Accessed January 29,
2013. www.guardian.co.uk/news/datablog/2011/apr/15/language-extinct-endangered.
Freedom House. 2015. “Georgia.” Accessed October 30, 2015. https://freedomhouse.org/
report/ freedom-net/2015/georgia
Gamsakhurdia, Zviad. 1989. “samegrelos sak’itxi”. isak’i žvania da samegrelos
“avt’onomia”. Lit’erat’uruli sakartvelo, 3 Nov: 6-8.
Green, Leila. 2010. The Internet: An introduction to New Media. New York: Berg.
Gvantseladze, Teimuraz. 2006. enisa da dialekt’is sak’itxi kartvelologiaši. Tbilisi:
universali.
Harris, Alice C. 1991. “Mingrelian.” In: The Indigenous Languages of the Caucasus. Vol. 1,
The Kartvelian Languages, 313-394. Delmar N.Y.: Caravan Books.
256 Vamling
IDFI. 2013. “Statistics on Internet Users in Georgia.” Institute for Development of

Freedom of Information (ms.), http://www.idfi.ge/?cat=news&topic=384&lang=en.
Kajaia, Otari. 2001-2002. megrul-kartuli leksik’oni. Vol. 1-3, Tbilisi: Nekeri.
Kartozia, Guram, Rusudan Gersamia, Maia Lomia and Taia Tskhadaia. 2010. megrulis
lingvist’uri analizi. Tbilisi: meridiani.
Khubua, Makar. 1937. megruli t’ekst’ebi. Tbilisi: sakartvelos ssr mecnierebata ak’ademiis
gamomcemloba.
Kipshidze, Iosif. 1914. Grammatika mingrelskago (iverskago) jazyka s xrestomatieju i
slovarem. Sankt-Peterburg.
Klimov, Georgii 1999. “Mingrelskij jazyk.” In: Kavkazskie jazyki, 52-59. Moscow:
Akademia.
Klimov, Georgii and Otar Kajaia. 2013. Megrelo-russko-gruzinskij slovar’. Moskva:
Govorun.
Lomaia, Maia and Rusudan Gersamia. 2012a. xaztašoris morpemuli glosireba. Tbilisi:
Ilias saxelmc’ipo universit’et’i. nac’ili/part 1.
Lomaia, Maia and Rusudan Gersamia. 2012b. megruli t’ekst’ebi. Tbilisi: Ilias saxelmc’ipo
universit’et’i. nac’ili/part 2.
Marr, Nikolai. 1910. Grammatika čanskago (lazskago) jazyka s xrestomatieju i slovarem.
Sankt-Peterburg.
Paolillo, John. 2011. “ ‘Conversational’ Codeswitching on Usenet and Internet Relay
Chat.” Language at Internet, Vol. 8, http://www.languageatinternet.org/articles/2011/
Paolillo.
Rosen, Georg. 1845. Über das Mingrelische, Suanische und Abchasische (Vorgelegt in der
Akademie der Wissenschaften am 31 Januar 1845).
Siebenhaar, Beat. 2008. “Quantitative Approaches to Linguistic Variation in IRC:
Implications for Qualitative Research.” Language@Internet, 5, article 4 (2008), www
.languageatinternet.org/ articles/2008/1615.
Svitzer, Bobby. 2012. The Peculiar Case of the Megrelians. Representation and Identity
Negotiation in Post-Soviet Georgia. MA thesis, Dept. of Global Political Studies,
Malmö University.
Tchantouria, Revaz and Karina Vamling. 2005. “Basic verb frequency in Megrelian.”
Working Papers 51, Department of Linguistics, 199-207. Lund: Lund University.
Thurlow, Crispin and Kristine Mroczek. 2012. Digital Discourse: Language in the New
Media. Oxford Scholarship Online. DOI:10.1093/acprof:oso/9780199795437.001.0001
Tsagareli, Alexander. 1880. Mingrelskie etjudy. vypusk I i II, Sankt-Peterburg.
Vamling, Karina. 2000. “Language Use and Attitudes among Megrelians.” Analysis of
Current Events. Slavic & East European Studies. Baylor University. Vol. 12: 5-6,
September 2000: 9-11.
Vamling, Karina. 2005. “Sentences with Double Subordinators – namda and -ni in
Megrelian.” Haptačahaptaitiš. Festschrift for Fridrik Thordarson, edited by Dag Haug
and Eirik Welo, 319-328. Oslo: Oslo University Press.
Vamling, Karina and Revaz Tchantouria. 1991. “Complement clauses in Megrelian.”

Studia Linguistica 48 (1/2): 71-89.
Vamling, Karina and Revaz Tchantouria. 1993. “On subordinate clauses in Megrelian.”
In The internal structure of adverbial clauses, edited by Kees Hengeveld. Eurotyp
Working Papers V: 67-86.
Vamling, Karina and Revaz Tchantouria. 2010. “Language use and language attitudes
among Megrelians.” In Language, History and Cultural Identities in the Caucasus,
Caucasus Studies 2, edited by Karina Vamling. Papers from the conference, June
17-20 2005, School of International Migration and Ethnic Relations, 81-92. Malmö:
Malmö University.
Warschauer, Mark, Ghada R. El Said and Ayman G Zohry. 2002. “Language Choice
Online: Globalization and Identity in Egypt.” Journal of Computer-Mediated
Communication, 7. doi: 10.1111/j.1083-6101.2002.tb00157.x
Wexler, Paul. 1971. “Diglossia, language standardization and purism. Parameters for a
typology of literary languages.” Lingua 27: 330-354.
YouTube. 2009. “megruli simgera mengrelian song.” Accessed October 25, 2015. https://
www.youtube.com/all_comments?threaded=1&v=CYRo1CmCWX8
YouTube 2011. “mengrelian song chqim chonguri megruli simgera chemi chonguri mar-
galuri chqimi chonguri.” Accessed October 25, 2015. https://youtu.be/ZzNm5jX86n8
Internet Sources
Kolkhuri tsentri: https://www.facebook.com/kolkhikolxi

Margaluri literatura: www.facebook.com/profile.php?id=100005462970091
Margaluri nina: www.facebook.com/megrelianlanguage/info
Margaluri nina – megruli ena – Megrelian Language (Kolkhi TV): www.youtube.com/
user/1986Giga/videos
Megrelian Wikipedia: https://xmf.wikipedia.org/wiki/მარგალური_ვიკიპედია
Megrelian Wikipedia Facebook: https://www.facebook.com/xmfwikipedia
Megruli poeziis shedevrebi: https://www.facebook.com/მეგრული-პოეზიის-
შედებრები-246805548845870/info/?tab=page_info
Moy mir: http://my.mail.ru
Odnoklassniki: http://www.odnoklassniki.ru/
Rutube: http://rutube.ru/
Vkontakte: http://vk.com/
Zugdiduri: www.youtube.com/user/zugdiduri/featured
Chapter 18
Linguistic Topography and Language Survival

George van Driem
A number of heterogeneous factors determine the survival and death of lan-

guages. At Ardahan in 2014, I coined the term linguistic topography to denote
the sociolinguistic situation of endangered languages in terms of the diverse
factors which determine a language’s prospects for extinction or survival.1 The
notion of linguistic topography is inspired by August Schleicher and Salikoko
Mufwene and opposed to a distinct and, as I shall argue here, complementary
approach to language, of which I am a proponent, inspired by Friedrich Max
Müller. Charting the linguistic topography of any particular language embod-
ies an attempt to distinguish, analyse and quantify the heterogeneous factors
which determine the propensity of that language at any given time in its his-
tory to thrive or to fall into desuetude.
1 Two Darwinian Approaches to Language
Evolution as a phenomenon in the natural world resulting from cumulative

changes in heritable traits from one generation to the next looms large in the
writings of Pierre-Louis Moreau de Maupertuis (1698-1759), Georges-Louis
Leclerc, Comte de Buffon (1707-1788), Jean-Baptiste Pierre Antoine de Monet,
Chevalier de Lamarck (1744-1829) and Thomas Robert Malthus (1766-1834).
Inspired by the writings of Malthus, the naturalist Alfred Russel Wallace con-
ceived of natural selection as the key mechanism that drove evolution, and
in 1856 at the age of thirty-three Wallace seeded the brain of Charles Darwin,
then aged fourty-seven, with this seminal idea in a letter which he wrote from
the Indonesian archipelago. Darwin eagerly incorporated Wallace’s ideas into
his own writings and propagated natural selection as the principal mechanism
driving evolutionary change.
Generations of biologists have heaped obloquy onto Lamarck and his
conception of evolution, for it is too easily forgotten that Darwin too was a
1 This paper was presented at the 1st International Caucasus University Association Conference
on Endangered Languages at Ardahan Üniversitesi on the 15th of October 2014.

Linguistic Topography And Language Survival 259
Lamarckian. Not only were Wallace and Darwin both deeply influenced by the
1844 English popularisation of Lamarck’s work, entitled Vestiges of the Natural
History of Creation, Darwin explicitly counted ‘the inherited effects of use and
disuse’ as being amongst the ‘general causes’ and ‘general laws’ which govern
whether or not variations are transmitted to offspring (1871, i: 9). Darwin’s
views are clearly spelt out in the Descent of Man (e.g. 1871, i: 116-121). He con-
ceived of ‘natural selection’ as ‘the chief agent of change, though largely aided
by the inherited effects of habit, and slightly by the direct action of the sur-
rounding conditions’ (1871, i: 152-153).
With respect to the inheritance of characteristics acquired during the life-
time of an organism, Darwin was just as much a Lamarckian as Lamarck. As
the celebrated linguist Friedrich Max Müller pointed out, ‘Darwin’s real merit
consisted, not in discovering evolution, but in suggesting new explanations of
evolution, such as natural selection, survival of the fittest, influence of environ-
ment, sexual selection, etc.’ (1889: 273). Meanwhile, in light of the promiscu-
ous intricacies of molecular genetics, the old polemic about Lamarckian vs.
Darwinian evolution today appears a trifle dated, for our understanding of
evolutionary dynamics has progressed well beyond such a simplistic confron-
tation of dogmas.
Charles Darwin’s On the Origin of Species was published on 24 November
1859. The German translation by the palaeontologist Heinrich Georg Bronn
appeared in 1860 as Über die Entstehung der Arten. The maverick German
biologist Ernst Haeckel sent a copy of the German translation to his friend,
the linguist August Schleicher. Inspired by this work, Schleicher adopted the
view of individual languages as species, which compete against each other ‘im
Kampfe ums Dasein’ (1863). A modern proponent of Schleicher’s view of lan-
guages as species subject to natural selection is Salikoko Mufwene (2001, 2005a,
2005b). By contrast, Friedrich Max Müller conceived language as such to be an
organism. On the 6th of January 1870, in the very first issue of Nature, Müller
took issue with Schleicher’s idea of language survival in terms of ‘die Erhaltung
der höher entwickelten Organismen’ and instead argued that language survival
was a more complex issue.
Although this struggle for life among separate languages exhibits some
analogy with the struggle for life among the more or less favoured spe-
cies in the animal and vegetable kingdoms, there is this important dif-
ference that the defect and the gradual extinction of languages depend
frequently on external causes, i.e. not on the weaknesses of the languages
themselves, but on the weakness, physical, moral or political, of those
260 van Driem
who speak them. A much more striking analogy, therefore, than the
struggle for life among separate languages, is the struggle for life among
words and grammatical forms which is constantly going on in each lan-
guage. Here the better, the shorter, the easier forms are constantly gain-
ing the upper hand, and they really owe their success to their inherent
virtue. (1870: 257)
Darwin (1871, I: 60-61) adopted Müller’s conception of language evolution in

his Descent of Man. Over a century later, I voiced an essentially similar view,
which at least in this one respect gives the appearance of being diametrically
opposed to that of Schleicher and Mufwene.
The survival of a language is not determined by its grammatical subtlety,

its degree of refinement or the richness of concepts and notions which
find expression in its lexicon, but by largely unrelated economic, demo-
graphic and political factors affecting the people who happen to speak
the language. Languages which survive are not necessarily in any way
superior to those that go extinct . . . The fecundity with which a particular
language spreads and outcompetes another language may have little or,
in some cases, nothing to do with its grammatical propensities or lexical
richness and refinement. (2001: 113)
These two approaches, language as an organism vs. languages as species, rep-

resent distinct views of language evolution. In the Müller-van Driem approach,
the emergence and evolution of language in hominids is viewed in terms of
language as a semiotic organism which arose symbiogenetically within the
human brain. Relevant to our understanding of the nature of this semiosis is
the novel claim advanced by George Grace (1981, 1987) that language evolved
primarily not as a system of communication, but as an epistemological system
in order to organise the vast amount of sensory input and build conceptual
models of possible realities. The communicability of language-borne con-
structs and categories would, in Grace’s conception, be a secondary feature.
The language organism model studies natural selection as operative at the
levels of lexical and grammatical morphemes and language structures. This
model of language evolution is called Symbiosism (van Driem 2015b).
By contrast, the Schleicher-Mufwene conception views individual languages
as species in competition on a global scale. Whereas both models envisage
natural selection as operating on observable linguistic diversity and driving
language change, the units of selection are of a different order of magnitude.
Notwithstanding my initially skeptical stance with regard to the Schleicher-

Mufwene conception, the premiss formulated by Schleicher and elaborated
by Mufwene is an interesting and testable model, which merits elaboration
in face of the global scale of the threat of language extinction today. I pro-
pose a programme of research which aims analytically to apply the Schleicher-
Mufwene model to individual languages in order to assess the sociolinguistic
and semiotic factors determining their viability. To do so requires distinguish-
ing multiple levels of analysis. By enhancing our understanding of the anatomy
of the relationship between language and its human host, linguistic topogra-
phy unifies the two Darwinian approaches by the combined application of the
analytical frameworks of both models.
2 Linguistic Topography
Such a programme would have to assess the applicability of the notion of

inclusive fitness to grammatical structures and semantic systems in the light
of competing linguistic developments in the cultural environment of a lan-
guage community. Mathematical models have been developed to quantify
inclusive fitness, e.g. Dawkins (1982), Demetrius and Ziehe (1994), Grafen
(2009), Keller (1994), Maynard Smith (2000, 2004), but for languages weighted
assessments of socio-economic, demographic and politico-historical factors
affecting the vitality of individual languages would also have to be quantified
and modelled. Without overstretching biological analogies, the utility and
applicability of the notion of an extended phenotype manifestly holds prom-
ise for modelling the vitality of individual languages. One reason why such a
programme of research has not been undertaken until now is the sheer diffi-
culty and analytical complexity of conducting an empirically grounded study
of all linguistic and other observable phenomena relevant to developing and
testing the Schleicher-Mufwene model.
Another reason why this model has not been tested today is that the con-
cept of individual languages as entities in competition goes back to the early
days of language typology, at a time when the field was marred with a che-
quered history. After Pott (1848) distinguished the basic linguistic types, e.g.
‘isolirend, agglutinirend, flexivische, einverleibend’, a racist form of linguistic
typology was developed by others who did not heed the exhortations of Julius
von Klaproth and Max Müller not to confuse linguistic affinity and biological
ancestry. Scholars such as Arthur de Gobineau, Heymann Steinthal and Ernest
Renan used language typology to buttress a racist world view and arranged
262 van Driem
language types hierarchically on a typological ladder of evolutionary devel-

opment. If we keep this egregious episode of Social Darwinism in linguistics
in mind as a cautionary example, it should be possible today to devise a pro-
gramme of inquiry to explore and test the Schleicher-Mufwene hypothesis
within a Darwinian framework devoid of ludicrous value judgements.
The inclusive fitness of a language is an important dimension of its linguis-
tic topography. However, since language is a semiotic life form, and individual
languages are entities borne by living and speaking populations of hominid
hosts, various levels of analysis must be distinguished in order to make quan-
titative assessments and predictions about the prospects that a language may
thrive or die and to discover hitherto unmooted factors which may determine
the inclusive fitness and survival prospects of a language. The following pro-
visional short list cannot yet claim to be exhaustive and will no doubt require
augmentation and enhancement in due course. Yet the list specifies some of
the sociolinguistic factors which form part of the assemblage of parameters
characterising the linguistic topography of any given language.
(1) The domains of use of a language and the facility of use of the language
(2) What Wilhelm von Humboldt called the Inhalt of a language
(3) The demographics of the human population using the language as a
mother tongue.
(4) The socio-economic situation of the language community in relation to
competing or neighbouring language communities
The quantification and weighting of these various dimensions of linguistic

topography is no trivial undertaking.
3 Domains of Use and Facility of Use
The domains of use constitute one determinant of the linguistic topography

of a language, and closely tied to this issue is the facility of the use of the lan-
guage. It might be expected that a person’s native language should be the easi-
est language for that person to use in any given context. In 1569, one of the
several arguments advanced by Goropius Becanus of Hilvarenbeek, alias Jan
van Gorp, that Flemish or Dutch must be the original language of mankind was
that, to his mind, as a medium of expression Flemish was more to the point
than any other language, and Flemish words meant exactly what they signi-
fied. Although this naïve viewpoint was expressed in writing long ago, one may
on occasion still hear similar views innocently expressed by people with regard
to their own native language, which for obvious reasons strikes them as being
the most apt and most natural of all languages.
Yet for reasons which have nothing to do with the aptness, richness or preci-
sion of expression of a language, a language community may cede domains of
usage to the tongue of another language community. The different sociolin-
guistic situations in which speakers of a language either decide to surrender or
acquiesce to ceding a domain of language use to another tongue merit iden-
tification and study. Let us look at one such case, which is ongoing and easily
observable. In 1989, Jo Ritzen became Minister of Education and Sciences in
The Hague. Ritzen introduced the idea and later the practice of using English
medium in university education in the Netherlands. Hitherto most scientific
discourse, whether in experimental physics, astronomy, theoretical physics,
microbial genetics, cell physiology, economics, medicine or linguistics, had
essentially been conducted almost exclusively in Dutch. The language has for
centuries had a continually expanding arsenal of precise specialised lexical
terms in the sciences. Antoni van Leeuwenhoek did not bother to translate his
letters to the Royal Society in London into English.
In terms of precision or richness of expression, nothing whatsoever is
gained by replacing Dutch terms such as eiwitmantel ‘capsid’, geleedpotigen
‘arthropods’, holtedieren ‘coelenterates’, tweezaadlobbigen ‘dicotyledons’, cel-
vocht ‘cytoplasm’, bedektzadigen ‘angiosperms’, achterhoofdskwab ‘occipital
lobe’, traagheid ‘inertia’ and trage massa ‘inertial mass’ with their English
equivalents. In fact, it can be argued quite defensibly that the English forms are
inferior because of their semantic opacity. The motive behind Ritzen’s policy
was to tap into a lucrative global education market. The use of English medium
in tertiary education enables Dutch universities to sell Bachelor’s, Master’s and
Doctoral programmes more competitively to international students, just as do
the universities in the Anglo-Saxon countries. Yet Ritzen’s policies have set into
motion the ultimate surrender of a vital domain of the Dutch language and
may even have sounded the knell for Dutch as a language of science.
As a language of science, Afrikaans has been able to piggy-back on Dutch,
with its over twenty-four million native speakers in the Netherlands, Belgium,
the West Indies and Surinam. Afrikaans has nearly seven million native speak-
ers, and policy makers in the Afrikaans language community have always
been perceptive enough to recognise the importance of using their language
as a medium of science. Scientific articles written in Afrikaans bearing titles
such as Die klassifikasie van ’n sianoprokarioot deur van ligmikroskopie, trans-
missie elektronmikroskopie en molekulêre tegnieke gebruik te maak ‘The clas-
264 van Driem
sification of a cyanoprokaryote using light microscopy, transmission electron

microscopy and molecular techniques’, Die effekte van verskillende n-3 en n-6
poli-onversadigde vetsure op die sekresie van insulienagtige groeifaktor I in
MC3T3-E1 osteoblaste ‘The effects of various n-3 and n-6 polyunsaturated fatty
acids on the secretion of insulin-like growth factor I by MC3T3-E1 osteoblast-
like cells’ or Nuut-ontwikkelde metode vir die meting van triptofaan en triptofaan-
metaboliete: kliniese toepassing ‘Newly developed method for quantification of
tryptophan and its metabolites: clinical application’ are typical and routine
and have been taken here at random from a recent issue of the Suid-Afrikaanse
Tydskrif vir Natuurwetenskap en Tegnologie.2
The examples of Dutch and Afrikaans clearly illustrate that loss of terrain
is a matter of domain, and that language loss in many cases begins at home,
for the developments in the Netherlands and South Africa have been precipi-
tated by decisions taken at a political level. The Himalayan region as a whole,
and the Eastern Himalaya in particular, represents one of the world’s hot
spots in terms both of linguistic diversity and of language endangerment. In
terms of domains, the national languages of Bhutan and Nepal provide some-
what contrasting examples of linguistic topography. For both languages, the
struggle is as much about not ceding domains of use to English as acquiring
hitherto uncolonised domains of use for the language. Yet in terms of political
motivation, the relative success of Nepali is due not so much to political deci-
sions as much as it is to the vibrancy of the language community, whereas in
Bhutan the best intentions of the Royal Government of Bhutan to advance the
national language often appear to get foiled or at least be somewhat mitigated
by a number of other factors.
Nepali is more robust than Hindi and has colonised and thrived in new
domains more effectively than Hindi. Nepali terms that are part and parcel of
normal lay speak and natural educated discourse include गुरुत्व आकर्षण gurutva
ākarṣaṇ ‘gravitational attraction’, बहुदलीय प्रणाली bahudalīya praṇālī ‘multi-
party system’, संविधान सभा saṃvidhān sabhā ‘constitutional assembly’, आतङ्किाद
ātaṅkavād ‘terrorism’, प्राकृवतक उपग्रह prākṛtik upagraha ‘natural satellite’,
संग्रहालय saṅgrahālaya ‘museum’ and, obviously, countless other terms in such
speech registers. The many indigenous languages of Nepal generally adopt the
Nepali technical terms if such registers of discourse are not just conducted
by their speakers directly in Nepali in preference to the native language. Of
2 In respective order, the authors of the articles named are L. Labuschange, M. Wescott, S. du
Plessis, A. Venter and A. Levanets; E. Moseley, T. Steynberg and M. Coetzee; P. Bipath and
M. Viljoen, and the issue cited is Suid-Afrikaanse Tydskrif vir Natuurwetenskap en Tegnologie,
Jaargang 28 No. 2: Junie 2009.
course, Nepal is one of the few countries in Asia which managed to safeguard
its sovereignty intact throughout the age of European colonial expansion.
In India, by contrast, English is generally used instead of the Hindi neol-
ogisms that have been coined to express certain notions. Of course, Hindi
also has long coined neologisms, such as सूक्ष्म-दर्षक यन्त्र sūkṣma-darṣak yan-
tra ‘microscope’ or द ूरभार संखया dūrbhāṣ-saṅkhyā ‘telephone number’, except
that these terms generally remain unused. In terms of ease or convenience,
English telephone number may perhaps have little to recommend itself in pref-
erence to the Hindi neologism, which is just as apt. However, often enough
the Hindi neologism is so extraordinarily clumsy as to render the coinage
definitively unusable in any natural register of spoken language other than
satire, such as भुमष्मगत पैदल पार पथ bhumigat paidal pār path ‘underground foot
crossing path’ for ‘subway’, often seen written on signage in Delhi. Despite the
far greater number of native speakers of Hindi, the linguistic topography of
Nepali today is immeasurably healthier than that of Hindi, for Hindi has ceded
numerous domains to English. The contrast can be most vividly illustrated in
cases where Nepali and Hindi happen to use the same neologisms. Speakers of
Nepali will usually be heard to say विश्वविद्ालय viśvavidyālaya ‘university’ and
संग्रहालय saṅgrahālaya ‘museum’ in normal speech, whereas speakers of Hindi
are far more likely than not to say what I have sometimes even seen written in
Devanāgarī script as yunivarsiṭī ‘university’ and myuziyum ‘museum’.
Whilst protagonists in Hindi films and speakers in Hindi talk shows glibly,
perennially and almost invariably shift from English to Hindi and back, often
within the same sentence, natural Nepali speech is seldom if ever charac-
terised by the same coquettish code switching. The situation is yet different
again in Bhutan, where Dzongkha has the status of national language and
has long been used in legal, political and religious contexts as a spoken lan-
guage throughout the kingdom. Native to western Bhutan, Dzongkha is also
used throughout the country in official contexts. Dzongkha has only in recent
history become a written language, although some traditionalist advocates
might contend that the language has been used in writing for centuries under
the guise of its literary exponent Chöke, which in reality, however, is a distinct
language, the Classical Tibetan liturgical tongue.
In its traditional domains, the Dzongkha and Chöke terms are often identi-
cal, and Dzongkha suffers from no dearth of vocabulary for notions such as
བཀའ་ཤོག་ (bKaḥ-śog) kasho ‘edict, royal decree’, སྤྲུལ་སྐུ་ (sPrul-sku) trüku ‘reincar-
nation’ or དབང་ (dBan̂ ) ’wang ‘empowering benediction’. Dzongkha struggles
to colonise domains which in Bhutan are presently dominated by English. In
the political and administrative realm, Dzongkha neologisms have made easy
inroads, e.g. རྒྱལ་ཡོངས་ཚོགས་འདུ་ (rGyal-yon̂ s Tshogs-ḥdu) gäyong tshôdu ‘national
266 van Driem
assembly’, ཆ་སྦོར་ (Cha-sbyor) chajo ‘ratification’, ཁྲིམས་དོན་ཚོགས་ཆུང་། (Khrims-don

Tshogs-chun̂ ) thrimdön tshôchung ‘legislative committee’, not least because
the usage of such terms is required by the compulsory use of the national lan-
guage in administration, but some of these coinages strike people as artificial
so that in speaking they may often resort instead to the English word. Coinages
are quite often devised by Bhutanese specialists in Chöke who happen not
to be native speakers of Dzongkha. For new items of material culture some
Dzongkha neologisms have met with success, although most are dismissed
as clumsy and hence never adopted in actual usage. Some slightly more suc-
cessful neologisms include བརྒྱུད་འཕྲིན་ཨང་། (brGyud-thrin An̂ ) jüthrin ’ang ‘tele-
phone number’, སྣུམ་འཁོར་ (sNum-ḥkhor) ’numkho ‘car’, དེད་གཡོག་ཆོག་ཐམ་ (Ded-gyog
Chog-tham) deyo chôtam ‘driving licence’, གནམ་གྲུ་ཐང་ (gNam-gru-than̂ ) ’namdru-
thang ‘airport’. Yet science and modern technology remain exclusively English
domains.
The acceptance or rejection of such neologisms does not, however, provide
adequate insight into the precarious situation of Dzongkha in Bhutan. Like
Nepal, the Kingdom of Bhutan was one of the few Asian countries not to be
subjugated by a European imperialist power and so succeeded in preserving
its sovereignty. Yet linguistically Bhutan has suffered from various forms of self-
inflicted linguistic imperialism. One struggle is the process of vernacularisa-
tion, which is not unlike the mediaeval transition from Latin to French as a
language of writing in France. Whilst རོང་ཁ་ (rDzon̂ -kha) Dzongkha ‘language
of the fort’ is the official spoken language, native to western Bhutan, Classical
Tibetan or ཆོས་སྐད་ (Chos-skad) Chöke ‘language of the dharma’ has for centu-
ries been the traditional liturgical and literary language in Tibet, Bhutan and
Sikkim. People spoke in Dzongkha, but they did not write in the vernacular.
Therefore, when people in Bhutan say ‘good Dzongkha’, they generally used to
mean a good command of the written language Chöke.
The spelling systems of English and French are esoteric works of art, but in
fact only French orthography has been determined by a venerable council of
aesthetes called les Immortels, who have been elected as members of l’Académie
française, whereas English orthography is the poor legacy of a lexicographers’
comprise. Although the vagaries of both spelling systems are notoriously
arcane, the orthography of Dzongkha, despite piecemeal and unsystematic
orthographic reforms since the 1960s, is still largely based directly on Chöke.
Consequently, Dzongkha spelling remains unnecessarily complicated. For
example, the Dzongkha consonant phoneme written as j in the phonological
transcription, Roman Dzongkha, corresponds not only to the combination
རྗ་ (rJ) in native Bhutanese ’Ucen script, but also to the spellings བརྗ་ (brJ) j, ལྗ་ (lJ)
j, འཇ་ (ḥJ) j, མཇ་ (mJ) j, རྒྱ་ (rGy) j, འགྱ་ (ḥGy) j, བརྒྱ་ (brGy) j, སྒྱ་ (sGy) j, བསྒྱ་ (bsGy) j
and སྦ་ (sBy) j.
Practical experience has amply demonstrated that Dzongkha spelling is
experienced as being overly complicated for Bhutanese learners. The complex-
ity of Dzongkha spelling hampers the use of Dzongkha in new media such as
internet chats, text messages and email. The use of ad hoc romanisations is
often experienced as being so unsystematic in nature that in practice English
is most usually used instead. A phonological orthography of Dzongkha in the
native Bhutanese script will be publicly presented this year for the first time
(Karma Tshering and van Driem, forthcoming). Hopefully this phonologically
consistent spelling system in the ’Ucen script, called Phonological Dzongkha,
will, alongside Roman Dzongkha, enhance the facility of use of the national
language in contemporary written media.
Another challenge is that the Bhutanese educational system has severely
restricted the domains into which Dzongkha has been permitted to venture.
When the first two secular schools were opened in Bhutan during the reign
of འབྲུག་རྒྱལཔོ་ཨོ་རྒྱན་དབང་ཕྱུག་ King ’Ugä ’Wangchu (imperabat 1907-1926), Hindi was
chosen as the medium of instruction because of the ready availability of inex-
pensive textbooks. Chöke remained the medium of instruction in the lamasery
schools. In 1961, འབྲུག་རྒྱལཔོ་འཇྲིགས་མེད་རོ་རྗེ་དབང་ཕྱུག་ King Jimi Dôji ’Wangchu decreed
that Dzongkha was the national language. At one level, this decree simply
recognised the status quo. At a deeper level, the intent was vernacularisation,
a move away from Chöke to living Dzongkha. Another aim was to eradicate
instruction in Hindi.
Until 1971, the ‘Dzongkha’ taught in the schools was in fact Chöke. As a con-
sequence of the royal decree of 1961, new English-medium textbooks were
especially developed for the Bhutanese schools. These new course books
replaced the Hindi textbooks in 1964. In 1971, the རོང་ཁ་ཡར་རྒྱས་སེ་ཚན་ Dzongkha
Division of the ཤེས་རྲིག་ལས་ཁུངས་ Department of Education was established in
order to develop materials for instruction in Dzongkha. Textbooks and learn-
ing materials in Dzongkha were developed at a rapid pace for both primary
and secondary education. Initially, English remained the medium of instruc-
tion for subjects other than Dzongkha, but nowadays virtually all subjects are
taught in English. Only Dzongkha is taught in Dzongkha as well as some mod-
ules of certain subjects such as history and geography. Bhutan in effect chose
a language policy in formal education diametrically opposed to the Malaysian
policy of replacing English with Malay as the medium of formal education,
including the coining of Malay neologisms for scientific terms. The results
is that, with the exception of remote villages, young and upwardly mobile
Bhutan, rather than Singapore, is the most English speaking country in Asia
268 van Driem
today. In language endangerment, the loss and gain of domains of use repre-
sent one dimension determining the viability of a language and its potential
for survival.
4 The Semiotic Content of a Language
Both ceding domains of use or failing to colonise new domains of use create
a linguistic topography that is less favourable to the survival of a language.
However, when neologisms merely denote new entities which have come
into use in our material culture, then these new coinages do not enrich the
notional repertoire of the language. Whilst French has ordinateur, Czech has
počítač and Afrikaans has rekenaar, Dutch seems to make do with computer,
and Japanese fares well with コンピュータ konpyūta. The use of native roots
in coining apt and facile neologisms attests to the creativity and vitality of a
language, especially when these coinages catch on by their own virtue and are
not enforced by top-down measures, although administrative interventions
too quite often prove effective. Yet these precise translation equivalents for
referring to newly invented objects do not enhance the conceptual repertoire
of a language more than would an English loan word. They fail to augment or
diversify what Wilhelm von Humboldt called the Inhalt of a language.
The research programme spearheaded by Wierzbicka and Goddard sought to
identify shared semantic primitives presumed to be common to all languages.
Both Wierzbicka and Goddard as well as the participants in their research pro-
gramme earnestly believed in the existence of semantic primes, yet they were
unable to demonstrate the existence of shared universal categories of meaning
without resorting to the methodologically indefensible ploys of polysemy, allo-
lexy and so-called non-compositional polysemy in order to ‘find’ the purported
‘exponents’ of the hypothetical primes (van Driem 2004). The negative result of
their quest represents one of the most significant contributions to linguistics in
recent years, for their inadvertent and unwanted finding provides the strongest
corroboration to date for the theory of linguistic relativity developed in the
writings of Pierre de Maupertuis (1698-1759), Wilhelm von Humboldt (1767-
1835) and other linguists and subsequently popularised in North America by
Edward Sapir (1884-1939) and Benjamin Whorf (1897-1941). Grammatical and
lexical meanings in different languages generally tend to embody semantically
non-equivalent notional repertoires, and part of the resistance to the work on
Pirahã by Daniel Everett stems from a lingering but recalcitrant reluctance to
accept his empirical findings in many linguistic quarters still today.
The notional repertoire of English today is not the same as it was at the
time of King Alfred. The categories of meaning available to an English speaker
today, whether grammatically or lexically expressed, are not at all congruent
with those available to a speaker of Old English in the 9th century. Whilst the
language of Kind Alfred lives on today in the form of modern English by virtue
of an unbroken continuity of speech history, it can also defensibly be stated
that Old English is a dead language. Latin too is conventionally termed a dead
language, although through a continuous unbroken line of use the language
still exists as modern French, Romanian, Portuguese and the other Romance
tongues. The inexorable and universal nature of change was long ago expressed
by Heraclitus (ca. 535-475 bc), to whom the phrase πάντα ῥεῖ ‘everything flows’
is traditionally attributed, and this fact is personally experienced by all.
When proponents of linguistic diversity defend the use of native languages
and combat language endangerment in order to preserve mankind’s linguistic
heritage, presumably they are aware that language does and will change. The
relentlessness of change will cause one language to be replaced by another,
whether this takes the form of an alien tongue, as when the Celtic inhabitants
of Britain adopted the Teutonic tongue imported by Anglo-Saxon migrants,
or of drastic cumulative change over time, as in the case of Latin turning into
French or Old English ultimately becoming modern English. What is worth-
while preserving, or at least attempting to document, in addition to phonetic
diversity and the panoply of different types of morphological systems operative
in language is the language-specific repertoire of notions, meanings and con-
cepts which are lexically, grammatically or idiomatically expressed in any given
language. The danger to diversity is not change, but centripetal change in the
same direction in order to conform to one single global semiotic repertoire.
The insidious peril of semantic assimilation through the globalisation of
categories of meaning was a central theme in the writings of David Hubert
Greene, alias Dáithní ó Huaithne (1913-2008). In the context of the Irish lan-
guage, Greene explained what is meant by such semantic assimilation and
convergence.
Unfortunately, many people are under the impression that such mod-
ern terms as development, influence, interesting represent essential con-
cepts of human thought, and that no language can afford to be without
them; yet, although they are all of Latin origin, not one of them occurs in
Latin in anything resembling its modern meaning . . . But most European
languages, from Welsh to Russian, have accepted them either as loan-
words, or calques, as these equivalents of influence indicate: German
270 van Driem
Ein-fluß, Russian v-liyaniye, Welsh dy-lanwad, where the second element

in each case means ‘flowing’. (1966: 57-58)
Greene further illustrated this with a random but well-chosen Irish example.
An example is English development, for which Irish has no one equiva-

lent. Rather than say ‘await further developments’, the fitting Irish expres-
sion in a similar situation might be fanacht le cor nua sa scéal ‘waiting for
a new turn in the matter’. In other contexts where the English meaning
development would be appropriate, various different Irish categories of
meaning have to be found in Irish: forleathnú (smaoininmh) ‘widening out
(of an idea)’, imeachtaí ‘proceedings’, saothrú (na haigne) ‘cultivation (of
the mind)’, tabhairt chun cinn (ceantair) ‘advancing (of a district)’, tarlú
‘happening’, toradh ‘result’. Yet even Irish is not immune to the effects
of globalised categories of meaning. In recent times, the Irish word for-
bairt ‘growing, increasing’ has been used increasingly as an equivalent
for English ‘development’ in all contexts in which English ‘development’
could appropriately be used, even though Irish forbairt has never meant
‘development’ at any stage of its history. (1966: 59)
The observations made by Greene alert us to the danger of the loss of linguis-
tic diversity without actual language death. Semantic assimilation of one lan-
guage to another will reduce overall linguistic diversity. In fact, this insidious
phenomenon exerts a far greater impact on diversity, yet remains less ame-
nable to observation by the semantically unsophisticated, the monoglot and
the linguistically naïve observer. This threat raises questions which present a
fundamental challenge to the science of linguistics.
Will the languages of the future be more viable if these languages merely
represent exact or nearly precise translation equivalents of each other? Will
different languages become increasingly superfluous as they are all increas-
ingly compelled by normative influences exerted in the process of globali-
sation, including automated translation, shared international discourse
and the bullying scourge that is called political correctness, to give expression
to the same conceptual repertoire and so to have the same semiotic content?
At the same time, another pressing question which, given the current state,
direction and biases of linguistics, presently defies answering is the following:
Do certain types of conceptual repertoire render a language more resilient
than another language or in some sense intrinsically valuable? Methodologies
should be developed to address this central research query.
Instead, today a highly vocal segment of the linguistic community has

responded to the research findings published by Everett by crying foul and
alerting the Brazilian authorities, who have meanwhile undertaken linguisti-
cally and culturally to assimilate the Pirahã forcibly to the sedentary Occidental
mainstream culture and national language of Brazil and so to expunge forever
any and all trace of a conceptual repertoire and world view that was demon-
strably distinct from our own and so proved to be embarrassingly at variance
with the preconceptions of a subset of linguists that are blinded by essential-
ist notions and by their own typological and grammatical labels and biases.
A current obstacle in the ongoing discussion about language universals and
linguistic categories is precisely the presumed universality of putative lin-
guistic categories for which labels have been coined and artificial ‘test cases’
have been devised by a certain common breed of language typologist, some of
whom have recently gone into explicit denial regarding their Platonic agenda
and the essentialist underpinnings of their approach to language.
5 Demography and Socio-economic Factors
In addition to the factors which bear directly upon the language, its domains
of use and its semiotic content, there are sets of factors which determine
language viability that are related to the human speakers of the language.
Statistics and sophisticated methods of quantification appear ludicrous in
some extreme cases where a language has disappeared, as very many have,
because entire populations of speakers of these languages have been extermi-
nated by rival groups. Not only has genocide been perpetrated at times during
the colonisation of the Americas, Australia and the Andamans, but the whole-
sale slaughter of rival groups also features in the recorded history of the Old
World. Sometimes populations are wiped out not just by violent aggression
perpetrated by the rival group, but also equally by diseases introduced by an
incursive population. Often the genocide is incomplete, and then the small
contingent of survivors is afterwards easily linguistically assimilated so that
often no trace of the original language remains. Yet demographic change is not
invariably this drastic.
Sometimes demographicaly marginal groups hold on to a distinct ancestral
language alongside an overwhelming linguistic majority, such as the aston-
ishing resilience of Yiddish and Sorbian over time, whereas sometimes large
populations abandon their languages, as in the case of the many now extinct
Celtic languages of Europe and many languages of antiquity, such as Elamite,
272 van Driem
Hittite, Hattic and Hurrian. The phenotypical, cultural, ritual or religious dif-
ferences between populations of speakers also all play a role, as do the specific
dynamics of any process of acculturation, conquest or domination. In future, it
would be desireable to be able to quantify or meaningfully to characterise the
affects of each of such factors. Demographic factors affecting the number and
the fecundity of the population of speakers must be distinguished from those
affecting the socio-economic circumstances of the given language community.
Economy is a determinant of language vitality, but just how economic factors
affect language viability has yet to be fully understood.
Herodatus famously recorded the linguistic experiment ostensibly carried
out by the pharaoh Psammetichus I (664-610 bc) to discover the original lan-
guage of man. Children were brought up by themselves on an island or at some
remote locality, and, when they finally learnt to speak, they turned out to be
saying becos, the Phrygian word for ‘bread’. Yet was the man who supplied the
tiny and isolated experimental population of children with their daily allow-
ance of food not himself a Phrygian? The result of the legendary experiment
may have more to say about the socio-economic factors which determine the
direction of linguistic assimilation than about the original language of man-
kind. Languages are not all economically equally weighted. The languages that
pop up at you from your computer screen each time that a new operating sys-
tem of Apple is introduced reflect the economic weight in terms of consumer
potential of a highly select group of the world’s language communities. Certain
language communities which are an order of magnitude more populous in
terms of numbers of speakers, such as Bengali or Telugu, are not represented
in the same way as the languages of certain affluent but small language com-
munities in Europe, like Norwegian or Finnish, whose numbers of speakers
pale in comparison with the burgeoning populations speaking many of the
neglected languages.
The list of factors that determine the linguistic topography of a language
adduced above requires refinement and enhancement. The aim of this paper
has merely been to formulate the challenge to develop a programme of
research to study the linguistic topography of individual languages. Analysing
and charting the linguistic topography of a language should enable us to pro-
vide an insightful assessment of the viability of a language and a prediction
of its potential for survival. Although the proposed research programme has
been conceived within the Schleicher-Mufwene framework which envis-
ages individual languages as species in competition, the inclusive fitness of a
language can only be properly assessed and quantified when the anatomy of
the relationship between language as a semiotic organism and its human host
is properly understood, the distinction between language as organism and

individual languages as species is appreciated, and the interplay of various fac-
tors affecting each of the entities operative at the distinct levels of interaction
is understood. Not only will new methods have to be developed, in the process
certain ingrained biases prevalent in some quarters will have to be overcome.
Both semantic precision and semiotic sophistication are indispensable pre-
requisites, as has been argued in the prolegomena to the synoptic Bumthang
grammar (cf. van Driem 2015a). In future, the study of linguistic topography
could yield recommendations for policy makers, educators and members of
language communities based on an understanding and quantification of the
sociolinguistic dimensions of individual language endangerment situations at
different levels of analysis.
References
Darwin, Charles Robert. 1871. The Descent of Man and Selection in Relation to Sex (two
volumes). London: John Murray.
Dawkins, Richard. 1982. The Extended Phenotype. Oxford: Oxford University Press.
Demetrius, L., and M. Ziehe. 1984. ‘The measurement of Darwinian fitness in human
populations’, Philosophical Transactions of the Royal Society, 222: 33-50.
Driem, George van. 2001. Languages of the Himalayas: An Ethnolinguistic Handbook of
the Greater Himalayan Region, containing an Introduction to the Symbiotic Theory
of Language (2 volumes). Leiden: Brill.
Driem, George van. 2004. Book Review: ‘Meaning and Universal Grammar: Theory and
Empirical Findings. 2 vols. Ed. by Cliff Goddard and Anna Wierzbicka. (Studies in
Language Companion Series.) Amsterdam: John Benjamins, 2002. ISBN 1588112640.
$80 (Hb.)’, Language, 80 (1): 163-165.
Driem, George van. 2015a. ‘Synoptic grammar of the Bumthang language, a language of
the central Bhutan highlands’, Himalayan Linguistics Archive, 6: 1-77.
Driem, George van. 2015b. ‘Symbiosism, Symbiomism and the perils of memetic man-
agement’, pp. 327-347 in Mark Post, Stephen Morey and Scott Delancey, eds.,
Language and Culture in Northeast India and Beyond. Canberra: Asia-Pacific
Linguistics.
Goropius Becanus, Johannes. 1569. Origines Antwerpianæ sive Cimmeriorium
Becceselana novem libros complexa. Antwerpen: Christophorus Plantinus.
Grace, George William. 1981. An Essay on Language. Columbia, South Carolina:
Hornbeam Press.
274 van Driem
Grace, George William. 1987. The Linguistic Construction of Reality. London: Croom
Helm.
Grafen, Alan. 2009. ‘Formalizing Darwinism and inclusive fitness theory’, Philosophical
Transactions of the Royal Society, 364: 3135-3141.
Greene, David Hubert (Dáithní ó Huaithne). 1966. The Irish Language – An Ghaeilge.
Dublin: Cultural Relations Committee of Ireland.
Karma Tshering of Gaselô and George van Driem [forthcoming]. The Grammar of
Dzongkha Revised and expanded, with a Guide to Roman Dzongkha and to
Phonological Dzongkha. Canberra: Asia-Pacific Linguistics.
Keller, E. F. 1994. ‘Fitness: reproductive ambiguities’, pp. 120-121 in E. F. Keller and
E. Lloyd, eds., Keywords in Evolutionary Biology. Cambridge, Massachusetts: Harvard
University Press.
Leeuwenhoek, Antoni van. 1693. Derde Vervolg der Brieven, Geſchreven aan de
Koninglijke Societeyt tot Londen. Delft: Henrik van Kroonevelt.
Maynard Smith, John. 2000. Evolution and the Theory of Games. Cambridge: Cambridge
University Press.
Maynard Smith, John. 2004. Animal Signals. Cambridge: Cambridge University Press.
Mufwene, Salikoko S. 2001. The Ecology of Language Evolution. Cambridge: Cambridge
University Press.
Mufwene, Salikoko S. 2005a. Créoles, écologie sociale, évolution linquistique: Cours don-
nés au College de France durant l’automne 2003. Paris: L’Harmattan.
Mufwene, Salikoko S. 2005b. ‘Language evolution: The population genetics way’,
pp. 30-52 in Günter Hauska, ed., Gene, Sprachen und ihre Evolution. Regensburg:
Universitätsverlag Regensburg.
Müller, Friedrich Max. 1870. ‘The science of language’, Nature, I: 256-259.
Pott, August Friedrich. 1848. ‘Die wissenschaftliche Gliederung der Sprachwissenschaft:
eine Skizze’, Jahrbücher der freien deutschen Akademie, 1: 185-190.
Schleicher, August. 1863. Die Darwinsche Theorie und die Sprachwissenschaft: Offenes
Sendschreiben an Herrn Dr. Ernst Häckel, a.o. Professor der Zoologie und Direktor des
zoologischen Museums an der Universität Jena. Weimar: Böhlau.
Chapter 19
And So Flows History

Alexander Vovin
In this modest contribution I intend to demonstrate two major points: first,

that the gradual language loss is an inevitable part of the historical develop-
ment of the humanity, and second that the language revitalization is essentially
doomed. In other words, the history (an objective process) cannot be stopped
or reversed as far as language death is concerned, no matter how much we can
lament (a subjective process) this sad state of affairs. In order to illustrate these
two points with concrete examples I chose several case studies for both.
1 Gradual Reduction of the Linguistic Diversity
The three case studies that I am going to present here are Inner Asia, main
Japanese islands, and Korea. In the first and third cases, I review the linguistic
diversity from approximately 500 AD to the present, and in the second case
from about 700 AD to the present, roughly covering 1,500-1,300 years.
2 Case Study #1: Inner Asia
Around 500 AD we can observe the existence of the following language fami-
lies and languages co-existing in Inner Asia:
– Iranian languages: Sogdian, Khwarezmian, Khotanese, Tumshuqese, Per-

sian, Pamir languages.
– Indo-Aryan languages: Indian prakrits.
– Tocharian languages: Tocharian A, Tocharian B
– Turkic: East Old Turkic, Bulgar, etc.
– Mongolic: Ancestor of Mongolian, Naiman, Kereit, etc.
– Para-Mongolic: Khitan, Tabɣač, Särpi (Xianbi)
– Tungusic: Jurchenic, Nanaic, North Tungusic still spoken in Manchuria
– Tibeto-Burman: Tibetan, Tangut, various unattested languages of Tibet and
Sichuan
– Chinese

276 Vovin
– Ruan-ruan1
– Yeniseian: Xiong-nu, ancestral languages of Ket-Yugh, Kott-Assan-Arin, and
Pumpokol subgroups
Even given the fact that Iranian, Indo-Aryan, and Tocharian languages all
belong to the Indo-European family, and Mongolic and para-Mongolic lan-
guages are certainly related as well, while Tibeto-Burman and Chinese might
constitute together Sino-Tibetan phylum, we still have 7-8 unrelated language
families coexisting in this region.
However, in one thousand years at approximately 1500 AD the situation
changes quite drastically:
– Iranian: Persian, Pamir

– Turkic: Chagatay, New Uyghur, Kazakh, etc.
– Mongolic: Mongolian, Oirat, etc.
– Tungusic: Jurchenic, Nanaic; Solon, Ewenki (Oroqen), and Kili as remnants
of North Tungusic
– Tibeto-Burman: Tibetan, various unattested languages of Tibet and Sichuan
– Chinese
We can clearly see that not only the overall reduction of the overall number
of language families and languages has occurred, some of them simply dying
out like Tocharian, Ruan-ruan, Sogdian, Khotanese, Tumshuqese, and Tangut
or being pushed outside of this linguistic area like Yeniseian, but also for the
most part (with a notable exception of Turkic where the internal diversifica-
tion has increased), the internal diversification within the families has been
also greatly reduced.
At the first glance the situation in 2016 AD, 500+ years later does not seem
to be very different:
– Iranian: Tajik, Pamir

– Turkic: Uzbek, New Uyghur, Kazakh, Turkmen, etc.
– Mongolic: Mongolian, Oirat, etc.
– Tungusic: Sibe, Nanaic, Solon, Eweki (Oroqen), and Kili as remnants of
North Tungusic
– Tibeto-Burman: Tibetan, various languages of Tibet and Sichuan
– Chinese
1 On Ruan-ruan as a language not linguistically related to any other Inner Asian linguistic
family see Vovin (2004, 2010).
And So Flows History 277
Certainly, certain languages, such as Jurchen, Manchu, and Chagatay have died
out as spoken languages. But the main change that occurred is not so much
quantitative, but qualitative. All Tungusic languages, even the most numer-
ous among them, Sibe with more than twenty thousand native speakers, are
severely endangered, and some of them like Solon and Kili are moribund.
The same is true of the most Pamir languages and the Oirat language. On the
Chinese side of the Inner Asia, even the languages with several million speak-
ers like New Uyghur and Tibetan start to be slowly endangered due to the large
influx of Chinese immigration into Xinjiang and Tibet, as well as due to the
PRC language policy, which is felt especially acutely in Xinjiang, where often
even the education in elementary schools is forcibly replaced from the Uyghur
language-based by the Chinese language-based.2 It is no inconceivable, there-
fore, that in another one hundred years all Chinese Inner Asia will become
Chinese-speaking. And unless there is a major cataclysm in China that would
prevent this situation to become a reality, there seems to be no force that could
either stop or contain this process. On the other hand, there is much brighter
perspective in the Western part of Inner Asia, where the new independent
states liberated from the Soviet yoke now seem to have escaped the danger of
linguistic Russianization for good.
3 Case Study #2: Main Japanese Islands
Ca. 700 AD we find considerable diversity in main Japanese islands (Sakhalin,

Hokkaidō, Honshū, Shikoku, Kyūshū, and smaller islands excluding Ryūkyūan
islands):
– Japonic: Western Old Japanese, Eastern Old Japanese, Kyūshū Old Japanese,
ancestral language of Ryūkyūan.
– Ainu: Northern Honshū Ainu, Azuma Ainu, Kyūshū Ainu.
– Affiliation unknown: Hayato, Kumaso. Possibly some other languages which
we even do not know the names of.
– The language of the Okhotsk culture on Sakhalin and Hokkaidō, probably
some ancestral form of Nivx (Gilyak).
Thus, we have at least three genetically unrelated linguistic families, possi-

bly more, present in the main Japanese islands 1,300 years ago. It can be also
2 The situation in Tibet is slightly better (Kapstein 2006: 298-299).

278 Vovin
demonstrated that two of them: Japonic and Ainu, had considerable internal
diversity. One thousand years later, ca. 1700, the linguistic situation drastically
changes:
– Japonic: Japanese (roughly descendant of Middle Japanese), Hachijō

(descendant of Eastern Old Japanese)
– Hokkaidō Ainu, Sakhalin Ainu, Kuril Ainu (descendants of Northern
Honshū Ainu)
We can observe here the same picture as in the case study #1: not only the
overall reduction in the number of language families is obvious; the internal
diversity within the remaining language families also becomes less significant.
In 2016 AD, 300+ years later, the situation of the language loss is even more
drastic:
– Japonic: Japanese (descendant of Middle Japanese), Hachijō (descendant of

Eastern Old Japanese)
The Hachijō language is extremely moribund: it will die out completely in the
next ten to twenty years. Thus, all linguistic diversity of the past is reduced
essentially to a single language, which is, roughly speaking, a descendant of
only one of Japonic languages.
4 Case Study #3: Korea
Ca. 500 AD we find three genetically unrelated linguistic families on the Korean
peninsula and in the adjacent territory of Southern Manchuria:
– Koreanic: Koguryŏan, Sillan, aristocratic Paekchean, ancestor of the Chejudo

language
– Japonic: various Japonic languages in the center and the south or the Korean
peninsula: commoner Paekchean, Japonic Sillan, the language of Karak
(Kaya, Mimana)
– Tungusic: Jurchenic, possibly other Tungusic
One thousand years later, ca. 1500 only one language family survives on the
Korean peninsula, represented by three languages:
– Koreanic: Korean, Chejudo, Yukchin

In 2016 AD, 500+ years later, there is no quantitative change:
– Koreanic: Korean, Chejudo, Yukchin
However, similar to the case study #1, there is a very significant qualitative
change, as the Chejudo language is severely endangered, and Yukchin is mori-
bund in Kazakhstan and Russia, where it moved or was forcibly moved within
less than last two hundred years, and severely endangered in China. There are
no data available on its sociolinguistic status in Northern Korea, but given the
overall language policy in North Korea, which has as one of its major goals
the standardization of the existing non-standard varieties, it will be no wonder
if it is also at least severely endangered.
Thus, one can observe the same common developments in the last 1,500 to
1,300 years exhibited in all three case studies, namely:
1) Loss of the external diversity: the gradual reduction of the overall num-
ber of language families and languages.
2) Loss of internal diversity: the gradual reduction of the overall number of
languages within the same family.
3) Gradual language death resulting in the severe endangerment or mori-
bund nature of many surviving languages.
5 One Success and Multiple Failures of the Language Revitalization
In this section I will briefly survey one (and the only one!) story of the success-
ful language revitalization and three cases of failure (among multiple ones).
6 One Success Story
The only one success story of the language revitalization is well known: it
is the case of Hebrew. But why did the language revitalization (which was not
yet known under this name when Hebrew had been successfully revitalized)
succeed in this single case when in all others it so far miserably failed? The
answers seem to me to be quite self-evident, although they are not frequently
spelled out. Eliezer Ben-Yehuda is often almost solely credited with this suc-
cess (Blum and Rabin 1982: 5-14). However, the enthusiasts of the language
revitalization existed and continue to appear in other cases, albeit without any
tangible success. In short, these reasons are the following:
280 Vovin
1) For more than two thousand years of its history Hebrew always existed as
a liturgical language
2) It was used as a means of communication between European and Middle
Eastern Jews. Therefore, it always existed as a second language: a privi-
lege that other languages that were subject to revitalization never had.
3) It was the only choice as the language for Israel: not Yiddish, not Arabic,
not German, Polish, or Russian.
Let me now turn to the three case studies of failure.
7 Case Study of Failure #1: AINU
Many efforts were made to revitalize Ainu in the twentieth and the twenty-
first centuries, all of them in Japan. No such efforts were made in Russia,
although the last two speakers of Sakhalin Ainu passed away around 1975 AD
(V. M. Alpatov, p.c. around 1998). Moreover, Sakhalin Ainu was not even listed
as one of the minority languages of the USSR, although there is documentary
evidence for its existence (Novikova and Savel’eva 1953: 128-133). In Japan, on
the other hand, there were schools and groups for the study of Ainu, not only
on Hokkaidō, but also in Tokyo and Osaka, where not only ethnic Ainu them-
selves, but also native Japanese took classes. Ainu was taught as a subject in
Waseda University, Chiba University, Hokkaidō University and Asahikawa
Pedagogical University. Five textbooks of Ainu were published: (Tamura 1979)
based on Saru dialect of Hidaka on Hokkaidō, (Nakagawa et al. 1994) com-
bining South-Western and North-Eastern Hokkaidō dialects, (Nakagawa and
Nakamoto 1997) based on Chitose dialect of Hokkaidō, (Izutsu and Tezuka
2006) based on Asahikawa dialect of Hokkaidō, (Nakagawa and Nakamoto
2007) based on the language of folklore, and even a textbook of the Sakhalin
Ainu published after the language became extinct (Murasaki 2009). There is
also one pedagogical grammar with a short reader (Satō 2008), and numerous
teaching aids, such as for example (Izutsu 2006). There were also yearly con-
tests of the Ainu language in the city of Furano on Hokkaidō. And yet, all these
heroic efforts were in vain, and the Ainu language has effectively died out at
the present point. Why?
One of the key answers can be glimpsed from the Ainu language contests.
The author of these lines was present at one of these contests in 2001 in Furano.
What I saw, however, was encouraging only at the first glance. Young men and
women went on the stage and spoke in a very fluent Ainu either individually, or
in groups. But that was the end of it: once they descended from the stage, they
immediately switched to Japanese. That was my first clue to the understanding
of the main reason for the failure of the revitalization efforts: the Ainu lan-
guage could be the show-off case, but it had no communicative function. In
the next few days I played an experiment, trying to buy hot-dog in various con-
venience stores on Hokkaidō using Ainu. Naturally, I was given looks as if I
were an extraterrestrial from space. This convinced me finally that the Ainu
had no communicative function and value in the daily life on Hokkaidō. But
how can a language survive, let alone be revitalized, if it loses its main social
function: the communicative one?
The second observation concerns impossibility of using the Ainu language
even for quite elementary purposes outside of everyday conversation. In 2008
the Japanese National Diet finally recognized the Ainu people as the aboriginal
population of Japan, so in one of my publications (Vovin 2009), I have decided
to write a dedication to the Ainu people on this occasion. The English and
Japanese versions were composed within a minute:
– “To the Ainu people who were finally recognized as the aboriginals of the
Japanese archipelago”
– ついに日本列島の原住民として認められたアイヌ民族の方々へ
But when I proceeded to the Ainu version, I quickly realized that in spite of
many years of study of both modern and classical Ainu, I am completely unable
to do it, so asked my colleague Izutsu Katsunobu for help. He was able to come
out with something after several days, but warned me that it was extremely
unnatural and clumsy:
– tanepo sisam utarpa utar yaykopeker ayne aynu utari anakne yaunmosir
untar hoski okay utar ne ruwe eraman siri ne. nean aynu utari nispa utar
katkematutar ku-koonkami na
This translation is indeed extremely awkward and demonstrates quite well

that the Ainu language lacks even appropriate terminology. In short, when a
language cannot function in the modern world its chances of survival are very
slim, if existent at all. And if we ‘revitalize’ a language in this sad state of affairs,
unnatural and artificial passages will be a norm rather than an exception.
8 Case Study of Failure #2: Okinawan
One of the striking aspects of the language loss in Okinawa that while the Ainu
language death was more gradual and took over two hundred years to com-
plete, the language loss in Okinawa developed with a catastrophic speed in
282 Vovin
the post-World War II years, and really accelerated in the last ten or fifteen
years. In 2001 a number of local taxi drivers still spoke Okinawan in Naha, but
in 2012 I have not seen a single one: it was strictly Japanese. It might seem
that Okinawan would be in more advantageous position as compared to
Ainu, because in contrast to the latter that had no writing system before mid-
twentieth century the former had its own written tradition dating back to the
early sixteenth century. Okinawa has been de facto an independent kingdom
before 1609 AD, and although it was conquered by Satsuma clan of Kyūshū in
that year, it officially became a prefecture of Japan only in 1879 AD. Satsuma
overlords were more interested in Okinawa’s commerce and did not interfere
much with its political and educational system. This interference became sig-
nificant after Okinawa officially became a Japanese prefecture, and especially
in the years immediately preceding and during World War II, when any educa-
tion in Okinawan was essentially banned. There is however, two sides of the
medal in this story, as a number of elderly persons in Okinawa told me that it
was the initiative of their parents to educate them entirely in Japanese, because
this ensured that they have possibilities for better jobs both in Okinawa and
in the main Japanese islands. Consequently, the infamous hōgen fuda (方言
札) ‘dialect tags’ that were put on a child if s/he spoke in his/her school in
Okinawan as a punishment may not have been the sole responsibility of the
Imperial government.
In any case, the destruction of the education in the native tongue seems
to be the primary cause (but not the most important) of the language death
in Okinawa. As in the case with Ainu, efforts have been made to teach
Okinawans their former mother tongue, (Kawabatake 1982), (Nishioka and
Nakahara 2000), (Karimata n. d.). Recently even a pilot edition of a textbook
of Okinawan designated for Americans and Brazilians of Okinawan ancestry
has been published (Sakihara et al. n. d.). There is also a dictionary that could
be considered to a certain extent pedagogical (Uchima and Nohara 2006). But
overall, there are much less efforts in Okinawa directed at the revitalization
and Okinawan language education than at the Ainu revitalization. In spite
of the fact that it is still possible to find native Okinawan speakers (most of
them are at least in their sixties), the language is certainly moribund. The
second reason after the extermination of the education in Okinawan is that
it stopped to be the language of communication within the family, it is con-
siderably more important. This second reason is certainly directly connected
to the first: the first generation of Okinawans educated entirely in Japanese
continued to use Japanese with their children even within the family circle.
This created a snowball effect: with each following generation the competence
in Okinawan diminished until for those who are in their forties nowadays
and for the following generations it essentially became a foreign language,
which could be either learned or could be ignored altogether, because there
is no compulsory education in Okinawan either in schools or in universities.
Even at the University of Ryūkyū, the Okinawan language is an elective in
the general education program. In short, Okinawan, like Ainu has also lost its
communicative function.
9 Case Study of Failure #3: Hawaiian
At the first glance, Hawaiian is in an extremely advantageous position as

compared with Ainu and Okinawan, which have no official status in Japan.
Hawaiian, on the other hand, enjoys the status of the official language of the
State of Hawai’i alongside with English. This case is extremely bizarre on
the federal level, because English is not an official language of the United
States of America. USA do not have any official state language at all, contrary
to all other countries on the globe. It is more a matter of practice and conve-
nience that English fulfills the actual communicative function of a state lan-
guage, but this status is not legalized by any means. No other State in the Union
has an official state language, either. So the State of Hawai’i is an exception in
this respect, as in many others. All state documents have to be done in both
English and Hawai’ian. So at the first glance this legal practice ensures that the
Hawai’ian language preserves its communicative function, at least on the writ-
ten level. But does it? In other words, who reads these documents in Hawai’ian?
It appears that practically no one. In order to justify this, which might seem at
the first glance as an extreme position, let us look at the Hawai’ian demogra-
phy in the State of Hawai’i, both ethnic and linguistic.
According to the official count (I would not really call it “census”) there are
about 6-7% of the people with Hawai’ian ancestry in the State of Hawai’i. But
how this figure is achieved is completely unclear: under the official Hawai’ian
law only a person with ½ blood quantum can be considered a Native Hawai’ian.
But this exists solely on paper. I knew a person who had much lighter color
than Irish or Anglo-Saxon whites and two blond children: they were all Native
Hawai’ians. To give this person his due, he was really trying to educate his kids
in Hawai’ian. But neither him nor they were the native speakers. Unfortunately,
he is not the only example.
The Hawai’ian language is still spoken: natively on the westernmost island
of Ni’ihau, with the population of 300 people out of almost two million in
284 Vovin
the State of Hawai’i.3 Some elderly native speakers can still be found at the
Hawai’ian homesteads throughout the islands. But there is hardly anyone
younger than sixty or seventy there, who is not a complete English-Hawai’ian
bilingual in the best case scenario. The Hawai’ian language is, of course, spo-
ken at the University of Hawai’i campuses. But without any doubt it is not
native, it is HSL – Hawai’ian as a Second Language. Contrary to the cases of
Okinawan and Ainu, there are even two show-case schools: on the island
of O’ahu, and the other on the island of Hawai’i (Big Island), where almost all
instruction is conducted in Hawai’ian. But these are only two schools in the
whole State of Hawai’i and their main purpose is to impress dignitaries from
the USA mainland and linguists from abroad. Hawai’ian is, of course, taught
at the University of Hawai’i, primarily at Mānoa and Hilo campuses. But it is
not required, and it is not a part of any University or school curriculum. The
result: the Hawai’ian language has lost completely its communicative function
in the State of Hawai’i. One has no greater chance of buying a hotdog in one
of the Honolulu convenience stores speaking in Hawai’ian than s/he has in
Hokkaidō using Ainu. So, who is reading these state documents in Hawai’ian
except fishermen and agriculturalists from the island of Ni’ihau? Probably
no one.
There are numerous textbooks: (Elbert 1970), (Kahananui and Anthony
1970), (Hopkins 1992), (Wight 1992), (Cleeland 2006) and other teaching aids
for Hawai’ian: (Judd et al. 1945), (Judd n. d.), (Hitchcock 1968), (Alexander
1968), (Pukui et al. 1975) and there was no dearth in attempts to revitalize it,
probably more than for Ainu or Okinawan combined. And yet the results are
largely the same: the ultimate failure, albeit the failure in a noble undertaking.
And the one common thing that unites all three failures is the complete loss of
the communicative function.
Before proceeding to the conclusion, I would like to attract the attention of
my readers to one case (which is by no means unique, so it serves here just as
an example), where such a catastrophic failure did not occur in spite of the rel-
atively small number of speakers, and the language death is not to be expected
any time soon.
3 There is an additional problem with this island, which is privately owned by the Robertson
family from Kawai’i. There is a wonderful law inherited, apparently, from feudal times: once a
resident decides to leave, s/he can come back only for visiting relatives. This is, undoubtedly,
the smartest possible strategy the Robertson family could have come up with regarding the
language preservation of Hawai’ian.
10 The Case of Survival: Western Toyama Dialect of Japanese
Before the advent of the Meiji government in 1868, Western Toyama region of
Japan belonged to the Kaga province, with which it was bound by strong cul-
tural and linguistic ties. Western Toyama dialect is just a variety of Edo period
(1600-1867 AD) Kaga speech, although the former is separated from the latter by
a mountain range. Most of the former Kaga territory is nowadays in Ishikawa
prefecture, and linguistically Western Toyama is just a variety of Ishikawa dia-
lect, and has much less in common with the rest of the Toyama prefecture.
The total population of Western Toyama (former Western Tonami county)
hardly exceeds 50,000 people – almost a drop in the bucket in the ocean of the
total Japanese population of 126 million. In this situation one would expect
that in the modern world Western Toyama would be swallowed linguistically
if not by Standard Japanese then by the rest of the Toyama prefecture. Yet it did
not happen, and is not even on the horizon. The question is: why? The condi-
tions for the complete linguistic destruction seem to be much more favorable
here in the geopolitical sense than in the three cases of failure that I have sur-
veyed above.
The answer again lies in the communicative function. The language of the
school in Western Toyama is Standard Japanese. So is the language of any work
place. But the language of the family is not: it is still the local language. And
one uses the local language, and not Standard Japanese, when one goes to a
local fishmonger, tatami maker, meat seller, or sake dealer on the same street
or in the same quarter. Not only the inner circle of the family, but all neighbors
still use the same language, and this insures its survival, and will continue to
insure it for uncountable generations as long as they stick to the same practice.
Conclusion
Once a language loses its communicative function, it becomes endangered

very quickly, and then in the course of two or three generations, moribund.
The next stage is certainly a complete language death. The most important
factor in the language survival is the attitude of its actual speakers, although
other factors, such as imperial pressure for unification may play their role, but
they are rarely decisive except in the situation of a genocide. Once a language
reaches endangered stage, let alone a moribund stage, there is nothing much
we as linguists can do, except documenting to the best possible extent a lan-
guage which is either endangered or moribund. We can shed our tears over
286 Vovin
the irretrievable language loss, but so flows history . . . And it is much better

to channel our efforts into language documentation than into unachievable
dream of ‘revitalization’, which will always be a dream inside of a dream.
References
Alexander, William D. 1968 (1891). A Short Synopsis of the Most Essential Points in
Hawaiian Grammar. Rutland and Tokyo: Charles E. Tuttle.
Blum, Shoshana and Chaim Rabin. 1982. Modern Hebrew. Jerusalem: Tarbut.
Cleeland, Hōkūlani. 2006. ‘Ōlelo ‘Ōiwi. Hawaiian Language Fundamentals. Honolulu:
Kamehameha Publishing.
Elbert, Samuel H. 1970. Spoken Hawaiian. Honolulu: University of Hawaii Press.
Hitchcock, H. R. 1968. An English-Hawaiian Dictionary. Rutland and Tokyo: Charles E.
Tuttle.
Hopkins, Alberta Pualani. 1992. Ka Lei Ha’aheo. Beginning Hawaiian. Honolulu:
University of Hawaii Press.
Izutsu, Katsunobu (ed.) 2006. I/YAY-PAKASNU. Ainu go no gakushū to kyōiku no tame
ni. Asahikawa: Hokkaidō kyōiku daigaku Asahikawa kō.
Izutsu, Katsunobu and Yoritaka Tezuka. 2006. Kiso Ainu go. Sapporo: Sapporo dō
shoten.
Judd, Henry P. n. d. The Hawaiian Language and Hawaiian-English Dictionary: a
Complete Grammar. Honolulu: Hawaiian Service, Inc.
Judd, Henry P., Mary Kawena Pukui, and John F. G. Stokes. 1945. Introduction to the
Hawaiian Language. English-Hawaiian Dictionary. Hawaiian-English Dictionary.
Honolulu: Tongg Publishing Company.
Kahananui, Dorothy M. and Alberta P. Anthony. 1970. Let us Speak Hawaiian. Honolulu:
University of Hawaii Press.
Kapstein, Matthew T. 2006. The Tibetans. Malden and Oxford: Blackwell.
Karimata, Shigehisa, n. d. Wakai hito no tame no Uchināguchi nyūmon. Naha: Ryūkyū
daigaku.
Kawabatake, Yasuo. 1982. Okinawa hōgen nyūmon. Naha: Okinawa kyōiku shuppan.
Murasaki, Kyōko. 2010. Karafuto Ainu go. Nyūmon kaiwa. Kushiro: Midori kujira sha.
Novikova, Klavdiia A. and Vera N. Savel’eva. 1953. K voprosu o iazykakh korennykh
narodnostei Sakhalina (On the problem of the aboriginal languages of Sakhalin). In:
Iazyki i istoriia narodnostei krainego severa SSSR. Uchenye zapiski Leningradskogo
Universiteta. Fakultet narodov severa (Languages and history of the peoples of the
Extreme North of the USSR. Communications of the Leningrad State University.
Faculty of the peoples of the North), #157.2: 84-133.
Nakagawa, Hiroshi et al. 1994. Akor itak. Ainu go tekisuto 1. Sapporo: Hokkaidō utari
kyōkai.
Nakagawa, Hiroshi and Mutsuko Nakamoto. 1997. Ekusupuresu Ainu go. Tokyo:
Hakusuisha.
Nakagawa, Hiroshi and Mutsuko Nakamoto. 2007. Kamuy yukar de Ainu go wo manabu.
Tokyo: Hakusuisha.
Nishioka, Satoshi and Jō Nakahara. 2000. Okinawa go no nyūmon. Tokyo: Hakusuisha.
Pukui, Mary Kawena, Samuel H. Elbert and Esther T. Mookini. 1975. The Pocket
Hawaiian Dictionary. Honolulu: University of Hawaii Press.
Sakihara, Masashi, Shigehisa Karimata, Moriyo Shimabukuro and Lucila Etsuko Gibo.
n. d. Rikka, Uchinaa-nkai! n. p.
Satō, Tomomi. 2008. Ainu go bunpō kiso. Tokyo: Daigaku shorin.
Tamura, Suzuko. 1979. Aynu itak. Ainu go nyūmon. Tokyo: Waseda University.
Uchima, Chokujin and Mitsuyoshi Nohara. 2006. Okinawa go jiten. Tokyo: Kenkyūsha.
Vovin, Alexander. 2004. Some notes on Old Turkic 12-year animal cycle. Central Asiatic
Journal, 48.1: 118-132.
Vovin, Alexander. 2009. 『万葉集と風土記に見られる不思議な言葉と上代日本列島
におけるアイヌ語の分布』。京都：国際日本文化研究センター。 2009 。
(Strange Words in the Man’yōshū and the Fudoki and the Distribution of the Ainu
Language in the Japanese Islands in Prehistory). Kyoto: The International Research
Center for the Japanese Studies, 57 pp.
Vovin, Alexander. 2010. Once again on the Ruan-ruan language. In: Ötükenden
Istanbul’a. Türkc̨enin 1290 Yɩlɩ, (From Ötüken to Istanbul. 1290 years of the Turkish lan-
guage). Istanbul, ed. by Mehmet Ölmez et al., pp. 27-36.
Wight, Kahikāhealani. 1992. Learn Hawaiian at Home. Honolulu: Bess Press.
Index
Abaev, Vasilij I. 22, 23, 35 Andaman Islands 128

Abakan Xakas 10, 11, 12, 13 Anglo-Romani 85
Abaza 92 Arabic dialects
Abkhasians 91, 93, 94, 96 gilit-dialect 210
Abkhaz-Adyghe 80 loss of grammatical features 221
Abkhaz 92, 95, 96 qəltu-dialects 210, 223
az 29 Turkey 209-210
Abkhazia 92, 95, 96, 246, 247, 252, 255 Aramaic 125, 133
Abkhazians 91, 94, 96, 228, 229 Ardahan 128, 132
Achaemenid 125, 130 Ardahan University 96
Adjarians 93 Ardeşen 95n1
adjective 48-49 Arhavi 95n1
Adyghe 92 see Circassian Armenia 85
Afrikaans 263-264 Armenian 80, 85, 192, 193, 195
agreement see verb indexing Armenian, Hai 130-133
agriculture 124 see farming Arsacid 132
Ainu 277, 278, 280, 281, 282, 283, 284 Artvin 95n1
Alan 18, 21, 22, 35 ASK DEAL 118
Albania 83, 86 ASK REAL 109, 110, 112, 113, 118, 119
Albanian 81, 86, 126 Assyrian 129
Albanic 81 Atlas of Endangered Languages 51, 56
Aleut 2, 6 Axial Age 125
Alexander, William D. 275, 284 Avar 48-49
alignment 46-48 Avar multilingualism 100
Altai Turks 2 Azerbaijani 192, 193, 196, 197, 198
Altaic Society of Korea 51-52 Azov 52
Al’utor 2
Anthony, Alberta P. 284 Babylonian 125
antipassive 46-48 Bačkovo 85
Arabic 280 badge function of language 124
Armenian-Qypchak(s) 54-55 Bagapsh, Sergej 96
Arnaut, Tudora 52 Balkan Wars 83, 85
Aromanian 81, 83, 84, 87 Bashkir 116, 117
ArtPole, Art Agency 53 Basque 126
Arvanitika 81, 86 Bela di Suprnã 87
Ashyk Garip 53 Belamaci, Constantin 84
aspect 47 Bel’tir Xakas 13
Assyrian 80 Benediksten, Age 91
Austro-Hungarian 81 Beria, Lavrent’i 93
Azerbaijan 82 Bezhta 43-44, 46-49, 60-62, 64, 67, 69,
Altai 117 76, 104
Altaic languages 110, 114, 118 Bible translation 233-235
Altaic Languages Series 113 bi-cultural communities 55
Altaic Society of Korea 108, 109, 111, 114, bilingualism 55, 127
116, 119 Billbao-Sarria, Paul 54
290 Index
Blum, Shoshana 279 Circassian(s) (Adyghe, Kabardino-

Boboshtica 86 Cherkess) 85, 91, 92, 94
Bonan 118 qolow (Kab.) 29
Borcka 95n1 šəquɫtər (Adyghe) 29
borrowing 22-25, 28-29, 31-34, 36 zǝ (Adyghe) 29
Bosha see Lomavren Cleeland, Hökûlani 284
Bosnian 86 code
Bostancı, Eylem 94n4 mixing 9, 227, 236, 238-239
Bucharest 82 switching 9, 245, 249, 250
Buffon, Georges-Louis Leclerc, Comte de talkers 128
258 colonialism 1, 3, 14
Bulgar 275 commerce 125
Bulgaria 83, 85, 86 communicative
Bulgarian(s) 55, 86 habitus 202
Buriat 2 knowledge 189, 190, 199, 202
Buryat 118 system 188, 193, 196, 199, 200, 203, 204
Byzantine 84, 86 complementizers 9, 10
complex sentence structure 9, 10
Cabal-Guarro, Miquel 52 computer-mediated communication 244,
Canada 82 245, 248, 255, 257
canonical typology 71-74 conceptual domain 201
Cappadocian 83 conditionals 14
case marking 9, 10, 11, 12, 13 conjunctive knowledge 190, 199
Caucasus 130 contract enforcement 124
Caucasian 151-153, 156-157, 161, 163-164 contact-triggered restructuring 1, 4, 9, 10, 11,
north west 91 12, 13, 14
s[outh] 92 Corinth 126
Caucasian Albanian 194 Creole 224
Celtic 126, 127 Crimea 52-56
census data 5, 6, 7 Russian-speaking population 56
Census, Russia 17, 18 Crimean Tatar language 51-53
Čerkes kjoj 85 Latin based alphabet 55
Chan 246 texts 55
Chagatay 276, 277 Wikipedia, Linguistic corpus 52
Chechen Crimean Tatars 52 -56
mur 30 cultural
Chejudo language 278, 279 affiliation 55
Chinese 110, 275, 276, 277 capital 191, 192, 204
Christian 79, 83n4, 86 dialect 204
Christianity 126, 131 institution 191
Chukchi 2, 7 policy 55
Chukotko-Kamchatkan 2 Cumans 18, 21, 32-36
Chulym Tatar 115, 116
Chulym Turkic (Ös) 2, 9 Daghestan 82, 85
Chuvash 115, 117 Daghestanian 92
părça 33 Dagur 108, 111
Cimmerians 129 Daniil of Moscopolis 84
Index 291
Darginian European Language Equality Network

dupur 27 (ELEN) 54
Darkhat-Mongolian 167 Even 2
Darwin, Charles 258-260 Evenki 2, 5
dative 72-75 extended phenotype 261
demographic factors in language survival and extinction 151, 152, 155-158, 163
endangerment 262, 271-273 extra-linguistic factors 151, 155
Department of Linguistics 108, 110 Ewen 118
dictionaries of unwritten languages 102 Èwēnkèyǔ Jiǎnzhì 111
diglossic situation, diglossia 244, 246, Ewenki 108, 111, 118, 276
254, 257
Dirr, Adolf 91 facebook 235-240, 248, 252-254, 257
disease (Black Death) 130 falling marginal costs 124, 127
DNA 123 farming, food production 124, 133
Dobrudja 83 fertile reindeer doe 170
Dolgan 5, 116 Feurstein, Wolfgang 92, 93
Dom 85 Fındıklı 95n1
domains of folk-linguistic 199
cultural/social practices 201 Frasheriote 87
use 262-268 French 49, 130, 131
Domari 84, 85 functional
Doric 83 biculturalism 187, 197, 199
Dormant languages 55 bilingualism 197, 199
Dumézil, Georges 91 Fuyü Kyrgyz 114, 115, 118, 137
durative 47
Dutch 263-264 Gagauz 81, 86, 115, 116, 117, 118
Dzongkha 265-267 Gagauz language 51
Garbet see Domari
East Old Turkic 275 Gagauzes 55
Eastern Old Japanese 277, 278 Garabik, Radovan 52
East Taiga, 166 gelded reindeer 171
ecology 79, 80, 82, 85, 87 gender 44-46
Elbert, Samuel H. 284 gene mutation 123
elite languages 133 genocide 80
endangerment 151, 152, 155-158, 163, 165 Georgia (country) 92, 92, 94, 95, 96,
endangerment and extinction of languages 226-240, 244, 246-249, 252, 255
258-273 Georgian (language) 39-41, 85, 94, 96,
Enets 2 246-255
England, English 46-47, 49, 53, 130, 134 ašk’ili 29
Enguri 247 ǯγiba- (dial.) 29
ergative 61, 72-75 marc̣qv- 30
Esenç, Tevfik 91 mc̣qer- 29
Eskimo-Aleut, Eskimoic 2 naʒv 30
ethnic shame 5, 6, 7 qoqo 29
Europe 54, 79n2, 80, 81, 84, 87 Georgian script 229-232, 234-239
European Charter of Regional and Minority Georgians 93, 94
Languages (ECRML) 94, 95, 233, 247 German 55, 280
292 Index
Gesamtsprache 192, 193, 196, 190 Imperial Russia 1, 3

Gogebashvili, Jakob 229-232 inclusive fitness of languages 261-262
Goran 86 Indian prakrits 275
Gorna Belica see Bela di Supra Indic 84, 85
Gothic 126 indigeneity 80, 81
grammars of unwritten languages 105 Indo-Aryan 275, 276
Greece 83, 86 Indo-Aryan languages 275
Greek 81, 83, 126, 128, 131, 132 Indo-European family 80, 276
apóstolos 34, 36 inferencing 40-41
kládos 30 Ingush
mínthē 29 mur 30
Griko 83n4 Inhalt of a language 268-271
Gundestrup Cauldron 127 institutional support 246, 247
Gypsies 79 see also Dom, Domari, Lom institutionalization 191, 192
Insular Turkic languages 51
Hachijö 278 internal factors 163
Haci Osman Köyü 91 Iran 79n2
Haeckel, Ernst 259 Iranian 129, 130, 275, 276 see Scythians,
Haldi 130 Medes, Persians
Hasidism 82 Iranian languages 275
Hawai’ian 283, 284 Israel 82
Hayato 277 Istanbul 82, 95n2
Hebrew 279, 280 Italy 83n4
hegemony 1, 3, 4, 8, 14 Itelmen 2, 7
Hellenic 81, 83n4 Izutsu, Katsunobu 280, 281
Hewitt, B. George 93
Hezhe 108 Japanese 275, 277, 278, 280, 281, 282, 285
Hidaka 280 Japhetic theory 20, 22-24, 35
Hindi 264-265 Japonic 277, 278
Hinuq 60-64, 67-76 Jastrow 209-210
Hitchcock, H.R. 284 Jews 79
Hokkaidö 277, 278, 280, 281, 284 Judd, Henry 284
Hokkaidö Ainu 278 Jurchen 277
Holocaust 82 Jurchenic 275, 276, 278
Honshü 277 Judeo-Georgian 82
Hopa 95n1 Judeo-Greek 82
Hopkins, Alberta Pualani 284 Judezmo 81, 82
humanity Juhuro 82
global spread 122
language ability 122, 123 Kabardian 41-43 see Circasian 92
Humboldt, Wilhelm 262, 268 Kahananui, Dorothy M. 284
Hungarian Kajnas 86
borsó 33 Kamasian 2
Kangjia 118
ideology 79, 81 Kapstein, Matthew T. 277
immigrants 83 Karachay-Balkar 80
imperfect learning see language change, Abıstol, Amıstol 34, 36
language learning aran 27
Index 293
awana 26 türtü 30
balas 24 xans 28
bittir 28, 32, 36 xömpek, xoppug 31
biyik (Balk.) 33 xuru 27
burçak 33 züdür 29
cıgıra, zıǧıra 31 Karachi see Domari
cumarık 31 Karaim 115, 116, 118
çeget (Balk.) 26 Karaim language 51
çuçxur 27 Crimean dialect 52
çum 29 Halych-Volyn’ dialect 52
didin 31 Karaite(s) 54-55
dorbun 26 Karak (Kaya, Mimana) 278
duppur 27 Karaulov, Nikolaj 19
fadawan (Balk.) 28 Karimata, Shigehisa 282
gabu 28 Kartvelian language family 80, 226, 229-233,
gebelo, gelbo 30 246, 255
gıbı (Kar.) 28 Kashub 54
gılıw 29, 36 Kawabatake, Yasuo 282
gǝmǝx (Balk.) 28 Kazakh 115, 276
göbelek 30 Kereit 275
gubu (Bal.) 28 Kerek 2
gumulcuk 28 Ket 2, 6, 12
ırxı 27 Ket-Yugh 276
kaya 27, 36 Khakas 117
kıldı 30 Khalkha Mongolian 110
kırdık 28, 36 Khanty 2, 5
küllüm 27 Khitan 275
kündeş 31 Khotanese 275, 276
maka 33 Khövsgöl region 166
mant 29 Khwarezmian 275
miyik (Balk.) 33 Khwarshi 60-64, 67
mıga 29, 30, 36 Kıpçak 17, 18, 21, 33
mırzı 28 Kili 276, 277
mursa 29, 33 Kolkhian (language) 246
murtxu (Kar.) 30 Koguryöan 278
nanık 30 Korcha 86
nazı (Balk.) 30 Korean 278, 279
nızı (Kar.) 30 Koreanic 278, 279
pura 30 Koryak 2
qımıja (Kar.) 24 Kosova 86 see Üsküp
şinji (Balk.) 28, 31 Kott-Assan-Arin 276
şkeyli, şkildi 29 Krymchak 115
taban 34 Kubedinova, Lenara 52
taqüzük 28, 36 Kumaso 277
tıgır, (Balk.) tıkır 30 Kumyk 80, 118
tılpıw 27, 34 baka 33
tobuk (Balk.) 34 Kurds 129, 133
töppe 27 Kuril Ainu 278
294 Index
Kyiv 51-52 linguistic
Kyrgyz 116 heritage 151, 152, 155, 163
Kyüshü 277 relativity 268-271
Kyüshü Ainu 277 rights 247
Kyüshü Old Japanese 277 topography 258-273
literacy 80
Lamarck, Jean-Baptiste Pierre Antoine de emergent 228-237
Monet, Chevalier de 258 intimate 237-239
language literary standard 246, 248
awareness 190-193, 195, 196, 204 Lithuania 51
background 122 loanword 48-49
causes 122 Lom 85
change 122 Lomavren 84, 85
diversity 122 loyalty 55
endangerment (scale) 7, 106, 126, 128, Lubavitcher 82
133, 258-273
ideology 191 Macedonia 82, 83, 84, 85n6, 86
loss 127 Macedonian 83, 86
maintenance 244-246, 255 Malthus, Thomas Robert 258
sage 191, 195, 196 Manchu 118, 277
shift 151, 156, 157 Manchu-Tungusic 110, 113, 118
standardization 233, 240 Mansi 2, 5
value 128 Manyas 91
language and dialect 232-233 Mari
language and identity 226, 229, 232-233 pursa, pırsa 33
language contact(s) 24, 32-33, 36 Marr, Nikolaj. 20, 22, 230-231
eastern Anatolia 209-211, 223 Maruipol 53
effects of 211, 223-24 Maupertuis, Pierre-Louis Moreau de 258,
language families 123 268
language learning 122, 123 Mbaliote 87
language retention 4, 7, 8 Medes 129
language shift 2, 3, 7, 8 Meglen 83
language survival 258-273 Meglenoromanian 80, 82, 83
Latin 126, 127 Mesrop Mashtots 131
imperial 128 Mészáros, Julius von 91
Laz 92, 93, 94, 95n1, 95n2, 95n3, 246, 248, military command 125
249, 253, 255, 256 Mingrelia 92, 93
Laz Cultural Organisation 96 Megrelian(s) 92, 93, 94, 96, 244-257,
Laz Institute 95n2 246, 255
Lazika Publication Collective 95n3 Megrelian grammars 248
Lesghians 93 Megrelian identity 247, 256
Lausanne, Treaty of 82 Megrelian Wikipedia 254, 257
Lezgian 194 γoγo 29
čumal 29 šker- 29
dur 29 Megrelo-Chan language 246
turt 30 migration 244, 252
Lezgic 92 forced migration 246, 247
lingua franca 125, 127, 128, 131 missionary 126 see religion
linguicide 81 Moldova 86
Index 295
Mongolian 118, 276 onomatopetic verb 47

Mongolic 2, 110, 113, 118 orality 231, 240
monolingualism 55 Oroch 2, 6
multilingualism 81, 87 Orochen 108
Murasaki, Kyöko 280 Orochi 118
Muslim 79, 86 Orok 2
Müller, Friedrich Max 259-260 Orontid 130, 133 see Armenian
orthography 231-232, 234-240
Naiman 275 Ossetic 80
Nakagawa, Hiroshi 280 Amistol (Dig.) 34, 36
Nakahara, Jö 282 awwon 26
Nakamoto, Mutsuko 280 ærxæ 27
Nakh-Daghestanian 60-76, 80 æxsæli, æxsælæ (Dig.) 29
Nanai 2, 6, 118 æxsæly (Iron) 29
Nanaic 275, 276 bælas 24
Nãte/Nãnte 82 bærz / bærzæ 28
nation-state 80, 81, 82, 86 bittir (Dig.) 28, 32, 36
National Network on Endangered Turkic bynʒ / binʒæ 31
languages 51, 54 cægat 26
Native Siberian languages 1, 2, 3, 4, 5, 6, 7, 8, cuxcur 27
9, 11, 13, 14 cym/cumæ 29
NATO 85 c’upp / c’opp 27, 36
Nawar see Domari dyrǧ (Iron) 29
Nazi 82 fadawon (Dig.) 28
Negidal 2, 6 ʒæbidyr / ʒæbodur, ʒæbedur 29
Nenets 2, 4, 5 ʒægæræg 31
Nepali 264-265 ʒedyr, ʒeʒyr, ʒedyræg / ʒæduræ 29
New Uyghur 276, 277 ʒumarǧ (Dig.) 31
new-born reindeer 168, 169 gæby, gyby / gæbu 28
Nganasan 2, 6 gælæbo, gæbælo, (Iron) gælæbu 30
Nij 187, 188, 191-199, 203-205 gælæw 29, 36
Nishioka 282 gæmæx 23, 28
Nivkh 2, 6, 13, 14 kældæ 30
Nivx (Gilyak) 277 kændys / kændus 31
Nogay kærdæg 28, 36
biyik 33 k’æj/ k’æjæ 27
Nohara, Mitsuyoshi 282 k’æxæn 27
North Tungusic 275, 276 k’oyldym / k’uldun 27
Northern Honshü Ainu 277 mæga 29, 30, 36
Novikova, Kladviia 280 mælʒyg / mulʒug 28
mænærǧ 30
Ob-Ugric 2 mæntæg / mæntæg, mont 29
Ochamchira 92 mæra / mura, pura 30
Odul 2 murtgæ, murk’æ 30
Oghuz dialects of Amu Darya region 114, 115 næzy / næzi 30
Oirat 276, 277 ninæǧ(Dig.) 30
Oirat-Kalmyk 118 pursa (Dig.) 29, 33
Okinawan 281, 282, 283, 284 pysyra (Iron) 29, 33
Omok 2 qoppæg / qoppæǧ, qobæǧ 31
296 Index
Ossetic (cont.) predation 124

ran / rawæn 27 priests 124
sk’eldu (Dig.) 29 Prishtina 85
swadon / sawædonæ 26 Pritsak, Omelyan 21, 33
synʒ / sinʒæ 28, 31 protection 124
tægær 30 processing 40-41
tæk’uzgæ (Dig.) 28, 36 productive male reindeer 170
tulfæ (Dig.) 27, 34 Pukui, Mary Kawena 284
turtu, (Iron) tyrty 30 Pumpokol subgroups 276
typpyr / tuppur 27 Punic 126
ug, (Iron) wyg 30 Put’k’aradze, T’ariel 232-233
xans (Dig.) 28
xælyn-byttyr (Iron) 28 Q’ivruli see Judeo-Georgian
xoyr / xuræ 27 Qrymchak language 51-52
yzmis / (æ)zme(n)sæ 27 questionnaire(s) 111, 119
zum, zumarǧ (Dig.) 31
Ossets 93 Rabin, Chaim 279
Ottoman, Osman 55, 81, 85, 132, 133 rational reconstruction 123
reindeer taming 172
Paekchean 278 relative clause 41-43
Pajak, Mt. 82 relative pronouns 9, 10
Pamir 275, 276, 277 religion
Pamir languages 275, 277 Buddhism 126
para-Mongolic languages 276 Christianity 126, 131
pastoralism 124 see food production Islam 126
Pazar 95n1 worship 125, 126
Persian 130, 133, 275, 276 Zoroastrianism 131
ās 21 revival 152, 163
bahāneh 26 rhetorical
balqar 21 act 188, 189
bon 26 genre 189, 191, 204, 205
cagād 26 Rhodopes 85
dār 26 Rhodopian 86
kāh 28 riding and packing reindeer 171
nāz, nāžu, nājū 30 Rom 85
šepeš 31 Romani 81, 84, 85
topoli 27 Romania 83
tūt 30 Romanian 83
personification 44-46 Romaniote see Judeo-Greek
pharyngealization 43-44 Rome 126, 127
phonological criteria 32-34, 36 ruan-ruan 276
phonology 158 ruler 124
Poland 55 Rumelian 81, 86
Polish 54, 280 Rize 95n1
Pomak 86 Russia 81, 82, 85, 92, 247, 249
Pontic 83 Russian(s) 1, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 93,
Posha see Lomavren 96, 110, 280
post-Soviet Russia 1, 4, 5, 6, 8, 14 Russian influence 9, 10, 11, 12, 13, 14
Index 297
Russian language 54-55 semi-speakers 82

Rybalko, Oleksandr 53 Semitic 80
semiotic content of a language 268-271
Safavid 81 Seoul National University 108, 110, 118
Sakhalin 277 Serbia 82
Sakhalin Ainu 278, 280 Serbian 85n6
Sakihara, Masashi 282 Serbo-Croatian, former 86n7
Salar 116 shibboleth 124
Salonika 82 Shikoku 277
Slovakia 52 Shor 115, 118
Sloviansk 53 Sibe 108, 118, 276, 277
Samegrelo 246, 250, 252, 255 Siberia 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 13, 14
Samoyedic 2, 3 Siberian Tatar 115
Sarajevo 82 Sillan 278
Sarpi (Xianbi) 275 Slavic 81, 86
Sason Arabic Sochi 91
Causatives 220-221 social
complement clauses 213 socio-economic factors in language survival
consonant loss 222 and endangerment 262, 271-273
copula 213-215 Sofia 82
demonstratives 218 Sogdian 275, 276
existential sentences 221 Solon 118, 276, 277
expressions of greetings 211n Sony Sound Forge 112, 119
head-final properties 222 Soviet Union 1, 3, 4, 5, 6, 55
head-initial properties 217, 222 Soyot 2
indefinite nouns 218-219 circle/sozialer kreis 198
loan words 222 class 133
negation 215-216 media 235-240
nominal agreement 213-215, 218 spatial cases 60-62, 68-76
object clitics 212-213, 219 special language 198, 203
passive 220 speech vs writing 245
Satö 280 spoken Manchu 108
Savel’eva, Vera N. 280 question formation 216-217
Sayan Turkic 167 relativizer 213, 222
Schleicher, August 259-261 simplification in grammar 222
scopeless negative operators 11, 12 verbal subject agreement 220
scribe 125 vowels 222
scripts word order 212-219
Cyrillic 249, 250 Sprachbund 81
Georgian 249, 250 Stalin 92, 93
Latin 249, 250 Standard German 245
scripture 125 standardised orthography 247
Scythians 129 Starobeshevo 53
Scytho-European 30-32 Stary Krym 53
sedentary lifestyle 124 Stokes, John F.G. 284
Selânik 85 subordination 9, 11, 12
Salïr 139 Sukhum 92
Sel’kup 2, 7 Svan (language) 93, 94, 226-240
298 Index
Svan(etic) Tungusic 2, 6, 275, 276, 278

γu 30 Tungusic languages 277
k’a 27 Turk 132, 133
mant 29 Turkey 79n2, 85, 91, 92, 93, 94, 96
tek’er, tek’ra 30 Turkic 2, 3, 9, 10, 11, 12, 13, 14, 80, 110, 113, 114,
Svanetians 93 116, 118, 275, 276
Swiss-German dialects 245 Turkic language speaking minorities 51
synchronous vs asynchronous Turkic (minority) languages 51, 55-56
interaction 245 Turkish 81, 82, 86
ark 27
Tabassaran burçak 33
čemel 29 büyük 33
Tabyac 275 çim 29
Tajik 276 kaya 27
Talay 210 kelebek 30
tally 125 tepe 27, 36
tam (tense-aspect-mood) 46-48 Turkmen 276
Tamura, Suzuk 280 Turks 93
Tangut 275, 276 Turks-meskhetians 53
Tao 85 Tuvan 2, 9, 115, 116, 118, 145, 166
Tatar 54-55, 117, 137
taxation 125 Ubykh 91, 151-165
Tbilisi 93 Ubykhs 91, 92
tense-aspect-mood see tam Uchima, Chokujin 282
Tere Khöl 166 Udi 92
Tezuka, Yoritaga 280 Udihe 2, 6
theft, robbery 124 Uilta 108, 118
Thrace 82 Ukraine 51-54, 56
Tibetan 275, 276, 277 language diversity 54
Tibeto-Burman 275, 276 language policy 53
Tofa 2, 7, 13 public discourse 53
Tocharian 275, 276 south-eastern region 53
Tocharian A 275 The Language law 55
Tocharian B 275 Ukrainian language 53-54
Tocharian languages 275, 276 Ukrainian-speaking population 56
Tofa 2, 7, 13, 116, 117 Ulcha 2, 6
Toju 166 umlaut 43-44
Torlak 86 UNESCO 51, 56, 246
Trakai 52 unwritten minority languages of
transnational networks 254 Daghestan 99
tribe 124 Upper Gal 93
Tsaatan 166 Urartian, Urartu 128, 129, 130, 131
Tsagaan Nuur 166, 167 urban classes 133
Tsakonian 81, 83 Urum 115, 118
Tsez 43-46, 60-62, 64, 67, 69 Urum language 51-53
Tsezic languages 60-76 Oghuz dialect 53
Tulcea 83 Qypchaq-Polovets dialects 52
Tumshuqese 275, 276 Urums 53, 55
Index 299
USA 82 word
Uslar, Peter 229-230 grammatical 42
USSR 83n4 phonological 42
Uyghur 277 word class 48-49
Uzbek 136, 276 written vs oral communication 246,
Üsküp 85 252-254
Vartashen/Oğuz 193, 195 Xakas 2

verb indexing 39-41 Xiong-nu 276
verb subcategorization frames 9, 12, 13
vilayets 85 Yakut 115, 116
Virgil 127 Yellow Uyghur 143
voice 46 Yeniseian 276
see also antipassive Yeniseic 2, 3, 12
Vovin, Alexandre 275, 276, 281 Yiddish 82, 280
vowel harmony 44 young written languages 92
youtube 248-250, 252, 253, 257
Wadul 2 Yugh 2, 12
Wallace, Alfred Russel 258-259 Yugoslavia 81
warfare 124 Yukaghir 2, 5
West Taiga 166 Yukchin 278, 279
West Yugur 115
Western Old Japanese 277 Zan 246
Western Toyama 285 Zhvania, Ishak 93
Wight 284 Zinobiani 195
Wheatley, Jonathan 94 Zugdidi 93, 252
window movie maker 112, 119 Zutt see Domari

Ramazan Korkmaz, Gürkan Doğan, Endangered Languages of The Caucasus and Beyond (2017)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ramazan Korkmaz, Gürkan Doğan, Endangered Languages of The Caucasus and Beyond (2017)

Uploaded by

Copyright:

Available Formats

Endangered Languages of the Caucasus and Beyond

The Languages of Asia

José Andrés Alonso de la Fuente (Universitat Autònoma de Barcelona)

The titles published in this series are listed at brill.com/la

Ramazan Korkmaz and Gürkan Doğan

The Library of Congress Cataloging-in-Publication Data is available online at http://catalog.loc.gov

Copyright 2017 by Koninklijke Brill nv, Leiden, The Netherlands.

This book is printed on acid-free paper and produced in a sustainable manner.

1 Consequences of Russian Linguistic Hegemony in (Post-)Soviet

2 The Contacts between the Ossetians and the Karachay-Balkars,

3 Why Caucasian Languages? 39

4 International Research Collaboration on Documentation and

5 Cases-Non-cases: At the Margins of the Tsezic Case System 60

6 Language Endangerment in the Balkans with Some Comparisons to

7 Instilling Pride by Raising a Language’s Prestige 91

8 Unwritten Minority Languages of Daghestan: Status and Conservation

9 Report on the Fieldwork Studies of the Endangered Turkic

10 Empire, Lingua Franca, Vernacular: The Roots of Endangerment 122

11 Endangered Turkic Languages from China 135

12 The Death of a Language: The Case of Ubykh 151

13 Diversity in Dukhan Reindeer Terminology 166

14 How Much Udi is Udi? 187

15 Language Contact in Anatolia: The Case of Sason Arabic 209

16 Language and Emergent Literacy in Svaneti 226

17 The Internet as a Tool for Language Development and Maintenance?

18 Linguistic Topography and Language Survival 258

19 And So Flows History 275

Prof. Ramazan Korkmaz

Consequences of Russian Linguistic Hegemony in

© koninklijke brill nv, leiden, ���7 | doi ��.��63/97890043�8693_00�

1 Native Siberia at the Time of Russian Contact

speaking several dozen languages belonging to a range of unrelated linguistic

2 Phases of Colonialism and Hegemony in Native Siberia

Siberia was subjected to at least two very different patterns of colonialism

Marxism-Leninism, seeing as many were engaged in Stone-Age economies,

3 Census Data and Indicators of Ethnic Shame & Language Shift

Population figures from census records

1959 1970 1979 1989 2002 2010

Nenets 23,007 28,705 29,894 34,665 41,302 44,640

1959 1970 1979 1989 2002 2010

Khanty 19,410 21,138 20,934 22,521 28,678 30,943

1959 1970 1979 1989 2002 2010

Oroch 782 1,089 1,198 915 686 596

1959 1970 1979 1989 2002 2010

Nanai 8,026 10,005 10,516 12,023 12,160 12,003

1959 1970 1979 1989 2002 2010

Chukchi 11,727 13,597 14,000 15,184 15,767 15,098

Unsurprisingly, language retention rates for Native Siberian languages

1959 1970 1979 1989 2002

Nenets 84.7% 83.4% 80.4% 77.1% 75.8%

Khanty 77.0% 68.9% 67.8% 60.5% 47.3%

Oroch 68.4% 48.6% 40.7% 18.8% 37.5%

Nanai 86.3% 69.1% 55.8% 44.1% 32.0%

Chukchi 93.9% 82.6% 78.2% 70.3% 49.1%

Russian linguistic hegemony and language ideology that valorizes Russian as

4 Other Consequences of Russian Linguistic Hegemony:

(7) Chulym Turkic (Ös)

4.1 Evidence of Contact-Induced Restructuring I: Syntax of Complex

(8) Abakan Xakas

(9) Abakan Xakas

Another feature of Russian syntax now found in high-contact varieties of

(10) Abakan Xakas

cf. (11) Abakan Xakas

© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_00�

© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_003

© koninklijke brill nv, leiden, ��7 | doi ��.��63/97890043�8693_004