You are on page 1of 12

Peering at the Tocharians through Language

A Window to the Ancient Europoid Folk of
Western China
Written by Afsheen Sharifzadeh, a graduate of Tufts University focusing on Iran and the
Caucasus. The goal of this article is to present the Tocharian narrative in a
broad linguistic framework, with a focus on affinities to earlier Proto-Indo-European.

Europoid-type “Tarim Mummies” found in XInjiang, China, dating back to around 1800 BC.
The Tocharians are described as having full beards, red or blond hair, deep-set blue or green
eyes and high noses and with no sign of decline as attested in Chinese sources for nearly
a millennium. The mummies, particularly the early ones, are frequently associated with the
presence of the Indo-European Tocharian languages in the Tarim Basin, although Mallory and
Mair attribute the later mummies to the Iranian Saka (Scythian) people who settled later in the
western part of the basin.


What do Englishmen, Sicilians, Spaniards, Bengalis, Kurds, Russians, Welshmen, Germans,
Pashtuns, Lithuanians, Armenians, Australians, Persians, Irish, Greeks, Swedes, Punjabis,
Albanians, Brazilians, Icelandics, Romani, Ossetians, and many other peoples all have in
common? Astonishingly enough, we all speak languages derived from a single Mother Tongue.
This is a humorously underappreciated fact amidst the clutter of our daily social interactions, and
more broadly, in our latent perpetuation of decidedly irreconcilable ethnic consciousnesses.

Welsh and Hindi. to the modern linguist’s dismay.C. its daughters were splitting and differentiating via mass migrations of peoples throughout Eurasia. . live? After all. communal responsibilities. the Proto-Indo-Europeans. they were not some obscure race of language-speaking humanoids roaming aimlessly on a primitive Earth. but rather. and shares a common ancestor with languages such as English. Afghanistan. Pashto is an Eastern Iranian branch language. But where did the progenitors of all these modern languages. And neither was it ever recorded or attested to. a pluralistic people who lived fairly recently with families. These registers have been applied in synergy with archaeological evidence to paint a compelling narrative for one of the most appreciable ethno-linguistic progenitors of modern human civilization.But this Mother Tongue was not a monolith. Russian. Variations of it were spoken for a span of roughly two thousand years between 4500 BC and 2500 BC. speaking an adaptive language with which they sang. In the absence of written attestation to any of these highly dynamic prehistoric vernaculars. sometimes losing and regaining contact with each other after centuries or millennia. Italian. spoken between ~4. ambitions and concerns like you and I.500- 2. Pashtun children in the village of Khost. in the form of the Proto-Indo-European language. linguists have used the comparative method of language reconstruction to produce a long.500 B. All the while. fragmentary list of words used in daily speech. as it underwent drastic regional transformations and passed through defining bottleneck events. The Mother Tongue is known to linguists as Proto-Indo-European.

In the opinion of this author. wherein language shift occurs due to emulation of an intruding but more powerful minority. This phenomenon. This so-called “Kurgan Hypothesis” posits that in the riverine steppe lands darting from southern Ukraine deep into the Ural Mountains of Russia lived a semi-nomadic. mortuary mound-building (called kurgans.joked. animal-sacrificing.500-2. in a world populated by many different language families but theirs came to include roughly half the world’s population by the modern era. and chronological evidence points to a Pontic-Caspian Urheimat (homeland) for the speakers of Proto-Indo-European. from Turkic). probably pulled to the Russian . pastoral tribes which more or less could understand each other.C. the preponderance of archaeological. The Kurgan hypothesis postulates a Pontic-Caspian steppe homeland for the speakers of Proto- Indo-European (pink). The black arrows represent the various branch splittings of neolithic PIE- speaking peoples between ~4. is called Elite Dominance. The Yamna culture (early Proto-Indo-Europeans) was in a turn a collection of semi-nomadic. loved. pastoralist. The Tocharians as Indo-Europeans The answer to that question has been the subject of heated debate among archaeologists and linguists for over a century. philological. glory-inspired people whose commitment to ceremony and client- host tradition coupled with their militaristic ingenuity served as a franchising incentive for the widespread adoption of their languages by subject peoples. In archaeology. a postulation that also explains the later extinction of Iranian languages in Central Asia and Azerbaijan beginning in the in the 11th century AD upon the arrival of a minority of Turkic-speaking peoples.500 B. our conjectured Proto-Indo-Europeans are said to have composed the early mesolithic Yamna culture. lamented and prayed. cannabis-smoking.

spoken by bands of foragers near the end of the last glacial period some 13.000 years ago. Many migrations (especially the Corded Ware cultural horizon that stretched from the Netherlands to the Volga) . This horizon transformed the Yamna culture (Proto-Indo-Europeans) into a mobile. the new invention spurred what archaeologists refer to as the Yamnaya horizon. Tocharian. and Korean. Uralic. and disputably other families. of another [Uralo-Siberian] group of an earlier [Eurasiatic] group of the proposed primitive Nostratic language macrofamily. the easternmost Indo-European language spoken in the Silk Road caravan cities of the Tarim Basin in northwestern China. Somali. When the first wheel-driven wagons rolled into the Pontic-Caspian steppe via the Caucasus piedmont from the ancient urban civilizations in the Near East around 4. Albanian has incongruities). Arabic. Estonian. Georgian.steppe (Samara and Khvalynsk) from the northeast Black Sea basin by adverse climatic changes. primitive but expressive ancestral language to Indo-European. Centum languages (blue) departed first and share a number of archaic phonological features that were later innovated in the Satem (red) languages that stayed behind (Indo-Iranian. also lacked the Satem and Ruki innovations. As such. so it likewise seems to have departed prior to the Satemization phenomenon. and Mari). Armenian. Turkish. expansive economy. The hypothetical area of origin of satemization happens to also be in the range of the Sintasha/Abachevo/Srubna cultures (dark red). Semitic. Slavic. Kartvelian.500 BCE. Hungarian. Altaic. Vladislav Illich-Svetych suggested that the Nostratic language was an incredibly remote. then Indo-European languages would have remote genetic affinities to modern languages like Mongolian. we might better conceive of Pre-Proto-Indo-European (the stage of linguistic development before the Yamna horizon) as a group of related dialects which evolved from one group. Baltic. Indo-Uralic (connecting Indo-European to Uralic languages like Finnish. Nivkh. Ainu. The implications of such a theory are earth-shaking for modern social constructs of “ethnicity” throughout the world. Centum-Satem isogloss between Indo-European branches descended from splitting events of neolithic pastoralists migrating out of the Pontic-Caspian Indo-European homeland. If such a conjecture were to have any baring in reality.

including Indra. Andronovo. The Mitanni dynasty ruled over a Hurrian-speaking (non-Indo-European. Varuna. which bares an interesting resemblance to the nearby Repin Centum derived Tocharian Kokale (from PIE *kwel-/ *kwol). The Mitanni rulers regularly made references to the hymns and deities of the Rig Veda to the east. and it wasn’t until a millenium later that wagon chariots appear in China. who usurped the throne–a common pattern in Near Eastern and Iranian dynastic histories. etc. with the new revolutionary technology of the wheeled wagon. and the Nasatyas or Divine Twins. but likely was founded by Old Indic-speaking mercenaries. the combination of push and pull factors—perhaps a combination of tribal conflicts. perhaps charioteers. Timber-grave. Catacomb-Grave. climatic changes. and gave raise to a number of distinct and innovative cultural horizons (TRB/Globular Amphora culture. The Beijing Chinese word for wheel is KuLu. first . likely related to Urartian and to modern Northeast Caucasian languages) population in what is today northern Syria between 1500 and 1350 BC. The speakers of Proto-Indo-European then pioneered the chariot using a technology from the Fertile Crescent.) that interacted with and often displaced many Proto-Uralic and Paleo-European speaking cultures (prehistoric European languages of unknown provenance. employed in the Rig Veda to refer to the heavenly war-band assembled around Indra. such as the language of the Cucuteni- Tripolye culture. The Mitanni military aristocracy was headed by the “maryanna” (from Indic “marya”: “young man”. Over the millennia. A modern survivor is Basque). Usatovo.coincide. Abashevo-Fatyanovo-Balanovo. Funnel Beaker. Pit-Grave/Poltavka. The Proto-Indo-Europeans were the first to domesticate the horse and develop chariotry.) All Mitanni Kings. as reflected in their Indo-European lexicons. and economic incentives—spread the speakers of PIE throughout Europe and Asia.

The projected Pre-Tocharian migration from the Eastern dialects of PIE accross the Ural Moutain range and into the Altai region around 3. Even later. Pre-Baltic and Pre-Slavic split off probably from the northwestern dialects of PIE probably around 2. a mass migration took place from the central zone of the Yamna culture. It is to the Afanasievo cultural horizon that supporters of the Kurgan Hypothesis ascribe a Pre-Tocharian pedigree.800 BCE. and finally Pre- Indo-Iranian between last.000 BCE. Pre-Armenian. see Middle Dnieper multi-ethnic “vortex” culture for more reading).300 BCE from the northeastern group. Pre-Germanic and Pre-Italo-Celtic would split several centuries later into the Danube valley. Pre-Phrygian.700-3.500-2. . Tocharian was probably closest of kin to the PIE dialects that were ancestral to the later Thraco-Phrygian and Armenian. The push factors for this Trans-Ural exodus are unknown. and this area developed into the Early Bronze Age Afanasievo culture.C.700-3500 BCE. a Mitanni horse trainer used many Old Indic terms for technical details. and Pre- Greek split later yet with PIE transhumance into the Balkans.500 B. Pre-Albanian. and in the oldest surviving horse-training manual in the world. including horse color and number of laps. At the time of this hypothetical Pre-Tocharian split from the eastern dialects. perhaps Proto-Altaic-speaking) a startling 2000 km to the east of their starting point. around 3. but from the western and central dialects respectively. Proto-Indo- European was still in an early stage of its development. The migrants settled on virgin land on the contact zone with Siberian foragers (hunters and gatherers. such as Tvesa-ratha (“having an attacking chariot”). or perhaps it was due to a conflict event. where the migrants likely interacted with speakers of Proto-Uralic and Proto-Altaic before migrating southward into the Tarim Basin. but it may have been encouraged by the new opportunities for social and economic expansion offered by the novel mobile economy discussed above.. but their origins are conflicting and their affinities with each other are problematic for a number of reasons that are outside the scope of this article (such as incongruities in Satemization and Centum superstrate. around a site called Repin between the Don and Volga.300-3. At some point around 3. took Old Indic throne names.

most likely archaic Anatolian branch languages Luwian and Hittite. complex verbal tenses (Anatolian only has present and perfect). The Lion Gates at the ruins of Hattusha.but similarities with Italo-Celtic suggest an extended period of contact following an initial separation event. the dual case for nouns. and major phonemic and lexical shifts that would be passed down to the rest of her daughters. Luwian. in the coming millennia. whereas almost all modern Indo-European languages have inherited PIE *do. For example. Hittite. who are now believed to have been remote relatives of the Proto-Northwest Caucasian- speaking peoples.“to give”. for some Indo-Europeanists these traits suggest that the Anatolian branch did not develop from Proto-Indo-European at all but rather that the two evolved from different geographic dialects of a Pre-Proto-Indo-European ancestral dialect continuum. and the Hattians were ultimately absorbed and assimilated into Indo- European-speaking society after nearly two thousand years of coexistence by the end of the 2nd . all of which are now long-extinct but were once spoken for thousands of years in modern-day Turkey.“to give”. this root originally meant “to take” in archaic PIE around the time of the Pre-Anatolian branch splitting. and the poorly attested Palaic. termed “Indo-Hittite” by William Sturtevant. It later underwent a semantic shift probably in the context of the mesolithic Proto-Indo-European client-host gift-offering tradition in the steppe. Pre-Anatolian had split first of all daughters. As such. perhaps half a millennium before Pre-Tocharian around 4200 BCE from archaic Proto-Indo-European. Turkey. among other languages. Beginning in the 4th millennium BCE the Hattic language was gradually displaced by archaic Indo-European languages. Pre-Anatolian then differentiated into Lycian. which lacked grammatical gender. Hattusha was once the capital city of the Hattians. so Hittite (Anatolian branch) has instead the archaic *Pai.

Afrikaans: kom) help her mälka (German melken. we can imagine there was once a young Tocharian girl on a farm in western China who called out the ñem (Swedish namn. it does not feature the subsequent innovation of Ruki sound law (*s > *š / {*r.millennium BCE. while her macer (Armenian mayr. Middle Persian (Iranian. West Frisian wetter) to wash it down. *gṓʰ. labial to velar shift and *r/*l merger reflected in “wolf” = “gurg”) and Portuguese (Italic). Hindi pitr. Belorussian vadá. Russian and Armenian s. *kṓ ʰ. the latter adopted the former’s self-designation (<Hatti. and plain velars *k. Armenian kov. As the second major branching-off event of PIE. as it split before the Satem shift occurred in PIE (the Satem group merged Proto-Indo-European palatovelars *kṓ . Irish Gaelic each) for dinner and fetching fresh war (Hittite wa-a-tar. Armenian mis) of a yekwe (Latin equus. Satemization and Ruki rule reflected in the register for “eight” = “hašt”. Tocharian still shares a striking number of cognates with her sisters in her core vocabulary. Ancient Greek (Hellenic). Lithuanian š [ʃ]. and a Tocharian-like Ural-Volga area Repin Centum language. Sorani Kurdish mang. Phrygian matar. Hittite ekuus. Persian barâdar. *kṓ became Sanskrit ś [ɕ]. Persian pedar) were preparing the misa (English meat. Tocharian maintains a number of archaisms that are absent in later branches. Lithuanian trýs.) As such. Tocharian is the only geographically “eastern” Centum language. However. Latin lux. Latin mulgere) the tri (Spanish tres. *g. French nom) of her older procer (Dutch broeder. including English (Germanic). Albanian mjel. Latvian. German Mutter) and pacer(Italian pater. Avestan. Kurmanji Kurdish: gav. is likely also the root of the Armenian self-designation Հայ Hay). yielding plain velars only. If we indulge ourselves for moment. and Albanian th [θ] but k before a resonant. The Volga Uralic Mordvin languages (Erzya/Moskha) have loanwords from early Indo-Iranian. *gʰ. However. East Baltic. Gothic mats. . *gṓ. *w. The Tocharian Language Despite her early separation from PIE. For example. *kʰ. including Pre-Greek Catacomb and Pre-Tocharian Don-Repin. Pashto dre) kews(English cow. Ukrainian misjac’). Persian gâv) in the pen at nighttime under the lyuks (English light. *y}). inferring another contact period probably whilst en route to Afanasievo. Kurmanji Kurdish nav. English Tocharian B Ancient Greek Middle Persian Portuguese Proto-Indo-European name ñem ónoma nâm nome *h₃néh₃-mmn eight okt oktṓṓ hašt oito *h₃ekṓ téh₃(u) mother macer mḗṓ tḗr mâdar mãe *méh₂tḗr foot paiye poús pây pé *pṓṓ ds wolf walkwe lúkos gurg lobo *wĺm kʷos new ñuwe néos nṓg novo *néwos star śre astḗṓ r stâr estrela *h₂stḗṓ r Language comparison chart prepared by the author illustrating a few readily recognizable cognates between Tocharian and a few extant/extinct Centum and Satem members of the Indo- European family. but retained the labiovelars as a distinct set. some Indo-European tribes (dialects) maintained tribal-linguistic contact—often via assimilated substrate—prior to their various distinct Proto-stages. *K. For example. Armenian luys) of the beautiful stars and meñe (Danish måne. which in the opinion of this author. Russian brat) to käm (English: come.

Lastly. the lack of a secular corpus in Tocharian A could simply be an accident. proto-Uralic foragers interacted through trade with the neolithic Proto-Indo-European tribes migrating out of the homeland who introduced them to . Saami. Tocharian A and Tocharian B were strikingly different languages with radical divergence in their plural markers. noun endings. Hungarian. Tocharian A (Turfanian) is distributed along the eastern part of the Silk Road. case system and verbal system. Tocharian A was more archaic and used solely as a Buddhist liturgical language. suggesting that it may have been the spoken language of the entire area (discussed below). Mari etc. while the Tocharian B corpus includes documents that are both secular and religious in nature. conceptualized by linguists who reconstructed these loanwords and attributed their origin to some unknown sister of Tocharian A and B. Linguistic evidence suggests that Proto-Uralic and Proto-Indo-European likely shared two kinds of linkages. Alternatively. perhaps a broadly related set of intergrading dialects spoken by hunters roaming between the Carpathians and the Urals at the end of the last Ice Age (Joseph Greenberg calls this language stock “Eurasiatic”. Estonian. perhaps ultimately descended from Nostratic). the result of a fragmentary preservation of texts. and shared basic vocabulary. The Mordvin people centered in the middle Volga region of Russia speak languages belonging to the Uralic macrofamily (includes Finnish.C. whose proto- language homeland was probably in the birch-pine forest zone on the southern flanks of the Ural Mountains. although it is unclear whether they were mutually intelligible. The second link is cultural. Tocharian C is only attested to in about 100 words in Prakrit documents. could be ancestral: the proto-languages probably shared some quite ancient common ancestor. while Tocharian B (Kuchean) is centered in the northern part.Documents from the 6th to 8th centuries identify two Tocharian languages which probably split in the first millennium B.). revealed in the similarity of pronouns. one kind.

The Chinese .” It is very likely that B was also the language of everyday monastery life in the east. concludes: “at the time when the extant materials in dialect A were written it was purely a liturgical language in the monasteries of the east. an authority on Tocharian.agriculture and the wheel. the Tocharians interacted with and borrowed extensively from the Indo-Iranians. it had long since ceased to be a vernacular [as a result of Turkic immigration into the area]… whereas Tocharian B was clearly the vernacular of a comparatively rich and flourishing culture [to the west and better protected by the mountains and the desert from the influence of the Turks]. Turks. their distant Indo-European cousins unbeknownst to them at the time. The primary effect of these languages upon Tocharian was in loanwords into the lexicon. along with the Hsiung-nu (Mongolian nomads). as a result of extensive missionary activity from Iran and India which coincides with the Tocharian’ adoption of Buddhism. again of Iranian provenance. The most recent linguistic influences upon Tocharian were Iranian and Sanskrit.” Wooden tablet with an inscription showing Tocharian B in its Brahmic form. as one of the barbarian kingdoms in their western region which had been involved in many wars with the Chinese. in my estimation. Kucha. The Europoid-type residents of Turfan and Kucha were first noted by the Chinese in the Han-shu in the first century BC. Lane concludes: “the two Tocharian dialects A and B have gone through a long period of independent development… anywhere from five hundred to a thousand years…they are. and again much later with Indo-Iranian and Tocharian migrants trekking eastward out of the PIE homeland. There was also a notable Manichaean minority. 5th- 8th century (Tokyo National Museum) The Decline of the Tocharians Following their Trans-Ural exodus by over a millennium. George Lane. China. On the relationship of Tocharians A and B. existing side by side with the liturgical form of A. and had been so preserved for several centuries at least…. especially in religious terminology. no longer mutually intelligible. and Tibetans.

As such the footprint of the speakers of Tocharian languages remains blurred. Belezek. c. . As a result of Iranian Buddhist proselytic activity. and the various Iranian peoples to the west. Kushans and Khotanese. or as warriors dressed is Sassanian Iranian dress. wherein they present themselves as Buddhists dressed in north Indian clothes. with a Tocharian on the left. red-haired inhabitants of the Tarim basin as Yuezhi. Nevertheless it is likely to have been around the beginning of the Common Era. the Tocharians adopted Mahayana Buddhism and served as an important conduit for the spread of Buddhism into China and the East. the Tocharians appear to have emerged as a devoutly Buddhist and mercantile people. Tocharians are represented iconographically.sources refer to the fair. such as the Bactrians. Ultimately. Southwest Asia. 8th century AD. serving as middle-men between the more advanced civilizations of early Imperial China. Painting of Buddhist monks from the Eastern Tarim Basin. as we can only observe them through the lens of Buddhism. Exactly when Buddhism was introduced to Tocharia from India by Middle Iranian-speaking peoples is unknown since there are no historical records describing such a transmission. the Tocharians seem to have played a role in the Silk Road transmission of Buddhism to China. Together with East Iranian peoples.

Greek. we can learn from Tocharian about the effect that a long migration and contacts with members of other language families can have on an IE language and. and the influence of non-IE languages. after which time they were either assimilated into the growing Turkic-speaking population in the area or simply died out. for instance Sanskrit. as Winter says. has not been as helpful in reconstructing PIE as. This means that the Tocharians composed a distinct ethno-linguistic grouping within Indo-European for nearly three millennia–in turn providing quite a considerable window for study–but their own scant literary productions (mostly monastic and mercantile in nature) coupled with those of their neighbors fail to provide us with any substantive account regarding their there were already Kuchean missionary Buddhist monks in China beginning the third century AD. or Hittite have been. it seems that their culture lasted up until the end of the first millenium of the Common Era. Although details surrounding the social and political undertakings of the Tocharians remains shrouded in mystery due to lack of attestation. its geographic isolation from other IE languages. However. due to the rather late date of the extant documents. therefore. “below the rather forbidding surface of our Tocharian data there are some real treasures to be found. the Tocharian evidence. In general.” .