Adelaar 2020 PaperforFSXXX12!1!20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/344441812
Seventeenth century texts as a key to Malagasy linguistic and ethnic history
Preprint · October 2020
CITATIONS READS
0 347
1 author:
Karl Alexander Adelaar

University of Melbourne
102 PUBLICATIONS 1,073 CITATIONS
SEE PROFILE
All content following this page was uploaded by Karl Alexander Adelaar on 01 October 2020.
The user has requested enhancement of the downloaded file.

1
Seventeenth century texts as a key to Malagasy linguistic and ethnic history
Alexander Adelaar
Asia Institute, University of Melbourne
1. Introduction1
In this paper I attempt to obtain some phonological information about the Malagasy language
(henceforth MLG) as it is enclosed in the two oldest written sources dealing with MLG: 17th
century Sorabe texts and Frederik de Houtman’s textbook and wordlist2. These sources are four
centuries old, and they are bound to contain some information that is pertinent to the linguistic
history of MLG. I also make a critical evaluation of the possible origins of early Sorabe texts,
which are generally assumed to be Taimoro (or Antaimoro).
Sorabe is one name for the MLG adaptation of the Arabic writing system. Another
(older) name is Onjatse. The script is also called “Arabico-Malagasy”, an awkward designation
as it also refers to Arabic applied to MLG in general. A lot has been written in it, and traditional
Sorabe texts represent three languages: Arabic, MLG, and Kalamotetsitetsy (/Antetsitetsy/
Antetetsitetsy). The latter is a mixed language, if not a pidgin3.
Sorabe originated on Madagascar’s south east coast. It is often identified with the
Taimoro area, where the writing tradition is still upheld, and the language of traditional Sorabe
texts is likewise supposed to be a form of old Taimoro. However, this identification is
problematic (see Section 3). Moreover, it ignores the fact that the script was also used for writing
Merina MLG in the early 19th century (before a romanised script became official).
Some Sorabe texts are four centuries old and therefore of vital importance for the history
of the MLG language. However, they are not easy to read as they were originally handwritten
1
I am grateful to Professor Noël Gueunier (Paris) for his careful critical reading of an earlier version of this paper,
and to Redha Ameur (Melbourne) for his assistance with the use of Arabic in msWord. The usual disclaimers apply.
2
Frederick de Houtman van Gouda, Spraeck ende woord-boeck inde Maleysche ende Madagaskarsche talen met
vele Arabische ende Turcsche woorden […], Amsterdam, J.E. Cloppenburch, 1603.
3
Philippe Beaujard, Le parler secret arabico-malgache. Recherches étymologiques, Paris, L’Harmattan, p. 5, 1998.
2
and show archaic language and considerable spelling variation. Many texts are considered sacred
and are guarded jealously by their owners, which makes them difficult to access.
Elsewhere4 I propose that Sorabe, and presumably early Islam itself, were introduced
from Southeast Asia. Basic evidence for this claim is the form of the Arabic letters designating
‘d’, ‘t’, ‘ŋ’ and ‘p’ in Sorabe: they seem to be taken from Southeast Asian adaptations of the
Arabic script (such as Pegon, used for Javanese) and not from Standard Arabic, Omani Arabic or
Swahili ones. Other evidence is the term sumbidi or sumbili ‘ritual slaughter’, a concept of
crucial importance traditional Islamic societies on Madagascar’s southeast coast. It derives from
Malay səmbəlih (originally ‘to slaughter according to Muslim law’). This suggests a Southeast
Asian origin of Islam as practised in coastal Southeast Madagascar, and it corroborates with
other evidence that Madagascar was still in contact with Southeast Asia at the time Islam was
introduced.
Houtman’s textbook5 is the oldest European source of MLG and the first systematic study
of the language. It represents MLG as it was spoken in the late 16th century around Antongil Bay
on Madagascar’s northeast coast. The medium language is Dutch, and the MLG data is in an
archaic 17th century Dutch spelling, which is not very uniform. The book consists of sample
sentences with translation and a wordlist, together some hundred pages. Although a treasure
trove for historical linguists, it has never been fully sourced, which may be due to lack of
familiarity with 17th century Dutch among the researchers concerned.
A third source to be mentioned is Flacourt’s dictionary6. It is slightly more recent than
Houtman and represents Tanosy (or Antanosy) MLG, a southeast coast dialect. It has been used
more extensively for comparative study and for practical reasons will not be used in this paper.
MLG is originally a language of South Borneo. It belongs to the Southeast Barito (SEB)
language subgroup or linkage7 of Malayo-Polynesian languages, which in turn form a branch of
the Austronesian language family. The other members of the SEB subgroup are Ma’anyan,
4
Alexander Adelaar, Asian roots of the Malagasy: a linguistic perspective, Bijdragen tot de Taal-, Land-
en Volkenkunde vol. 151, no 3, 1995, 325-357.
5
Frederick de Houtman van Gouda, op. cit.
6
Étienne Flacourt, Dictionnaire de la langue de Madagascar, Paris, Georges Josse, rue St. Jacques à la Couronne
d’Épines, 1658.
7
Alexander D. Smith, The languages of Borneo: a comprehensive classification, Ph.D., linguistics, Honolulu,
University of Hawaiʻi, 2017.
3
Samihim, Dusun Witu, Dusun Malang, and Bayan. MLG is most directly related to Ma’anyan,
Samihim and Dusun Witu (“southern SEB”), and somewhat more distantly to Dusun Malang and
Bayan8. These language are spoken east of the Barito River in southern Borneo (currently the
Central and South Kalimantan provinces in Indonesia).
For a better understanding of the historical development of MLG, it helps to distinguish
between two hypothetical stock languages. Proto Malagasy 1 (PMLG1) is supposed to have been
a southern SEB sociolect in South Borneo which had developed under direct influence of the
Hindu-Malay metropole there and must have been spoken at the time of the MLG migrations to
East Africa. It showed much Malay influence as well as influence from Javanese, Sanskrit, South
Sulawesi languages and (presumably but not systematically investigated) Ngaju and other
languages in the Barito basin. In comparison to other SEB languages, it must have been more
directly affected by early urbanisation along Borneo’s coast9. A later form, Proto Malagasy 2
(PMLG2), evolved after the migrations and had integrated Bantu lexical and grammatical
elements as a result of substrate influence from coastal East Bantu languages. This stock
language is more directly at the origin of all MLG dialects spoken today than is PMLG1.
This paper is organised as follows. Section 2 demonstrates that the Arabic letters yā’ and wāw
were still vocoids at the early stages of Sorabe, and therefore that early east coast MLG and also
PMLG2 still had the semivowels w and y rather than the corresponding z and v that are almost
ubiquitous today. Section 3 argues that the phonological developments in early Sorabe texts are
not unequivocally of an (Old) Taimoro signature. Section 3 shows that some of the palatal nasals
in Houtman’s data10 are historically relevant and must be inherited from Proto SEB, or at least
from PMLG1. Some concluding remarks follow in Section 4.
Malagasy language data are written in the original orthography used in the sources,
except for the following changes: 'o' is replaced by 'u' and final 'y' by 'i', whispered vowels at the
end of a word (or “paragogic” vowels) are marked with a breve (ă, ĕ, ĭ etc.), and velar nasals are
rendered as 'ŋ'. Stress is indicated wherever the original sources do so. Exceptions to these
8
Alexander Adelaar, unpublished field notes.
9
Alexander Adelaar, Who were the first Malagasy, and what did they speak?, In Andrea Acri, Roger Blench et
Alexandra Landmann (ed.), Spirits and Ships. Cultural Transfer in Early Monsoon Asia, Singapour, Institute of
South East Asian Studies, 2017, p. 441-469.
10
Frederick de Houtman van Gouda, Spraeck end word-boeck, op. cit.
4
spelling conventions are data from Houtman11 as well as toponyms and dialect names, which are
left in their original spelling. The capital N refers to a nasal when it occurs with a hyphen
attached to it (-N, -N- etc.).
Names of dialects are given without the aN- prefix with which they are often found in the
literature, thus Antaimoro, Antankarana, Antandroy, Antanosy are written as Taimoro,
Tankarana, Tandroy, Tanosy, and so on.
Proto Malayo-Polynesian (henceforth PMP) etyma are used rather than Proto
Austronesian ones as they are closer to their Malagasy reflexes and thus easier to interpret.
Reconstructed (hypothetical) words and sounds are indicated by a preceding asterisk (*).
Finally, this paper uses the following abbreviations and dialect sources:
(B) Sorabe data from Beaujard12

(D) Sorabe data from Dez13
(F1904) Sorabe data from Ferrand (1904)14
(F1905) Sorabe data from Ferrand (1905)15
(G) Sorabe data from Gauthier16
(H) data of early MLG spoken around Antongil Bay from Houtman17
id. idem (indicating that a MLG word has the same meaning as the
MLG word preceding it)
MLG Malagasy (the language)
PMLG Proto Malagasy
PMLG1 Proto Malagasy 1
PMLG2 Proto Malagasy 2
PMP Proto Malayo-Polynesian
SEB Southeast Barito
11
Ibidem.
12
Philippe Beaujard, Le parler secret arabico-malgache, op. cit., p. 5.
13
Jacques Dez, Vocabulaire pour servir au déchiffrement des documents arabico-malgaches. Paris: Université Paris
7, Département de recherches Linguistiques, 1981, p. iv.
14
Gabriel Ferrand, Un texte arabico-malgache du XVIe siècle. Transcrit, traduit et annoté d’après les mss 7et 8 de
la Bibliothèque Nationale, Paris, Imprimerie Nationale, 1904.
15
Idem, Un texte arabico-malgache, Alger, Imprimerie Orientale Pierre Fontana, 1905.
16
E.-F. Gauthier, Notes sur l’écriture antaimoro.Paris: Ernest Leroux, 1902.
17
Op. cit.
5
2.1 The Arabic letters yā’ (y or long i) and wāw (w or long u) were still vocoids in early Sorabe
The Arabic script is basically a syllabary, in which letters mainly indicate consonants. In
agreement with the phonemic structure of Arabic, it has two dental series (including voiced ‫ د‬and
‫ ض‬and voiceless ‫ ت‬and ‫)ط‬.18 Furthermore, while the letters yā’ (‫ )ي‬and wāw (‫ )و‬respectively
indicate the semivowels y and w, they also may indicate the corresponding long vowels ī and ū.
Apart from that, all vowels can also be represented by diacritics but this is optional and often
does not happen.
Sorabe is an adaptation of the Arabic script and has some different conventions. Among
others, in native MLG vocabulary it only uses ‫ د‬and ‫ ط‬for the voiced and voiceless dentals
respectively, both written with a dot under them (see below). Another convention is that it makes
almost exclusive use of Arabic letters as consonants, and diacritics as vowels. Finally, the
original yā’ (‫ي‬19) and wāw (‫ )و‬do not – or no longer – express semivowels but are as a rule
transcribed as the fricatives z and v respectively.
Apart from using different conventions from the original Arabic script, Sorabe texts also
exhibit a fair amount of spelling variety, and spelling rules in them are by no means uniform.
While these texts as a rule express all vowels with the use of diacritics, some texts occasionally
apply an alif, yā’ and wāw to denote a, i/e, and u/o respectively, and a very few among them use
these letters throughout to express vowels (see 2.4 below). Although in most Sorabe texts wāw,
yā’ and alif are not used as vowels, these letters do have an important second function as dummy
letters to support vowels that are not preceded by a consonant they can be attached to, as will be
shown below.
The Sorabe use of yā’ and wāw as fricatives does not cause many problems, as most
MLG dialects – including all those that have made use of the Sorabe script – currently have no
semivowels at the phonemic level. The historical semivowels *y and *w which they had
unherited from SEB have become z and v respectively. At the same time, Arabic has no v, and
while it does have phonemic contrast between y and z, practically this does not seem to cause
much ambiguity.
18
‫ ض‬and ‫ ط‬are pronounced with the tongue approaching the pharynx.
19
Note that yā’ is written as ‫ ﯾـ‬when it is the initial letter of a word and connected to a following letter, and as ‫ـﯿـ‬
when connected to both a preceding and following letter.
6
Two prominent Malagasy scholars who investigated Sorabe, Jacques Dez20 and Philippe
Beaujard21, transcribe yā’ and wāw as z and v. They apparently do so because in current Taimoro
(and other South East coast dialects) these letters also correspond to fricatives, and they are
pronounced as such by the katibo (scribes) reading traditional Sorabe texts today. While this
makes sense from a synchronic perspective, it can also be shown beyond reasonable doubt that
yā’ and wāw were still semivowels in Sorabe writing early onwards. The historical importance of
this is that if yā’ and wāw initially still referred to semivowels in Sorabe, these semivowels must
also have occurred in the southeastern coastal form of MLG for which Sorabe was first used. In
other words, *y and *w had not yet become z and v respectively in this dialect form.
Furthermore, since the semivowels in question correspond to semivowels in current South East
Barito languages, they should also be reconstructed for PMLG1, even if forms of MLG spoken
today generally have fricatives instead, including those on Madagascar’s South East coast.
There are exceptions to the shift from semivowels to fricatives, especially where *y is
concerned, which is maintained as y in a few lexical items in some dialects, as discussed in
Section 2.1. However, *w has become a v throughout Madagascar, and no current form of MLG
has maintained *w as a semivowel.
As will become clear in the following sub-sections, the evidence that yā’ and wāw
originally stood for semivowels in Sorabe is clearly more transparent for *y than for *w.
2.1 *y survives as a semivowel in some dialects and historical records

Proto SEB *y and de-syllabified *i became z in most present-day dialects. However, it was
largely maintained in the Vorimo dialect close to Madagascar’s central east coast. It was still a
semivowel in Houtman’s22 data on the dialect of Antongil Bay on the north-east coast, where it is
written as ‘i’, ‘j’, ‘ij’ or ‘y’, variant ways used almost interchangeably in early 17th century Dutch
texts and indicating syllabic i as well as the semivowel y [j]. It also occurs in some frequent
lexemes in the Tanala, Tandroy and Bara Vinda dialects. The repertoire is different in each of
these but usually includes a few question words, personal pronouns and kinship terms. One
lexeme showing *y that they all have in common is the one reflecting *aia ‘where?’, see below.
20
Jacques Dez, Vocabulaire..,. op. cit., p. iv.
21
Philippe Beaujard, Vocabulaire pour servir au déchiffrement des documents arabico-malgaches. Paris: Université
Paris 7, Département de recherches Linguistiques, 1981, p. iv.
22
Frederick de Houtman van Gouda, Spraeck ende woord-boeck, op. cit.
7
Some well-represented cognate sets maintaining *y and de-syllabicised *i as y or i are:
Merina aìza ‘where?’ Tanala, Tandroy àia, Bara Vinda aya, Vorimo, (H) aia ‘id.
Merina ìza ‘who?’ Tanala, Tandroy ìa, Bara Vinda iya ‘id.’
Merinaz àhu ‘I’ Tanala iàhu, Bara Vinda yahu, Vorimo iau, (H) jahoeu ‘id.’
Vezo (i) àbi ‘all; entire’ Tandroy iàbi, Bara Vinda iyabi, (H) yiabe, iabe‘id.’
Merina izàhai ‘we (exclusive)’ Tanala iahài, Bara Vinda yahài, Vorimo iehe ‘id.’
Merina zàza ‘(child (not one’s own)’ Tandroy àja, ajàja, Vorimo iaia, (H) jajia
In other instances, *y and de-syllabicised *i were maintained as y or i more sporadically, e.g.
Merina zàndri‘ youngest sibling’ Tanala iàndri ‘id.’

Merina zùki‘ oldest sibling’ Tanala iùki ‘id.’
Merina ìzi ‘(s)he’ Tandroy ìe, Vorimo iie ‘id.’
Merina zànakă‘ offspring’ Vorimo ianaka ‘id.’, (H) ianack ‘child’ (<*ianak)
Tsimihety zàma ‘maternal uncle’ (H) iama ‘(maternal uncle)’ (< *iama ‘id.’)
Merinazau-bàvi ‘sister-in-law’,
zau-dàhi ‘brother-in-law’ (H) jahots ‘sibling-in-law’ (< *iauT ‘id.’)
Merina ra/fùzană ‘father-in-law’ Vorimo rafuya (‘rafuia’) (< *ra-fuyaŋ ‘id.’).
Merina ùzatră ‘nerve, muscle,
tendon’ (H) oejats ‘id.’ (< *(h?)uyaT ‘id.)
Merina vùzună ‘neck’ (H) wojong ‘id. (<*wuyuN ‘neck (Fr. ‘cou’)
Merina vàzană ‘molar’ (H) vajangkoho, Vorimo vayankuu ‘nail’ (< *wayaŋ
‘molar’ + *huhu ‘nail’)
As to *w, it became a voiced labial fricative (v) in all current dialects; however, w is still found
in Houtman23, where it often precedes u and is usually in free variation with v, e.g.
23
8
Proto SEB Houtman

*watu ‘stone’ watou, vatou ‘id.’
*wuyuŋ ‘neck’ wojong ‘id.’
*wulu ‘body hair’ wullo ‘hair (unspecified)’
*wehi ‘iron’ wi ‘id.’; vy ‘nail’
*wulan ‘moon; month’ woulan ‘moon’; voelan ‘month’; dea voelan ‘moon’
*wawa ‘mouth’ wava ‘id.’
2.2 Sorabe texts often use a dedicated Arabic letter ‘z’ in Arabic loanwords containing z
Modern MLG z is written with a yā’. It is therefore conceivable that at the time the Sorabe script
was adopted, MLG did not have a z but still had a semivowel *y. The latter would then have
acquired a fricative pronunciation only later on. The proposition is strengthened further by the
fact that Sorabe users must have been aware of the existence of z phonemes in Arabic even if it is
not used for what is nowadays identified as z in MLG. Otto Christian Dahl24 reminds us of the
fact that Sorabe still makes use of dedicated Arabic letters for ‘z’ in Arabic loanwords originally
containing z. Arabic has two dedicated letters for ‘z’. One, ‫ز‬, expresses an ordinary z, and the
other, ‫ظ‬, is pronounced with the back of the tongue approaching the pharynx. In the Sorabe script
these letters are not used in native MLG words, but they do feature in Arabic loanwords.
Examples are:
Sorabe transliterated Sorabe Arabic Arabic transliterated
Zafarana25 ‘saffron’ َ ‫( َز َﻋﻔَ َﺮ‬F1904 p. 35, line 10)

‫ان‬ ‫َز ْﻋﻔَﺮان‬ za‘farān ‘id.’)
allorozy ‘rice’ ِ ‫( ُآﻻر‬B p.156-157)

‫ُز‬ ‫اﻷَ ُر ّز‬ al-aruzz ‘id.’
moza-mizo‘sour-
tasting banana’ ‫( ُﻣ َﺰ ِﻣ ُﺰ‬B p.158-159) ‫َﻣ ْﻮز‬ mauz ‘id.’)
24
Otto Christian Dahl, Sorabe révélant l’évolution du dialecte antaimoro, Antananarivo, Trano Printy Fiangonana
Loterana Malagasy, 1983, p.6.
25
This is the MLG pronunciation of َ‫ َز ْﻋﻔَﺮان‬rather than an accurate transliteration of its spelling, which in fact would
be za‘afarana. The latter contains the letter‘ expressing a pharyngeal, a sound for which MLG has no equivalent.
9
But there is spelling variation. Compare moza-mizo (above) with maozo ‘bananas’ ‫ج‬
ُ ‫َﻣ ْﻮ‬
(Beaujard 1998:158-159), which is written with the Arabic letter jīm ( ‫ ج‬, a voiced affricate in
Classical Arabic).
And there is also the occasional use of a yā’ instead of a dedicated Arabic z, e.g.
Sorabe transliterated Sorabe Arabic Arabic transliterated
zobo ‘penis’ ُ‫ﯾـ ُ ْﻮب‬ ّ‫ ُزب‬26 zubb ‘id.’ (B, p.162-163)
The use, in some texts, of dedicated Arabic ‘z’ letters and their occasional variation with yā’ in
loanwords show that the authors of these texts were aware of the original vocoid value of yā’.
Dahl27 also draws attention to the fact that at least in one Sorabe manuscript28, z is indicated with
a dedicated Arabic ‘z’ letter ‫ ز‬even in inherited vocabulary, drawing attention to pairs such as
yatu / zatu ‘a hundred’, and uyati / uzati ‘people’. He interprets this as an innovation. Such a
practice suggests that the author of this manuscript still interpreted yā’ as a semivowel, or at
least, was aware of its historical value as a vocoid.
2.3 Sorabe yā’ and wāw are also support letters for vowels not preceded by a consonant
Another indication that both yā’ and wāw were still vocoids rather than fricatives at the time
Sorabe came into being is that they are also used as support letters. As already mentioned, if
(short) vowels are written at all in the Arabic writing system, they are indicated as diacritics on
top of or under the previous consonant letter. However, some vowels are not preceded by a
consonant, for instance, when they occur in a word beginning with a vowel or containing a
26
The w-like diacritic (tashdīd) indicates gemination of the letter under it (b).
27
Otto Christian Dahl, Sorabe, op. cit., 1983, p. 40.
28
Edited by G.H. Julien (G.H. Julien, G.H., Pages arabico-madécasses. Histoire, légendes et mythes, Paris, Societé
d’Éditions Géographiques, Maritimes et Coloniales, 1929 ; Idem, Pages arabico-madécasses (Deuxième série).
Histoire, légendes et mythes, Paris, Societé d’Éditions Géographiques, Maritimes et Coloniales, 1933).
10
sequence of vowels. In such cases support letters such as, yā’, wāw, alif, are used, as well as a
glottal stop sign called hamzah (the latter symbol is of no relevance for our further discussion).
Both yā’ and wāw serve as support letters for vowel diacritics to indicate that two vowels directly
follow one another. Observe the following examples (in which yā’ and wāw are rendered as
capital Y and W respectively if they are used as support letters):
Roman orthography realization of Sorabe writing Transliteration

non-phonemic glide
vua ‘fruit; kidney’ [wuʷa] ‫ُو ًو‬ wuWa
tua ‘of course’ [tuʷa] tuWa
ruazatu ‘two hundred’ [ruʷazatu] ruYaYatu
feu ‘voice’ [feʸu] ‫ﻓِ ْﻮ‬ feWu
fuitri ‘navel’ [fuʷițʳ] ‫ﻓُ ْﯿﺮﱢ‬ fuYrri
a-ivu ‘(in the) middle’ [aʸvu] ‫اَ ْﯾ ُﻮ‬ ayvu
Note that the a in vua, tua and ruwa (in rua zatu) is indicated with a double vowel sign (or
fatḥah) for a. This happens specifically in cases where a directly follows u and a non-phonemic
u-glide is involved. The reason for this orthographic convention remains unclear.29
Note furthermore that in the cases under discussion yā’ and wāw are really only support
letters, and not ad hoc developments from non-phonemic semivowel glides that are heard
between the vowels involved. In cases like vua and tua it may look as if they are, because the
transition from u to a involves a non-phonemic glide [ʷ], and the support letter is a wāw. But a
case like feu contradicts this: although here too a wāw is used, the transition of e to u actually
involves a non-phonemic glide [ʸ] instead of [ʷ], if anything. So do rua (in rua zatu) and fuitri,
where the preceding u is rounded and a labial glide [ʷ] is produced, although the support letter is
a yā’. In these instances yā’ and wāw have not only become support letters for vowels, they have
also lost their original vowel colour. However, while it is clear that there is no longer a
connection between the original vowel colour of yā’ and wāw and the environment in which they
occur, it is very likely that originally they did develop from semivowels, and not from the
fricatives z and v.
29
It certainly exasperated Gauthier, who made the somewhat high-handed remark “Il est difficile de reconstituter le
processus de déformation par lequel le cerveau du scribe est arrive à cette anomalie” [it is hard to trace the distortion
process that brought the brain of the scribe to produce such an anomaly] (Gauthier, op. cit. p.12).
11
2.4 In some Sorabe texts yā’ and wāw also indicate vowels
In Sorabe vowels are as a rule expressed by diacritics written on top or under the preceding
َ◌
consonant, with (fatḥah), placed over the letter and indicating ‘a’, ◌ِ (kasrah), placed
◌ُ
underneath the letter and indicating ‘i’ or ‘e’, and (ḍammah), placed over the letter and
indicating ‘u’ or ‘o’), as in:
Roman Sorabe writing Transliteration
mamunu ‘to kill’ (F1905 p.10, line 2) ‫َﻣ ُﻤ ُﻦ‬ MaMuNu
maru ‘many’ (F1905 p.9, lines 11, 14) ‫َﻣ ُﺮ‬ MaRu
mati ‘dead’ (G1902 p.47, line 6) MaTi
zahu ‘I, me’ (G1902 p.56, line 6) ُ‫ﯾَﮫ‬ ZaHu
vulafutsi ‘silver’ (G1902, p.49, line 1) ِ ُ‫ُوﻟَﻔ‬

‫ﺖ‬ VuLaFuTSi
tafiki ‘army’ (G1902, p.58, line 13) TaFiKi
However, vowels are on occasion also indicated by yā’, wāw and alif. Consider the following
examples from a text edited by Gauthier30. In the transcription below, these letters are written
with a capital Y, W and A respectively, whether they represent a semivowel or a vowel. Their
use as vowels here does not seem to follow a clear rule, as evidenced by the occurrence of pairs
such as lWhA /luhA, and tafiki / tafYki among the following examples31:
30
E.-F. Gauthier, op. cit.
31
note that the Sorabe script does not indicate preconsonantal nasals.
12
Roman Sorabe writing Transliteration
lùha ‘head’ (G p.49), luhA (G p.48) ‫ ﻟُﻮْ ھَﺎ‬،‫ﻟُﮭَﺎ‬ lWhA, luhA
tsìka ‘we’ (G, p.49) َ ‫ ﺗِ ْﯿ‬،‫ﻚ‬

‫ﻚ‬ َ ِ‫ﺗ‬ tsYka, tsika
etu ‘here’ (G, p.50) YtuW, Ytu,

ituW
(z,y?)anaki vavi ‘child via female line’
(G, p. 49) ِ ‫ﯾَﻨَ ِﻚ َو‬
ْ‫اوي‬ yanakY vAvY
aŋumbi ‘cattle’ (G, p.50) ِ ‫اَ ُﻋ‬

‫ﺐ‬ aŋubi
aŋumbiku ‘my cattle’ (G, p.50) ‫اَ ُﻋﺒِ ْﯿ ُﻜ ْﻮ‬ aŋubYkuW
tafiki ‘army’ (G, p.58, line 13) taFiki
tafiki ‘army’ (G, p.65, line 12) tafYki
This vocalic use of wāw, yā’ and alif is no hard evidence that yā’ and wāw were originally used
as semivowels in Sorabe, as any katibo striving for a more authentic use of the Arabic script
might have decided to use these letters as vowels. Nevertheless, it shows that Sorabe users were
still aware of their vocalic origins.
2.5 Double marking of a final vowel indicates a historical long vowel

As already shown in previous sections, final vowels as a rule are marked as diacritics on top of
or under the preceding letter (a consonant), as in the following instances:
Roman Sorabe
lèla ‘tongue’ ‫ِﻟ َل‬
zàza ‘child’ ‫ﻲ‬

َ َ‫ﯾ‬
àti ‘liver’
nìfi ‘tooth’ ‫ف‬

ِ ‫ِﻧ‬
afèru ‘bile, gall’ ‫ا َ ِﻓ ُر‬
13
rànu ‘water’ ‫َر ُن‬

A special instance of yā’ and wāw (but not alif) expressing vowels are some monosyllabic words
ending in a vowel. In these words, they apparently denote vowels, although they are always used
in conjunction with diacritics. These diacritics are supported by the preceding letter, whereas the
juxtaposed yā’ or wāw is marked with a so-called sukūn (an o-shaped diacritic on top of a
consonant to indicate that this consonant is not followed by a vowel). It thus seems as if final
vowels are doubly marked. The basis for this convention is only clear from a comparative
linguistic perspective: these final letters are used when the vowel in question reflects the
previous existence of a sequence of identical vowels. These sequences can still be observed in
cognate forms in SEB languages in South Borneo, where the vowels in question are divided by a
glottal (ʔ or h). Compare:
Roman Sorabe Transliteration Phonetic PMLG2 Ma’anyan Proto-SEB
fe ‘thigh’ ‫ِﻓ ْﻲ‬ fiY [fe] *fee pe’e *pe’e
fu ‘heart’ ‫ﻓ ُ ْو‬ fuW [fu] *fuu lim/puhu 32 *puhu
ra ‘blood’ ْ‫َرا‬ raA [ra] *raa ira33 (Dusun *raha’

Malang raha’)
be ‘big, ‫ِﺑ ْﻲ‬ biY [be], [bey] *bei wahai34 *wahai

much, many’
In these instances, the use of yā’ and wāw in addition to diacritic signs to indicate historical
geminated vowels is explained by the fact that they still represented vocoids at the time when
Sorabe was first applied to MLG.
32
This historically complex form most likely reflects Proto East Barito *ulu ‘head, upper part’ + *-N- ‘(linker)’ +
*pusuq ‘heart’, compare cognates in Taboyan (Northeast Barito) lᵻpusu and Dusun Deyah (Central East Barito)
lumpusu’ ‘heart’ (Alfred B. Hudson, The Barito isolects of Borneo, Ithaca (New York), Cornell University Press,
Southeast Asia Program (Department of Asian Studies), Data Paper no 68, 1967, p. 70).
33
Dahl explained the first vowel in Ma’anyan ira as the possible result of a back formation: *mi- + *raa ‘to bleed’ --
> m- + ira ‘id.’ --> ira ‘blood’ (Otto Christian Dahl, Malgache et Ma’anyan. Une comparaison linguistique, Oslo,
Egede Instituttet, Avhandlinger utgitt av Instituttet 3, 1951, p.350).
34
Ma’anyan wahai means ‘numerous’. As to the relation between Ma’anyan wahai and MLG be, note that in the
history of MLG, proximity between *w and a following *h or another glottal consonant causes fortition (Ibidem,
p.350).
14
2.6 Summarising the evidence
It appears that:
(1) *y survives as a semivowel in some individual words roots in some dialects, and is
maintained in Houtman’s 1603 data of the MLG dialect spoken in the Bay of Antongil. As to *w,
it became v in all current dialects; in Houtman’s data, there is still an orthographic ‘w’ which is
usually in free variation with ‘v’.
(2) Both yā’ and wāw are used as orthographic devices to support vowels.
(3) In some texts, yā’ and wāw (and alif) also indicate vowels.
(4) The double marking of vowels combining the use of yā’, wāw and alif with diacritics reflects
the historical presence of geminated vowels.
(5) Finally, the z in Arabic loanwords is usually written with a dedicated Arabic ‘z’ letter.
This is clearly enough evidence to conclude that both *y and *w were still a vocoids at the time
the Arabic script was first adapted to MLG and Sorabe came into being. It shows beyond
reasonable doubt that yā’ and wāw started out as semivowels and changed into fricatives only
some time after the introduction of the Arabic script to Madagascar’s South East coast. This
allows for some chronological conclusions: (1) the change was relatively late, and, as it
happened after the introduction of the Arabic script, (2) it certainly happened after the
divergence of PMLG2 into the various dialects in Madagascar today. There are still forms of
MLG that have not – or not entirely – undergone the change from *y to z, and Houtman’s early
17th century text is evidence that *w was still a semivowel, although apparently one in free
alternation with v.
3. The identification of Sorabe with the Taimoro dialect is problematic
As already mentioned, Sorabe texts are usually identified with the Taimoro area. The Sorabe
writing tradition is thought to belong to the Taimoro region, and likewise, the dialect on which the
MLG writing in the oldest texts are based is often taken to be Taimoro (Dahl35; Rajaonarimanana36;
Beaujard37). However, the association between Sorabe and Taimoro needs some qualification.
Étienne Flacourt was a French government administrator and a long-time resident of Fort Dauphin,
35
Otto Christian Dahl, Sorabe, op. cit., 1983, p.5.
36
Narivelo Rajaonarimanana, Sorabe. Traités divinatoires et recettes medico-magiques de la tradition malgache
antemoro, PhD thesis, 4 vols, Paris, Institut National de Langues et Cultures Orientales, 1990.
37
Philippe Beaujard, Le parler secret arabico-malgache. Recherches étymologiques, Paris, L’Harmattan, 1998, p. 7;
Idem, Histoire et voyages des plantes cultivées à Madagascar, Paris, Karthala, 2017, p. 39.
15
a town on the southern Malagasy coast. He is also compiler of the first MLG - French dictionary38.
In this work he related how he read texts with the help of a katibow when he worked in Fort
Dauphin39. While there, he also drew up a treaty and published several translations of texts which
almost certainly were in use in or around that town40. Fort Dauphin is in a region where Tanosy
and Tandroy people live and is several hundred kilometers away from the Taimoro region. This
would indicate that the Sorabe writing tradition was originally practised in a part of coastal
southeastern Madagascar much wider than the Taimoro region alone, although the tradition is
nowadays predominantly maintained in the Taimoro region.
Linguistic support for this seems to be that the oldest Sorabe texts (e.g. Ferrand’s 1904
manuscript)41 have retained the PMP (and Proto SEB) syllable *li as li. The retention of li is also
reflected in the Tanosy dialect spoken on Madagascar’s south coast, but not in Taimoro, where*li
has become di42. Therefore, as far as this development is concerned, Sorabe is more in alignment
with Tanosy than with Taimoro. Compare the following cognate sets:
English PMLG2 Merina Taimoro Tanosy Sorabe
Digging *-hali -hàdi hàdi -hàli -hali (D)

Skin *huliT hùditra hùditri hùliSi hulitsi (F1904, p. 93)
Rope *tali tàdi tàdi tàli tali (F1904, p. 115)
Forgetting *halinu hadìnu --- halìŋu haliŋu (F1904, p. 115)
Leech *linta dìnta --- lìta linta (D)
Ear *taliŋɛ --- tadìŋi taliŋi (F1904, p. 88)
Egg *atuli atùdi atùdi atùli ---
The similarity between Sorabe and Tanosy is striking, although its significance is diminished by
the fact that Sorabe and Tanosy li represents a retention rather than an innovation. On the other
hand, Sorabe and Tanosy are the only forms of MLG in which *li remained unchanged but *ti
palatalised to tsi/si, whereas all other MLG dialects maintained both *li and *ti (especially in the
West and South of Madagascar) or innovated both as di and tsi (or si) (in the Centre, North and
East). In other words, all other dialects either retain li as well as ti or pairwise changed them into
di and tsi/si. Remarkably, this pair of changes does not only run through the dialects of MLG but
also through the East Barito languages of Borneo. Although there is no apparent phonetic
38
Étienne Flacourt, Dictionnaire de la langue de Madagascar, op. cit.
39
Noël Gueunier (email communication 2018).
40
Étienne Flacourt, Dictionnaire de la langue de Madagascar, op. cit., 178-188.
41
Gabriel Ferrand, Un texte arabico-malgache du XVIe siècle, Paris, Imprimerie Nationale, 1904.
42
The maintenance of *li as li, the use of ts as a reflex of final *T, and the variation of final whispered vowels in
Sorabe are also noted by Velonandro (Velonandro, Lexique des dialects du Nord de Madagascar, par des
missionaires et séminaristes catholiques, Tuléar (Madagascar), Centre de Documentation et de Recherche sur
l'Art et les Traditions Orales à Madagascar, Centre Universitaire Régional / Valbonne (France): Centre de
Documentation et de Recherche sur l'Asie du Sud-Est et le Monde Insulindien, CNRS-EHESS, 1983) and
Philippe Beaujard (Philippe Beaujard, Rituel et société à Madagascar. Le cas des antemoro (côte sud-est),
Paris, (manuscript), s.d., 35).
16
connection between the *li > di and *ti > tsi/si changes, they may still be connected in the overall
phonological structure of SEB languages. PAn *d, *z and *j all merged to *r in Proto SEB,
leaving SEB languages with only one inherited dental obstruent, Proto SEB *t. So when *li
became di, the newly created d in this sequence may have been perceived as rather similar to the
*t in *ti. Consequently, the need to disambiguate *ti from di may have triggered its
affricatisation to tsi. This might explain why the change from *li to *di almost always occurs in
tandem with that of *ti to tsi. It would also make the Tanosy and Sorabe cases exceptional, being
the only forms of SEB in which either the retentions *li and *ti or their changes into di and tsi/si
are not manifested in tandem.
Further research into the features of Taimoro, Tanosy and Sorabe will hopefully bring
more light into the assumed historical relationship between these forms of MLG.
As an aside, the contrasts between li and di (< PMP *li) and between ti and tsi (/si) (<
PMP *ti), are generally considered to be criteria for a major division between eastern and
western MLG dialects. However, their critical value must be rejected43 because (1) they
“subgroup” the western and southern dialects on the basis of retentions44, (2) northern MLG
dialects also maintain li and ti, whereas manifestations of di and tsi (/si) in these dialects are the
result of borrowing from central MLG (Merina?), and (3) although Tanosy and Sorabe belong to
South East Madagascar and represent eastern Malagasy, they have nevertheless retained *li; in
doing so they partly agree with western dialects and undermine the critical value of *li reflexes
for a dialect classification.
Another feature of Sorabe texts is that they sometimes show influence from various
dialect regions. They do so through the variety of final whispered vowels and reflexes of final *T
that they display.
Almost all MLG dialects have whispered vowels which appear after the historical final
consonants *-T, *-k, *-ŋ, *-n, or *-m. There is quite a variety of final whispered vowels among
dialects. For instance, in Central MLG dialects (Merina, Betsileo, Tanala) this vowel is ă; in
southern and southeastern dialects, it is often ĭ (Bara); in southwestern dialects, ĕ (Vezo and
South Sakalava); in Tandroy, it is ĕ or (in the case of a final nasal) ñĕ or Ø with loss of the final
nasal. In Tankarana, North Sakalava and Tsimihety (the northern MLG dialects), it is an echo-
vowel (it is a copy of the vowel in the preceding syllable and can be ă, ĭ or ŭ). Finally, in Vorimo
and in Old MLG sources, roots often lack corresponding whispered vowels, although they do
occur in roots borrowed from other dialects. The phonemic value of whispered vowels is
minimal, if they have such value at all. They are heard mainly in the isolated quotation of a word
43
Alexander Adelaar, Malagasy dialect divisions: genetic vs. emblematic criteria, Oceanic Linguistics vol. 52, no 2,
2013, p. 457-480.
44
In genetic linguistics, similarities based on innovations that are shared exclusively among certain languages within
a larger language family are considered to be potentially relevant for the classification of these languages into a
subgroup, whereas similarities based on retentions from the protolanguage are not.
17
and at the end of a phrase. If the position of a word ending in a whispered vowel is elsewhere in
a phrase, the vowel is dropped. Some dialects such as Bara and Vezo do not realise final nasals
in free forms or forms at the end of a phrase: in such cases there is also no whispered vowel. The
17th century Sorabe text published by Ferrand (1904)45 shows the influence of northern and
central MLG dialects. Northern influence appears in the occasional use of echo vowels, whereas
Central MLG influence appears in the occasional occurrence of final ă, the default whispered
final vowel in Central MLG dialects.
Dialectal influence also appears in the use of two dialectally distinct reflexes of PMLG2
final *T. In some dialects, this *T developed into a retroflex tr consonant, whereas in other ones
it became a palatal affricate ts. In Ferrand’s (1904) Sorabe text46 both tr and ts are found.
Observe the distribution of final whispered vowels and final *T reflexes in various
dialects:
Dialect Four far difficult fear fallen reverse nape of road leaf person west
neck
Merina: èfatră làvitră sàrutră tàhutră làtsakă vàdikă hàtukă làlană ràvină ùlună andrèfană
Bara: èfatsĭ làvitsĭ sàrutsĭ tàhutsĭ làtsakĭ vàlikĭ hàtukĭ làla ràvi ùlu andrèfa
Vezo: èfatsĕ làvitsĕ sàrutsĕ tàhutsĕ làtsakĕ vàlikĕ hàtukĕ làla ràve ùlu (ah-)anjèfa
Tandroy: èfatsĕ làvitsĕ sàrutsĕ tàhutsĕ làtsakĕ vàlikĕ47 hàtukĕ làla(ŋĕ) ràve(ŋĕ) ùlu (ah-)andrèfa
Taimoro: èfatrĭ làvitrĭ --- tàhutrĭ làtsakă --- hàtukĭ làlaŋă ràviŋă ùlu andrèfaŋă
48
North MLG: èfatră làvitrĭ sàrutrŭ --- làtsakă vàdikĭ hàtukŭ làlaŋă ràviŋĭ ùluŋŭ andrèfaŋă
49 50
Vorimo: (efatră) lavitr sarutr --- latsak vadik --- --- ravin uln andrefan
Compare these endings with the ones in various corresponding forms in the Sorabe text edited by
Ferrand (1904)51:
Four far difficult fear fallen reverse nape of neck road leaf person west
efatră, lavitrĭ, sarutsĭ tàhutrŭ latsakă, valikă, hatukŭ lala, ravină ulună, andrefană
efatrĭ lavitsĭ latsakĭ vadikă lalană ulu, ulun
In this text we find examples of -ă, -ĭ, and echo-vowels (-ă, -ĭ, -ŭ). We also find retroflex -tr-
reflexes as well as palatal -ts- reflexes of final *-T, leaving the impression that the text was
interfered with by authors of different dialect backgrounds, including northern MLG ones.
45
Gabriel Ferrand, Un texte arabico-malgache du XVIe siècle, op. cit.
46
Ibidem.
47
Mamalikĕ ‘to change, return’.
48
See Faridanona, Rantimbôlaŋa diksionera Tsimihety, Antananarivo, Akademia Malagasy, 1977, p. 5.
49
*efatr would be the expected form.
50
Am-badik ‘beyond’; am-badikimareñ ‘day after tomorrow’.
51
Gabriel Ferrand, Un texte arabico-malgache du XVIe siècle, op. cit.
.
18
These various influences make it hard to link up Sorabe with only one dialect of MLG.
To be sure, in Taimoro MLG, final *T became a retroflex tr, not an affricate ts. As to the final
whispered vowel, both ă and ĭ are found in this dialect: ĭ as a rule occurs after tr (historical final
*T), whereas ă tends to occur after (historical final) k and (historical final) nasals, although ĭ may
also occur in these positions (the rules of this variation remain unclear). To some extent,
Taimoro has the possession of several whispered vowels in common with Sorabe. However,
unlike the latter it does not exhibit -ŭ, and notwithstanding their shared ă/ĭ variation, it is clear
that Taimoro MLG and Sorabe show different patterns of variation in their final whispered
vowels and *T reflexes. Once again, further research is needed to obtain more clarity on the
dialectal affiliations of Sorabe.
Note incidentally that the origins of many words remain ambiguous. For instance, it is
easy to see that a word is not central MLG if it has no final ă, and it is not South East coast MLG
if it has no final -ĭ. On the other hand, on the basis of a final ă alone it is not possible to
determine whether the word in question is central MLG if the vowel in the preceding syllable is
the same, nor is it always possible to diagnose a word ending in ĭ as South East coast MLG if the
preceding syllable has ĭ or ĕ: in both cases they may represent echo-vowels, which is a feature of
North MLG. Consequently, while it is easy to spot words of North MLG provenance if they have
u as an echo-vowel, such as matahutrŭ ‘have fear’ hatukŭ ‘neck’, and huhutrŭ ‘foot’, other
words, such as utekĭ ‘brain’ and ulikĭ ‘intestines’ could also be South East coast MLG, and
latsakă ‘fallen’, latakă ‘penis’, and vutrakă ‘belly’, could also be central MLG if assessed by
their final whispered vowel alone.
4. Evidence for a historical palatal nasal (ñ < *ñ) in Houtman’s 1603 wordlist
As pointed out in Section 2.1, the 17th century Dutch orthography in Houtman52 uses ‘ij’, ‘j’, ‘y’
and ‘i’ more or less indiscriminately to denote the palatal semivowel y. Furthermore, ‘j’ occurs in
the combination ‘nj’ denoting a palatal nasal. The following instances featuring this palatal nasal
provide evidence that it can be traced to PMLG1:
source Merina Houtman (1603)

Banjar Malay pañu ‘turtle’ fànu‘turtle’ fanjou [fàñu] ‘turtle or tortoise(?)’
SEB (Ma’anyan) amini ‘id.’ amàni ‘urine’ amanji [amàñi] ‘to urinate’
PMP *buni ‘to hide, conceal’ vùni ‘id.’ avoenji [avùñi] ‘on the stealth’
Malay pəniŋ ‘dizzy’ fànină ‘dizzy’ fanjing [fàñiŋ] ‘weak, faint’
SEB (Dusun Malang) mɛyaʔ ‘red’ mèna ‘id.’ meynja [mèña] ‘id.’
Malay ajar ‘learning, teaching’ ànatră ‘id.’ mang’anjarts [maŋ-àñats] ‘to teach’
mienjats [mi-añats]53 ‘to practice’,
myanjaets [mi-añats] ‘school’
52
53
With presumed palatalisation of initial a to e under the influence of preceding i.
19
These instances show that the Antongil Bay dialect still had palatal nasals in the 17th century.
Some of these nasal instances are only circumstantial evidence for inherited *ñ as they can also
be explained as the result of secondary palatalization of *n under the influence of the following
*i. Secondary palatalization is no doubt the case of fanjing, which derives from Malay pəniŋ54, a
source form that has not undergone the process. And it is also conceivable in amanji and avoenji,
whose nj could also be attributed to secondary palatalization of *n under the influence of a
following *i. However, it cannot be the case in words like fanjou, myanjaets, or meynja, in which
nj is followed by a different vowel. Here, secondary palatalisation does not apply, and nj must
reflect a historical *ñ. This is most obvious in fanjou reflecting Banjar Malay pañu. Less
obviously, it may also reflect *ñ in meynja [mɛyña) and in the derivations involving ianjaets
[añats]. These are roots which have developed from etyma with an intermediate semivowel *y,
which became nasalized as a result of progressive assimilation or “nasal spread”, a phenomenon
happening under the influence of nasal prefixation at the onset of the previous syllable55. It can
be observed in various languages in South Borneo (including Ngaju, Ma’anyan, and possibly
even Banjar Malay) and was probably also at work in the early history of MLG. It seems to have
been an areal feature in the linguistic history of South Borneo.
Blust56 treats nasal spreading in a wider contest and gives the following Ngaju Dayak examples:
kayu ‘wood; firewood’ ma-ŋañu ‘to gather firewood’

uyah ‘salt’ m-uñah ‘to salt something’
payoŋ ‘umbrella’ ma-mañoŋ ‘to shelter with an umbrella’
The phenomenon can also be observed in Ma’anyan,where the following cases occur:
wayat ‘(paying)’ mañat ‘to pay’

huyu ‘(ordering)’ nuñu ‘to order’
ayak (‘inviting)’ ŋ-añak ‘to invite’
kuyum ‘mouthful’ ŋuñum ’to mouth without swallowing (tobacco)’
hayaŋ ‘a pity, waste’ na-hañaŋ ‘let go waste’ (Adelaar personal fieldnotes)
54
It most probably derives from a Banjarese source form *paniŋ, although no such form is attested in Abdul Djebar
Hapip’s dictionary (Abdul Djebar Hapip, Kamus Banjar – Indonesia, Banjarmasin, PT. Grafika Wangi Kalimantan,
2006).
55
Robert A. Blust, The Austronesian languages, Canberra, Australian National University Press, Asia-Pacific
Linguistics, A-PL 008, 2013, pp 238-239.
56
Blust shows that nasal spreading also occurs in Narum. However, this must be an unrelated development as
Narum is spoken in northern Sarawak, which is far away from southern Borneo. Moreover, the nasal spreading in
Narum affects a following l, not a y, as in hulet ‘skin’ versus m-unet ‘to skin’, and alaut ‘boat’ versus ŋ-anaut
‘paddle a boat’ (Ibidem, p. 239).
20
Malay muyaŋ ‘ancestors’ muñaŋ ‘great-great-grandfather’57
Furthermore, in Banjar Malay58, variation involving nasal spread seems to occur in one instance:
samua’an, samuya’an, samuña’an ‘all, everybody’.
In MLG, nasal spreading can be shown to have occurred in historical hindsight in at least one
root. In mena ‘red’, the intervocalic n is the result of nasalisation of *y (reflecting an earlier PMP
*R) under the influence of the preceding adjectival prefix *m(a)-:
PMP *ma-iRaq ‘red’ > Proto SEB *m-ɛyaʔ ‘id.’ > Dusun Malang mɛyaʔ, Samihim mɛaʔ, MLG
mèna‘id.’
Another root that may have undergone nasal spread is MLG anatră ‘learning/teaching’, although
in this case the source (Malay ajar ‘(studying/teaching)’) has currently an affricate j instead of a
semivowel y. Here we may speculate that Malay ajar evolved from an original form *(h)ayar
‘learning/teaching’ and that this form was adopted into PMLG1 (as *ayaT) which later on
underwent nasal spread through prefixation of the verbal prefixes mi- and maN- (which both
contain nasals). In so doing it would become possible to explain both (Merina MLG) (mi-)
ànatră, (man-)ànatră and modern Malay məŋ-ajar, as follows:
From early Malay *(h)ayar ‘learning/teaching’ to modern Malagasy:

> (via borrowing) PMLG1 *ayaT ‘learning/teaching’
--> (+ nasal spread) *mi-añaT ‘to study’, *meŋ-añaT ‘to teach’
> via back-formation: *mi-añaT ‘to study’, *meŋ-añaT yield a root *añaT
> Houtman’s 1603 data from the Bay d’Antongil: mienjats ‘to practice’,
mang’anjarts ‘to teach’
> (after loss of palatalisation) Merina MLG anatră, mi-anatră ‘to study’,
man-anatră ‘to teach’
Fortition of semivowels is a common development in the phonological history of Malay59, and

modern Malay ajar could have evolved from *(h)ayar through the fortition of *y. The
development may have been as follows:
Early Malay *(h)ayar ‘learning/teaching’ to modern Malay:

(via regular inheritance):
> (with *y fortition): modern Malay ajar ‘learning, teaching’, bəl-ajar ‘to study’,
57
Alfred B. Hudson, The Padju Epat Ma’anjan Dajak in historical perspective, Indonesia (Cornell University), pp 8-
42, 1967.
58
Abdul Djebar Hapip, Kamus Banjar – Indonesia, op. cit.
59
Alexander Adelaar, More on Proto-Malayic, in Mohd. Thani Ahmad and Zaini Mohamed Zain (eds), Rekonstruksi
dan cabang-cabang Melayu Induk, Kuala Lumpur: Dewan Bahasa dan Pustaka, 1988, pp 62-63.
21
məŋ-ajar ‘to teach’.
Nasal spreading to a following *y may have been an areal feature in the history of South Borneo,
affecting Ngaju, Ma’anyan and PMLG1, the predecessor of modern MLG spoken in South
Borneo at the time of the Malagasy migrations to East Africa. It apparently also left a trace in
Banjar Malay, to wit the various forms samua’an, samuya’an, samuña’an ‘all, everybody’ in this
language. These languages used to be spoken in a continuous area and were in contact. They still
are in the case of Ma’anyan, Ngaju and Banjar Malay. One also recalls that the Banjar Malay
metropole and the Ma’anyan speaking region were more directly contiguous in the past than they
are today60, and that PMLG1 was most likely spoken in an area directly bordering that of Banjar
Malay (if not overlapping with it)61.
In PMLG1, nasal spreading may have resulted in a palatal nasal, as it did elsewhere in South
Borneo. PMLG1 evolved into PMLG2, the language showing Bantu influence which was
directly ancestral to all modern forms of MLG. While nasal spreading may have been productive
in PMLG1, it apparently lost its productivity in PMLG2: at any rate, none of the modern MLG
dialects show the sort of morphological alteration caused by nasal spreading observed in Ngaju
and Ma’anyan. However, PMLG2 did maintain the palatal nasal as a result of this spreading, as
testified by Houtman’s 1603 data on the Antongil Bay dialect.The palatal nasal in meynja and
mang’anjarts ‘to teach’ etc. directly supports this assumption. None of the other MLG dialects
have a palatal nasal phoneme, as they merged whatever *ñ they may have inherited from Proto
SEB or (Banjar) Malay with *n to n.62 If Houtman’s data attest the presence of a palatal nasal
which agrees with a palatal nasal in SEB languages and/or in loanwords from Malay, we assume
by implication that PMLG1 and PMLG2 also had an *ñ. We reconstruct for both proto-levels the
following etyma: *fañu ‘turtle’, *meña ‘red’, *añaT ‘studying/teaching’, *fà[ñn]iŋ ‘dizzy’, and
*wu[ñn]i ‘secretly’.
Note incidentally that in SEB languages and PMLG1, *y is always a secondary development. It
was not a reflex of PMP *y as the latter was lost: compare the PMP roots *kayu ‘wood; tree’ >
Maanyan ka-kau ‘tree’; layu ‘fading’ > Ma’anyan la’u ‘weakened’). The y in SEB languages has
many origins: (1) it reflects *R (e.g. *suRuq ‘send’ >Ma’anyan huyu ‘id.’), (2) it is due to lexical
borrowing (e.g. wayat and mam-bayar ‘to pay’63, ‘id.’, layu ‘fading’), or (3) it is due to
60
Alfred B. Hudson, The Padju Epat Ma’anyan Dayak in historical perspective, Indonesia (Cornell University), p.
15.
61
Alexander Adelaar, Who were the first Malagasy, and what did they speak?, In Andrea Acri, Roger Blench et
Alexandra Landmann (ed.), Spirits and Ships. Cultural Transfer in Early Monsoon Asia, Singapour, Institute of
South East Asian Studies, 2017, p. 441-469.
62
In Merina, this merger was extended to *ŋ, so that this dialect ended up with only two phonemic nasals, n and m
(Otto Christian Dahl, Malgache et Maanyan, op. cit., pp 36-37).
63
Both words have the same origin, mam-bayar being a more modern reflex of the same Malay lending form bayar.
22
prefixation and subsequent desyllabification and fortition of the personal article *(h)i- (e.g. SEB
*hi + *andi à Merina zandri ‘younger sibling’; *i ahu ‘1st person singular nominative topical
pronoun’ >Merina izàhu, Bara iàhu, Tandroy zàhu, Old MLG64 yahu ‘id.’.
Conclusion
It is obvious that philological texts are of crucial importance for the study of history, language
and culture. This also includes the study of historical linguistics. Somewhat less obvious,
perhaps, is that in historical linguistics, philological texts are not always sourced properly, if they
are sourced at all.
In the present paper we were able to draw the following conclusions from textual materials
dating back to the 17th century CE. A deeper look into the orthographic conventions and spelling
variation in Sorabe texts reveals that Southeast coastal MLG still had the semivowels *y and *w
at the time the Arabic script was first used for MLG and its Sorabe adaptation came into being. It
also still distinguished long or double vowels. Furthermore, the linguistic data in Sorabe texts
betray the influence from various dialects, including a dialect as far away as northern
Madagascar. They also cast some doubt on the assumption that the language of early Sorabe
texts is a form of Taimoro MLG as it has also some critical features in common with Antanosy
MLG. Finally, Houtman’s data from the Antongil Bay area in northeastern Madagascar testify to
the occurrence of a palatal nasal phoneme in the early history of MLG.
64
Gabriel Ferrand, Un texte arabico-malgache du XVIe siècle, op. cit., pp 44, 63.
View publication stats

Adelaar 2020 PaperforFSXXX12!1!20

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adelaar 2020 PaperforFSXXX12!1!20

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Seventeenth century texts as a key to Malagasy linguistic and ethnic history

Preprint · October 2020

Karl Alexander Adelaar

The user has requested enhancement of the downloaded file.

Seventeenth century texts as a key to Malagasy linguistic and ethnic history

Asia Institute, University of Melbourne

(B) Sorabe data from Beaujard12

2.1 *y survives as a semivowel in some dialects and historical records

Some well-represented cognate sets maintaining *y and de-syllabicised *i as y or i are:

In other instances, *y and de-syllabicised *i were maintained as y or i more sporadically, e.g.

Merina zàndri‘ youngest sibling’ Tanala iàndri ‘id.’

Proto SEB Houtman

Sorabe transliterated Sorabe Arabic Arabic transliterated

Zafarana25 ‘saffron’ َ ‫( َز َﻋﻔَ َﺮ‬F1904 p. 35, line 10)

allorozy ‘rice’ ِ ‫( ُآﻻر‬B p.156-157)

tasting banana’ ‫( ُﻣ َﺰ ِﻣ ُﺰ‬B p.158-159) ‫َﻣ ْﻮز‬ mauz ‘id.’)

Sorabe transliterated Sorabe Arabic Arabic transliterated

zobo ‘penis’ ُ‫ﯾـ ُ ْﻮب‬ ّ‫ ُزب‬26 zubb ‘id.’ (B, p.162-163)

Roman orthography realization of Sorabe writing Transliteration

vua ‘fruit; kidney’ [wuʷa] ‫ُو ًو‬ wuWa

tua ‘of course’ [tuʷa] tuWa

ruazatu ‘two hundred’ [ruʷazatu] ruYaYatu

feu ‘voice’ [feʸu] ‫ﻓِ ْﻮ‬ feWu

fuitri ‘navel’ [fuʷițʳ] ‫ﻓُ ْﯿﺮﱢ‬ fuYrri

a-ivu ‘(in the) middle’ [aʸvu] ‫اَ ْﯾ ُﻮ‬ ayvu

Roman Sorabe writing Transliteration

mamunu ‘to kill’ (F1905 p.10, line 2) ‫َﻣ ُﻤ ُﻦ‬ MaMuNu

mati ‘dead’ (G1902 p.47, line 6) MaTi

zahu ‘I, me’ (G1902 p.56, line 6) ُ‫ﯾَﮫ‬ ZaHu

vulafutsi ‘silver’ (G1902, p.49, line 1) ِ ُ‫ُوﻟَﻔ‬

tafiki ‘army’ (G1902, p.58, line 13) TaFiKi

Roman Sorabe writing Transliteration

lùha ‘head’ (G p.49), luhA (G p.48) ‫ ﻟُﻮْ ھَﺎ‬،‫ﻟُﮭَﺎ‬ lWhA, luhA

tsìka ‘we’ (G, p.49) َ ‫ ﺗِ ْﯿ‬،‫ﻚ‬

etu ‘here’ (G, p.50) YtuW, Ytu,

aŋumbi ‘cattle’ (G, p.50) ِ ‫اَ ُﻋ‬

aŋumbiku ‘my cattle’ (G, p.50) ‫اَ ُﻋﺒِ ْﯿ ُﻜ ْﻮ‬ aŋubYkuW

tafiki ‘army’ (G, p.58, line 13) taFiki

tafiki ‘army’ (G, p.65, line 12) tafYki

2.5 Double marking of a final vowel indicates a historical long vowel

lèla ‘tongue’ ‫ِﻟ َل‬

zàza ‘child’ ‫ﻲ‬

nìfi ‘tooth’ ‫ف‬

rànu ‘water’ ‫َر ُن‬

Roman Sorabe Transliteration Phonetic PMLG2 Ma’anyan Proto-SEB

fe ‘thigh’ ‫ِﻓ ْﻲ‬ fiY [fe] *fee pe’e *pe’e

fu ‘heart’ ‫ﻓ ُ ْو‬ fuW [fu] *fuu lim/puhu 32 *puhu

ra ‘blood’ ْ‫َرا‬ raA [ra] *raa ira33 (Dusun *raha’

be ‘big, ‫ِﺑ ْﻲ‬ biY [be], [bey] *bei wahai34 *wahai

2.6 Summarising the evidence

3. The identification of Sorabe with the Taimoro dialect is problematic

English PMLG2 Merina Taimoro Tanosy Sorabe

Digging *-hali -hàdi hàdi -hàli -hali (D)

source Merina Houtman (1603)

kayu ‘wood; firewood’ ma-ŋañu ‘to gather firewood’

wayat ‘(paying)’ mañat ‘to pay’

Malay muyaŋ ‘ancestors’ muñaŋ ‘great-great-grandfather’57

From early Malay *(h)ayar ‘learning/teaching’ to modern Malagasy:

Fortition of semivowels is a common development in the phonological history of Malay59, and

Early Malay *(h)ayar ‘learning/teaching’ to modern Malay:

məŋ-ajar ‘to teach’.

View publication stats

Some well-represented cognate sets maintaining y and de-syllabicised i as y or i are:

In other instances, y and de-syllabicised i were maintained as y or i more sporadically, e.g.

fe ‘thigh’ ‫ِﻓ ْﻲ‬ fiY [fe] fee pe’e pe’e

fu ‘heart’ ‫ﻓ ُ ْو‬ fuW [fu] fuu lim/puhu 32 puhu

ra ‘blood’ ْ‫َرا‬ raA [ra] raa ira33 (Dusun raha’

be ‘big, ‫ِﺑ ْﻲ‬ biY [be], [bey] bei wahai34 wahai