Professional Documents
Culture Documents
Kwary-Jalaluddin2015 ReferenceWorkEntry TheLexicographyOfIndonesianMal
Kwary-Jalaluddin2015 ReferenceWorkEntry TheLexicographyOfIndonesianMal
Kwary-Jalaluddin2015 ReferenceWorkEntry TheLexicographyOfIndonesianMal
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Abstract
The lexicography of Indonesian and Malay is closely related. The Indonesian and Malay language originate
from the same language called Melayu, which was the language of the people who lived on the coastal plains
of east and southeast Sumatra and offshore islands. The description of the lexicography of Indonesian/
Malay starts with the lexical characteristics of these languages. A general history of the lexicography of
Indonesian/Malay is then presented, followed by the specific further development of lexicography in
Indonesia and Malaysia, respectively. The third section deals with corpora for both languages. The
important role of the language planning institutions in Indonesia (called Badan Bahasa) and in Malaysia
(called Dewan Bahasa dan Pustaka) is given due attention, with particular reference to the paper and
electronic dictionary products of these institutions. The chapter concludes with future prospects.
Introduction
The Indonesian language and the Malay language share the same origin. Both languages originated with
the Malay (Melayu) people who lived on the coastal plains of east and southeast Sumatra and offshore
islands (Sneddon 2003, p. 7). By the turn of the twentieth century, the Malay language had two different
names (Bahasa Indonesia and Bahasa Melayu) with only slight differences in the vocabulary. At the
Second Indonesian Youth Congress in 1928, the delegates proclaimed Bahasa Indonesia as the language
of national unity. Bahasa Indonesia then became the national language of Indonesia after its independence
in 1945. The name Melayu is retained by Malay people, and the Malay language was declared the national
language of Malaysia when it gained its independence in 1957.
The Indonesian language has a considerable number of speakers, given that Indonesia is one of the most
populous countries in the world. There are over 240 million Indonesians, so we can assume that the
number of speakers is not less than that. However, their degree of proficiency in Indonesian varies a lot
because most of them actually have the Indonesian language as their second language. The first language
of most Indonesian people is one of the hundreds of local languages that can be found in Indonesia.
According to the data from Ethnologue (http://www.ethnologue.com/), of the 7,105 languages spoken in
over 200 countries in the world, 706 are spoken in Indonesia. As for the Malay language, it is spoken by
approximately 28 million people. In a wider context, the number of people who speak Indonesian/Malay
can reach 400 million people, comprising those who live in Indonesia, Malaysia, Singapore, Brunei
Darussalam, and Southern Thailand.
Indonesian and Malay are similar in most respects. In terms of orthography, both languages share the
same vowels and consonants, so the spellings of most of the words are the same. In terms of phonetics and
phonology, there are also considerable similarities. The pronunciation of the words at the segmental level
is the same, and the only difference that can probably be noticed is at the suprasegmental level. However,
since suprasegmental features are not distinctive features, the difference in the pronunciation will not
*Email: deny_ak@yahoo.com
Page 1 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
cause any misperception of the words pronounced. Any misunderstandings only happen at the lexical
semantics level, because some of the words or cognates have developed different meanings. The
differences at the semantic level can also occur due to special contextual use. For example, in Indonesian
and Malay, the word lembu “cow” refers to an animal with four legs. However, in some contexts in Malay,
the word lembu may also connote bodoh “stupid.” In Indonesian, the word keledai “donkey” is used to
connote bodoh “stupid.”
In order to account for dialectal differences of Indonesian/Malay, particularly at the word level, a
dictionary known as Kamus Melayu Nusantara has been published by Dewan Bahasa, Brunei
Darussalam, which was initiated by Majlis Bahasa Brunei, Indonesia, and Malaysia (MABBIM, “the
Language Council of Brunei, Indonesia, and Malaysia”). This dictionary is a combination of two
comprehensive dictionaries, i.e., Kamus Dewan (Malaysia) and Kamus Besar Bahasa Indonesia
(Indonesia), and additional corpus data from Brunei Darussalam (Omar 2008).
Description
Lexical Characteristics of Indonesian/Malay
Indonesian/Malay uses the Latin alphabet. There are 26 letters, comprising five vowels (a, i, u, e, o) and
21 consonants (b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w, x, y, z). Of these 26 letters, only the vowel “e”
that has two different pronunciations, i.e., [e] and [ə], while the other 25 letters have regular pronunci-
ations. This means that the number of sounds (i.e., 27) is quite similar to the number of letters (i.e., 26). In
addition, its spelling system is phonemic, so the words can be read without any difficulty. For instance, the
word makan “eat” is pronounced exactly as it is written, i.e., [mɑkɑn]. Loan words with complex syllable
structures undergo phonological modifications. For example, the English words “bomb,” “method,” and
“consonant” become bom, metode, and konsonan in orthography and are pronounced [bom], [metodə],
and [konsonɑn], respectively. The modification involves vowel insertion and consonant deletion which is
triggered by the native phonological system.
Based on their position, there are three types of affixes in Indonesian/Malay, i.e., prefix, suffix, and
confix, where the prefix is the most productive one. The prefixes pose a challenge to lexicographers: how
should complex words with prefixes be placed in a dictionary? The most common method is to place the
inflections under the lemma. However, this usually confuses the users or learners because they may not
know the root of the word. Take, for example, the words mengajar “to teach” and mengejar “to chase.”
A learner, who knows the prefix meng-, will know that the root of mengajar is ajar, but this learner may also
infer that the root of mengejar is ejar. This is incorrect, because the root of mengejar is kejar. This means
that when this learner looks for the word ejar in a dictionary, he will not be able to find it. There has been a
suggestion to include all the inflected forms as headwords in a dictionary. However, this will make the
dictionary very thick under the letter M, because most of the verbs in Indonesian/Malay can take the prefix
meng-, which has several allomorphs, i.e., meng-, mem-, men-, me-, and menge-. In addition, the passive
forms in Indonesian/Malay are formed by adding the prefix di- or ter- to a verb. If these are listed as
headwords, the letters D and T in the dictionary will also be very thick. However, this is not a problem for an
electronic dictionary, because the lexicographers can put all the inflected forms as headwords and use cross-
referencing to the headwords that contain the roots of the words. For a printed dictionary, especially a learner
dictionary, Kwary (2010) suggested the use of an appendix that contains an explanation of the word
formation rules. The users, especially those who are not native speakers of Indonesian/Malay, can refer to
the appendix when they need to know the root of a particular complex word.
Page 2 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Page 3 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Teuku Iskandar was the chief editor of the first edition of Kamus Dewan, and he was assisted by A. Teeuw,
a Dutch scholar who was appointed by UNESCO as an advisor to this project. The drafting started in 1967
and was completed in 1970 (Baharom 2007).
DBP’s first attempt to produce a bilingual dictionary dates from 1979. It was followed by a joint effort
with Australia National University to compile a comprehensive English-Malay dictionary, which mate-
rialized in 1992. This dictionary serves as a useful tool for translators, especially translation from English
to Malay. It focuses on polysemy and word choice in different contexts.
To date, DBP has produced nine Malay monolingual dictionaries and six bilingual dictionaries, namely,
English-Malay, French-Malay, Thai-Malay, Russian-Malay, Tamil-Malay, and Mandarin-Malay (Padilah
2012). In addition, DBP also engaged in coining terminologies which are subsequently compiled in the
form of dictionaries. These dictionaries are discipline-specific references, such as those for physics,
chemistry, biology, medicine, banking, economy, and linguistics, and the main objective of this compi-
lation is to accomplish one of the national aspirations in that the Malay language as a national language is
the language of knowledge and is the language of national unity.
As far as lexicography is concerned, DBP’s greatest endeavor was to produce Kamus Besar Bahasa
Melayu Dewan (KBBM) which is estimated to have about 100,000 entries. Each entry is rich in
information, including phonetic transcription, grammatical category, etymology, and jawi transcription,
as illustrated in Fig. 1.
Page 4 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Teks Digital
dari Internet
Pemilihan teks
melalui Sistem
Korpus
language. The corpus can be accessed at the website of the Sketch Engine, i.e., http://www.sketchengine.
co.uk/. The second corpus of the Indonesian language, also consisting approximately 100 million words,
was also created outside Indonesia, namely, at Leipzig University in Germany (Quasthoff and Goldhahn
2013). The corpus is available at the website http://corpora.informatik.uni-leipzig.de. The texts for this
corpus were taken from the websites of Indonesian newspapers and Wikipedia.
These two corpora were used by Kwary (2013) to create the first high-frequency list of the Indonesian
language. The first base list consists of 500 word families. This base list is embedded in the modified AWP
software, which was originally created by Laurence Anthony from Waseda University. The original
software can be downloaded from the website http://www.laurenceanthony.net/software.html (Anthony
2012). The modified version that can be used to check the profile of an Indonesian text is called AWP-IWL
(Indonesian Word List) and is available from http://www.kwary.net/iwl.html. Analyzing several general
texts shows that the first base list (which consists of only 500 word families) covers more than 60 % of the
words in the general texts. Further, work on the creation of a second base list and an academic word list
needs to be done so that the profile of the general texts can be analyzed properly.
Page 5 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
of an entry has changed from intuition-based description to a corpus-based description. The citation of
examples is more natural and authentic. The lexicographers have also been able to look for more appropriate
synonyms and polysemies. The corpus has made the job of lexicographers much easier but simultaneously
also more challenging. Hence, Malay dictionaries have become more reliable and respectable.
Planning in Indonesia
In Indonesia, the language planning institution is called Badan Bahasa. The main task of Badan Bahasa is
to develop, cultivate, and preserve Indonesian languages and literature (http://badanbahasa.kemdikbud.
go.id/lamanbahasa/sejarah/). Badan Bahasa is under the ministry of education and culture. Badan Bahasa
has two divisions: the Centre for Language Development and Preservation and the Centre for Language
Cultivation and Socialization. The dictionary work is handled by the subdivision called standardization
and preservation under the Centre for Language Development and Preservation. The latest dictionaries
produced by this subdivision are as follows (http://badanbahasa.kemdikbud.go.id/lamanbahasa/jenis_
produk/Kamus%20Bahasa%20Indonesia):
1. Kamus Besar Bahasa Indonesia (KBBI). This is the comprehensive Indonesian dictionary already
mentioned. The dictionary is available in printed form and online; it can be accessed at http://
badanbahasa.kemdikbud.go.id/kbbi/. The front page of the website is shown in Fig. 3.
2. Kamus Bahasa Indonesia untuk Pelajar. This is a student dictionary which can be used as a reference
for students at junior high schools and senior high schools in Indonesia. This dictionary contains
31,200 entries and is only available in printed form.
3. Glosarium. This is a bilingual dictionary that focuses on scientific terms. It includes terms related to
religion, linguistics, mathematics, biology, psychology, etc. It is the main reference for translators who
want to know the Indonesian equivalents of foreign terms. This dictionary is available in printed form,
Page 6 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Planning in Malaysia
Even though DBP is not the sole guardian of producing dictionaries in Malaysia, the task of compiling a high-
quality dictionary is shouldered by them. The Department of Language Development at DBP comprises a
lexicography division, a lexicology division, and a dictionary division. These have a complimentary task in
ensuring profound dictionary activities. In parallel with today’s technological advances, DBP has become a
user-friendly counterpart in encouraging users to interact and give responses to their dictionary work. As an
Page 7 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
initial effort, DBP has set up Sistem Bahasa Melayu Bersepadu (SBMB) or Integrated Malay Language
System. SBMB incorporates all the systems, including the dictionary management and organization system,
the corpus management and development system, the terminology management and development system, the
encyclopedia management and development system, the minority language management and development
system, and a “hotline” language service. Users are free to interact with the staff.
DBP’s Pusat Rujukan Persuratan Melayu (PRPM) has become the most popular language website in
Malaysia. Its website address is http://prpm.dbp.gov.my/. PRPM serves as a one-stop center for all language
information seekers. Any language user can search for a specific section of the site, and the search engine
will bring the user to the desired page. Figure 5 shows a page in the dictionary-specific section. There are
twelve dictionaries that can be retrieved, and information can be extracted from each of them. This has
become a great help to all users in obtaining any Malay language inquiries within a split second.
Another promising and exciting avenue is DBP’s Gerbang Kata section. This section is specifically
designed for e-dictionary (e-kamus) services. Gerbang Kata is reachable at the website http://ekamus.dbp.
gov.my/. It serves as a platform for users to interact with the lexicography unit to discuss, give feedback,
contribute ideas, and introduce new words to the unit. Figure 6 is the front page of the website of
Gerbang Kata.
Page 8 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
lexicographic practice and theory, blending them together in order for students to have a better compre-
hension of this discipline. Consequently, we may be looking at brighter prospects for lexicography in
Indonesia and Malaysia.
There are a number of possible future tracks for lexicography in Indonesia and Malaysia. Considering
the number of local languages in both countries, we should be looking at digitalization of these local
languages, especially the endangered ones, and further studies on the role of a local language in the
society.
In Malaysia, Jalaluddin et al. (2013) have ventured into an endangered language, where her research
team members attempt to relate lifelong learning with the understanding of an aboriginal community, with
specific focus on compiling a dictionary of that aboriginal language. Apparently, a combination of field
research and knowledge of compiling a dictionary provides a method toward useful insights into an
aboriginal people’s language. In addition to the language of the aboriginal people, we are exposed to their
intellectual, economic, social, and personal contexts from which their language and values arise.
Therefore, this compilation is a two-way learning process: on the one hand, new information is disclosed
during fieldwork; and on the other, new insights about aboriginal world views are revealed.
Fieldwork as conducted by Jalaluddin et al. has also been conducted in Indonesia by Badan Bahasa
through their local units called Balai Bahasa. However, further lexicography training is needed for the
staff members of Balai Bahasa in order to document the lexicon of the local languages properly and to
reveal the insights of the local people about their lives and their surroundings. The proper documentation
of the lexicons of the hundreds of local languages in Indonesia will no doubt enrich the comprehensive
Indonesian dictionary which “only” consists of 90,000 entries in its latest (fourth) edition (2008).
In 2015, a new publication called the Frequency Dictionary Indonesian (Quasthoff et al. 2015) was
published by Leipziger Universitätsverlag. This work is the first to use the 100-million-word corpus of the
Page 9 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Indonesian language. It includes both the most frequent 1,000 word forms in order of frequency and data
on the relative frequency of 1,000,000 word forms. This publication is now being considered by Badan
Bahasa in order to inform the revision of the comprehensive Indonesian dictionary and to prepare for its
fifth edition.
References
Ahmad, I. (2002). Perkamusan Melayu: suatu pengenalan. Kuala Lumpur: Dewan Bahasa dan Pustaka.
Alwi, H., Dardjowidjojo, S., Lapoliwa, H., & Moeliono, A. M. (1998). Tata Bahasa Baku Bahasa
Indonesia. Jakarta: Balai Pustaka.
Anthony, L. (2012). AntWordProfiler (Version 1.4.0) [Computer Software]. http://www.laurenceanthony.
net/antwordprofiler_index.html
Baharom, N. (2007). Perkamusan di Malaysia. In N. H. Jalaluddin & R. Baharudin (Eds.), Leksikologi
dan Leksikografi Melayu (pp. 18–52). Kuala Lumpur: Dewan Bahasa dan Pustaka.
Ghani, R. A., Husin, N. M., & Chin, L. Y. (2008). Pangkalan data korpus DBP: Perancangan, pembinaan
dan pemanfaatan. In Z. Ahmad (Ed.), Nahu Praktis Bahasa Melayu. Bangi: Penerbit UKM.
Jalaluddin, N. H., Zainudin, I. S., Ahmad Z., Mohamad, F., Sultan, M., & Radzi, H. M. (2013). The
dictionary as a source of a lifelong learning. Paper presented at the 5th Word congress on educational
sciences, Sapienza University, Rome, Italy.
Jalaluddin, N. H., Zainudin, I. S., Sanit, N., & Yusoff, Y. M. (2012). Teaching and learning lexicography:
from impressionistic to systematic understanding. Paper presented at U.K.M. teaching and learning
congress, Bentong, Pahang.
Kilgarriff, A., Reddy, S., Pomikálek, J., & Avinesh, P. S. V. (2010). A corpus factory for many languages.
Paper presented at the seventh international conference on language resources and evaluation, ELRA,
Malta.
Kridalaksana, H. (1979). Lexicography in Indonesia. RELC Journal, 10(2), 57–66.
Kwary, D. A. (2010). Bilingual dictionaries in language cultivation. Paper presented at the Language
Planning Symposium, Badan Bahasa, Jakarta.
Kwary, D. A. (2013). Creating and testing the Indonesian High Frequency Word List. In Paper presented
at the 11th KOLITA (‘Annual Linguistics Conference’), Atma Jaya University, Jakarta.
Omar, A. H. (2008). Perkamusan Melayu: dari jejak pengembara ke pembangunan negara. In
N. H. Jalaluddin & R. Baharudin (Eds.), Leksikologi dan Leksikografi Melayu. Kuala Lumpur:
Dewan Bahasa dan Pustaka.
Padilah, A. (Ed.). (2012). Meneliti Jejak Membaharui Babak. Kuala Lumpur: Dewan Bahasa dan
Pustaka.
Quasthoff, U., Fiedler, S., Hallsteinsdóttir, E., Kwary, D. A., & Goldhahn, D. (2015). Frequency
dictionary Indonesian. Kamus Frekuensi Bahasa Indonesia. Leipzig: Leipziger Universitätsverlag.
Quasthoff, U., & Goldhahn, D. (2013). Indonesian Corpora. Leipzig: Universität Leipzig. http://asv.
informatik.uni-leipzig.de
Sneddon, J. (2003). The Indonesian language: Its history and role in modern society. Sydney: UNSW
Press.
Dictionaries
de Houtman, F. (1603). Spraeck ende woord-boeck, Inde Malaysche ende Madagaskarsche Talen met
vele Arabische ende Turcsche Woorden. Amsterdam: Jan Evertsz.
Gerbang Kata. http://prpm.dbp.gov.my/
Page 10 of 11
International Handbook of Modern Lexis and Lexicography
DOI 10.1007/978-3-642-45369-4_83-1
# Springer-Verlag Berlin Heidelberg 2015
Glosarium. http://badanbahasa.kemdikbud.go.id/glosarium
Haji, R. A. (1928). Kitab Pengetahuan Bahasa iaitu Kamus Loghat Melayu-Johor-Pahang-Riau-Lingga,
penggal yang pertama. Singapore: Al-Ahmadiah Press.
Kamus Bahasa Indonesia untuk Pelajar. http://badanbahasa.kemdikbud.go.id/lamanbahasa/produk/889
[KBBI] Kamus Besar Bahasa Indonesia. http://badanbahasa.kemdikbud.go.id/kbbi
[KBBM] Kamus Besar Bahasa Melayu Dewan. (Forthcoming in 2017). Kuala Lumpur: Dewan Bahasa
dan Pustaka.
Kamus Dewan. (1970). Kuala Lumpur: Dewan Bahasa dan Pustaka.
Poerwadarminta, W. J. S. (1957). Kamus Umum Bahasa Indonesia. Jakarta: Balai Pustaka.
Pusat Rujukan Persuratan Melayu. http://prpm.dbp.gov.my/
Syed Mahmud bin Almarhum Syed Abdul Kadir Al Hindi. (1894). Kamus Waman Yatawakkal. Singa-
pore: Al-Ahmadiah Press.
Wiltens, C., & Danckaerts, S. (1623). Vocabularium ofte Woortboeck naer ordre vanden Alphabet in’t
Duytsch-Malaysch ende Mrilayselz-Duytsch.’s Graven-Haghe: by de Weduwe, ende Erfghenamen van
wijlen Hillebrant Jacobssz van Wouw.
Page 11 of 11