You are on page 1of 8

Turkish Journal of Computer and Mathematics Education Vol.12 No.

3 (2021), 852-859
Research Article

A Study on Classical Texts in the Field of Computational Linguistics through


Bibliometric Analysis
Athira Najwa Zakaria1, Anida Sarudin2*, Zulkifli Osman3, Husna Faredza Mohamed Redzwan4,
Muhammad Fadzllah Hj. Zaini5
1,5
Universiti Pendidikan Sultan Idris
2*,3,4
Fakulti Bahasa dan Komunikasi, Universiti Pendidikan Sultan Idris
athiranajwa3@gmail.com1, anida@fbk.upsi.edu.my2*, zulkifli@fbk.upsi.edu.my3,
husna.faredza@fbk.upsi.edu.my4, muhamadfadzllahhjzaini@gmail.com5

Article History: Received: 10 November 2020; Revised: 12 January 2021; Accepted: 27 January 2021;
Published online: 05 April 2021
Abstract: Manuscripts or classic texts written by hand found in papers, barks, and rattans are relics of past generations.
According to generated data, Springer Link publishes a total of 111,010 articles concerning classical texts from the year 2015
to 2019. The present bibliometric analysis focuses on three aspects, namely year of publication, type of document, and field
of discipline. Data collection in schedules and visual scheduling display the current trends of classical text studies.
Bibliometric analysis discovers that the utmost type of publication generated from the classical text neyword is “article” with
6,962,098 hits. The field of research which records the highest search frequency is “Physics” with 30,705, whereas, the field
of “Linguistics” only records 1425. However, the analysis concentrates about research on the subdiscipline field of
computational linguistics. The Language and Literature subdiscipline records the highest search numbers of 148. Through the
bibliometric study, three prominent lexicons revealed from the field of linguistics closely related to classical texts are
Language, Corpus, and the Arabic Language. The process of topic visualisation of research papers through a word cloud can
reveal these three lexicons. In conclusion, bibliometric analysis related to the field of linguistics not only provides a clear
view of current developments of global classical text studies, but it also predicts the future research potential of the field.
Keywords: Computational Linguistics, Bibliometric Analysis, Subdiscipline

1. Introduction

Researchers all around the world are ardently pursuing the exploration of knowledge through classical text
studies. In Malaysia, classical text studies of Malay manuscripts not only piqued the curiosity of local
researchers, but even international researchers such as from Leiden University and Hamburg University are also
avidly delving into manuscript research (Ming, 2003). These researchers find contemporary relevance of
interdisciplinary knowledge in the contents of Malay manuscripts. Utilising higher-order thinking indicators,
(Sarudin et al. 2019a; 2019b), advanced Malay language proficiency and high comprehension of socio-cultural
skills understood as “santun berbahasa” (socio-cultural language politeness) (Mohamed Redzwan et al., 2020),
one can explore comprehensive knowledge in the contents of Malay manuscripts. These include aspects of
intellect, intellectuals, weltanschauung (worldview), and civilisation of society. The Malay manuscripts are the
highest contribution of traditional Malay yet consisting of versatile knowledge tradition and reflecting the
sociological advancement of its civilisation (Abd. Azam & Yatim, 2012). The palace historically produced
manuscripts as only rulers such as kings possess the ability to read and write, whereas citizens disseminate
knowledge orally. The estimated number of manuscript collections in Jawi writing is 265. Among the prominent
manuscripts still being researched until today are Sulalatus-Salatin, Hikayat Amir Hamzah and Hikayat Hang
Tuah.

The occupation of the British on the Malay lands has widely disseminated Malay manuscripts to an
international audience (Ming, 2003). The Malay Language arose to popularity amongst the West, starting from
publications of Malay-English dictionaries for communication during trading. Concurrent with the birth of
European scholars in the field of education, European traders who came to the Malay lands were encouraged to
purchase and preserve a collection of Malay manuscripts relating to the history and politics of the Malay
Archipelago. English rulers at that time actively copied manuscripts if they were unable to purchase them. The
transcribers responsible for copying the manuscripts were amongst paid local people or even English people who
copied without understanding the contents. Foreign ownership is the reason why the Malays do not own many
manuscripts. For example, the copyright owner of Hikayat Sri Rama, Edward Pococke, awarded the manuscript
to the library of Oxford University for research (Ming, 2003). As a result, Malay manuscripts are now available
in many places in Britain such as at the Royal Asiatic Society, London (the collection of Stamford Raffles), the
library of Leiden University and the Baptist College library in Bristol.

852
A Study on Classical Texts in the Field of Computational Linguistics through Bibliometric Analysis

The development of classical text studies has not only engaged in local research, hence, investigating how it
has impacted international research is the objective of this study. Therefore, this paper will study the trends or
development of neywords in classical texts through a bibliometric analysis.

2. Methodology

This paper extensively utilises the bibliometric analysis in mapping a general knowledge structure of the
research subject (Rialti, Marzi, Ciappei, & Busso, 2019). The bibliometric analysis enables a demonstration of
progress between the method of application and the field of study (Iqbal et al., 2019). This method can
accomplish a significant amount of research progress by applying relevant keywords to uncover current trends
emerging around the target field. Also, this method facilitates researchers in identifying key players in the target
study by analysing the frequency of related works found in, for example, journals, conference papers, books, and
reports. Bibliometric analysis is also relevant for application in all fields of study, including science, social
science, and Language. Gunashekar, Wooding and Guthrie (2017); Iqbal et al., (2019) and Muslu (2018) are
among the researchers who have applied bibliometric analysis in the fields of medicine, computer science, and
health science. In bringing about the literature review for bibliometric analysis, the main keyword, “classical
text”, is used to obtain related studies that will underpin this chapter.

Articles indexed by SpringerLink provided data in October 2019. The preference to SpringerLink is because
of the values of its papers which comprise of a wide range of documents such as journals, book chapters,
conference papers, reference work entries, protocols, and videos. SpringerLink is a leading database on par with
Scopus and ISI. It maintains its quality by experts in various fields of research from all around the world. In
2019 to date, SpringerLink hosts 13,118,849 research materials in the record, rendering it to be the most suitable
database to study the trends of classical texts through bibliometric analysis.

Table 1. General info in SpringerLink


Type of Document Total Publications
Article 6,962,098
Book Chapters 4,391,851
Conference Papers 1,143,906

Reference Work Entries 562,947

Protocols 57,939

Video 108

3. Results and Discussions

All the articles obtained were analysed according to the following aspects, namely type of document,
discipline, and subdiscipline.

Table 2. Types of Neyword Documents [Classic Texts]


Types of Documents Frequency Percentage (%)
Articles 111,070 53.94
Book Chapters 90,104 43.76
Reference Work Entry 4,217 2.05
Protocols 494 0.24
Books 36 0.017
Videos 4 0.0019
Book Series 1 0.00049
Total 205,926 100

According to Table 2, articles have the highest number of frequencies with 111,070 equivalents to 53.94%,
followed by book chapters with 90,104 or 43.76%. Both of these types of documents show higher frequencies
and percentages compared to reference work entry, protocols, books, videos and book series. Reference work
entry indicates frequencies of 4,217 or 2.05% while the remaining four documents suggest percentages less than
1% which are 0.24% for protocols, 0.017% for books, 0.0019% for videos, 0.00049% for book series with 494,

853
A Study on Classical Texts in the Field of Computational Linguistics through Bibliometric Analysis

36, 4 and 1 frequencies respectively. The results shown do not tally with the generated data because the data
show reachable numbers of articles only. Subsequently, the analysis adheres to the criteria of the type of
discipline (refer to Table 3).

Table 3. Types of Neyword Discipline Field [classic text]


Types of Discipline Frequency Percentage (%)
Physics 30,075 15.53
Mathematics 29,076 15.01
Engineering 28,705 14.82
Computer Science 28,565 14.75
Material Science 7,881 4.07
Philosophy 7,262 3.75
Chemistry 6,436 3.32
Medicine and Allied Sciences 6,366 3.29
Life Sciences 6,079 3.14
Earth Sciences 5,161 2.66
Social Sciences 5,161 2.66
Business Management 4,524 2.34
Biomedicine 4,520 2.33
Education 4,382 2.26
Economics 3,881 2.00
Media and Culture Studies 3,725 1.92
Literature 3,681 1.90
Political Sciences and International 2,942 1.52
Relations
Psychology 2,362 1.21
History 2,342 1.21
Statistics 2,316 1.20
Law 1,989 1.03
Environmental Studies 1,528 0.79
Linguistics 1,425 0.74
Energy Studies 997 0.51
Religious Studies 980 0.51
Popular Sciences 653 0.34
Geography 639 0.33
Finance 541 0.28
Criminology and Criminal Justice 361 0.19
Pure Sciences, Humanities and Social 208 0.11
Sciences, Multidisciplinary Sciences
Dentistry 108 0.06
Pharmacy 36 0.02
Architecture 24 0.01
Total 193,691 100

Table 3 demonstrates the types of discipline fields from the search results of classic text neywords. The field
of Physics shows the highest results with 30,075 frequencies or 15.53%. All three fields of Mathematics,
Engineering and Computer Science display similar results to Physics with percentages of 15.01%, 14.82%, and
14.75% or frequencies of 29,076, 28,705, and 28,565 respectively. The remaining types of discipline fields show
a far disparity against the highest four fields. The field of Material Science illustrates frequencies of 7,881 or
4.07% only.

Meanwhile, discipline fields that record around 3% are Philosophy, Chemistry, Medicine and Allied Health
Sciences, and Life Sciences. The disciplines with percentages of 2.00% to 2.66% are Earth Sciences, Social
Sciences, Business Management, Biomedicine, Education and Economics. The 12 disciplines having the least
percentage lower than 1% among them are Environmental Studies, Linguistics, Finance, Dentistry and
Pharmacy, with frequencies between 24 to 1,528.

The results of neywords [classic text] according to Table 2 and 3 display a general trend of classical text
studies which encompass a wide range of interdisciplinary fields. Therefore, the next process is limiting the

854
A Study on Classical Texts in the Field of Computational Linguistics through Bibliometric Analysis

search results to articles related to only Linguistics based on neywords [classic text]- articles-linguistics-2015-
2020 (refer to Diagram 1).

Diagram 1. Neyword Search [classic text]

According to the search specifications, the articles related to Linguistics came back with 362 results
comprising of several subdisciplines related to classical text studies (refer Table 4). Only related articles from the
subdisciplines are selected, as shown in the following table:

Table 4. List of Neyword [classic text] Subdisciplines


Type of Subdisciplines Frequency Percentage %
Language and Literature 148 10.33
General Linguistics 131 9.14
Comparative Linguistics 93 6.49
General Education 90 6.28
Syntax 82 5.72
Historical Linguistics 66 4.61
Language Education 62 4.33
Language Philosophy 60 4.19
Comparative Literature 56 3.91
Philology 56 3.91
Computer Linguistics 53 3.70
General Computer Science 51 3.56
Neurology 47 3.30
Psycholinguistics 47 3.30
General Sociology 43 3.00
Literature 41 2.86
Semantics 41 2.86
Applied Linguistics 30 2.09
Neuro-Linguistic Programming (NLP) 30 2.09
Phonetics and Phonology 23 1.61
Sign Language 23 1.61
Theoretical Linguistics 23 1.61
Political Science 19 1.33
Sociolinguistics 19 1.33
Chinese Language 16 1.12
Computer Applications in Literature and 11 0.77

855
Athira Najwa Zakaria, Anida Sarudin*, Zulkifli Osman, Husna Faredza Mohamed Redzwan,
Muhammad Fadzllah Hj. Zaini

Humanities
Pragmatics 11 0.77
Culture and Religious Studies 10 0.70
Russian Language 10 0.70
Cognitive Linguistics 9 0.63
Corpus Linguistics 9 0.63
Multi-language 9 0.63
German Language 7 0.49
Japanese Language 7 0.49
Total 1,433 100

According to Table 4, there are 34 subdisciplines with 1,433 articles.

Based on Table 4, there are 34 subdisciplines of 1,433 articles from the previously mentioned linguistic
disciplines. The subdiscipline of Language and Literature shows the highest result with 148 frequencies or
10.33%, followed by General Linguistics with 131 frequencies or 9.14%. Both Comparative Linguistics and
General Education show a similar percentage of 6.49% and 6.28% or frequencies of 93 and 90 respectively. The
subdisciplines scoring lower than 1% are Computer Applications in Literature and Humanities, Pragmatics,
Culture and Religious Studies, Russian Language, Cognitive Linguistics, Corpus Linguistics, Multi-language,
German Language and Japanese Language with percentages and frequencies of 0.77% (11), 0.77% (11), 0.70%
(10), 0.70% (10), 0.63% (9), 0.63% (9), 0.63% (9), 0.49% (7), 0.49% (7) respectively.

The subdiscipline, which can be detailed further, to signify more correlation and relevance to previous
literature reviews, is Computer Linguistics. Based on this paper’s topic of interest related to Computer
Linguistics, the WordSift website forms a holistic word cloud view that reflects lexical frequencies of the
research topic achieved through the performed search.

Diagram 2. Word Cloud of Neywords [classic text]


Source: https://wordsift.org/

Diagram 2 illustrates the word cloud produced by WordShift generated from titles of uploaded research
papers. Word cloud is a web-based application service offering visualisation of words derived from Language or
text, conforming to the pursuit of big data which is gaining prominence (Jin, 2017). Academics make the most
out of word cloud in achieving a critical and holistic overview of research ideas (Qeis, 2015). Based on Diagram
2.2, the most outstanding lexicons, namely Language, corpus, and Arabic Language, signify a substantial
correlation with classical text studies in the context of linguistics. Other lexicons include historical, method,
resources, recognition, analysis and linguistics. From the lexicons seen in the word cloud, the initial overview
perceives 54 articles as related to classical texts.

In total, the lexicon of Language appears the most in the neyword classic text, since 53 out of 54 uploaded
articles discuss classical text studies from the perspective of Language not limited to one but all languages. In the
classical text studies, the Arabic Language dominates the discipline of Linguistics and its subdiscipline;

856
A Study on Classical Texts in the Field of Computational Linguistics through Bibliometric Analysis

Computer Linguistics. Among the research in the Arabic Language are by Belinkov, Magidow, Barrón-Cedeño,
Shmidman and Romanov (2019); Neme and Paumier (2019), Hammo, Yagi, Ismail and Abu Shariah (2016);
Hammo et al. (2016); Djellab, Amrouche, Bouridane, and Mehallegue (2017) and Al-Thubaity (2015). These
studies discuss the development of language corpus aimed at uncovering the beauty of Language from the
aspects of sociology, history, semantics and grammar. Studies that explore the history of Arabic Language such
as Hammo et al. (2016) intended at developing historical corpus on classical text studies inclusive of the al-
Quran have existed for thousands of years ago. Meanwhile, studies that discover lexicology such as Neme and
Paumier (2019) and Mohamed and Oussalah (2019) investigated the correlation of text to semantics through a
hybrid approach.

Drawing on a discussion of classical textual, linguistic studies, researchers from Asia who have conducted
their study of native languages such as Koo, (2015) have studied unsupervised methods for developing
applications for identifying loan words in Korean using a statistical approach completely to identify word
frequency and analyse it bi-gram.

Asian researchers such as Koo (2015) have studied classical texts in their native tongue. He studied an
unsupervised method for developing a character-based n-gram classifier that detects loanwords or transliterated
imported vocabularies in Korean classical text. Other than that, Pham, Tucker, and Baayen (2019) studied
general corpus-based research of various Vietnamese materials, including children literary books and compared
them to film subtitles. Pham et al. also investigated the perspective of semantics using the Latent Semantic
Analysis (LSA) method. Classical text studies in other languages include Latin language (Kabala, 2018;
Boschetti, 2015), French Language (Magistry, Ligozat, & Rosset (2019), Turkish language Eryiğit et al. (2019),
German language (Schulz & Ketschik, 2019), Slovenian Language (Fišer, Ljubešić, & Erjavec, 2018); Indo-
European Language (Eckhoff et al., 2018) and Hausa language (Bimba, Idris, Khamis, & Noor, 2016). In
summary, classical text studies apply to all languages of the world. However, classical text studies find more
Arabic Language research compared to other languages.

The word that frequently occurs in the word cloud of articles related to the classical text is corpus. Classical
texts store an abundance of data related to various aspects of life. Researchers investigating classical texts
organise the data and information in the form of digital corpus for easy access to future research. For example,
Rubinstein (2019) developed the first corpus database of Emergent Modern Hebrew, which combines writings
and visuals from various genres. The database creation has improved research in the Hebrew language through
the historical development of Hebrew from the classical era to modern times. Other than observing the language
progress, corpus database can identify differences of intra-language dialects.

Jarrar, Habash, Alrimawi, Akra, and Zalmout (2017); Djellab et al., (2017); Masmoudi, Bougares, Ellouze,
Estève, dan Belguith (2018); and Abainia (2019) are among researchers who study Arabic Language but with
diverse dialects according to their respective geographies, such as the Arabic Language in Tunisian dialect,
Algerian dialect, Palestinian dialect, and French dialect. These researches are made possible by corpus that
records all classical texts. Intra-language dialect researches prove that classical text studies are not limited to
only literary studies but encompass a wide range of disciplines including in-depth discussions on phonetics and
phonology.

4. Conclusion

In brief, international classical text studies in the search results of SpringerLink are mostly in forms of
articles compared to other publications such as book chapters, conference papers, reference work entries,
protocols and videos. The significant number of articles suggest to researchers that publication in article format
has a broader reach and availability in large quantities of high quality and relevant related materials.

The type of discipline with the highest searchings Physics with 30,075 or 15.53% followed by Mathematics,
Engineering and Computer Science. Whereas, disciplines possessing the lowest percentages below 1% are
Environmental Studies, Linguistics, Finance, Dentistry, and Pharmacy with frequencies 24 to 1,528. Pure
Sciences record higher several classical text studies compared to this paper’s field of interest that is Language
and Linguistics. Academic exploration of classical text studies in all related disciplines is recommended to be
enhanced further in understanding the extent of respective disciplines. However, for this research, only the
development and trends of related classical text are studied through the generated neyword classic text.
Besides, 34 subdisciplines related to linguistics comprise of 1,433 articles were scrutinised as previously
discussed. The subdiscipline of Language and Literature shows the highest engagement with 148 frequencies or
10.33%, whereas, the lowest which is lower than 1% are Computer Applications in Literature and Humanities,

857
Athira Najwa Zakaria, Anida Sarudin*, Zulkifli Osman, Husna Faredza Mohamed Redzwan,
Muhammad Fadzllah Hj. Zaini

Pragmatics, Culture and Religious Studies, Russian Language, Cognitive Linguistics, Corpus Linguistics, Multi-
Language, German Language, and Japanese Language with percentages and frequencies of 0.77% (11), 0.77%
(11), 0.70% (10), 0.70% (10), 0.63% (9), 0.63% (9), 0.63% (9), 0.49% (7), and 0.49% (7) respectively.
Subsequently, the article focuses on the subdiscipline of Computer Linguistics aforementioned in the initial
bibliometric analysis. As noted in the research title, holistic observation is given to the WordShift website in
generating a word cloud of frequent lexicons created from the performed database search. The three most
frequent lexicons are Language, corpus and Arabic Language. Each lexicon provides a clear overview of the
related field of study.

In conclusion, the bibliometric analysis of classical texts in Computer Linguistics has given an overview to
other researchers interested in studying a similar field related to types of publications, disciplines and
subdisciplines in a more systematic way. The research development mapping by way of bibliometric analysis is
crucial in observing a holistic global education trend using related neywords.

Bibliography

1. Abainia, K. (2019). DZDC12: a new multipurpose parallel Algerian Arabizi–French code-switched


corpus. Language Resources and Evaluation. https://doi.org/10.1007/s10579-019-09454-8.
2. Abd. Azam, A. and Yatim, O. (2012). Manuskrip lama: asas keupayaan dan kearifan Melayu tradisi.
International journal of the Malay world and civilisation, Vol. 30, No, 1, pp. 29 – 39.
3. Al-Thubaity, A. O. (2015). A 700M+ Arabic corpus: KACST Arabic corpus design and construction.
Language Resources and Evaluation, Vol. 49, No. 3, pp. 721–751. https://doi.org/10.1007/s10579-014-
9284-1.
4. Alkema, A. (2021). Lines and semi-countably differentiable primes. Mathematical Statistician and
Engineering Applications, 70(2), 90-98.
5. Belinkov, Y., Magidow, A., Barrón-Cedeño, A., Shmidman, A., and Romanov, M. (2019). Studying the
history of the Arabic Language: language technology and a large-scale historical corpus. Language
Resources and Evaluation, Vol. 53, No. 4, pp. 771–805. https://doi.org/10.1007/s10579-019-09460-w.
6. Bimba, A., Idris, N., Khamis, N., and Noor, N. F. M. (2016). Stemming Hausa text: using affix-
stripping rules and reference look-up. Language Resources and Evaluation, Vol. 50, No. 3, 687–703.
https://doi.org/10.1007/s10579-015-9311-x.
7. Boschetti, F. (2015). Barbara McGillivray: Methods in Latin computational linguistics. (Brill’s studies
in historical linguistics). Language Resources and Evaluation, Vol. 49, No. 4, pp. 927–931.
https://doi.org/10.1007/s10579-015-9305-8.
8. Chen, X., Xie, H., Wang, F. L., Liu, Z., Xu, J., and Hao, T. (2018). A bibliometric analysis of natural
language processing in medical research. BMC Medical Informatics and Decision Making, Vol.
18(Suppl 1), 1–14. https://doi.org/10.1186/s12911-018-0594-x
9. Djellab, M., Amrouche, A., Bouridane, A., and Mehallegue, N. (2017). Algerian Modern Colloquial
Arabic Speech Corpus (AMCASC): regional accents recognition within complex socio-linguistic
environments. Language Resources and Evaluation, Vol. 51, No. 3, pp. 613–641.
https://doi.org/10.1007/s10579-016-9347-6.
10. Djellab, M., Amrouche, A., Bouridane, A., and Mehallegue, N. (2017). Algerian Modern Colloquial
Arabic Speech Corpus (AMCASC): regional accents recognition within complex socio-linguistic
environments. Language Resources and Evaluation, Vol. 51, No. 3, pp. 613–641.
https://doi.org/10.1007/s10579-016-9347-6.
11. Eckhoff, H., Bech, K., Bouma, G., Eide, K., Haug, D., Haugen, O. E., and Jøhndal, M. (2018). The
PROIEL treebank family: a standard for early attestations of Indo-European languages. Language
Resources and Evaluation, Vol. 52, No. 1, pp. 29–65. https://doi.org/10.1007/s10579-017-9388-5.
12. Eryiğit, G., Eryiğit, C., Karabüklü, S., Kelepir, M., Özkul, A., Pamay, T., … Köse, H. (2019). Building
the first comprehensive machine-readable Turkish sign language resource: methods, challenges and
solutions. Language Resources and Evaluation. https://doi.org/10.1007/s10579-019-09465-5.
13. Fišer, D., Ljubešić, N., and Erjavec, T. (2018). The Janes project: language resources and tools for
Slovene user-generated content. Language Resources and Evaluation. https://doi.org/10.1007/s10579-
018-9425-z.
14. Gunashekar, S., Wooding, S., and Guthrie, S. (2017). How do NIHR peer review panels use
bibliometric information to support their decisions? Scientometrics, Vol. 112, No. 3, pp. 1813–1835.
https://doi.org/10.1007/s11192-017-2417-8.
15. Hammo, B., Yagi, S., Ismail, O., and AbuShariah, M. (2016). Exploring and exploiting a historical
corpus for Arabic. Language Resources and Evaluation, Vol. 50, No. 4, pp. 839–861.
https://doi.org/10.1007/s10579-015-9304-9.

858
A Study on Classical Texts in the Field of Computational Linguistics through Bibliometric Analysis

16. Iqbal, W., Qadir, J., Tyson, G., Mian, A. N., Hassan, S., and Crowcroft, J. (2019). A bibliometric
analysis of publications in computer networking research. In Scientometrics, Vol. 119.
https://doi.org/10.1007/s11192-019-03086-z.
17. Jarrar, M., Habash, N., Alrimawi, F., Akra, D., and Zalmout, N. (2017). Curras: an annotated corpus for
the Palestinian Arabic dialect. Language Resources and Evaluation, Vol. 51, No. 3, pp. 745–775.
https://doi.org/10.1007/s10579-016-9370-7.
18. Jin, Y. (2017). Development of Word Cloud Generator Software Based on Python. Procedia
Engineering, Vol;. 174, pp. 788–792. https://doi.org/10.1016/j.proeng.2017.01.223.
19. Kabala, J. (2018). Computational authorship attribution in medieval Latin corpora: the case of the Monk
of Lido (ca. 1101–08) and Gallus Anonymous (ca. 1113–17). Language Resources and Evaluation.
https://doi.org/10.1007/s10579-018-9424-0.
20. Koo, H. (2015). An unsupervised method for identifying loanwords in Korean. Language Resources
and Evaluation, Vol. 49, No. (2), pp. 355–373. https://doi.org/10.1007/s10579-015-9296-5.
21. Magistry, P., Ligozat, A. L., and Rosset, S. (2019). Exploiting languages proximity for part-of-speech
tagging of three French regional languages. Language Resources and Evaluation, Vol. 53, No. 4, pp.
865–888. https://doi.org/10.1007/s10579-019-09463-7.
22. Masmoudi, A., Bougares, F., Ellouze, M., Estève, Y., and Belguith, L. (2018). Automatic speech
recognition system for Tunisian dialect. Language Resources and Evaluation, Vol. 52, No. 1, pp. 249–
267. https://doi.org/10.1007/s10579-017-9402-y.
23. Ming, D. C. (2003). Kajian manuskrip Melayu: masalah, kritikan dan cadangan. Kuala Lumpur: Utusan
Publications & Distributors Sdn Bhd.
24. Mohamed, M., and Oussalah, M. (2019). A hybrid approach for paraphrase identification based on
knowledge-enriched semantic heuristics. Language Resources and Evaluation.
https://doi.org/10.1007/s10579-019-09466-4.
25. Mohamed Redzwan, H. F., Bahari, K. A., Sarudin, A. and Osman, Z. (2020). Strategi pengukuran
upaya berbahasa menerusi kesantunan berbahasa sebagai indikator profesionalisme guru pelatih
berasaskan skala morfofonetik, sosiolinguistik dan sosiopragmatik. Malaysian Journal of Learning &
Instruction, Vol. 17, No. 1, pp. 187-228.
26. Muslu, Ü. (2018). The Evolution of Breast Reduction Publications: A Bibliometric Analysis. Aesthetic
Plastic Surgery, Vol. 42, No. 3, pp. 679–691. https://doi.org/10.1007/s00266-018-1080-7.
27. Neme, A. A., and Paumier, S. (2019). Restoring Arabic vowels through omission-tolerant dictionary
lookup: َ ‫ت ْشكيل ال َكلِمات‬. In Language Resources and Evaluation.
‫عب َْر َموارد حاسوبيّة‬
https://doi.org/10.1007/s10579-019-09464-6.
28. Pham, H., Tucker, B. V., and Baayen, R. H. (2019). Constructing two vietnamese corpora and building
a lexical database. Language Resources and Evaluation, Vol. 53, No. 3, 465–498.
https://doi.org/10.1007/s10579-019-09451-x.
29. Qeis, M. I. (2015). Aplikasi wordcloud sebagai alat bantu analisis wacana. International Conference on
Language, Culture, and Society - ICLCS LIPI, (November 2015). Retrieved from
https://www.researchgate.net/publication/316736417_APLIKASI_WORDCLOUD_SEBAGAI_ALAT
_BANTU_ANALISIS_WACANA.
30. Rialti, R., Marzi, G., Ciappei, C., & Busso, D. (2019). Big data and dynamic capabilities: a bibliometric
analysis and systematic literature review. Management Decision. https://doi.org/10.1108/MD- 07-
2018-0821.
31. Rubinstein, A. (2019). Historical corpora meet the digital humanities: the Jerusalem Corpus of
Emergent Modern Hebrew. Language Resources and Evaluation, Vol. 53, No. 4, pp. 807–835.
https://doi.org/10.1007/s10579-019-09458-4.
32. Sarudin, A., Mohamed Redzwan, H. F., Osman, Z., Raja Ma’amor Shah, R. N. F., and Mohd Ariff
Albakri, I. S. (2019a). Menangani kekaburan kemahiran prosedur dan terminologi awal Matematik:
Pendekatan leksis berdasarkan Teori Prosodi Semantik. Malaysian Journal of Learning and Instruction,
Vol. 16, No. 2, pp. 255-294.
33. Sarudin, A., Mohamed Redzwan, H. F, Osman, Z., and Mohd Ariff Al-Bakry, I. S. (2019b). Using the
Cognitive Research Trust scale to assess the implementation of the elements of higher-order thinking
skills in Malay Language teaching and learning. International Journal of Recent Technology and
Engineering(IJRTE,), Vol. 8, No. 2S2, pp. 392-398.
34. Schulz, S., and Ketschik, N. (2019). From 0 to 10 million annotated words: part-of-speech tagging for
Middle High German. Language Resources and Evaluation, Vol. 53, No. 4, pp. 837–863.
https://doi.org/10.1007/s10579-019-09462-8

859

You might also like