You are on page 1of 2

The South Slavic languages are one of three branches of the Slavic languages [Friedman1999],

[Ronelle2000]. South Slavic group of languages is used by the nations on Balkan, speaking by
approximately 30 million speakers. This group of languages is divided into western and eastern part.
Western South Slavic languages include the following languages: Slovenian, Croatian, Bosnian,
Serbian and Montenegrin, while Eastern South Slavic languages are Macedonian and Bulgarian.
Obviously, the criteria for division of these languages is the script that they are using for writing, i.e.
Latin (Western) and Cyrillic (Eastern).
We have to point that some authors mean that the Montenegrin is a sub-Serbian language
[Greenberg2004]. Unlike the other languages, Serbian and Montenegrin language support live
synchronic diagraphia allowing the use of Latin and Cyrillic script [Dale1980]. Although same
researchers point out that the division of the Bosnian, Croatian and Serbian language are political
[Kordic2004], the other considers it as closely related languages [Miller2008].
From all aforementioned, it is obvious that the differences between these languages are minimal. It is
confirmed by the fact that Serbian, Croatian, Bosnian and Montenegrin can easily understand each
others. Furthermore, the researching of differences between these closely related languages are real
challenge.
In this paper, we propose a new approach to the characterization and distinction between Serbian and
Croatian language, which was previously known as Serbo-Croatian or Croatian-Serbian language (up
to the political division of Yugoslavia). This approach extended our previous works related to script
recognition [Brodic2013][Brodic2014] by introducing new extracted features from the text. In this way,
the feature extraction vector obtained by co-occurrence analysis of the coded text numbering up to 12
elements was enlarged by adding new 16 elements obtained by adjacent local binary pattern analysis
\cite[(Nosaka2011)]. At the end, the feature classification process carried out by state of the art method
GA-ICDA was efficient leading to adequate characterization of Serbian and Croatian language and
establishing their distinction.

To understand how the South Slavic languages are closely related we have to briefly introduce some
history facts. After the Balkans inhabited by Slavic people, they had communicated mutually by unique
Old Church Slavonic language. However, the Slavic people were pagans and illiterate. In the 9th
century, the task to Christianize the Slavic people is given to Byzantine Greek missionaries Saints
Cyril and Methodius from Thessaloniki. As a part of Christianization process, It was indispensably to
create the unique script, which is suitable for use with Old Church Slavonic language in order to
translate and transcript Bible and Church books. First script invented by Saints Cyril standardize to Old
Church Slavonic was Glagolitic script (round). It was used on territories from the east, i.e. the eastern
Bulgaria to the west, i.e. Istria (Croatia) and Moravska (Czech Republic). Glagolitic script was
complicated to write and needed higher knowledge of writing. Hence, the students of Saints Cyril and
Methodius invented new script, which was easier to write. It was called Cyrillic. In the meantime,
Glagolitic script maintained in the part of Croatia (Istria, some North Adriatic islands, Dalmatia) in the
now for which has a more angular shape. It was called angular Glagolitic script. In the other territories
of Balkans, the Cyrillic script was widespread. The Slavic people created different medieval states on

the Balkans which were under influence of either Byzantine or Rome. The states under Byzantine
influence

(Friedman1999) Friedman, V. (1999). Linguistic emblems and emblematic languages: on language


as flag in the Balkans. Kenneth E. Naylor memorial lecture series in South Slavic linguistics ; vol.
1. Columbus, Ohio: Ohio State University, Dept. of Slavic and East European Languages and
Literatures. p. 8.
(Ronelle2000) Ronelle, A. (2000). In honor of diversity: the linguistic resources of the Balkans.
Kenneth E. Naylor memorial lecture series in South Slavic linguistics ; vol. 2. Columbus, Ohio:
Ohio State University, Dept. of Slavic and East European Languages and Literatures. p. 4.
(Dale1980) Dale, I. R.H. (1980). "Digraphia". International Journal of the Sociology of
Language 26: 513.
(Miller2008) Miller, B. (2008). Translating Between Closely Related Languages in Statistical
Machine Translation, Master of Science by Research, School of Informatics, University of
Edinburg.
(Kordic2004) Kordic, S (2004). "Pro und kontra: "Serbokroatisch" heute" [Pro and con: "SerboCroatian" nowadays]. In Krause, Marion; Sappok, Christian. Slavistische Linguistik 2002: Referate
des XXVIII. Konstanzer Slavistischen Arbeitstreffens, Bochum 10.-12. September 2002.
Slavistishe Beitrge; vol. 434 (in German). Munich: Otto Sagner. p. 141. ISBN 3-87690-885-X
(Greenberg2004) Greenberg, R. D. (2004). Language and identity in the Balkans: Serbo-Croatian
and its disintegration. Oxford University Press
(Nosaka2011) Nosaka, R., Ohkawa, Y., Fukui, K.: Feature Extraction Based on Co-occurrence of
Adjacent Local Binary Patterns. In: Ho, Y.-S. (ed.) PSIVT 2011, Part II. LNCS, vol. 7088, pp. 82
91. Springer, Heidelberg (2011)
(Brodic2013) Brodi, D., Milivojevi, Z.N., Maluckov, .A.: Recognition of the Script in Serbian
Documents using Frequency Occurrence and Co-occurrence Analysis. The Scientific World
Journal 2013(896328), 114 (2013)
(Brodic2014) D. Brodi, Z. N. Milivojevi, . A. Maluckov, An approach to the script
discrimination in the Slavic documents, Soft Computing, In Press, [Online]. Available:
http://dx.doi.org/10.1007/s00500-014-1435-1

Dmitrij Cizevskij. Comparative History of Slavic Literatures, Vanderbilt University Press (2000) p.
27

You might also like