You are on page 1of 14

Scientometrics DOI 10.

1007/s11192-012-0818-2

Inconsistent transliteration of Iranian university names: a hazard to Irans ranking in ISI Web of Science
Mohammad Reza Falahati Qadimi Fumani Marzieh Goltaji Pardis Parto

Received: 20 May 2012 miai Kiado , Budapest, Hungary 2012 Akade

Abstract Today, university ranking has turned into a critical issue in the world. Each university is identied with a surface form under which the whole performance of that university is assessed. This article intends to provide a clear picture of the inconsistencies observed in recording Iranian university titles by their afliated authors and to clarify the negative impact of such inconsistencies in positioning Iranian universities in global university ranking systems. To collect various surface forms of Iranian university names, use was made of ISI Web of Science through keywords Cu = Iran and py = 20002009. Only MSRT universities were considered. Two M.A. experts listed all variant forms of a single university under that name. The form publicized in a universitys website was considered as its entry name. The major sources of variation identied were as follows: Acronyms, misspellings, abbreviations, space variations, syntactic permutation, application of vowels/ consonants and vowel/consonant combinations, /a/vs./aa/, Tashdid, Kasra ezafe, redundancy, downcasing, voiceless glottal stop sound /?/, shortening and deletion of titles. It was found that at its present shape Iranian universities are not receiving the rank they really deserve simply because authors afliated to a university use university title forms inconsistently. It was recommended that authors follow the surface form publicized by universities in their websites, use the help of an editor in their works, and not be credited for their articles in case the forms deviate from those publicized through the websites. A spell checker, as an add-ins software is highly needed to homogenize Iranian university surface forms by replacing the variants by the dominant form proposed.

M. R. Falahati Qadimi Fumani (&) Faculty of Computational Linguistics Research Department, Regional Information Center for Science and Technology, RICeST, Shiraz, Iran e-mail: mrfalahat@yahoo.com M. Goltaji Islamic World Science Citation Center, Shiraz, IRAN P. Parto Department of Evaluation and Collection Development, RICeST, Shiraz, Iran

123

M. R. Falahati Qadimi Fumani et al.

Keywords University ranking systems Iranian universities ISI Web of Science Persian-English transliteration Misspellings Information retrieval Persian orthography Downcasing

Introduction Recognition of terms comprises a major component in any natural language processing (NLP) software. In this regard, the surface forms of words are also of great importance. Any inconsistency in the orthography of words may hurt the IR system and will decrease recall. An end user may fail to retrieve some relevant documents despite their availability in the database simply because they keyed in the word with a surface form different from the one available in those documentsprogram and programm in and (both meaning even) in Persian, being two examples. Falahati English and Qadimi Fumani (2010) devoted a full chapter of his dissertation to description of such inconsistencies in Persian orthography. He also elaborated on such inconsistencies in two more articles (Falahati Qadimi Fumani and Ramachandra, 2008, 2011). Such inconsistencies are so much so that hundreds of articles have already been written on the issue by Persian scholars. LIS experts, as mediators of information producers and users, [as well as linguists] have long emphasized the importance of standardizing the orthography of terms especially following the expansion of information databases (Mortezai, 2001). Such orthographic inconsistencies not only hurt the IR, but they may also damage the position of universities in global university ranking systems like ISI, Shanghai, etc. This is simply because tracing research and scientic production (often appeared in the form of books, articles, etc.) of universities is used as a major component in ranking universities. As a routine, all publication related to a specic university are listed under that university and then analyzed. Hence, if all authors afliated to a university use a given title as the university name, all those publication could be linked to that university. But if various forms of a university name are available, works encompassing the variants will be grouped into different classes in terms of the specic variant form used by the authors. Under such circumstances, the university will be the great loser because ISI recognizes each universityand accordingly analyzesby one and only one single surface form. That is, a number of scientic publications will be excluded in nal ranking simply because authors used multiple titles, rather than a single one, to refer to the university to which they are afliated. In fact, ISI will consider each variant a single item and accordingly will present a separate analysis and ranking for each case. That is, for each university, at best, a part of its real performance will be reected and hence the universitys ranking in the ranking system will be underestimated. A large number of researchers over the globe have worked on university ranking and researchers evaluation. Chung and Park (2012), for instance, examined the Web visibility of researchers in the eld of communication. Schulz and Manganote (2012) analyzed the Country Proles, the open access data from ISI Thomson Reuters Science Watch. They discussed the advantages of dening a Country Prole Index (CPI), a tool for diagnosing the activity of the scientic community of a country and their possible strengths and weaknesses. Feng, Yong, Xiaolong and Wei (2012) discussed evaluation of research universities in mainland China, Hong Kong and Taiwan. They considered two variables, namely quality and quantity, of research production in research universities.

123

Inconsistent transliteration of Iranian university names

Based on the brief introduction presented above, the main objective of the present article is to provide a clear picture of the inconsistencies observed in recording Iranian university titles by their afliated authors and to clarify the negative impact of such inconsistencies in positioning Iranian universities in global university ranking systems.

Literature review Extensive works have already been published on Persian script and the problems it faces in a computing environment. Some problems are rooted in the writing system itself but some are due to the unavailability of full-edged, well-established and well-observed standards for Persian orthographyeven if there is one it is not followed completely by language users. Proposals for modifying Persian script go back to half a century ago (Naseh, 2004). Emami (1992) stressed the need for Persian alphabet to undergo modications to t the computing environment. Mortezai (2001) discussed the problems Persian writing system faces in information retrieval. From amongst those discussing the issue of computer and writing system, Horri (1993), Sanati (1992) and Masoumi-Hamadani (2002) are relevant examples. All the three works concentrated on the difculties in Persian computing. Akbarnejad (1997) discussed the difculties space variability may introduce into information storage and retrieval systems. Ahmadi-Birjandi (1973) elaborated on the two possibilities in Persian alphabet, namely joining and disjoining of letters and morphemes. To him, this dis/joining contrast is a great burden on Persian computing. Kaboli (1995) discussed space variation within the framework of word formation processes. Hendi (2002) focused on compound formation and proposed a method to standardize the writing of compounds terms that embody a number of space variants within their inner structures. To him, compounds must be tackled as an important issue in information retrieval endeavors. Some researchers like Samai (2004), discussed the inconsistency in application of punctuation marks. He undermined that punctuation is an instrument that helps us a lot to communicate meaning and thus any inconsistency in its application could cause problems in the communication process. Tayyeb (1992), for instance, in his article Homography in Persian reiterated that the absence of short vowels was the root of 85 % of homographs in Persian. Yet, other researchers examined the way terms from English are handled in Persian. Behzadi (1996), for example, examined the inconsistencies in transliteration of English borrowed terms. Pourjavadi (2003) put forward suggestions to standardize Penglish (Writing Persian using English alphabet, which quite often happens in mobile SMSs, email exchanges and other similar situation, e.g. when Persian university names are written using English letters). In fact, the issue of recording Persian university names using English letters is the main theme of the present article.). As few examples, he referred to the availability of oo, ou and u for the long close back vowel and ee, ii, ie, y and i for the front close long vowel. He called for production of guide books with the objective of standardizing the recording of Persian words using Penglish in SMSs, email exchanges and on the Web. The inconsistencies observed in Persian writing system do not and must not imply that there are not guide books on principles of writing in the market. In fact, there are abundant books (Samii-Gilani, 2000; Jahanshahi, 1981; Yahaghi & Naseh, 1992; Saffarpour, 2001; IAPLL, 2007). Najas (2005) book entitled Ghalat Nanevisim (Lets Write Correct Persian) was also written to make the writings of different authors and, in general, the whole native speakers more consistent. So, the case is not the unavailability of guide

123

M. R. Falahati Qadimi Fumani et al.

books, rather the real source of the problem is that even in such sources and guide books various discrepancies are observed and more importantly in many cases, mostly cases which cause a lot of problems in Persian computing, no clear cut recommendations are put forward. Even in the book written by the IAPLL (2007) in many cases the authors have left the issue into the hands of the writers to opt for one possibility rather than the other(s). This is while Marashi (2002) reiterated that the IAPLL is the only authorized body that must take responsibility over revising Persian writing system. He recommended that a grammar embodying 28 letters for the 28 sounds of Persian be proposed by the IAPLL.

Methodology Instruments To collect various surface forms of Iranian university names, use was made of ISI Web of Science through keywords Cu = Iran and py = 20002009. Of course, only data related to the time span 20002009 was considered and analyzed. To mark one surface form as dominant, use was made of Iranian university websites (the English web pages), that is, the title appeared in a universitys website was taken as the dominant form of that university name. Such dominant forms have been presented in Table 1 below. In this article, only MSRT (Ministry of Science, Research and Technology) universities were considered. Participants Two ISC (the Islamic World Science Citation Center) experts, each with an M.A. degree and good command of information science, and with at least 5 years of job experience, retrieved the whole collection of Iranian university surface forms (available in ISI Web of Science) with the objective of listing all surface forms related to a given university name under a single entry. They further visited the website of each Iranian university to come up with the surface form used in each website. They marked the form observed as the dominant surface form to which all other surface forms of that university could be referred to. This, of course, does not mean that the form used in each website was perfect and problem freein fact, such forms also bore orthographic and linguistic problemsrather such strategy was only adopted to save space, report only one surface form for each university, and enable the inclusion of all university names in a single table (Table 1). Such strategy seemed justied since logically all faculty afliated to a university are expected to draw on the surface form provided by the afliated universities. This could, at least, act as an easy way to reduce divergence in orthography of university names. Procedure To carry out the study, the book by Goltaji and Alinejad Chamazkoti (2011) was drawn on as the main source of data collection. Having visited ISI Web of Science, they had extracted all variant forms of Iranian university titlesall variant forms of a university under a single entrythrough keywords Cu = Iran and py = 20002009. To use one form as the entry term, the surface form appeared in a universitys website was considered as the entry name. For each university name, the total number of variants availableincluding the dominant form, was also recorded (i.e. 4 for Ilam Univ, or 6 for Shiraz Univ). After

123

Inconsistent transliteration of Iranian university names Table 1 Iranian universities dominant names and their total variant forms University name* Tarbiat Modares Univ Payame Noor Univ*** Amirkabir Univ Technol KN Toosi Univ Technol Ferdowsi Univ Mashhad Iran Univ Sci & Technol Tarbiat Moallem Univ Shahid Beheshti Univ Shahid Bahonar Univ Kerman Azarbaijan Univ Tarbiat Moallem Shahid Chamran Univ Ahvaz Univ Mohaghegh Ardabili Bu Ali Sina Univ Univ Sistan & Baluchestan Isfahan Univ Technol Gorgan Univ Agr Sci & Nat Resources Babol Noshirvani Univ Technol Allameh Tabatabai Univ Alzahra Univ Vali e Asr Univ Rafsanjan Imam Khomeini Int Univ Tarbiat Moallem Univ Sabzevar Sharif Univ Technol Urmia Univ Shahid Rajaee Teacher Training Univ Univ Tehran Shahrekord Univ No.** 136 114 108 94 67 60 57 56 50 48 45 40 37 35 34 33 32 29 25 24 23 23 23 21 21 20 20 University name Univ Isfahan Power & Water Univ Technol Sahand univ Technol Damghan Univ Univ Appl Sci & Technol Univ Tabriz Shahrood Univ Technol Lorestan Univ Sari Agr Sci & Nat Resources Univ Univ Gilan Univ Kurdistan Persian Gulf Univ Khoramshahr Marine Sci & Technol Univ Petr Univ Technol Ramin Univ Agr & Nat Resources Yasouj Univ Yazd Univ Inst Adv Studies Basic Sci Razi Univ Zanjan Univ Shahed Univ Mazandaran Univ Sci & Technol Semnan Univ Univ Kashan Shiraz Univ Hormozgan Univ Arak Univ No. 18 18 17 16 16 15 15 14 13 13 11 10 10 9 9 9 9 8 8 8 8 8 7 7 6 6 5 University name Bojnord Univ Univ Zabol Ilam Univ Tafresh Univ Jundi Shapour Univ Shiraz Univ Technol Fasa Univ Golestan Univ Gonbad High Educ Ctr Art Univ Isfahan Imam Sadiq Univ Univ Birjand Chabahar Maritime Univ Univ Qom Dr Shariaty Coll Imam Reza Univ Kerman Grad Univ Technol Urmia Univ Technol Police U niv Univ Maragheh Malayer Univ Univ Art Birjand Univ Technol Qom Univ Technol Kermanshah Univ Technol Hamedan Univ Technol Neishabour Univ No. 5 5 4 4 4 4 4 4 4 4 3 3 3 3 2 2 2 2 2 2 2 2 1 1 1 1 1

123

M. R. Falahati Qadimi Fumani et al. Table 1 continued University name* Univ Mazandaran No.** 19 University name Univ Bonab No. 5 University name Tabriz Islam Art Univ No. 1

* The variant reported, in each case, as university name was extracted from the website of each university ** The frequencies reported cover sum of frequency of appearance of all variants listed under a single university name *** The case with payame Noor Univ is different from the rest of items. In fact, in all other cases we deal with a single university and variants of its name, but Payame Noor University has a lot of branches within Iranthe branches are mostly marked by city names. So, the real amount of variations for each branch is much less than 114. Nevertheless, for ease of discussion, and for the sake of being comprehensive, the data related to all branches were merged under one single title, namely Payame Noor Univ

this step, the variants were inspected and analyzed linguistically, that is, variations were divided into different classes based on the roots of the inconsistencies observed in orthography of university names. Some sources of variation observed included use of abbreviations, shift, conversion of vowels, the availability of parallel vowel/consonant letters to represent the same phone, etc. Such factors were used as the basis of analysis as depicted in data analysis below. While linking the variant forms to their relevant university names some problems were also observed which were handled as follows: Firstly, two surface forms were differentiated from each other even when they were different only in one single character, i.e. Shahrekord Uni vs. Shahrekord Univ, or when each surface form contained the same character set but in a different order, i.e. Shiraz Univ versus Univ Shiraz. Secondly, surface forms were observed which could be linked to multiple university names. This mostly happened with acronyms. For example, SUT could be linked both to Sahand Univ Technol and also to Sharif Univ Technol. In the acronym, SUT, the location of the university is not known for which reason it was added to both entries. There were not, of course, many such cases in the whole data studied. Data analysis The main objective of the present article was as follows: To provide a clear picture of the inconsistencies observed in recording Iranian university titles by their afliated authors and, accordingly, to clarify the negative impact of such inconsistencies on positioning Iranian universities in global university ranking systems. To attain the above objective a series of analyses was carried out as follows: Table 1 presents, in a descending order, the list of 84 MSRT universities (research institutes were not considered)each with its relevant number of orthographic variationsall extracted from ISI Web of Science. Based on Table 1, in all, 1668 orthographic variations were observed for the 84 universities under analysis, that is, on average 5 variants for each university name. Some key points observed in this table are as follows: (1) More orthographic variations were found in some university titles than in others, e.g. Tarbiat Modares Univ, Payame Noor Univ and Amirkabir Univ Technol, with 136, 114 and 108 variants, depicted the highest number of variations. In contrast, Birjand Univ

123

Inconsistent transliteration of Iranian university names

Technol, Qom Univ Technol, Kermanshah Univ Technol, Neishabour Univ and nally Tabriz Islam Art Univ stood at the bottom line with only 1 variant. In 41 universities, the total number of variants observed was a two-digit number, a number between 136 and 10. In 43 other universities less variations were observed, something between 9 and 1. Such gures clarify the extent to which authors are inconsistent while presenting their afliation in their articles. Despite great strides made by Iranian universities to promote their position in global ranking systems it appears that what they get is much less than what they really deserve and truly have done. In simple terms, by presenting various surface forms for their universities they have unintentionally reduced the ranking of the universities to which they are afliated since in ISI Web of Science each variant is taken as a distinct entry and the data listed under each variant is analyzed separately. In what follows some major patterns of inconsistencies observed in university titles will be introduced. The major classes discussed will be as follows: Acronyms, misspellings, abbreviations, space variations, syntactic permutation, application of vowels/consonants and vowel/consonant combinations, /a/vs./aa/, Tashdid, Kasra ezafe, redundancy, upper and lower case letters (downcasing), voiceless glottal stop sound /?/, deletion of some terms/letters (shortening) and deletion of titles. As indicated in Table 2 above, one root of variation in Iranian university titles is the parallel application of acronyms rather than full words. Based on the data, two classes of acronym application were observed: Part I of Table 2 shows examples of full acronyms, i.e. Bu Ali Sina Univ ? BASU or Shahid Bahonar Univ Kerman ? SBUK, whereas Part II shows examples of partial acronymsinitials of some words ? abbreviated or full forms of other words are present in the title, i.e. Tarbiat Modares Univ ? TM Univ. Authors may opt for one form rather than another, an inconsistency which has inevitably led to the availability of an array of surface forms for each university title. Table 3 illustrates samples of misspellings observed in university titles. Out of the 1666 entries inspected, 576 (34.57 %) entries contained misspellings. This means roughly onethird of the whole entries analyzed. This may indicate that the articles, in ISI journals, are not being edited/proof read by a person having a good command of Persian and English. A non-Iranian editor or proof reader has no way to nd out if Tabrix or Tariz are wrong and if Tabriz is the standard form to use. Neither will browsing the Web enable the proof reader to resolve the problem since in the Web the editor or proof reader will come across various surface forms. Of course, some misspellings should have been noticed by ISI
Table 2 Inconsistencies rooted in acronyms Part I University name 1 2 3 4 5 6 Bu Ali Sina Univ Imam Khomeini Int Univ Tarbiat Modares Univ Isfahan Univ Technol Shahid Rajaee Teacher Training Univ Univ Sistan & Baluchestan Acronym(s) BASU IKIU TMU IUT SRTTU USB Part II University name Tarbiat Modares Univ Imam Khomeini Int Univ Payame Noor Univ Khaje Nasir Toosi Univ Technol Sharif Univ Technol Univ Tehran Partial acronyms TM Univ; TMU Univ IKI Univ PN Univ KN Toosi U Technol Sharif U T U Tehran

123

M. R. Falahati Qadimi Fumani et al. Table 3 Inconsistencies rooted in misspellings Correct form 1 2 3 4 5 6 7 8 Tabriz Tarbiat Ahvaz Persian Gulf Univ Teacher Training Univ Shiraz Rajaee Persian Gulf Univ Misspelling Tariz; Tabrix; Trabriz; Tabrize; Tebriz Taebiat; Tabiat; Tabriat; Tarbat; Tarbayat; Tarbeiyat; Tarbia; Tarbial; Tarbian; Tarbiart; Tarblat; Tarbita; Tarbit; Tarbist; Tarbiate; Tariat; Tarbyat; etc. Akhvaz Persian Calf Univ Teacher Trianing Univ Shiran Rahaee; Radjaei; Rajae; Rajee; Rajaei; Rajaii; Persian Calf Univ

Table 4 Inconsistencies rooted in the application of abbreviations Word 1 2 3 4 5 6 7 Technology University Science Petroleum Engineering Graduate Center Sample abbreviated forms Tech, Technol, Techno U, Univ, Unv*, Unvi*, Uuniv*, Unuiv*, Uinv* Sci P as in (PUT); Petr as in (Petr Univ) Eng; Engn Grad as in (Kerman Grad Univ Technol) Ctr as in (Gonabad High Educ Ctr)

* Asterisk shows ill-formed words

journal editors had they not had any Persian background. For example, they should have known that Unvi in Razi Unvi is wrongThe correct form is Razi Univ or better Razi University. So, the logical conclusion is that journals often trust the afliation information submitted by Iranian authors. One way to remove this problem is that Iranian authors seek help from Iranian editors or colleagues who also master English, or at least use the surface form available in the universitys website. As shown in Table 4, authors have used abbreviations in a clumsy way. Various abbreviations have been used for a single term. At times, the abbreviated form used seems awkward and bizarre, i.e. the use of Engn along with Eng for Engineering (as in Engn
Table 5 Inconsistencies rooted in space variations 1 2 3 4 5 6

With space Khajeh Nasir Kashan Univ Amir Kabir Univ Bu Ali Univ Bu Ali Sina Univ Imam Reza Univ

Without space Khajehnasir KashanUniv Amirkabir Univ Buali Univ Bu AliSina Univ Imamreza Univ Mashhad

123

Inconsistent transliteration of Iranian university names

Fac Bonab for Univ Bonab), or U and Unv, along with Univ being only few examples. According to Table 5, space variation (especially zero and full space variants, half space being the third type) can result in emergence of various surface forms. In ISI Web of Science, two arrays of words or letters are taken distinct even if the only difference between them is the application of a different space variant. On this basis, Khajeh Nasir and KhajehNasir, or Kashan Univ and KashanUniv are deemed as different entries, though they are actually two variant surface forms of a single entity. Table 6 reveals that word order can also result in university title variations. On this basis, Arak Univ and Univ Arak are considered different, though they both are variants of the same university name. The permutation model observed in Table 6 is A B ? B A. More extended forms of permutation may also occur especially in longer phrases, i.e. A B C ? C A B, C B A, A C B, B A C and B C A. Quite a few such cases were observed in the data studied, e.g. Univ Bu Ali Sina ? Bu Ali Sina Univ, where only the word Univ has been permutated. As indicated in Table 7, authors have also been inconsistent in transliterating Persian vowels and consonants into English. For instance, o, ow, ou, oo and u have all been used to stand for the half close back vowel. Similarly, q and gh have been drawn on to stand both for and for , voiced velar fricative phonemes. At times, the variations have been unbelievably vast and awkward, a fact that highlights the urgent need for standardizing transliteration of vowels/consonants in Persian. Some more variation sources In addition to the above general classes some more variation sources were also observed a brief account of which will be catered for below: (I) /a/vs./aa/ It appears that no distinction is often made between open front spread vowel a//, as in cat, and open back round vowel aa/ :/, as in car, by Iranian authors when transliterating Persian university titles into English. In Shahed Univ vs. Shaahed Univ, for instance, a and aa have both been used to stand for the sound/ :/. Similarly, a has often been used to stand for/ :/as in Zanjan, Baluchestan, Chamran and Hamadan. In all these words, the rst occurrence of a sounds//and the second occurencein fact, the third one in Hamadan stands for/ :/. Only rarely, have some authors used letters a and aa, consistently, to stand for//and/ :/respectively, e.g. Univ Mazandaraan. Other authors have, of

Table 6 Inconsistencies rooted in syntactic permutation 1 2 3 4 5 6 7

Form 1 Arak Univ Golestan Univ Mashhad Univ Kurdestan Univ Malayer Univ Art Univ Police Univ

Form 2 Univ Arak Univ Golestan Univ Mashhad Univ Kurdestan Univ Malayer Univ Art Univ Police

123

M. R. Falahati Qadimi Fumani et al. Table 7 Inconsistencies rooted in variations in the application of vowels/consonants and vowel/consonant combinations Vowel(s) 1 2 3 4 /u:/as in you /o/as in book /i/as in seed / /as in know Variations oo (Toosi, Shahrood), o (Tossi), u (Tusi), ou (Shahroud) u (Jundi), o (Jondi) Beheshtee, Shahid, Shaheed Ferdoosi, Ferdosi, Ferdowsi, Ferdousi Consonant(s) /g/ / /, as in French word merci/m /h/ Variations g (Gorgan), Gh (Ghorgan*) si/ Q (Qom), Gh (Damghan) h (Allameh), (Allame)

course, used forms like Univ Mazandaran where a stands both for//and for/ :/ even in a single word. (II) Tashdid Tashdid simply means repeating a single letter and putting more emphasis on it. Compared to other letters, the pronunciation of such letters needs more force and duration. Some authors ignored such letters and used their simple forms (without Tashdid) rather than the emphasized forms, e.g. Modares rather than Modarres and Khoramshahr rather than Khorramshahr. At times, misapplication of Tashdid was observed: Authors used it where they should not, e.g. Illam Univ rather than Ilam Univ. So, ignoring Tashdid or its misapplication acted as another contributing factor to the emergence of university title variations. (III) Kasra ezafe In Persian Kasra ezafe functions something like of in the English phrase, The house of the president. It is shown by the under script in Persian orthography. Although, it is an optional symbol most Iranians wish to ignore or skip in writing. The data analysis revealed that authors sometimes ignored this symbol altogether (i.e. Univ Tarbiat Modarres, Kasra ezafe after the second word is missing.), and sometimes they used different symbols (i.e. joining and disjoining e and E letters) to represent it, e.g. Univ Tarbiate Modarres and Univ Tarbiat E Modarrese and E after Tarbiat representing Kasra ezafe. Another example being Univ Shahr E Kord and Shahrekord Univ. So, at least three variations are observed regarding kasra ezafe which will ultimately add up to the collection of orthographic variations. (IV) Redundancy Some terms have been used redundantly by authors while providing their afliation information in their publication. The examples below will clarify the point. The term Univ has quite often been drawn on redundantly in university titles as in IUST Univ: The full form of this university title is Iran Univ Sci & Technol Univ in which double application of the word Univ is sure redundant. Univ TMU the abbreviated form of Univ Tarbiat Modarres University is just another example. (V) Upper and lower case letters (downcasing) Quite often upper case letters are used to form acronyms. Some authors, however, used upper and lower case letters inconsistently. As an example, some authors used TMU to stand for Tarbiat Modarres University but some others drew on the form Tmu as in Tmu Univ. Here, the author has used the upper case letter T for Tarbiat and lower case letters m and u for Modarres and University respectively. Such cases were observed abundantly in the data under study.

123

Inconsistent transliteration of Iranian university names

(VI)

Voiceless glottal stop/?/ Authors did not use any symbol to represent voiceless glottal stop in university titles. The two letters a and aa after o respectively in Moallem and Moaallem have been used by authors to stand for the voiceless glottal stop sound. The sound/?/is present before a and aa in the above two words or after a before ii in Tabatabaii. The point is that none of these variations seem to work. Rather, they could have used the simple super script to stand for the glottal stop sound/?/. In this way, the words Moallem and Tabatabaii could have been written instead as Moallem and Tabaatabaaii respectively. (VII) Deletion of some terms/letters (shortening) Cases were observed where title words had been deleted by the authors. This point also gave rise to more title variations. Petr Univ Technology, for instance, changed into Petr Univ by some authors. Sometimes, differences in pronunciationdue to regional and social accentsled also to orthographic variations. The word Mashhad, for example, also appeared as Mashad and Meshedh deleted in the former and a changed into e in the latter. (VIII) Deletion of titles By title we mean words like Dr, Prof., Shahid meaning Martyr, etc. Such titles were drawn on by some authors but ignored by others and hence the divergence in university title names. Dr Shariaty Coll versus Shariaty Coll could be cited as one example, where Dr is missing in the latter. Further, such titles appeared in different forms (full, acronym-wise and abbreviated forms). The word Shahid, for instance, appeared as S and Sh as well.

Discussion Based on the analyses in the previous section, the following points could be made: Firstly, a surprisingly large number of variations were observed in transliterating Iranian university names into English. This sort of diversity could be due to a number of reasons including lack of standards for transliteration of Persian words into English, authors less interest in having their articles edited and proof read by a qualied editor and dearth of effective laws to prevent authors from introducing different surface forms. From among 1668 title forms inspected, 576 (34.57 %) entries embodied misspellings. This was shown to be mostly due to carelessness of authors and, in part, due to unfamiliarity of journal editors with orthography of Persian university names. Such journals could, of course, be criticized with certainty for having missed some clear examples of misspellings, like Unvi vs. Univ. Misspelling was not the only source of problems observed, in fact, some 15 sources of inconsistency were discussed in this article. Secondly, such inconsistencies will downgrade the ranking of Iranian universities in global ranking systems. Some may claim that had authors followed the orthography publicized by their afliated university, through its website, the issue of various surface forms would have been settled totally, but in reality this is not the case. To clarify this point, four sample universities, each with 8 variants, were inspected. Further, the number of articles under each variant was also recorded. Based on the data in Fig. 1, it is true that most authors draw on the surface form publicized by their afliated universities in their websites; nevertheless, the number of forms other than the dominant one is not that marginal. In Mazandaran Univ Sci & Technol, Shahed Univ, Zanjan Univ and Inst Adv Studies Basic Sci, 15.38 % (12 out of 66), 10 % (66 out of 594), 28.71 % (172 out of 727) and 31.33 % (26 out of 57) of the surface forms differed from the surface form of the

123

M. R. Falahati Qadimi Fumani et al.

Fig. 1 Percentage of dominant versus other surface forms in 4 Iranian university titles

university publicized in university websites. Such deviant forms comprise almost one-third of the total forms used by authors of Inst Adv Studies Basic Sci. So, the mere application of the dominant form would not resolve the problem completely, though it can reduce the problem to a great extent. It seems that there is an urgent need to propose a standard system for transliteration of Persian words into English. This system can be included as add-ins software in Microsoft Word Ofce and could act as a spell checker for Iranian university names. Any form other than the default form could be identied by this software and converted into the standard form proposed. This is, of course, the topic of another article by the authors.

Recommendations Based on the ndings, a number of recommendations could be made as follows: There is a need to standardize the transliteration of Persian words, in general, and Persian university titles, in particular, into English. Persian authors are advised to have their articles revised by a person who masters English along with Persian. They may also use the cooperation of such a person as coauthor. This will surely reduce title variations as it will also add up to the quality of the articles in terms of grammar, etc. University faculty members are promoted to higher ranks based on a number of factors within which their publications forms a core element. Rules could be ratied, or reinforced with regard to laws already available, so that as authors receive credit for their articles, they should also be punished for at least not observing the university title publicized through the website of that university (Of course, in October 2011 the MSRT approved a law emphasizing that an article by a faculty member will be considered for their promotion provided the author(s) have included the afliation publicized through the website of the university in which they work). A standard list of Iranian university names could be produced and embedded as add-ins software in Microsoft Word to act as a spell checker for university titles. This last recommendation could be the most tangible and effective one.

123

Inconsistent transliteration of Iranian university names

Concluding remarks This article tackled orthographic variations in Iranian university titles. The extent of such variations was found to be very wide having their root in a variety of issues including: misspellings, abbreviations, space variations, syntactic permutation, application of vowels/ consonants and vowel/consonant combinations,/a/vs./aa/, Tashdid, Kasra ezafe, redundancy, upper and lower case letters (downcasing), voiceless glottal stop sound/?/, deletion of some terms/letters (shortening) and deletion of titles. It was discussed that today the issue of positioning Iranian universities in global ranking systems is taken as an important issue and the MSRT, as a high priority, has adopted policies to promote the ranking of Iranian universities at the global scale. It was found that at its present shape Iranian universities are not receiving the rank they really deserve simply because authors afliated to a university use various titles to stand for the university name. Authors have proved so inconsistent in this regard. It was recommended that authors follow the surface form publicized by universities in their websites, use the help of an editor while writing their articles, and be punishedas they are encouraged for their publicationby not crediting their articles in case they deviate from the surface form publicized. A spell checker, as an add-ins software is highly needed to homogenize Iranian university surface forms by replacing the variants by the default form proposed.

References
Ahmadi-Birjandi, A. (1973). Ghesse-ye por ghosse-ye ettesal va enfesal [The sorrowful story of joining and disjoining]. Yaghma, 26(7), 473475. Akbarnejad, S. (1997). Fasele-ye khali miyan-e vajeha dar zakhire va bazyabi-ye rayanei-ye ettelaat [The issue of inner and outer word spaces in information storage and retrieval]. Faslname-ye Ketab (pp. 4956). Berlin: Spring and Summer Issue. Behzadi, M. (1996). Shive-ye zabt-e alam-e engelisi dar Farsi [A method for recording English proper nouns in Persian]. Tehran: Markaz-e Nashr-e Daneshgahi, Ketabkhane-ye melli-ye jomhoori-ye eslami-ye Iran. Chung, C.J., & Park, H.W. (2012). Web visibility of scholars in media and communication journals. Scientometrics. doi:10.1007/s11192-012-0707-8, pp. 19. Emami, K. (1992). Lozoom-e baznegari dar shive-ye khatt-e Farsi [The need to revise Persian writing system]. Adine, 73(74), 1819. Falahati Qadimi Fumani, M. R. (2010). Proposing a model of automatic key phrase indexing for a specic type of persian scientic articles based on a linguistically enriched statistical approach. India: Kuvempu Institute of Kannada Studies, University of Mysore. Falahati Qadimi Fumani, M. R. (2011). The Persian Agrovoc in an indexing context. Int. J. Index. (The Indexing), 29, 2329. Falahati Qadimi Fumani, M. R., & Ramachandra, C. S. (2008). The concept of stopwords in Persian chemistry articles: A discussion in automatic indexing. Glossa, 4(1), 146164. Feng, L., Yong, Y., Xiaolong, G., & Wei, Q. (2012). Performance evaluation of research universities in Mainland China, Hong Kong and Taiwan: based on a two-dimensional approach. Scientometrics, 90, 531542. doi:10.1007/s11192-011-0544-1. Goltaji, M., & Alinejad Chamazkoti, F. (2011). Motalee-ye ashoftegi-ye negaresh-e nam-e daneshgahhaye vezarat-e olum, tahqiqat va fannavari dar paygah-e tamson roiterz va yekdast sazi-ye nam-e anha [Irans MSRT university title variations in ISI Web of Science: The need for consistency]. Shiraz: Takht-e Jamshid Publications. Hendi, S. (2002). Dastoor-e khatt-e Farsi: shivei dar negaresh-e kalameha-ye morakkab [Persian writing system grammar: a method to write compound terms]. Aamoozesh-e Zaban va Adab-e Farsi, 16(63), 2731. Horri, A. (1993). Kampiyuter va rasm-ol-khatt-e Farsi [Computer and Persian writing system]. Payam-e Ketabkhane, 3(1), 611.

123

M. R. Falahati Qadimi Fumani et al. IAPLL. (2007). Dastoor-e khatt-e Farsi [Persian writing system grammar] (7th ed.). Tehran: Farhangestan Publications. Jahanshahi, (1981). Rahnamay-e nevisande va virayesh [A guide for writers and editing]. Tehran: Shooray-e Ketab-e Koodak. Farhangname-ye Koodakan va Nojavanan. Kaboli, I. (1995). Vajesazi va bifasele nevisi [Word formation and joining of compound term elements]. Adine, 97, 5659. Marashi, A.A. (2002). Chegoone ba doshvarihay-e khatt-e farsi kenar biyaim? [How to deal with the difculties in Persian writing system?] Technoloji-ye Amoozeshi, 17(137), 2832. Masoumi-Hamadani, H. (2002). Khatt-e Farsi va rayane [Persian writing system and computer]. Nashr-e Danesh, 19(2), 26. Mortezai, L. (2001). Masaaele zabaan va khatt-e Faarsi dar zakhire va baazyaabi-ye ettelaaaat [The problems with Persian orthography in information retrieval and storage]. Faslnaame-ye Ettelaaresaani [Ettelaaresaani Quarterly], 17(1,2), 2429. Naja, A. (2005). Ghalat Nanevisim. Farhang-e doshvarihay-e zaban-e Farsi [Lets write correct Persian. A dictionary of difculties in Persian writing] (14th ed.). Tehran: Markaz-e Nashr-e Daneshgahi. Naseh, M. A. (2004). Negahi be Payannamehay-e daneshgahi dar zamine-ye khatte Farsi (19742003) [An overview of academic theses on Persian writing system (19742003)]. Name-ye Farhangestan, 6(3), 4750. Pourjavadi, N. A. (2003). Dar jabolsay-e internet: zaroorat-e khatt-e latini baray-e Farsi [In the Internet: the need for the Penglish]. Nashr-e Danesh, 20(2), 25. Saffarpour, A. (2001). Olgoohaa-ye yaaddehi-yaadgiri- ye gaam be gaam-e enshaa-ye Faarsi [Teaching and learning step-wise patterns of Persian spelling]. Tehran: Moasese Samai, S. M. (2004). Karbord-e neshaneha dar khatt-e Farsi [Use of punctuation marks in Persian orthography]. Oloom-e Ettela Resani, 19(1/2), 812. Samii-Gilani, A. (2000). Negaresh va virayesh [Writing and editting] (2nd ed.). Tehran: SAMT Publications. Sanati, M. (1992). Doshvariha-ye zaban-e Farsi ba kampiyuter [Difculties in Persian computing]. Adine, 72, 5657. Schulz, P.A., & Manganote, E.J.T. (2012). Revisiting country research proles: learning about the scientic cultures. Scientometrics. doi:10.1007/s11192-012-0696-7, pp. 115. Tayyeb, (1992). Homography in Persian. Res. J. Isfahan Univ. (Humanities), 4, 1538. Yahaghi, M. J., & Naseh, M. M. (1992). Rahnamay-e negaresh va virayesh [A guide to writing and editing]. Tehran: Astan-e Ghods-e Razavi Publications.

123

You might also like