• Embed Doc
  • Readcast
  • Collections
  • CommentGo Back
Download
 
Center for Research in Urdu Language Processing288
CONTEXTUAL SHAPE ANALYSIS OF NASTALIQ
 Aamir Wali, Atif Gulzar, Ayesha Zia, Muhammad Ahmad Ghazali,Muhammad Irfan Rafiq, Muhammad Saqib Niaz, Sara Hussain, and Sheraz Bashir 
 
ABSTRACT
Nastaliq calligraphic style is one of the mostcomplex and widely used styles of Urduscript. It employs different shapes and sizesfor the same letter in a bewildering variety of contexts. It is observed that these differentshapes vary with neighboring letters as wellas position of that letter in a ligature. Thispaper explores this variety in shapes thatNastalique offers, thus providing afoundation to eventually model this inherentvariety and recreate this script font aswritten by a calligrapher.
 
1. INTRODUCTION
 
Nastaliq is one of the most widely used fontsof Urdu script in Pakistan and is regarded asone of the most complex fonts in theliterature of electronic computing. Earlyefforts to make this font availableelectronically used the approach of storingall possible ligatures (sequence of Urducharacters occurring without space) of Urdu,which was not supported by the standardMicrosoft Windows font technology of TTF(True Type Font) available at that time.Consequently this font was limited to somespecialized Urdu Word Processors only.With the emergence of new font technologyOTF (Open Type Font) from Microsoft, it isnow possible to make Nastaliq font, whichwould be portable to any standard wordprocessor like Microsoft Word, Power Pointand Excel.A natural approach for this, is to make acharacter-based Nastaliq font i.e. in spite of storing all possible ligatures, individualcharacters are stored. Character-basedNastaliq font needs to specify the rules,which govern the change of shape of characters while moving from one segmentof ligature to another. The first step of sucha complex writing system is to identify theall-possible shapes of characters in ligaturesand then study the rules that change theshape in particular context.The aim of this paper is to identify allpossible shapes of characters based uponthe context in which they occur in ligatures,thus providing a foundation to eventuallymodel it.
2. LITERATURE REVIEW ANDPROBLEM STATEMENT
2.1 Urdu Writing System
 
Urdu is a complete language with its ownscript, which is a mixture of Arabic andPersian script. Urdu script has total 38alphabets excluding Aerabs (Vowel Marks).Character set and Aerabs of Urdu areshown below in Fig 1.
Figure 1(a)
Character Set of Urdu
 
 289 
Figure 1(b)
Aerabs of Urdu
 
From the above figure it can be observedthat several of the basic alphabets of Urdushare same shape, these are differentiatedonly by the placement of dots or diacriticTuay on the basic shape. This propertymakes Urdu script fast and easy to write.Urdu is written in the opposite direction toEnglish i.e. from right to left. An interestingconcept about Urdu is that its number system is written form left to right. So Urduwriting system has both the properties of leftto right and right to left writing systems asshown in figure 2.
Figure 2
Urdu Number SystemUrdu writing system is cursive. More thanone character joins together to form aligature. Important thing to be observed isthat the characters change their shapedepending upon their position in the ligature.Each letter is written in a slightly differentform depending on whether it comes in thebeginning, middle or end of a word or whether it occurs in isolated form. This isshown in the figure 3.
Figure 3
Different shapes according to theposition in ligature
 
In the above figure the character has formedfour shapes according to its position.Another interesting property of Urdu writingsystem is that characters change their shapes depending upon the charactersfollowing and preceding it as in figure 4. Thischange follows some rules like, shape of thenext letter to join with and the shape of thecharacter, which is joining.
Figure 4
Different Shapes of the Letter Seen
().
In the above figure it can be clearly seenthat the second character () changes itsshape in accordance with following andpreceding characters. Which symbolizesthat Urdu writing system is context sensitive.
2.2 Different Fonts of Urdu
Urdu fonts started growing after the Khat-e-Kofi, an Arabic font that is now being usedby Urdu. It was invented in the year 238 AH.But now Kofi has lost its fame because of itscomplex writing system as shown in figure 5.
Figure 5
Different fonts of UrduIn the year 310 AH a famous Muslim scholar Ibne-Muqalla invented a new and easy fontcalled Naskh. All the previously inventedfonts lost their prominence due to Naskh’seasy and fast writing system.Although a considerable number of studieson Urdu Naskh script have been conductedduring the last few years and consequentlyled to the development of software related toUrdu Naskh, studies on Urdu Nastaliq are
 
Center for Research in Urdu Language Processing290
still in their premature stages. This is mainlydue to extraordinary features of this uniqueand dynamic form of writing. There aremany other fonts of Urdu, some of which areshown in the figure 5.
2.3 Urdu Nastaliq Script
Urdu Nastaliq script is a collection of twoother scripts Naskh and Talique. These twoscripts were combined to form another scriptformally called Naskh-Talique, which wasshortened to Nastaliq. Hence, this 38alphabets script has properties of bothNaskh and Talique.Two most common feature of Nastaliq foundin Naskh or for that matter in any Persian or Arabic script is that it is cursive. That is, thetip of the pen is not raised until a ligature iscomplete. Another characteristic is thatNastaliq is written from right to left unlikeEnglish which is form left to right. In additionto these, there are other characteristics of Nastaliq that have made its automationdifficult.Nastaliq is actually written from top right tobottom left. Each ligature is tilted at approx.45 degree (See Fig. 6). This is of particular significance as there is no fixed level or height for any character. Positions of characters change at a continuous scale. InNaskh each character has four shapesdepending on whether the character isisolated, at start, in middle or coming at end.In Nastaliq, however characters may havesignificant amount of variation at oneposition alone. This is because of theinfluence of other characters in the sameligature.
Figure 6
 
Nastaliq diagonality
 
The aim of this paper is to identify allpossible shapes of characters of Nastaliqstyle based upon the context in which theyoccurs in ligature, thus providing afoundation to eventually model it.Noori Nastaliq (a type of Nastaliq font, seefig. 5) provides an ideal platform for this kindof analysis. Moving from right to left thisstyle of writing uses simple spacing rules(where size of a shape does not depend onhow much space is available for the letter).Thus Noori Nastaliq style of writingeliminates other complexities of Nastaliq fontwhile providing different shapes and their context.
3. METHODOLOGY
We need to study all possible ligatures of Noori Nastaliq, to identify different shapesfrom them. Practically speaking, it is notpossible to study all of the ligatures. So welimit our study up to 4 character ligatures(which are about 600,000 ligatures).We observed that some of them behavesimilarly under all circumstance, whilewriting. Which can be observed from thefollowing example in figure 7. In this figurethe middle characters have common shapeswith the difference of dots.
Figure 7
Examples of characters behavingsimilarly in all the contextsSo these 38 characters have beencategorized into 21 classes so that anymember of same class will have same
of 00

Leave a Comment

You must be to leave a comment.
Submit
Characters: ...
You must be to leave a comment.
Submit
Characters: ...