You are on page 1of 9

##Adobe File Version: 1.000 #======================================================================= # FTP file name: ARABIC.

TXT # # Contents: Map (external version) from Mac OS Arabic # character set to Unicode 2.1 # # Copyright: (c) 1994-1999 by Apple Computer, Inc., all rights # reserved. # # Contact: charsets@apple.com # # Changes: # # b02 1999-Sep-22 Update contact e-mail address. Matches # internal utom<b1>, ufrm<b1>, and Text # Encoding Converter version 1.5. # n10 1998-Feb-05 Show required Unicode character # directionality in a different way. Matches # internal utom<n4>, ufrm<n21>, and Text # Encoding Converter version 1.3. Update # header comments; include information on # loose mapping of digits. # n07 1997-Jul-17 Update to match internal utom<n2>, ufrm<n17>: # Change standard mapping for 0xC0 from U+066D # to U+274A. Add direction overrides to # mappings for 0x25, 0x2C, 0x3B, 0x3F. Add # information on variants. # n03 1995-Apr-18 First version (after fixing some typos). # Matches internal ufrm<n11>. # # Standard header: # ---------------# # Apple, the Apple logo, and Macintosh are trademarks of Apple # Computer, Inc., registered in the United States and other countries. # Unicode is a trademark of Unicode Inc. For the sake of brevity, # throughout this document, "Macintosh" can be used to refer to # Macintosh computers and "Unicode" can be used to refer to the # Unicode standard. # # Apple makes no warranty or representation, either express or # implied, with respect to these tables, their quality, accuracy, or # fitness for a particular purpose. In no event will Apple be liable # for direct, indirect, special, incidental, or consequential damages # resulting from any defect or inaccuracy in this document or the # accompanying tables. # # These mapping tables and character lists are subject to change. # The latest tables should be available from the following: # # <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/> # <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/> # # For general information about Mac OS encodings and these mapping # tables, see the file "README.TXT". # # Format: # ------#

# Three tab-separated columns; # '#' begins a comment which continues to the end of the line. # Column #1 is the Mac OS Arabic code (in hex as 0xNN). # Column #2 is the corresponding Unicode (in hex as 0xNNNN), # possibly preceded by a tag indicating required directionality # (i.e. <LR>+0xNNNN or <RL>+0xNNNN). # Column #3 is a comment containing the Unicode name. # # The entries are in Mac OS Arabic code order. # # Control character mappings are not shown in this table, following # the conventions of the standard UTC mapping tables. However, the # Mac OS Roman character set uses the standard control characters at # 0x00-0x1F and 0x7F. # # Notes on Mac OS Arabic: # ----------------------# # 1. General # # The Mac OS Arabic character set is intended to cover Arabic as # used in North Africa, the Arabian peninsula, and the Levant. It # also contains several characters needed for Urdu and/or Farsi. # Mac OS Arabic is used for the Arabic localizations, and for the # Arabic language support in the Arabic Language Kit. # # The Mac OS Arabic character set is essentially a superset of ISO # 8859-6. The 8859-6 code points that are interpreted differently # in the Mac OS Arabic set are as follows: # 0xA0 is NO-BREAK SPACE in 8859-6 and right-left SPACE in Mac OS # Arabic; NO-BREAK is 0x81 in Mac OS Arabic. # 0xA4 is CURRENCY SIGN in 8859-6 and right-left DOLLAR SIGN in # Mac OS Arabic. # 0xAD is SOFT HYPHEN in 8859-6 and right-left HYPHEN-MINUS in # Mac OS Arabic. # ISO 8859-6 specifies that codes 0x30-0x39 can be rendered either # with European digit shapes or Arabic digit shapes. This is also # true in Mac OS Arabic, which determines from context which digit # shapes to use (see below). # # The Mac OS Arabic character set uses the C1 controls area and other # code points which are undefined in ISO 8859-6 for additional # graphic characters: additional Arabic letters for Farsi and Urdu, # some accented Roman letters for European languages (such as French), # and duplicates of some of the punctuation, symbols, and digits in # the ASCII block. The duplicate punctuation, symbol, and digit # characters have right-left directionality, while the ASCII versions # have left-right directionality. See the next section for more # information on this. # # Mac OS Arabic characters 0xEB-0xF2 are non-spacing/combining marks. # # 2. Directional characters and roundtrip fidelity # # The Mac OS Arabic character set was developed in 1986-1987. At that # time the bidirectional line layout algorithm used in the Mac OS # Arabic system was fairly simple; it used only a few direction # classes (instead of the 13 or so now used in the Unicode # bidirectional algorithm). In order to permit users to handle some # tricky layout problems, certain punctuation and symbol characters

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

have duplicate code points, one with a left-right direction attribute and the other with a right-left direction attribute. For example, plus sign is encoded at 0x2B with a left-right attribute, and at 0xAB with a right-left attribute. However, there is only one PLUS SIGN character in Unicode. This leads to some interesting problems when mapping between Mac OS Arabic and Unicode; see below. A related problem is that even when a particular character is encoded only once in Mac OS Arabic, it may have a different direction attribute than the corresponding Unicode character. For example, the Mac OS Arabic character at 0x93 is HORIZONTAL ELLIPSIS with strong right-left direction. However, the Unicode character HORIZONTAL ELLIPSIS has direction class neutral. 3. Behavior of ASCII-range numbers Mac OS Arabic also has two sets of digit codes. The digits at 0x30-0x39 may be displayed using either European digit shapes or Arabic digit shapes, depending on context. If there is a "strong European" character such as a Latin letter on either side of a sequence consisting of digits 0x30-0x39 and possibly comma 0x2C or period 0x2E, then the digits will be displayed using European shapes, the comma will be displayed as Arabic thousands separator, and the period as Arabic decimal separator. (This will happen even if there are neutral characters between the digits and the strong European character). Otherwise, all of these characters will be displayed using the European shapes. In any case, 0x2C, 0x2E, and 0x30-0x39 are always left-right. The digits at 0xB0-0xB9 are always displayed using Arabic digit shapes, and moreover, these digits always have strong right-left directionality. These are mainly intended for special layout purposes such as part numbers, etc. 4. Font variants The table in this file gives the Unicode mappings for the standard Mac OS Arabic encoding. This encoding is supported by the Cairo font (the system font for Arabic), and is the encoding supported by the text processing utilities. However, the other Arabic fonts actually implement slightly different encodings; this mainly affects the code points 0xAA and 0xC0. For these code points the standard Mac OS Arabic encoding has the following mappings: 0xAA -> <RL>+0x002A ASTERISK, right-left 0xC0 -> <RL>+0x274A EIGHT TEARDROP-SPOKED PROPELLER ASTERISK, right-left This mapping of 0xAA is consistent with the normal convention for Mac OS Arabic and Hebrew that the right-left duplicates have codes that are equal to the ASCII code of the left-right character plus 0x80. However, in all of the other fonts, 0xAA is MULTIPLY SIGN, and right-left ASTERISK may be at a different code point. The other variants are described below. The TrueType variant is used for most of the Arabic TrueType fonts: Baghdad, Geeza, Kufi, Nadeem. It differs from the standard variant in the following way:

# 0xAA -> <RL>+0x00D7 MULTIPLICATION SIGN, right-left # 0xC0 -> <RL>+0x002A ASTERISK, right-left # # The Thuluth variant is used for the Arabic Postscript-only fonts: # Thuluth and Thuluth bold. It differs from the standard variant in # the following way: # 0xAA -> <RL>+0x00D7 MULTIPLICATION SIGN, right-left # 0xC0 -> 0x066D ARABIC FIVE POINTED STAR # # The AlBayan variant is used for the Arabic TrueType font Al Bayan. # It differs from the standard variant in the following way: # 0x81 -> no mapping (glyph just has authorship information, etc.) # 0xA3 -> 0xFDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # 0xA4 -> 0xFDF2 ARABIC LIGATURE ALLAH ISOLATED FORM # 0xAA -> <RL>+0x00D7 MULTIPLICATION SIGN, right-left # 0xDC -> <RL>+0x25CF BLACK CIRCLE, right-left # 0xFC -> <RL>+0x25A0 BLACK SQUARE, right-left # # Unicode mapping issues and notes: # --------------------------------# # 1. Matching the direction of Mac OS Arabic characters # # When Mac OS Arabic encodes a character twice but with different # direction attributes for the two code points - as in the case of # plus sign mentioned above - we need a way to map both Mac OS Arabic # code points to Unicode and back again without loss of information. # With the plus sign, for example, mapping one of the Mac OS Arabic # characters to a code in the Unicode corporate use zone is # undesirable, since both of the plus sign characters are likely to # be used in text that is interchanged. # # The problem is solved with the use of direction override characters # and direction-dependent mappings. When mapping from Mac OS Arabic # to Unicode, we use direction overrides as necessary to force the # direction of the resulting Unicode characters. # # The required direction is indicated by a direction tag in the # mappings. A tag of <LR> means the corresponding Unicode character # must have a strong left-right context, and a tag of <RL> indicates # a right-left context. # # For example, the mapping of 0x2B is given as <LR>+0x002B; the # mapping of 0xAB is given as <RL>+0x002B. If we map an isolated # instance of 0x2B to Unicode, it should be mapped as follows (LRO # indicates LEFT-RIGHT OVERRIDE, PDF indicates POP DIRECTION # FORMATTING): # # 0x2B -> 0x202D (LRO) + 0x002B (PLUS SIGN) + 0x202C (PDF) # # When mapping several characters in a row that require direction # forcing, the overrides need only be used at the beginning and end. # For example: # # 0x24 0x20 0x28 0x29 -> 0x202D 0x0024 0x0020 0x0028 0x0029 0x202C # # When mapping from Unicode to Mac OS Arabic, the Unicode # bidirectional algorithm should be used to determine resolved # direction of the Unicode characters. The mapping from Unicode to # Mac OS Arabic can then be disambiguated by the use of the resolved

# direction: # # Unicode 0x002B -> Mac OS Arabic 0x2B (if L) or 0xAB (if R) # # However, this also means the direction override characters should # be discarded when mapping from Unicode to Mac OS Arabic (after # they have been used to determine resolved direction), since the # direction override information is carried by the code point itself. # # Even when direction overrides are not needed for roundtrip # fidelity, they are sometimes used when mapping Mac OS Arabic # characters to Unicode in order to achieve similar text layout with # the resulting Unicode text. For example, the single Mac OS Arabic # ellipsis character has direction class right-left,and there is no # left-right version. However, the Unicode HORIZONTAL ELLIPSIS # character has direction class neutral (which means it may end up # with a resolved direction of left-right if surrounded by left-right # characters). When mapping the Mac OS Arabic ellipsis to Unicode, it # is surrounded with a direction override to help preserve proper # text layout. The resolved direction is not needed or used when # mapping the Unicode HORIZONTAL ELLIPSIS back to Mac OS Arabic. # # 2. Mapping the Mac OS Arabic digits # # The main table below contains mappings that should be used when # strict round-trip fidelity is required. However, for numeric # values, the mappings in that table will produce Unicode characters # that may appear different than the Mac OS Arabic text displayed # on a Mac OS system with Arabic support. This is because the Mac OS # uses context-dependent display for the 0x30-0x39 digits. # # If roundtrip fidelity is not required, then the following # alternate mappings should be used when a sequence of 0x30-0x39 # digits - possibly including 0x2C and 0x2E - occurs in an Arabic # context (that is, when the first "strong" character on either side # of the digit sequence is Arabic, or there is no strong character): # # 0x2C 0x066C # ARABIC THOUSANDS SEPARATOR # 0x2E 0x066B # ARABIC DECIMAL SEPARATOR # 0x30 0x0660 # ARABIC-INDIC DIGIT ZERO # 0x31 0x0661 # ARABIC-INDIC DIGIT ONE # 0x32 0x0662 # ARABIC-INDIC DIGIT TWO # 0x33 0x0663 # ARABIC-INDIC DIGIT THREE # 0x34 0x0664 # ARABIC-INDIC DIGIT FOUR # 0x35 0x0665 # ARABIC-INDIC DIGIT FIVE # 0x36 0x0666 # ARABIC-INDIC DIGIT SIX # 0x37 0x0667 # ARABIC-INDIC DIGIT SEVEN # 0x38 0x0668 # ARABIC-INDIC DIGIT EIGHT # 0x39 0x0669 # ARABIC-INDIC DIGIT NINE # # Details of mapping changes in each version: # ------------------------------------------# # Changes from version n03 to version n07: # # - Change mapping for 0xC0 from U+066D to U+274A. # # - Add direction overrides (required directionality) to mappings # for 0x25, 0x2C, 0x3B, 0x3F. #

################## 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29 0x2A 0x2B 0x2C 0x2D 0x2E 0x2F 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39 0x3A 0x3B 0x3C 0x3D 0x3E 0x3F 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4A 0x4B 0x4C 0x4D 0x4E 0x4F 0x50 0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 <LR>+0x0020 # SPACE, left-right <LR>+0x0021 # EXCLAMATION MARK, left-right <LR>+0x0022 # QUOTATION MARK, left-right <LR>+0x0023 # NUMBER SIGN, left-right <LR>+0x0024 # DOLLAR SIGN, left-right <LR>+0x0025 # PERCENT SIGN, left-right <LR>+0x0026 # AMPERSAND, left-right <LR>+0x0027 # APOSTROPHE, left-right <LR>+0x0028 # LEFT PARENTHESIS, left-right <LR>+0x0029 # RIGHT PARENTHESIS, left-right <LR>+0x002A # ASTERISK, left-right <LR>+0x002B # PLUS SIGN, left-right <LR>+0x002C # COMMA, left-right <LR>+0x002D # HYPHEN-MINUS, left-right <LR>+0x002E # FULL STOP, left-right <LR>+0x002F # SOLIDUS, left-right 0x0030 # DIGIT ZERO 0x0031 # DIGIT ONE 0x0032 # DIGIT TWO 0x0033 # DIGIT THREE 0x0034 # DIGIT FOUR 0x0035 # DIGIT FIVE 0x0036 # DIGIT SIX 0x0037 # DIGIT SEVEN 0x0038 # DIGIT EIGHT 0x0039 # DIGIT NINE <LR>+0x003A # COLON, left-right <LR>+0x003B # SEMICOLON, left-right <LR>+0x003C # LESS-THAN SIGN, left-right <LR>+0x003D # EQUALS SIGN, left-right <LR>+0x003E # GREATER-THAN SIGN, left-right <LR>+0x003F # QUESTION MARK, left-right 0x0040 # COMMERCIAL AT 0x0041 # LATIN CAPITAL LETTER A 0x0042 # LATIN CAPITAL LETTER B 0x0043 # LATIN CAPITAL LETTER C 0x0044 # LATIN CAPITAL LETTER D 0x0045 # LATIN CAPITAL LETTER E 0x0046 # LATIN CAPITAL LETTER F 0x0047 # LATIN CAPITAL LETTER G 0x0048 # LATIN CAPITAL LETTER H 0x0049 # LATIN CAPITAL LETTER I 0x004A # LATIN CAPITAL LETTER J 0x004B # LATIN CAPITAL LETTER K 0x004C # LATIN CAPITAL LETTER L 0x004D # LATIN CAPITAL LETTER M 0x004E # LATIN CAPITAL LETTER N 0x004F # LATIN CAPITAL LETTER O 0x0050 # LATIN CAPITAL LETTER P 0x0051 # LATIN CAPITAL LETTER Q 0x0052 # LATIN CAPITAL LETTER R 0x0053 # LATIN CAPITAL LETTER S 0x0054 # LATIN CAPITAL LETTER T 0x0055 # LATIN CAPITAL LETTER U 0x0056 # LATIN CAPITAL LETTER V 0x0057 # LATIN CAPITAL LETTER W 0x0058 # LATIN CAPITAL LETTER X 0x0059 # LATIN CAPITAL LETTER Y

0x5A 0x5B 0x5C 0x5D 0x5E 0x5F 0x60 0x61 0x62 0x63 0x64 0x65 0x66 0x67 0x68 0x69 0x6A 0x6B 0x6C 0x6D 0x6E 0x6F 0x70 0x71 0x72 0x73 0x74 0x75 0x76 0x77 0x78 0x79 0x7A 0x7B 0x7C 0x7D 0x7E # 0x80 0x81 0x82 0x83 0x84 0x85 0x86 0x87 0x88 0x89 0x8A 0x8B 0x8C 0x8D 0x8E 0x8F 0x90 0x91 0x92 0x93 0x94 0x95

0x005A # LATIN <LR>+0x005B <LR>+0x005C <LR>+0x005D <LR>+0x005E <LR>+0x005F 0x0060 # GRAVE 0x0061 # LATIN 0x0062 # LATIN 0x0063 # LATIN 0x0064 # LATIN 0x0065 # LATIN 0x0066 # LATIN 0x0067 # LATIN 0x0068 # LATIN 0x0069 # LATIN 0x006A # LATIN 0x006B # LATIN 0x006C # LATIN 0x006D # LATIN 0x006E # LATIN 0x006F # LATIN 0x0070 # LATIN 0x0071 # LATIN 0x0072 # LATIN 0x0073 # LATIN 0x0074 # LATIN 0x0075 # LATIN 0x0076 # LATIN 0x0077 # LATIN 0x0078 # LATIN 0x0079 # LATIN 0x007A # LATIN <LR>+0x007B <LR>+0x007C <LR>+0x007D 0x007E # TILDE

CAPITAL LETTER Z # LEFT SQUARE BRACKET, left-right # REVERSE SOLIDUS, left-right # RIGHT SQUARE BRACKET, left-right # CIRCUMFLEX ACCENT, left-right # LOW LINE, left-right ACCENT SMALL LETTER A SMALL LETTER B SMALL LETTER C SMALL LETTER D SMALL LETTER E SMALL LETTER F SMALL LETTER G SMALL LETTER H SMALL LETTER I SMALL LETTER J SMALL LETTER K SMALL LETTER L SMALL LETTER M SMALL LETTER N SMALL LETTER O SMALL LETTER P SMALL LETTER Q SMALL LETTER R SMALL LETTER S SMALL LETTER T SMALL LETTER U SMALL LETTER V SMALL LETTER W SMALL LETTER X SMALL LETTER Y SMALL LETTER Z # LEFT CURLY BRACKET, left-right # VERTICAL LINE, left-right # RIGHT CURLY BRACKET, left-right

0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS <RL>+0x00A0 # NO-BREAK SPACE, right-left 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS 0x00E1 # LATIN SMALL LETTER A WITH ACUTE 0x00E0 # LATIN SMALL LETTER A WITH GRAVE 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS 0x06BA # ARABIC LETTER NOON GHUNNA <RL>+0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK, right-left 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA 0x00E9 # LATIN SMALL LETTER E WITH ACUTE 0x00E8 # LATIN SMALL LETTER E WITH GRAVE 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS 0x00ED # LATIN SMALL LETTER I WITH ACUTE <RL>+0x2026 # HORIZONTAL ELLIPSIS, right-left 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS

0x96 0x97 0x98 0x99 0x9A 0x9B 0x9C 0x9D 0x9E 0x9F 0xA0 0xA1 0xA2 0xA3 0xA4 0xA5 0xA6 0xA7 0xA8 0xA9 0xAA 0xAB 0xAC 0xAD 0xAE 0xAF 0xB0 0xB1 0xB2 0xB3 0xB4 0xB5 0xB6 0xB7 0xB8 0xB9 0xBA 0xBB 0xBC 0xBD 0xBE 0xBF 0xC0 0xC1 0xC2 0xC3 0xC4 0xC5 0xC6 0xC7 0xC8 0xC9 0xCA 0xCB 0xCC 0xCD 0xCE 0xCF 0xD0 0xD1

0x00F1 # LATIN SMALL LETTER N WITH TILDE 0x00F3 # LATIN SMALL LETTER O WITH ACUTE <RL>+0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK, right-left 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS <RL>+0x00F7 # DIVISION SIGN, right-left 0x00FA # LATIN SMALL LETTER U WITH ACUTE 0x00F9 # LATIN SMALL LETTER U WITH GRAVE 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS <RL>+0x0020 # SPACE, right-left <RL>+0x0021 # EXCLAMATION MARK, right-left <RL>+0x0022 # QUOTATION MARK, right-left <RL>+0x0023 # NUMBER SIGN, right-left <RL>+0x0024 # DOLLAR SIGN, right-left 0x066A # ARABIC PERCENT SIGN <RL>+0x0026 # AMPERSAND, right-left <RL>+0x0027 # APOSTROPHE, right-left <RL>+0x0028 # LEFT PARENTHESIS, right-left <RL>+0x0029 # RIGHT PARENTHESIS, right-left <RL>+0x002A # ASTERISK, right-left <RL>+0x002B # PLUS SIGN, right-left 0x060C # ARABIC COMMA <RL>+0x002D # HYPHEN-MINUS, right-left <RL>+0x002E # FULL STOP, right-left <RL>+0x002F # SOLIDUS, right-left <RL>+0x0660 # ARABIC-INDIC DIGIT ZERO, right-left <RL>+0x0661 # ARABIC-INDIC DIGIT ONE, right-left <RL>+0x0662 # ARABIC-INDIC DIGIT TWO, right-left <RL>+0x0663 # ARABIC-INDIC DIGIT THREE, right-left <RL>+0x0664 # ARABIC-INDIC DIGIT FOUR, right-left <RL>+0x0665 # ARABIC-INDIC DIGIT FIVE, right-left <RL>+0x0666 # ARABIC-INDIC DIGIT SIX, right-left <RL>+0x0667 # ARABIC-INDIC DIGIT SEVEN, right-left <RL>+0x0668 # ARABIC-INDIC DIGIT EIGHT, right-left <RL>+0x0669 # ARABIC-INDIC DIGIT NINE, right-left <RL>+0x003A # COLON, right-left 0x061B # ARABIC SEMICOLON <RL>+0x003C # LESS-THAN SIGN, right-left <RL>+0x003D # EQUALS SIGN, right-left <RL>+0x003E # GREATER-THAN SIGN, right-left 0x061F # ARABIC QUESTION MARK <RL>+0x274A # EIGHT TEARDROP-SPOKED PROPELLER ASTERISK, right-left 0x0621 # ARABIC LETTER HAMZA 0x0622 # ARABIC LETTER ALEF WITH MADDA ABOVE 0x0623 # ARABIC LETTER ALEF WITH HAMZA ABOVE 0x0624 # ARABIC LETTER WAW WITH HAMZA ABOVE 0x0625 # ARABIC LETTER ALEF WITH HAMZA BELOW 0x0626 # ARABIC LETTER YEH WITH HAMZA ABOVE 0x0627 # ARABIC LETTER ALEF 0x0628 # ARABIC LETTER BEH 0x0629 # ARABIC LETTER TEH MARBUTA 0x062A # ARABIC LETTER TEH 0x062B # ARABIC LETTER THEH 0x062C # ARABIC LETTER JEEM 0x062D # ARABIC LETTER HAH 0x062E # ARABIC LETTER KHAH 0x062F # ARABIC LETTER DAL 0x0630 # ARABIC LETTER THAL 0x0631 # ARABIC LETTER REH

0xD2 0xD3 0xD4 0xD5 0xD6 0xD7 0xD8 0xD9 0xDA 0xDB 0xDC 0xDD 0xDE 0xDF 0xE0 0xE1 0xE2 0xE3 0xE4 0xE5 0xE6 0xE7 0xE8 0xE9 0xEA 0xEB 0xEC 0xED 0xEE 0xEF 0xF0 0xF1 0xF2 0xF3 0xF4 0xF5 0xF6 0xF7 0xF8 0xF9 0xFA 0xFB 0xFC 0xFD 0xFE 0xFF

0x0632 # ARABIC LETTER ZAIN 0x0633 # ARABIC LETTER SEEN 0x0634 # ARABIC LETTER SHEEN 0x0635 # ARABIC LETTER SAD 0x0636 # ARABIC LETTER DAD 0x0637 # ARABIC LETTER TAH 0x0638 # ARABIC LETTER ZAH 0x0639 # ARABIC LETTER AIN 0x063A # ARABIC LETTER GHAIN <RL>+0x005B # LEFT SQUARE BRACKET, right-left <RL>+0x005C # REVERSE SOLIDUS, right-left <RL>+0x005D # RIGHT SQUARE BRACKET, right-left <RL>+0x005E # CIRCUMFLEX ACCENT, right-left <RL>+0x005F # LOW LINE, right-left 0x0640 # ARABIC TATWEEL 0x0641 # ARABIC LETTER FEH 0x0642 # ARABIC LETTER QAF 0x0643 # ARABIC LETTER KAF 0x0644 # ARABIC LETTER LAM 0x0645 # ARABIC LETTER MEEM 0x0646 # ARABIC LETTER NOON 0x0647 # ARABIC LETTER HEH 0x0648 # ARABIC LETTER WAW 0x0649 # ARABIC LETTER ALEF MAKSURA 0x064A # ARABIC LETTER YEH 0x064B # ARABIC FATHATAN 0x064C # ARABIC DAMMATAN 0x064D # ARABIC KASRATAN 0x064E # ARABIC FATHA 0x064F # ARABIC DAMMA 0x0650 # ARABIC KASRA 0x0651 # ARABIC SHADDA 0x0652 # ARABIC SUKUN 0x067E # ARABIC LETTER PEH 0x0679 # ARABIC LETTER TTEH 0x0686 # ARABIC LETTER TCHEH 0x06D5 # ARABIC LETTER AE 0x06A4 # ARABIC LETTER VEH 0x06AF # ARABIC LETTER GAF 0x0688 # ARABIC LETTER DDAL 0x0691 # ARABIC LETTER RREH <RL>+0x007B # LEFT CURLY BRACKET, right-left <RL>+0x007C # VERTICAL LINE, right-left <RL>+0x007D # RIGHT CURLY BRACKET, right-left 0x0698 # ARABIC LETTER JEH 0x06D2 # ARABIC LETTER YEH BARREE