Professional Documents
Culture Documents
Myanmar
> The Myanmar script is used to write Burmese, the majority
language of Myanmar. Variations and extensions of the script are
used to write other languages of the region, such as Shan and Mon,
Karen as well as Pali and Sanskrit.
Ref; http://www.unicode.org/versions/Unicode5.0.0/ch11.pdf
> The Myanmar writing system derives from a Brahmi-related
script borrowed from South India in about the eighth century to
write the Mon language.
> The basic consonants, independent vowels, and dependent
vowel signs required for writing the Myanmar language are encoded
at the beginning of the Myanmar range (U1000~U109F).
2
Myanmar Unicode 5.0.0
3
Unicode 5.1, What News ?
4
Etymology: Myanmar Unicode
Unicode
> The Unicode Standard is the universal character encoding
standard for written characters and text. It defines a consistent way
of encoding multilingual text that enables the exchange of text data
internationally and creates the foundation for global software.
> It provides the capacity to encode all characters used for the
written languages of the world—more than 1 million characters can
be encoded.
> The Unicode character encoding treats alphabetic characters,
ideographic characters, and symbols equivalently, which means they
can be used in any mixture and with equal facility.
5
Etymology: Myanmar Unicode
Unicode
> The Unicode Standard specifies a numeric value (code point)
and a name for each of its characters.
> The Unicode Standard defines these and other semantic values,
and it includes application data such as case mapping tables (a, A)
and character property tables as part of the Unicode Character
Database (UCD). Character properties define a character’s identity
and behavior; they ensure consistency in the processing and
interchange of Unicode data.
> Unicode characters are represented in one of three encoding
forms: a 32-bit form (UTF-32), a 16-bit form (UTF-16), and an 8-bit
form (UTF-8). The 8-bit, byte-oriented form, UTF-8, has been
designed for ease of use with existing ASCII-based systems.
6
What’s Unicode Design Goal?
7
What is Unicode
8
Why is Unicode needed?
11
Unicode Design Principals
1. Universal repertoire
2. Processing efficiency
3. Characters, not glyphs
4. Semantics (properties)
5. Plain text
6. Logical order in memory
7. Unification (within scripts across languages)
8. Dynamic composition
9. Equivalent Sequences (compatibility with current standards)
10. Convertibility
12
Life of a Character
Caps
Lock
A S D F GG H J K L :
;
"
'
Enter
Shift
Shift
Memory
GG
Keyboard
Driver
G Layout
Renderer G
0047
CPU Font
File
Font
Font
13
Memory
1000 103B 1031 Rendering
က ျ ေ a complex script
Display
Order ေကျ
Glyph
Selection
ေကျ &
Positioning
Positional
Font Shapes
and Ligatures
14
かきく…
Syllabic
ถไณ… 漢字…
Alphabetic Ideographic
Scripts
!,:;…
Shared
Α Β Γ…
α β γ…
…ا ب ت
√ ≠…
Alphabetic
Alphabetic Right-to-left
Bi-cameral
Symbols
15
Letter → Codes → Glyphs
ĝ ĝ ĝ
ĝ 011D
Coded Character ĝ ĝ
Letter
g $̂
0067 0302
ĝ ĝ
Glyphs
Coded Character Sequence
16
What is the Standards
+ Unicode Standard Annexes
18
Conclusion…
19
Myth #1
(based on Unicode Std)
…ဆိုတာ ယူနီကုတံအဖွဲဲအစညံဵက သတံမဿတံထာဵတဲဴ ြမနံမာစာနဿငံဴ
ပတံသကံတဲဴ စဳသတံမဿတံချကံ ေတွကိုအေ ြခခဳ ြပီဵ ကွနံပျူတာမဿာ ြမနံမာစာကို
သုဳဵလိုေရေအာငံ လုပံထာဵတဲဴစနစံ ြဖစံပ၂တယံ။…
အမဿနံ
ယူနီကုဒံစဳသတံမဿတံချကံမျာဵကို တစံေသွမတိမံဵလိုကံနာလုပံေဆာငံရာတွငံ
သတံမဿတံထာဵေသာ Code points သာမက အ ြခာဵ စဳသတံမဿတံချကံမျာဵကိုပ၂
လိုကံနာရနံ လိုအပံပ၂သညံ။
20
Myth #2
(Code Page Or Code Block)
ဘာသာစကာဵတစံခုချငံဵစီအတွကံ Code page သတံမဿတံေပဵထာဵပ၂တယံ။
ြမနံမာဘာသာစကာဵအတွကံ Code page ကို 1000 ကေန 109F အထိ
သတံမဿတံေပဵထာဵပ၂တယံ။
အမဿနံ
Wiki defined as “Code page is the traditional IBM term used for a specific
character encoding table: a mapping in which a sequence of bits, usually a single
octet representing integer values 0 through 255, is associated with a specific
character. IBM and Microsoft often allocate a code page number to a character
set even if that charset is better known by another name.”
It might be Code Block for each Script, not for each Language.
21
Myth #3
(just only drag and drop)
ဘယံလို OS မျိုဵမဿာ မဆို font file ကို drag and drop လုပံရုဳနဲေ
အသုဳဵ ြပုလိုရ
ေ ပ၂တယံ။
Bravo !
အမဿနံ
22
Myth #4
(word break or syllable break)
input method မဿာ စာလုဳဵေတွကို သူေဘာသာသူ မဿတံမိေစနိုငံေအာငံ
လိုအပံတဲဴ word break ကို တစံပ၂တညံဵထညံဴေပဵေစပ၂တယံ။ ြမနံမာစာမဿ
မဟုတံပ၂ဘူဵ။ ဘယံဘာသာစကာဵမဆို word break လိုအပံပ၂တယံ။
အမဿနံ
UTN #11 expressed that “From this we can say that a syllable break
may occur before a Myanmar digit, an independent vowel, one of the
various signs or a base consonant so long as the consonant:”
is not devowelised with an asat and has no stacked consonant below it and
is not a kinzi.
23
Myth #5
(vowel sign in Myanmar)
သရအတွကံ သီဵသနံေ အက္ခရာမရဿိပ၂ဘူဵ။ ဒ၂ေ၊ကာငံဴ ဩ၊ ဪ၊
ဥ၊ ဦ ကို သရယူနီကုဒံ တနံဖိုဵ ေပဵထာဵတာ အရမံဵကို
အဳဴအာဵသငံဴမိတယံ။
အမဿနံ
ဆရာ ေမာငံခငံမငံ (ဓနု ြဖူ)၏ ြမနံမာစကာဵ၊ ြမနံမာစာ ရုပံပုဳလှာ
စာမျကံနဿာ း္း တွငံ၊ “သရသေကတကိ
် ု သရသကံသကံ
နဿငံဴ ဗျညံဵတွဲေသာ သရဟူ၍နဿစံမျိုဵခွဲနိုငံသညံ။”
24
Myth #6
(fake, partial, pseudo Unicode)
ယူနီကုဒံ စဳသတံမဿတံချကံမျာဵကို ြပငံဆငံထညံဴသွငံဴ ြခငံဵ။
ယူနီကုဒံ Code point အသစံမျာဵထညံဴသွငံဵ ြခငံဵ။
သီဵ ြခာဵ စဳပုဳစဳမျာဵ ထညံဴသွငံဵရ ြခငံဵ။
အမဿနံ
ယူနီကုဒံကို စဳသတံမဿတံချကံမျာဵ ြပုလုပံရ ြခငံဵ၏ အဓိကရညံရွယံချကံမဿာ
း) အချကံအလကံမျာဵ ဖလဿယံရာတွငံ လွယံကူမဿနံကနံေစရနံ ြဖစံသညံ။
္) သုဳဵစွဲသူ ္ ဘကံလုဳဵတွငံ ပုဳစဳတူ စနစံမရဿိေသာံလညံဵ ဖတံရ၁နိုငံသညံဴ
စနစံမျိုဵ ြဖစံမဿသာ ယူနီကုဒံကို သုဳဵစွဲရေသာ အကျိုဵေကျဵဇူဵကို ရရဿိနိုငံပ၂မညံ။
25
Myth #7
(Windows enabled Myanmar 100%)
ြမနံမာယူနီကုဒံကို MS Windows တွငံ း့့ ရာခိုငံန၁နံဵ အသုဳဵချနိုငံသညံ။
Windows တွငံ ြမနံမာစကာဵလုဳဵမျာဵ ရဿာနိုငံသညံ။ အက္ခရာစဉံနိုငံသညံ။
ကွနံပျူတာကို ြမနံမာလိုသုဳဵနိုငံေတာဴမညံ။
အမဿနံ
ြမနံမာစာတွငံ ထပံတိုဵေသာ ယူနီကုဒံ စဳသတံမဿတံချကံမျာဵ ့့္၈
မတံလတွငံ အတညံ ြပုပ၂မညံ။ ယခုအချိနံထိ ေဆွဵေနွဵဆဲ ြဖစံပ၂သညံ။
Windows တွငံ ရိုကံေသာစာမျာဵ ေပ၃ရုဳသာရဿိပ၂သညံ။ အ ြခာဵေသာ ဘာသာ
စကာဵမျာဵတွငံ ရရဿိေသာ အဆငံဴ ြဖငံဴ နိ၁ငံဵယဿဉံပ၂က ၅့ ရာခိုငံန၁နံဵမျှသာ
ြဖစံပ၂သညံ။
26
Myth #8
(vowel should not place after consonants)
သေဝထိုဵကို ဗျညံဵေနာကံပိုေထာဵတဲဴကိစ္စ၊ ...
သေဝထိုဵက ဗျညံဵေရဿဲေရာကံေနရငံ စာလုဳဵစီတာ၊ စဉံတာ၊ ြဖတံတာေတွမဿာ
ေခ၂ငံဵေတာံေတာံစာဵလိမံဴမယံ။
အမဿနံ
ြမနံမာစာတွငံ စာလုဳဵအစဉံမျာဵ သတံမဿတံချကံသညံ အလွနံအေရဵြကီဵေသာ
အချကံ ြဖစံသညံ။ ဗျညံဵ၊ ဗျညံဵတွဲ၊ သရ၊ ဆိုေသာ အစဉံသညံ
စကာဵလုဳဵဖွဲဲစညံဵပုဳအရ သတံမဿတံချကံ ြဖစံပ၂သညံ။
ဥပမာ - လိေမမောံ၊ အိေဒနြေ
27
Myth #9
(* Unicode Font)
ယူနီကုဒံ စာလုဳဵကို ြမနံမာပညာရဿငံ ၅ ဦဵ ့့္၅ ခုနစံမဿာ
တီထွငံခဲဴ။
အမဿနံ
ယူနီကုဒံမဿ ချမဿတံထာဵေသာ စဳသတံမဿတံချကံမျာဵကို
လိုကံနာမ၁မရဿိဘဲ သာမာနံ စာလုဳဵပုဳစဳ ေနရာချမ၁မျာဵကို ေ ြပာငံဵလဲ
သတံမဿတံ ြခငံဵကို တီထွငံသညံဟု မဆိုအပံေပ။
28
Myth #10
(Government forced to use unicode fonts)
သုဳဵလိုေမရတဲဴ Standards ကို နိုငံငဳေတာံက ဇွတံသုဳဵခိုငံဵတယံ။
အမဿနံ
စဳသတံမဿတံချကံမျာဵကို ချမဿတံ၍ စမံဵသပံသုဳဵစွဲရနံအတွကံ
သကံဆိုငံရာမဿ မူဝ၂ဒ ချမဿတံ ြခငံဵသညံ နိုငံငဳတကာတွငံ
လုပံေဆာငံသညံဴ ပုဳစဳ ြဖစံသညံ။
29
More Myths there. Send Us.
ngwetun@solvewaresolution.net
www.parabaik.info
30